Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp961161pxb; Thu, 4 Feb 2021 01:15:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJyGZa1mCfJIbf78pWiHZ040+qmtIQtt1iIvo7Xj+nU3/m3aDKdqukT+m9iVPIT9WmrxUQlI X-Received: by 2002:a17:906:388b:: with SMTP id q11mr7146708ejd.421.1612430119295; Thu, 04 Feb 2021 01:15:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612430119; cv=none; d=google.com; s=arc-20160816; b=hOBL0hUyoUyT6DnnFENoh+PuW9VhoK83/p2H0L0ZUAaR4XesDUcDOZiKYBZQVh10wc 9SENHoD83TKI0eOdAcq1qcIZ3Stt3NkzS9cs1K/Dkgtt4XHCzvpT8/h77qW9d58fc7rC pKg/V49G94q5SJzLWnWGXviBe9AN5hPGx1HKFGAABmd0gqoVPL9qy8zuXZE+86UPXf0F DhYrp2563c736mq5i2oF6BGvJk21XnsAbPM8IE9IorOWJs0SM9GOgK+s9TRhgJn/KqhC 9Qt5dubhUiJWrnuuj+2c8uV3+YsYWrrs/jfei/euv6M8h+aUHE7XAJgYdMvnU1FM6fDj aKDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date; bh=tDgNo4oTn3L6Gp1fA0KuMTuaGfv8BlQecOAnw2usRSM=; b=wFP1M/sZTYWQBmSywEeIMRJkjf9KnYyOcQnupgKJrNJ519uckTcgURWZ929EC03yuz j+ZlcUpHtusrDd7Uzje/pBEzs/5EFjNaPZKBxWB82X3UdS5rh/AS0QblTetYICz9AFW8 JxxMulWzixsB0naQ1Y57uzT+0Cj1gYCGtn61FcpNI8XBzcl094PP8xHnn2uNQdV9dxi0 Q7zf/QCQstmAOZjB4MTRY0dCdGpR3Xq8a9QF5swUvWH+HOevtjX4uHcDSwXO53u12x+3 cHo6NuON4yf5uxhoo5pF9Hc6NJzANqbZXdxXp2ccy6vQFN35UL61tbLFUV13nCj8SIg7 3Rmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m26si3344668edp.501.2021.02.04.01.14.52; Thu, 04 Feb 2021 01:15:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234765AbhBDJK6 (ORCPT + 99 others); Thu, 4 Feb 2021 04:10:58 -0500 Received: from gentwo.org ([3.19.106.255]:46368 "EHLO gentwo.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234992AbhBDJIZ (ORCPT ); Thu, 4 Feb 2021 04:08:25 -0500 Received: by gentwo.org (Postfix, from userid 1002) id 1185A3F4C8; Thu, 4 Feb 2021 09:07:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 0EFAB3E974; Thu, 4 Feb 2021 09:07:27 +0000 (UTC) Date: Thu, 4 Feb 2021 09:07:27 +0000 (UTC) From: Christoph Lameter X-X-Sender: cl@www.lameter.com To: Vincent Guittot cc: Bharata B Rao , Vlastimil Babka , Will Deacon , linux-kernel , linux-mm@kvack.org, David Rientjes , Joonsoo Kim , Andrew Morton , guro@fb.com, Shakeel Butt , Johannes Weiner , aneesh.kumar@linux.ibm.com, Jann Horn , Michal Hocko , Catalin Marinas Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order In-Reply-To: Message-ID: References: <786571e7-b9a2-4cdb-06d5-aa4a4b439b7e@suse.cz> <20210123051607.GC2587010@in.ibm.com> <66652406-25e4-a9e7-45a1-8ad14d2e8a36@suse.cz> <20210126230305.GD30941@willie-the-truck> <81424d71-c479-4c4a-de14-0a9b3f636e23@suse.cz> <20210203111009.GB2869122@in.ibm.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 4 Feb 2021, Vincent Guittot wrote: > > So what is preferrable here now? Above or other quick fix or reverting > > the original commit? > > I'm fine with whatever the solution as long as we can use keep using > nr_cpu_ids when other values like num_present_cpus, don't reflect > correctly the system AFAICT they are correctly reflecting the current state of the system. The problem here is the bringup of the system and the tuning therefor. One additional thing that may help: The slab caches can work in a degraded mode where no fastpath allocations can occur. That mode is used primarily for debugging but maybe that mode can also help during bootstrap to avoid having to deal with the per cpu data and so on. In degraded mode SLUB will take a lock for each operation on an object. In this mode the following is true kmem_cache_cpu->page == NULL kmem_cache_cpu->freelist == NULL kmem_cache_debug(s) == true So if you define a new debug mode and include it in SLAB_DEBUG_FLAGS then you can force SLUB to fallback to operations where a lock is taken and where slab allocation can be stopped. This may be ok for bring up. The debug flags are also tied to some wizardry that can patch the code at runtime to optimize for debubgging or fast operations. You would tie into that one as well. Start in debug mode by default and switch to fast operations after all processors are up.