Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1330804pxb; Wed, 10 Feb 2021 06:09:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJzzhtW1qK2JBxREz1vf3o/LTkhtX3cpS+WNQ4M3v6+CEb0chFmB6+yCNyg5p5Vf6pUgRtUR X-Received: by 2002:a17:906:7689:: with SMTP id o9mr3169682ejm.324.1612966159842; Wed, 10 Feb 2021 06:09:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612966159; cv=none; d=google.com; s=arc-20160816; b=Uh0yaMUGbZXZC1O067V+xZ9lkU20OUy/5NUEHejUwdiLpeFCpgBcEndRelGR256mH7 aCF2Tqu+sCapav+MwAKD0p4kYBBQnCETk7GCyBJ8x4atbBCmzy7uX08IaTXMsttlf8Wx 1d1VJyMcDOqrfkJhHc9P9LaECTlJuWtoEH91HbP1GmrDUYDqNNYn7Dm65rJHOjinNtRm 9431w803Naw6O/APJyyNtmNBm1JFuVg6JQ3QvF655fgIYftw0toScjvpc/x7URe5oZyN BdR0jwbEEzWFxH/UeFoBIbCDuJBrMGsWna7dg0NgKSupL0SPm57JtRHNMgRWfG/i6Ed0 DJ5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=inoG9a7tIoH1M+rWvq9JwInkI1sdURsYMCPFEDZPgH0=; b=SdZPbdr/v/tknr1Ofk2O+xCQyqgsy2dc40jYB58swU0EccslAYE4LtfMV+XcoDOh7G LmWS6dbrm5hbwqp81JjeX2z8NqSsk/eKARG19omewlBH94u1RfyqkIT0LiAZOFJDFWms jX9M2ud8aUGozcvJEnjcrtjLbunCQA2/eUn5szkcivUvhs0H7OZqQkycSjUqNGWDYNdB pljYEnV0bVpY+BuNkqbX1YoJKrSwy9EzOVYTaZwyn6f/c3t9UdKzLjDRVlREpnK6mkSn kyWtLUnvQ0HlTOLXkGgEpky4kRf3hoSqNIfx/h1I7nkxLnau/MeUlknZiQ7ed9jc1o0S f4XA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m30si1602822edj.75.2021.02.10.06.08.53; Wed, 10 Feb 2021 06:09:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231194AbhBJOIJ (ORCPT + 99 others); Wed, 10 Feb 2021 09:08:09 -0500 Received: from outbound-smtp24.blacknight.com ([81.17.249.192]:45206 "EHLO outbound-smtp24.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbhBJOIH (ORCPT ); Wed, 10 Feb 2021 09:08:07 -0500 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp24.blacknight.com (Postfix) with ESMTPS id 75939C0B44 for ; Wed, 10 Feb 2021 14:07:14 +0000 (GMT) Received: (qmail 5289 invoked from network); 10 Feb 2021 14:07:14 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 10 Feb 2021 14:07:14 -0000 Date: Wed, 10 Feb 2021 14:07:12 +0000 From: Mel Gorman To: Vlastimil Babka Cc: Catalin.Marinas@arm.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, bharata@linux.ibm.com, cl@linux.com, guro@fb.com, hannes@cmpxchg.org, iamjoonsoo.kim@lge.com, jannh@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@kernel.org, rientjes@google.com, shakeelb@google.com, vincent.guittot@linaro.org, will@kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order Message-ID: <20210210140712.GB3697@techsingularity.net> References: <20210208134108.22286-1-vbabka@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20210208134108.22286-1-vbabka@suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 08, 2021 at 02:41:08PM +0100, Vlastimil Babka wrote: > When creating a new kmem cache, SLUB determines how large the slab pages will > based on number of inputs, including the number of CPUs in the system. Larger > slab pages mean that more objects can be allocated/free from per-cpu slabs > before accessing shared structures, but also potentially more memory can be > wasted due to low slab usage and fragmentation. > The rough idea of using number of CPUs is that larger systems will be more > likely to benefit from reduced contention, and also should have enough memory > to spare. > > > > So this patch tries to determine the best available value without specific arch > knowledge. > - num_present_cpus() if the number is larger than 1, as that means the arch is > likely setting it properly > - nr_cpu_ids otherwise > > This should fix the reported regressions while also keeping the effect of > 045ab8c9487b for PowerPC systems. It's possible there are configurations where > num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time > bloated, so these (if they exist) would keep the large orders based on > nr_cpu_ids as was before 045ab8c9487b. > Tested-by: Mel Gorman Only x86-64 tested, three machines, all showing similar results as would be expected. One example; hackbench-process-sockets 5.11.0-rc7 5.11.0-rc7 5.11.0-rc7 vanilla revert-v1r1 vbabka-fix-v1r1 Amean 1 0.3873 ( 0.00%) 0.4060 ( -4.82%) 0.3747 ( 3.27%) Amean 4 1.3767 ( 0.00%) 0.7700 * 44.07%* 0.7790 * 43.41%* Amean 7 2.4710 ( 0.00%) 1.2753 * 48.39%* 1.2680 * 48.68%* Amean 12 3.7103 ( 0.00%) 1.9570 * 47.26%* 1.9470 * 47.52%* Amean 21 5.9790 ( 0.00%) 2.9760 * 50.23%* 2.9830 * 50.11%* Amean 30 8.0467 ( 0.00%) 4.0590 * 49.56%* 4.0410 * 49.78%* Amean 48 12.8180 ( 0.00%) 6.5167 * 49.16%* 6.4070 * 50.02%* Amean 79 20.5150 ( 0.00%) 10.3580 * 49.51%* 10.3740 * 49.43%* Amean 110 25.5320 ( 0.00%) 14.0453 * 44.99%* 14.0577 * 44.94%* Amean 141 32.4170 ( 0.00%) 17.3267 * 46.55%* 17.4977 * 46.02%* Amean 172 40.0883 ( 0.00%) 21.0360 * 47.53%* 21.1480 * 47.25%* Amean 203 47.2923 ( 0.00%) 25.2367 * 46.64%* 25.4923 * 46.10%* Amean 234 55.2623 ( 0.00%) 29.0720 * 47.39%* 29.3273 * 46.93%* Amean 265 61.4513 ( 0.00%) 33.0260 * 46.26%* 33.0617 * 46.20%* Amean 296 73.2960 ( 0.00%) 36.6920 * 49.94%* 37.2520 * 49.18%* Comparing just a revert and the patch 5.11.0-rc7 5.11.0-rc7 revert-v1r1 vbabka-fix-v1r1 Amean 1 0.4060 ( 0.00%) 0.3747 ( 7.72%) Amean 4 0.7700 ( 0.00%) 0.7790 ( -1.17%) Amean 7 1.2753 ( 0.00%) 1.2680 ( 0.58%) Amean 12 1.9570 ( 0.00%) 1.9470 ( 0.51%) Amean 21 2.9760 ( 0.00%) 2.9830 ( -0.24%) Amean 30 4.0590 ( 0.00%) 4.0410 ( 0.44%) Amean 48 6.5167 ( 0.00%) 6.4070 ( 1.68%) Amean 79 10.3580 ( 0.00%) 10.3740 ( -0.15%) Amean 110 14.0453 ( 0.00%) 14.0577 ( -0.09%) Amean 141 17.3267 ( 0.00%) 17.4977 * -0.99%* Amean 172 21.0360 ( 0.00%) 21.1480 ( -0.53%) Amean 203 25.2367 ( 0.00%) 25.4923 ( -1.01%) Amean 234 29.0720 ( 0.00%) 29.3273 ( -0.88%) Amean 265 33.0260 ( 0.00%) 33.0617 ( -0.11%) Amean 296 36.6920 ( 0.00%) 37.2520 ( -1.53%) That's a negligible difference and all but one group (141) was within the noise. Even for 141, it's very marginal and with the degree of overload at that group count, it can be ignored. Thanks! -- Mel Gorman SUSE Labs