Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp587222pxb; Wed, 27 Jan 2021 15:58:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJx3VsDfcQck70GkUMP96mLXDYKfqB3mWB8p9yxF4eDU7yDZ5e90PhLT5zFeTkH9ZJmMe75H X-Received: by 2002:a17:906:e085:: with SMTP id gh5mr8708878ejb.418.1611791913162; Wed, 27 Jan 2021 15:58:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611791913; cv=none; d=google.com; s=arc-20160816; b=JallGGIlr1nlFR5XQkJy402YhXwNTblspbmmpbXx7qOUXi1uaZiWW7g1ce5Z9ocv2M 2ir7kBtChfetd1xjOs35yUOT0mXiqJUXXi0vEUd7EZ7AX0lZzLuhoRmktde2qWOTopI9 xcQSJ+C3/x34E5Vv3H9WMOuZcSGLo7qBk+CHc6HRJIuuwx2wkxkk5R3IjaP7r646Eu3e ijCZHyFNiM7N3rRZ78HU4sWEM0z1Uin2j3kGn7A+SPmR6fArTFYQrhW9KeAMrnKgruiV bmc+BvsMWaZZNGy+iX+oH2bElAl/NBKYPrDU/10ej2Rt+HGEoV5C6zPS5g0ZPIV5WbKB 5ocA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=hDT5zYlrfn82thdLyEhWrarFpLnP3mcroJN2IJwwBYs=; b=tmCRFWkAoKz2N+zY+TMh/ELL+vji8wzKKon0R4yybA3f7+xWUPXUqMRBA3aRO5qtRx ERqXq1SgPqWu7vszrtoVNAElC5OSRN9iaW/msf1lkM/zP8WA6gKmRa8CqSVv/TLDFiQo 0VusSA/t20SigAP3RBgFREhkdI0mZrPKv76GhdSIj3CtgtC3JTsXLJSKx0AwLoAzQhDH ms88IcyU7CsoZF3gakqbB22bX4J8enccNNRV22yQeriKcRiIOG1cFcn6ZegpCcoBlOVP Iu75jT6r0F8QyvFlHY3QoJxGHAR20BSwPv9p4pHnMfj7qc07KYcaA7p/uBLjNikr3FRx joFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bd27si1702565edb.16.2021.01.27.15.58.09; Wed, 27 Jan 2021 15:58:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235400AbhA0Nk6 (ORCPT + 99 others); Wed, 27 Jan 2021 08:40:58 -0500 Received: from mx2.suse.de ([195.135.220.15]:54484 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238534AbhA0NjP (ORCPT ); Wed, 27 Jan 2021 08:39:15 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6233EAF49; Wed, 27 Jan 2021 13:38:33 +0000 (UTC) Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order To: Michal Hocko , Vincent Guittot Cc: Christoph Lameter , Bharata B Rao , linux-kernel , linux-mm@kvack.org, David Rientjes , Joonsoo Kim , Andrew Morton , guro@fb.com, Shakeel Butt , Johannes Weiner , aneesh.kumar@linux.ibm.com, Jann Horn References: <20201118082759.1413056-1-bharata@linux.ibm.com> <20210121053003.GB2587010@in.ibm.com> <20210126085243.GE827@dhcp22.suse.cz> <20210126135918.GQ827@dhcp22.suse.cz> From: Vlastimil Babka Message-ID: <62d61572-830b-a660-8049-3826128343c5@suse.cz> Date: Wed, 27 Jan 2021 14:38:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: <20210126135918.GQ827@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/26/21 2:59 PM, Michal Hocko wrote: >> >> On 8 CPUs, I run hackbench with up to 16 groups which means 16*40 >> threads. But I raise up to 256 groups, which means 256*40 threads, on >> the 224 CPUs system. In fact, hackbench -g 1 (with 1 group) doesn't >> regress on the 224 CPUs system. The next test with 4 groups starts >> to regress by -7%. But the next one: hackbench -g 16 regresses by 187% >> (duration is almost 3 times longer). It seems reasonable to assume >> that the number of running threads and resources scale with the number >> of CPUs because we want to run more stuff. > > OK, I do understand that more jobs scale with the number of CPUs but I > would also expect that higher order pages are generally more expensive > to get so this is not really a clear cut especially under some more > demand on the memory where allocations are smooth. So the question > really is whether this is not just optimizing for artificial conditions. FWIW, I enabled CONFIG_SLUB_STATS and run "hackbench -l 16000 -g 16" in a (small) VM, and checked tools/vm/slabinfo -DA as per the config option's help, and it seems to be these 2 caches that are stressed: Name Objects Alloc Free %Fast Fallb O CmpX UL kmalloc-512 812 25655535 25654908 71 1 0 0 20082 0 skbuff_head_cache 304 25602632 25602632 84 1 0 0 11241 0 I guess larger pages mean more batched per-cpu allocations without going to the shared structures or even page allocator. But 3 times duration is still surprising to me. I'll dig more.