Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4844170rdh; Wed, 29 Nov 2023 12:16:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IELXSZhVcnPHAf1bT6X4i+NkHUtHmbZDnYcS+l3w1HFqI2iDUTD6zBRG5EtqDd9T6UvX+Dq X-Received: by 2002:a05:6a20:3d1b:b0:18b:556b:6d54 with SMTP id y27-20020a056a203d1b00b0018b556b6d54mr22720490pzi.55.1701288996029; Wed, 29 Nov 2023 12:16:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701288996; cv=none; d=google.com; s=arc-20160816; b=sVDK7rt5ISbkhcCm44AoWOjS+aUxgiygPJPRp6JM4GcYihbL+snABhMBR63/pAH/pk 7ed4W1Oi/VW2tbEDJc+/DNI7iy0Ne5H0SuQfP4tjjlAEfwiqjmfDlnszUr/vYLez4luN +SPzLbUb3Blpz4U/MujRUI4bawmLR8UQQ2tlqiv13ahOmgYeY8La+J9ZDFhSotwm9jdz hY8oidgmTBxRP97YM0V+riOgz6UW6Gvv0oUYjm41RYgHtSGobN5jOyJShNnINLqGIoWj yP5oKBY5Ri6cOe+PhbPwXgr6ve0pzaQ3pZlACTfpfdFvSFXvPZZxSD2qQ8CaQoTFJhs7 2Tiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date; bh=qLqu6lgOrkEwJCcYwI1Pa4CapQzRye0TPSsqr60AwGc=; fh=hi9JHgv0t5lfI9B4lpP20WWPziAbQju4GOeINhvzHe0=; b=vadehRURjNeu/JeRpiEuAv1grYYJOvftx5g/MxbUu67ADjAfQo8M9N4xCkHMSkHNHJ +vgtyxoPDT6XvAMlun+cPcd6ggrKSFVGS0g7skaPvyQoIktqsajjvom2c7+po6+9H7vi SIh6tw58Rx22X4wY4sczoruXoTTvKMufgyb/N7saK9gMROZgx0LzvTn17Y9wAiD7MppV WoL2AEgooJtM/bNTNO6OgztcdBdTuI0S0sna8aY8+I3Bw1m2T5ONpMuPORXG8CBk+/NW naH5Ma1GVdh6+pGHhqUmuNFASr/331QOF2eF//pIE9juxqO4fUwKF67p/qjvaJayBGJz KYUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id v15-20020a63d54f000000b005aebc9096d4si14931912pgi.150.2023.11.29.12.16.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 12:16:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 4ECD68080E08; Wed, 29 Nov 2023 12:16:31 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233773AbjK2UQO (ORCPT + 99 others); Wed, 29 Nov 2023 15:16:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230146AbjK2UQN (ORCPT ); Wed, 29 Nov 2023 15:16:13 -0500 Received: from gentwo.org (gentwo.org [62.72.0.81]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA7B5D69 for ; Wed, 29 Nov 2023 12:16:17 -0800 (PST) Received: by gentwo.org (Postfix, from userid 1003) id 5A51448CA3; Wed, 29 Nov 2023 12:16:17 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 5921C430F2; Wed, 29 Nov 2023 12:16:17 -0800 (PST) Date: Wed, 29 Nov 2023 12:16:17 -0800 (PST) From: "Christoph Lameter (Ampere)" To: Vlastimil Babka cc: Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com Subject: Re: [PATCH RFC v3 0/9] SLUB percpu array caches and maple tree nodes In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> Message-ID: References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 29 Nov 2023 12:16:31 -0800 (PST) On Wed, 29 Nov 2023, Vlastimil Babka wrote: > At LSF/MM I've mentioned that I see several use cases for introducing > opt-in percpu arrays for caching alloc/free objects in SLUB. This is my > first exploration of this idea, speficially for the use case of maple > tree nodes. The assumptions are: Hohumm... So we are not really removing SLAB but merging SLAB features into SLUB. In addition to per cpu slabs, we now have per cpu queues. > - percpu arrays will be faster thank bulk alloc/free which needs > relatively long freelists to work well. Especially in the freeing case > we need the nodes to come from the same slab (or small set of those) Percpu arrays require the code to handle individual objects. Handling freelists in partial SLABS means that numerous objects can be handled at once by handling the pointer to the list of objects. In order to make the SLUB in page freelists work better you need to have larger freelist and that comes with larger page sizes. I.e. boot with slub_min_order=5 or so to increase performance. Also this means increasing TLB pressure. The in page freelists of SLUB cause objects from the same page be served. The SLAB queueing approach results in objects being mixed from any address and thus neighboring objects may require more TLB entries. > - preallocation for the worst case of needed nodes for a tree operation > that can't reclaim due to locks is wasteful. We could instead expect > that most of the time percpu arrays would satisfy the constained > allocations, and in the rare cases it does not we can dip into > GFP_ATOMIC reserves temporarily. So instead of preallocation just > prefill the arrays. The partial percpu slabs could already do the same. > - NUMA locality of the nodes is not a concern as the nodes of a > process's VMA tree end up all over the place anyway. NUMA locality is already controlled by the user through the node specification for percpu slabs. All objects coming from the same in page freelist of SLUB have the same NUMA locality which simplifies things. If you would consider NUMA locality for the percpu array then you'd be back to my beloved alien caches. We were not able to avoid that when we tuned SLAB for maximum performance. > Patch 5 adds the per-cpu array caches support. Locking is stolen from > Mel's recent page allocator's pcplists implementation so it can avoid > disabling IRQs and just disable preemption, but the trylocks can fail in > rare situations - in most cases the locks are uncontended so the locking > should be cheap. Ok the locking is new but the design follows basic SLAB queue handling.