Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp2435744rdb; Mon, 20 Nov 2023 10:38:16 -0800 (PST) X-Google-Smtp-Source: AGHT+IGqBPBn9tCmqqOJyvoPWEY7Tp/+E+xaDvvxpeso6zl9cgNiUZwYOt8vn08cDn1RA5lWlpmO X-Received: by 2002:a05:6808:d51:b0:3b5:64cd:6559 with SMTP id w17-20020a0568080d5100b003b564cd6559mr11773456oik.17.1700505496661; Mon, 20 Nov 2023 10:38:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700505496; cv=none; d=google.com; s=arc-20160816; b=1LsdZgqodb6adT1vHe/OgJzRqDKs+v3nzOZlZJVRip1MUs5B/g+cSbuH7VrreaAN5K iDyIMtgRsmkOqHnPqE7eFem8CsNl8GLHApGBFb6k6yld32+AeMuO+TCUaZOebEEaBI1O YnRu3J9q5cs8unjRXRBPYhEB67CHVt43/xoJ9uQGJo1s1N7H1knKQdvkFQB/HgzCDOGp lNKEGfDZ0Oz6GwsFleaPSCrYcawcwzdVZEPmKQeqoZx8ZyaGZ+MDApR6QDSWGKB3ydyy q6jJ5Auut+q23ZahXP+OypGrw30L6yHvl7mBFHx5S+iGsnuBSrlUvCU0sh9haivSEOod XSaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:in-reply-to:references:message-id :content-transfer-encoding:mime-version:subject:date:from :dkim-signature:dkim-signature; bh=CxVtau361btrxEZ1p8oQK+uCjxmURZ9Wrr8QyUc+PdA=; fh=ONqwzSuNvSIo96fWAp0pW54nN9xAdzTfApSlC7LEvRU=; b=Sb1K8rwPanz/8jpHy4pV03D6DYVPNd77UmtOBMw+Wwq53zQBmcwCjUOk/17BitsLKJ dR+dog7yyb2rB2Q44qrSXxLuzcQhL5YgSyEcd0bvSHiLzH9e8eiLtrtzWJpAyN/OH74r /PcDiqYK9V/tXvPhL+0UisSo6NY71PQL+O5igwD4UNW7yRsCTOcfguzyVioxBGRhjexW sxydvE8kRf7JpVJJBY5/PYJwg5ONePNuf01XUbm0EYQhgKLmlmg1njmiXL2nFgmOd0Xx jvkr6JQtB56qdclrbfhLo4MnUZs7x3T8UcDbCdVfFfb7239jQONGs4+YTgc65dYbtbYo M1AQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=dA1DtXnN; dkim=neutral (no key) header.i=@suse.cz header.b=hPvTmXax; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id m8-20020a656a08000000b005ad1edab539si9490901pgu.319.2023.11.20.10.38.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 10:38:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=dA1DtXnN; dkim=neutral (no key) header.i=@suse.cz header.b=hPvTmXax; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id CA82E80725F1; Mon, 20 Nov 2023 10:36:07 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234076AbjKTSfz (ORCPT + 99 others); Mon, 20 Nov 2023 13:35:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232487AbjKTSe5 (ORCPT ); Mon, 20 Nov 2023 13:34:57 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A3D4CB; Mon, 20 Nov 2023 10:34:46 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A69CC1F8B4; Mon, 20 Nov 2023 18:34:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1700505284; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CxVtau361btrxEZ1p8oQK+uCjxmURZ9Wrr8QyUc+PdA=; b=dA1DtXnNqrzA2tfUWppvLMcFXSoz5YA8kR1l30NMkK9d/reHC3N3+6WLFQbFu+6G/gWj76 JC3Uh1TALn6ggVdxqlFKre2d3QneXkVjlbhMiZCcq3uLU/WAx9iR2vAZpfgBG5X2D4mPUD jM3np8iFo/V/3eDfwFTF2GdSkTpwG4M= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1700505284; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CxVtau361btrxEZ1p8oQK+uCjxmURZ9Wrr8QyUc+PdA=; b=hPvTmXaxercEKsn6Df6woNxnTvve8p2Um8LN9+kKZnVI6mHvDWKoVXO6fZr38i79VzZZAW lzYTXtH3eDkXvLAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 67EF913912; Mon, 20 Nov 2023 18:34:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id UHi1GMSmW2UUMgAAMHmgww (envelope-from ); Mon, 20 Nov 2023 18:34:44 +0000 From: Vlastimil Babka Date: Mon, 20 Nov 2023 19:34:32 +0100 Subject: [PATCH v2 21/21] mm/slub: optimize free fast path code layout MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20231120-slab-remove-slab-v2-21-9c9c70177183@suse.cz> References: <20231120-slab-remove-slab-v2-0-9c9c70177183@suse.cz> In-Reply-To: <20231120-slab-remove-slab-v2-0-9c9c70177183@suse.cz> To: David Rientjes , Christoph Lameter , Pekka Enberg , Joonsoo Kim Cc: Andrew Morton , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Marco Elver , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Kees Cook , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, cgroups@vger.kernel.org, linux-hardening@vger.kernel.org, Vlastimil Babka X-Mailer: b4 0.12.4 Authentication-Results: smtp-out2.suse.de; none X-Spam-Level: X-Spam-Score: -6.80 X-Spamd-Result: default: False [-6.80 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; REPLY(-4.00)[]; MID_RHS_MATCH_FROM(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_RATELIMIT(0.00)[to_ip_from(RL563rtnmcmc9sawm86hmgtctc)]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; BAYES_HAM(-3.00)[100.00%]; RCPT_COUNT_TWELVE(0.00)[24]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_CC(0.00)[linux-foundation.org,gmail.com,linux.dev,google.com,arm.com,cmpxchg.org,kernel.org,chromium.org,kvack.org,vger.kernel.org,googlegroups.com,suse.cz]; RCVD_COUNT_TWO(0.00)[2]; SUSPICIOUS_RECIPS(1.50)[] X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_SOFTFAIL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 20 Nov 2023 10:36:08 -0800 (PST) Inspection of kmem_cache_free() disassembly showed we could make the fast path smaller by providing few more hints to the compiler, and splitting the memcg_slab_free_hook() into an inline part that only checks if there's work to do, and an out of line part doing the actual uncharge. bloat-o-meter results: add/remove: 2/0 grow/shrink: 0/3 up/down: 286/-554 (-268) Function old new delta __memcg_slab_free_hook - 270 +270 __pfx___memcg_slab_free_hook - 16 +16 kfree 828 665 -163 kmem_cache_free 1116 948 -168 kmem_cache_free_bulk.part 1701 1478 -223 Checking kmem_cache_free() disassembly now shows the non-fastpath cases are handled out of line, which should reduce instruction cache usage. Signed-off-by: Vlastimil Babka --- mm/slub.c | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 77d259f3d592..3f8b95757106 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1959,20 +1959,11 @@ void memcg_slab_post_alloc_hook(struct kmem_cache *s, struct obj_cgroup *objcg, return __memcg_slab_post_alloc_hook(s, objcg, flags, size, p); } -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, - void **p, int objects) +static void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, + void **p, int objects, + struct obj_cgroup **objcgs) { - struct obj_cgroup **objcgs; - int i; - - if (!memcg_kmem_online()) - return; - - objcgs = slab_objcgs(slab); - if (!objcgs) - return; - - for (i = 0; i < objects; i++) { + for (int i = 0; i < objects; i++) { struct obj_cgroup *objcg; unsigned int off; @@ -1988,6 +1979,22 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, obj_cgroup_put(objcg); } } + +static __fastpath_inline +void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, + int objects) +{ + struct obj_cgroup **objcgs; + + if (!memcg_kmem_online()) + return; + + objcgs = slab_objcgs(slab); + if (likely(!objcgs)) + return; + + __memcg_slab_free_hook(s, slab, p, objects, objcgs); +} #else /* CONFIG_MEMCG_KMEM */ static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr) { @@ -2047,7 +2054,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s, * The initialization memset's clear the object and the metadata, * but don't touch the SLAB redzone. */ - if (init) { + if (unlikely(init)) { int rsize; if (!kasan_has_integrated_init()) @@ -2083,7 +2090,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, next = get_freepointer(s, object); /* If object's reuse doesn't have to be delayed */ - if (!slab_free_hook(s, object, slab_want_init_on_free(s))) { + if (likely(!slab_free_hook(s, object, + slab_want_init_on_free(s)))) { /* Move object to the new freelist */ set_freepointer(s, object, *head); *head = object; @@ -4282,7 +4290,7 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab, * With KASAN enabled slab_free_freelist_hook modifies the freelist * to remove objects, whose reuse must be delayed. */ - if (slab_free_freelist_hook(s, &head, &tail, &cnt)) + if (likely(slab_free_freelist_hook(s, &head, &tail, &cnt))) do_slab_free(s, slab, head, tail, cnt, addr); } -- 2.42.1