Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4419466rdb; Fri, 15 Sep 2023 01:16:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE9J1Sq0O7nb49ipAgI3bOVergLOGXFQ7Dtr1lPDKu/fgLKD8V4AFnHIvcCZOBcgoAGYS5v X-Received: by 2002:a05:6a00:1250:b0:68a:6d34:474b with SMTP id u16-20020a056a00125000b0068a6d34474bmr6941477pfi.15.1694765760543; Fri, 15 Sep 2023 01:16:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694765760; cv=none; d=google.com; s=arc-20160816; b=xi+E66yQB2CRWlxx/f7BX8I6472DTBE6mnz9CSW2+w1A5M4zDgUbAiXmOyjb6kYRne VAJ3FGm/Y3cEtYub8JbPmPEFykMX7gwq5LHAc37SQxFpJpdHJpUHBVQmlXnDDvM5JY+S 5v+LSYPzxPMWMzBWwx0DM69F5tfZDvFfcQfPvDm2oNbS78S0keuuET1fxIaQQHDLFKFi 2UyBOQenojaPLNqqdx5QT6qdJmcZYmzjiJ3XqLwxhIWQqciWWPjsMVTKjMty/Rt+/9yv FAHqMFTVSftXAANBrCSH5yxlhIlTDGYqVF3K1I1lx8G3a3ubpk5WplGSia2ANKOc4FTz MYeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:subject:from:message-id:cc:to:feedback-id :dkim-signature; bh=fnt/1tGrYIgmAtZpbfYmp72DSY85xbo7uKz4JIoG494=; fh=oJhM/VqukzGeHAa5aEbxXkoq5v/8m3blLsdrHorLJtc=; b=LfANpp+t0RHg7CR71oi60PkK7gcH6+znkWh3Fopui98wEZoiYk0xXLgPappPgaccV0 dtdpY3PztW0J2fL0mV4NrnUoX2P89VKoUhlUPWw8TzUOlI5UVOpWEWwxDMVVP0fMTDIO chRykbzOWmVodiDbB1jX6MzQU2PaACBUmBRiynFcob5mGdR+Bov7fMcXcyRrz1loz2gl lELR6MwUmktkJqOvdf5adS8Mls/C04AmQI5mYjcLmAnRP9joOKKtL/QF51pDLBw/EWsB sTKvpx0B67N7a97f0y61ADPBGaYKHP9I1ebhZHayj3LftEbAsOUs7xkrTKjRcTWWcpzN TO7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="qP/ND55O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id q12-20020a056a00150c00b0068e405d9217si3193875pfu.302.2023.09.15.01.15.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 01:16:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="qP/ND55O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id DE8DF8098416; Thu, 14 Sep 2023 22:50:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232152AbjIOFut (ORCPT + 99 others); Fri, 15 Sep 2023 01:50:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231997AbjIOFus (ORCPT ); Fri, 15 Sep 2023 01:50:48 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 720DFE6E for ; Thu, 14 Sep 2023 22:50:42 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id 848D0320090A; Fri, 15 Sep 2023 01:50:40 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Fri, 15 Sep 2023 01:50:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:message-id:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1694757039; x=1694843439; bh=f nt/1tGrYIgmAtZpbfYmp72DSY85xbo7uKz4JIoG494=; b=qP/ND55O2ZecjYJ+c oKrpXHwsafuUjtxX4PyCv4Kvr8mm9Zilv7M/hvwbNH0vTnWLT+hgvf7FGec91Q0j 1XonXEpRany07WDdEobTX6TN6LXK0yyUS3TkSSrG+HOn71cu9H9u68dPBiQvZc7y SDifra8rcCiiwTwNcbAJFYTneBFWWHn9YWrNzxzBkbzHUEQ6GsXF7sl+q2XQExJU klxoBMXKbu6mcnEoTEeJy5zsswi5q5OT9JOrP8h0PEt/JkEz/6S1X2rIZv+J03J9 hGyhpX/5H6mrpEgY9U+JZqFb5JiXjzhnSP9jD5PceCD4JiTpS6zb/BPQTT9KrWrv 0bAkA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudejuddgleelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepvfevkffhufffsedttdertddttddtnecuhfhrohhmpefhihhnnhcuvfhhrghi nhcuoehfthhhrghinheslhhinhhugidqmheikehkrdhorhhgqeenucggtffrrghtthgvrh hnpeehfffggeefveegvedtiefffeevuedtgefhueehieetffejfefggeevfeeuvdduleen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehfthhhrg hinheslhhinhhugidqmheikehkrdhorhhg X-ME-Proxy: Feedback-ID: i58a146ae:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 15 Sep 2023 01:50:36 -0400 (EDT) To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Frederic Weisbecker Cc: "Thomas Gleixner" , linux-kernel@vger.kernel.org Message-Id: <0a403120a682a525e6db2d81d1a3ffcc137c3742.1694756831.git.fthain@linux-m68k.org> From: Finn Thain Subject: [PATCH] sched: Optimize in_task() and in_interrupt() a bit Date: Fri, 15 Sep 2023 15:47:11 +1000 X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 14 Sep 2023 22:50:53 -0700 (PDT) Except on x86, preempt_count is always accessed with READ_ONCE. Repeated invocations in macros like irq_count() produce repeated loads. These redundant instructions appear in various fast paths. In the one shown below, for example, irq_count() is evaluated during kernel entry if !tick_nohz_full_cpu(smp_processor_id()). 0001ed0a : 1ed0a: 4e56 0000 linkw %fp,#0 1ed0e: 200f movel %sp,%d0 1ed10: 0280 ffff e000 andil #-8192,%d0 1ed16: 2040 moveal %d0,%a0 1ed18: 2028 0008 movel %a0@(8),%d0 1ed1c: 0680 0001 0000 addil #65536,%d0 1ed22: 2140 0008 movel %d0,%a0@(8) 1ed26: 082a 0001 000f btst #1,%a2@(15) 1ed2c: 670c beqs 1ed3a 1ed2e: 2028 0008 movel %a0@(8),%d0 1ed32: 2028 0008 movel %a0@(8),%d0 1ed36: 2028 0008 movel %a0@(8),%d0 1ed3a: 4e5e unlk %fp 1ed3c: 4e75 rts This patch doesn't prevent the pointless btst and beqs instructions above, but it does eliminate 2 of the 3 pointless move instructions here and elsewhere. On x86, preempt_count is per-cpu data and the problem does not arise presumably because the compiler is free to optimize more effectively. Cc: Thomas Gleixner Fixes: 15115830c887 ("preempt: Cleanup the macro maze a bit") Signed-off-by: Finn Thain --- This patch was tested on m68k and x86. I was expecting no changes to object code for x86 and mostly that's what I saw. However, there were a few places where code generation was perturbed for some reason. The performance issue addressed here is minor on uniprocessor m68k. I got a 0.01% improvement from this patch for a simple "find /sys -false" benchmark. For architectures and workloads susceptible to cache line bounce the improvement is expected to be larger. The only SMP architecture I have is x86, and as x86 unaffected I have not done any further measurements. Changed since v2: - Clarify the comment about macros. Changed since v1: - Added a comment that was requested by Frederic. --- include/linux/preempt.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/include/linux/preempt.h b/include/linux/preempt.h index 1424670df161..9aa6358a1a16 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -99,14 +99,21 @@ static __always_inline unsigned char interrupt_context_level(void) return level; } +/* + * These macro definitions avoid redundant invocations of preempt_count() + * because such invocations would result in redundant loads given that + * preempt_count() is commonly implemented with READ_ONCE(). + */ + #define nmi_count() (preempt_count() & NMI_MASK) #define hardirq_count() (preempt_count() & HARDIRQ_MASK) #ifdef CONFIG_PREEMPT_RT # define softirq_count() (current->softirq_disable_cnt & SOFTIRQ_MASK) +# define irq_count() ((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | softirq_count()) #else # define softirq_count() (preempt_count() & SOFTIRQ_MASK) +# define irq_count() (preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_MASK)) #endif -#define irq_count() (nmi_count() | hardirq_count() | softirq_count()) /* * Macros to retrieve the current execution context: @@ -119,7 +126,11 @@ static __always_inline unsigned char interrupt_context_level(void) #define in_nmi() (nmi_count()) #define in_hardirq() (hardirq_count()) #define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET) -#define in_task() (!(in_nmi() | in_hardirq() | in_serving_softirq())) +#ifdef CONFIG_PREEMPT_RT +# define in_task() (!((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | in_serving_softirq())) +#else +# define in_task() (!(preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) +#endif /* * The following macros are deprecated and should not be used in new code: -- 2.39.3