Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp2410577rwe; Sun, 28 Aug 2022 10:39:58 -0700 (PDT) X-Google-Smtp-Source: AA6agR4E/ujyJTWrq7tQTe1v3x+PlsTisr6QxVh/gDBrOwZ5FLXDdF2JhJOT1zJEwFnc3/Gcqi5g X-Received: by 2002:a17:907:2d8a:b0:730:6880:c396 with SMTP id gt10-20020a1709072d8a00b007306880c396mr10868899ejc.192.1661708398436; Sun, 28 Aug 2022 10:39:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661708398; cv=none; d=google.com; s=arc-20160816; b=xTIZ0aSQe00aAHo6UHLZ07fx0wwZNy2nAHQnsO6w2nYWe+2zP4Hfp/vYTs1/F4623r DZU3KVsd6wFUHdc3QfjT5vwPdOBI9RHW4P21OfRv6QPew9K+E4elqhw5EkOY52YcoW9S zK6DaUUvFv7sJGjSTnP8GOoWeQq1O1fQafJPdB0FMcQSf3YW1SBmYU7C3Q+gAXHs7n2G UONz2FndL32Xs5nTUHW0ECNt8LKnVerDzj4vfgwo7ZJ53YEyHvanrCEavhxzgzRIZjmu C7THvL+1dVfaopWs81zzT6Y4C71Z8CWHEAaE6mrJ8IQ2adfq4B/6sVbe6XkHmaP++NMi Zhmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=5GaAETb47buvqKhbgQUN8dwyY2gFAeqLNggR83ghOGo=; b=qHiuCkt2Jx7NfgbUxmgFJd+Ycmqpw/ohScPCj4NTaYwGfvf3CIt2XLjDyaXxDKjl42 MxJX0WCwelNna9Z40JofXRJfcus9xot7jK7ncWenqOyk3WgdilCBB1XqgBnfRFR/YvKw L/L1QnUQ1cEK1l8mK5fMIIyU8KtpBrj/x+SXEdnNcN8l7KY/iAMIW3XF1q6GzbMqJCQA EcAggcySAYOkCIj/4V0nlspZcaD5zGRu0WYv9n0+Bh0AkwaBuhYzasWPHH23SpQmqHrj 6UznELPob7IW5oIGMT0YeGTelpSBOWc7L+LO0nkbFw6HZac8K5qejvdKr0ooRDq1An2N sfEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bP7DS3eG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hc13-20020a170907168d00b007315809ec84si5820717ejc.398.2022.08.28.10.39.33; Sun, 28 Aug 2022 10:39:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bP7DS3eG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230009AbiH1RLX (ORCPT + 99 others); Sun, 28 Aug 2022 13:11:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229591AbiH1RLW (ORCPT ); Sun, 28 Aug 2022 13:11:22 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19A6033413 for ; Sun, 28 Aug 2022 10:11:21 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CA964B80B86 for ; Sun, 28 Aug 2022 17:11:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7EB56C433C1; Sun, 28 Aug 2022 17:11:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1661706678; bh=LknBfxNJzf7458Aj8Oexy8m7f0vF18tFkxbYt0jSt9Y=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=bP7DS3eGipMDTFry+jN0V5edAb921daTuLWYd2suz6ilMXntsomQl5T0DTYU7yoyV 0MNaIFIdyv6ttfhFWmATsv9PgI/g9N2WtZ3uDNmg9umEP331BUpYxWl3q9QTH8Aa61 vPIgT+8G2Tl4Lc0v2jJLKxajEorNaciMWZbRDWnMtSGeyx13ZRraSZOXo4jyuvi9NS L4PgRnYacTKvGTfaHbxGabbiB3Rds5XX1G+U3OnpuLOvS1NIkAAtv1zdi9gMhKLnF+ jvJP4bY8smp8/oXA4Tuuw2JDOqfddEbU90gWVBMveTTyEW0kzc8Dc0lcQ0rkXJ1LWq 3b8O18PwWYoyQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1oSLoa-006Jzg-7R; Sun, 28 Aug 2022 18:11:16 +0100 Date: Sun, 28 Aug 2022 18:11:15 +0100 Message-ID: <87r110qong.wl-maz@kernel.org> From: Marc Zyngier To: Puyou Lu Cc: Thomas Gleixner , Robert Richter , Catalin Marinas , linux-kernel@vger.kernel.org Subject: Re: [PATCH] irqchip/gic-v3: do runtime cpu cap check only when necessary In-Reply-To: <20220828075610.GA30202@lu-N56VJ> References: <20220827051328.GA18042@lu-N56VJ> <87wnatra83.wl-maz@kernel.org> <20220828075610.GA30202@lu-N56VJ> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: puyou.lu@gmail.com, tglx@linutronix.de, rrichter@cavium.com, catalin.marinas@arm.com, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 28 Aug 2022 08:56:23 +0100, Puyou Lu wrote: > > On Sat, Aug 27, 2022 at 04:13:00PM +0100, Marc Zyngier wrote: > > On Sat, 27 Aug 2022 06:19:27 +0100, > > Puyou Lu wrote: > > > > > > Now cpu cap check is done every exception happens on every arm64 platform, > > > but this check is necessary on just few of then, so we can drop this > > > check at compile time on others. This can decrease exception handle time > > > on most cases. > > > > > > Fixes: 6d4e11c5e2e8 ("irqchip/gicv3: Workaround for Cavium ThunderX erratum 23154") > > > Signed-off-by: Puyou Lu > > > --- > > > drivers/irqchip/irq-gic-v3.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > > index 262658fd5f9e..3f08c2ef1251 100644 > > > --- a/drivers/irqchip/irq-gic-v3.c > > > +++ b/drivers/irqchip/irq-gic-v3.c > > > @@ -237,9 +237,11 @@ static void gic_redist_wait_for_rwp(void) > > > > > > static u64 __maybe_unused gic_read_iar(void) > > > { > > > +#ifdef CONFIG_CAVIUM_ERRATUM_23154 > > > if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_23154)) > > > return gic_read_iar_cavium_thunderx(); > > > else > > > +#endif > > > return gic_read_iar_common(); > > > } > > > #endif > > > > You realise that cpus_have_const_cap() results purely in a couple of > > branches once the caps have been finalised, right? > > > > Please provide data showing that it actually "can decrease exception > > handle time on most cases", because I'm pretty sure you cannot measure > > the difference in any meaningful way. > > > > M. > > > > -- > > Without deviation from the norm, progress is not possible. > > Hi Marc, > Thank you for the reply. Actually I did no test, just from the disassemble > code of vmlinux, I saw about 6 instruction generated by > cpus_have_const_cap, and about 36 by gic_read_iar_cavium_thunderx, which > is useless for most CPUs. I think this will waste some cpu cycles, as > exceptions can occur hunderds or thousands times per second. Also > (6+36)*4=168 bytes of icache is wasted, and icache misses increase > somewhere else. > If I got things wrong, please correct me. Well, what you got wrong is that these instructions are stepped over two branches when the caps are finalised, and that doesn't appear in the disassembly (you need to look at the code that is actually executed). Now, any optimisation of the sort must be backed by some performance numbers. If you can show that this has a meaningful impact on a given workload, I'm happy to look into it. But only if you can show that data. Thanks, M. -- Without deviation from the norm, progress is not possible.