Received: by 2002:a05:7412:bb8d:b0:d7:7d3a:4fe2 with SMTP id js13csp1501866rdb; Wed, 16 Aug 2023 13:26:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFaFAfMRFpD7MnMvq7HODDKqMeJT5AavlqjFYOqPaJjIyK39VBGbi2CYWqRd0X5oVRq2S8Q X-Received: by 2002:a05:6a20:9706:b0:13e:8ce5:bedd with SMTP id hr6-20020a056a20970600b0013e8ce5beddmr2673594pzc.1.1692217607174; Wed, 16 Aug 2023 13:26:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692217607; cv=none; d=google.com; s=arc-20160816; b=nDJk1OblUJt1kctwu5VUlEA66+KdZ+LZC49/ZZUp2ZNbe5+v4cxwhnGZ7LL3Ho0Llj RFUwIVwOh55S/V1xDfH5edhwyK2a/jPtiFxFtAkN7aL0A4l3mqB8yozykS/FbOZ1pX37 BYsYlju9grr1AEMfhh7gZqh1dvYttqylL0eTTJLLp411Ocg3GpgEWdf8PMTv69dsSfZt 5/4J04kqAs2f8wlGb7QYWhNb6lzZMJjHTxrLPIGDk9AQrb+CSfx8dy8+fbq4vqptB9EF PP+5FJkVSyTSobVdJlPkmqr1oPCJf2IgFHi6Qnd6+1BYC7R98q3eoaOHA0LqINZoDi9Y kIcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=KjaQaFDzCjw6ffN/lnwBWBfUti0cQifqhGdu7rjSabc=; fh=TyGMrrbtvJXc9KbgD7MOQ3VFidrITwpBCo6QYPKAj9c=; b=sH3Tn0E2zUE5jpc+3UM4+k792n2x1T3GytDEk5kAU8CXk2KWT2D/zPgQXPlSbos3S9 2kBGYHTOkdM1Uy9e8d3zo8+D2ItFA9LD06/n/XPXy8IcZr3ssAdPIMahh26xpdoRrEjK Sp9rETbwl4h2HMzfC8HIpv0S3tv6KCfRZ2gVI5MBJNrIhaOoY5Sfim1PSSyrlbfaPlUg 4DGZ4UQ0IAZy0Jw9SGVFWjmCUEkBOPeinAzM+MOwQx7hXvwjXu79HiHf5W1ffUzwwfHY /wm9+z8ilKoNp8Obk/oWTENUjQVdC73MU+fezOuQyMCnxnuwYI7ZKIOYVvMUagX68Tol F81g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s30-20020a63925e000000b00565f2784dabsi2009457pgn.719.2023.08.16.13.26.32; Wed, 16 Aug 2023 13:26:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237360AbjHONGP (ORCPT + 99 others); Tue, 15 Aug 2023 09:06:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237377AbjHONFx (ORCPT ); Tue, 15 Aug 2023 09:05:53 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5B88C10C0; Tue, 15 Aug 2023 06:05:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3BC931063; Tue, 15 Aug 2023 06:06:30 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.3.119]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D01AC3F6C4; Tue, 15 Aug 2023 06:05:45 -0700 (PDT) Date: Tue, 15 Aug 2023 14:05:36 +0100 From: Mark Rutland To: James Clark Cc: Anshuman Khandual , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, catalin.marinas@arm.com, Mark Brown , Rob Herring , Marc Zyngier , Suzuki Poulose , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , linux-perf-users@vger.kernel.org Subject: Re: [PATCH V13 - RESEND 02/10] arm64/perf: Add BRBE registers and fields Message-ID: References: <20230711082455.215983-1-anshuman.khandual@arm.com> <20230711082455.215983-3-anshuman.khandual@arm.com> <20230728162011.GA22050@willie-the-truck> <89ce4bc4-00c5-a763-3179-e1d3e9f198b7@arm.com> <937468a1-b325-7d05-8daf-765f911c9240@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 15, 2023 at 11:17:19AM +0100, James Clark wrote: > > > On 31/07/2023 10:06, Mark Rutland wrote: > > On Mon, Jul 31, 2023 at 08:03:21AM +0530, Anshuman Khandual wrote: > >> > >> > >> On 7/28/23 22:22, James Clark wrote: > >>> > >>> > >>> On 28/07/2023 17:20, Will Deacon wrote: > >>>> On Tue, Jul 11, 2023 at 01:54:47PM +0530, Anshuman Khandual wrote: > >>>>> This adds BRBE related register definitions and various other related field > >>>>> macros there in. These will be used subsequently in a BRBE driver which is > >>>>> being added later on. > >>>>> > >>>>> Cc: Catalin Marinas > >>>>> Cc: Will Deacon > >>>>> Cc: Marc Zyngier > >>>>> Cc: Mark Rutland > >>>>> Cc: linux-arm-kernel@lists.infradead.org > >>>>> Cc: linux-kernel@vger.kernel.org > >>>>> Tested-by: James Clark > >>>>> Reviewed-by: Mark Brown > >>>>> Signed-off-by: Anshuman Khandual > >>>>> --- > >>>>> arch/arm64/include/asm/sysreg.h | 103 +++++++++++++++++++++ > >>>>> arch/arm64/tools/sysreg | 158 ++++++++++++++++++++++++++++++++ > >>>>> 2 files changed, 261 insertions(+) > >>>>> > >>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h > >>>>> index b481935e9314..f95e30c13c8b 100644 > >>>>> --- a/arch/arm64/include/asm/sysreg.h > >>>>> +++ b/arch/arm64/include/asm/sysreg.h > >>>>> @@ -163,6 +163,109 @@ > >>>>> #define SYS_DBGDTRTX_EL0 sys_reg(2, 3, 0, 5, 0) > >>>>> #define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0) > >>>>> > >>>>> +#define __SYS_BRBINFO(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0)) > >>>>> +#define __SYS_BRBSRC(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1)) > >>>>> +#define __SYS_BRBTGT(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2)) > >>>> > >>>> It's that time on a Friday but... aren't these macros busted? I think you > >>>> need brackets before adding the offset, otherwise wouldn't, for example, > >>>> target registers 0-15 all access info register 0 and __SYS_BRBTGT(16) would > >>>> then start accessing source register 0? > >>>> > >>>> I'm surprised that the compiler doesn't warn about this, but even more > >>>> surprised that you managed to test this. > >>>> > >>>> Please tell me I'm wrong! > >>>> > >>>> Will > >>> > >>> No I think you are right, it is wrong. Luckily there is already an > >>> extraneous bracket so you you can fix it by moving one a place down: > >>> > >>> sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10) >> 2) + 2)) > >>> > >>> It's interesting because the test [1] is doing quite a bit and looking > >>> at the branch info, and that src and targets match up to function names. > >>> I also manually looked at the branch buffers and didn't see anything > >>> obviously wrong like things that looked like branch infos in the source > >>> or target fields. Will have to take another look to see if it would be > >>> possible for the test to catch this. > >>> > >>> James > >>> > >>> [1]: > >>> https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32 > >> > >> ((((n) & 0x10)) >> 2 + 2) ---> ((((n) & 0x10) >> 2) + 2) > >> > >> The additional brackets are useful in explicitly telling the compiler but > >> what it the compiler is just doing the right thing implicitly i.e computing > >> the shifting operation before doing the offset addition. > > > > No; that is not correct. In c, '+' has higher precedence than '>>'. > > > > For 'a >> b + c' the compiler *must* treat that as 'a >> (b + c)', and not as > > '(a >> b) + c' > > > > That's trivial to test: > > > > | [mark@gravadlaks:~]% cat shiftadd.c > > | #include > > | > > | unsigned long logshiftadd(unsigned long a, > > | unsigned long b, > > | unsigned long c) > > | { > > | printf("%ld >> %ld + %ld is %ld\n", > > | a, b, c, a >> b + c); > > | } > > | > > | int main(int argc, char *argv) > > | { > > | logshiftadd(0, 0, 0); > > | logshiftadd(0, 0, 1); > > | logshiftadd(0, 0, 2); > > | printf("\n"); > > | logshiftadd(1024, 0, 0); > > | logshiftadd(1024, 0, 1); > > | logshiftadd(1024, 0, 2); > > | printf("\n"); > > | logshiftadd(1024, 2, 0); > > | logshiftadd(1024, 2, 1); > > | logshiftadd(1024, 2, 2); > > | > > | return 0; > > | } > > | [mark@gravadlaks:~]% gcc shiftadd.c -o shiftadd > > | [mark@gravadlaks:~]% ./shiftadd > > | 0 >> 0 + 0 is 0 > > | 0 >> 0 + 1 is 0 > > | 0 >> 0 + 2 is 0 > > | > > | 1024 >> 0 + 0 is 1024 > > | 1024 >> 0 + 1 is 512 > > | 1024 >> 0 + 2 is 256 > > | > > | 1024 >> 2 + 0 is 256 > > | 1024 >> 2 + 1 is 128 > > | 1024 >> 2 + 2 is 64 > > > >> During testing, all > those captured branch records looked alright. > > > > I think we clearly need better testing here. > > > > Thanks, > > Mark. > > Hi Will and Mark, > > So I started looking into the test both with and without the fix, > strangely I couldn't see any difference in the branch outputs, or > anywhere in the driver where it would be flipping or filtering anything > to make it only appear to be working. This was a bit confusing, but > added up with the original point that the test was passing and it was > actually doing something. > > So I started going deeper and found what the issue (non-issue) is. > > Firstly why is there no warning: > > The expression is stringified and passed to the assembler, so it skips > the C compiler warning settings. I can send a patch to fix this, but all > we need to do is get the compiler to evaluate the argument and then > throw it away, luckily there are no other issues found even with an > allyesconfig, so BRBE was the only thing with this bug: > > #define read_sysreg_s(r) ({ > u64 __val; > + u32 __maybe_unused __check_r = (u32)(r); > asm volatile(__mrs_s("%0", r) : "=r" (__val)); > __val; > }) > > > Secondly, why does BRBE actually work: > > Well the assembler (at least in my Clang toolchain) has a different > order of operations to C. I put a minimal repro here: > https://godbolt.org/z/YP9adh5xh > > You can see the op2 should be a 0b100000 (0x20) for BRBSRC and it > appears to be, you can also see that moving the bracket makes no > difference in this case. > > For some more evidence, the disassembler I have locally actually gives > the correct register name, even when the bracket is wrong, and diffing > the .o file gives no difference when moving the bracket: > > 0000000000000008
: > 8: d503245f bti c > c: d503201f nop > 10: d503201f nop > 14: 2a1f03e0 mov w0, wzr > 18: d5318028 mrs x8, brbsrc0_el1 > 1c: d5318128 mrs x8, brbsrc1_el1 > 20: d65f03c0 ret > > Seems completely crazy to me that this is actually the case. So maybe I > am also wrong. Don't know if this counts as a compiler bug or it's just > supposed to be like that. From a quick dig, it's supposed to be like that: the GNU assembler uses a different operator precedence to C, and clang's assembler does the same for compatibility. What a great. Compare: https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_6.html#SEC66 ... with: https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence Adding the brackets will make this work in either case, so I think that's the right thing to do for now. Thanks, Mark.