Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp36635706rwd; Tue, 11 Jul 2023 03:49:50 -0700 (PDT) X-Google-Smtp-Source: APBJJlEXKwt0lRamvSFmu6IKUP7XSYH2Gv3HvTC7+TL7WusMPyzmeq2LmutZKz4PFjoM0/z5tBtt X-Received: by 2002:a17:907:3f0b:b0:98e:2423:708 with SMTP id hq11-20020a1709073f0b00b0098e24230708mr20255268ejc.62.1689072589889; Tue, 11 Jul 2023 03:49:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689072589; cv=none; d=google.com; s=arc-20160816; b=L9iH8ty55Zx96jqr/qBHzkqlqLwfoGZZ2PN5bQSIZonwxTc8s9aDOD4vEIX0G3k/3H JdcFoDOVbMCIXsvOpXPpKwp1r/+C9TPBCiNiONqQfZLg2YiydvjL+9mQppm+VNPA0nZy nn8z0h38/0N7PU1Ul6dWf6vS8XWAjRibxYCIOFYodPJPysskQ7Rb26tl0AV2Eoii8Chp xE93CD9lWgw2tNfBFD73bfl/BKGT+TF/t2KyNrxjOU8TpQi+BphI3nzseVxIv0IkJRX4 hQ2GFb58jruu2/zyCeBOMilCgQ3jG00LA43P0YVI6+Bv0CfpOw5aC90V2FVeSn0dhXch /31A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=sdUy16FS1M/AcN2pfUquN2fuiL7uM+WC5tKfzbia6cM=; fh=x/afr8rcjrDlW8vD40JVIV2VGsOBPUbIChD/i8AaAKI=; b=W6X6gf1ifovMaE7fQEYM4gbiYXz025pOUM7ApJiv7txHP5h2IDIZFzdz0v/gdFhfd/ l9BYMhP4ZM4s488xdT36jmh5QJirHQBf7/dNACmtNDGSP/WlmNHc8gbV1ud9QYV2xc9+ 7Ktuwgi8ngpeIZ8zRhy8xI+90j4hNar/K9IiSENskLV7vi9GfyGS7QSAmCeoW65fzBZx 6i2JdDHb8SsE6zpN8bweTqJTilVs7/KbKQJZuhhJHvWft3j4XsIFU46j1Eb4cPuqpqeD rcNk8D3QdRrUF4SGWiF5GAmLY+lbQP9dslYn27FORieEznPqqY3r0m98tZcEz0zEBPW8 JRrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z14-20020a1709063ace00b0099352e47e82si1754418ejd.450.2023.07.11.03.49.26; Tue, 11 Jul 2023 03:49:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230460AbjGKKen (ORCPT + 99 others); Tue, 11 Jul 2023 06:34:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230474AbjGKKel (ORCPT ); Tue, 11 Jul 2023 06:34:41 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 70186E6C; Tue, 11 Jul 2023 03:34:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1FD7E2B; Tue, 11 Jul 2023 03:35:19 -0700 (PDT) Received: from FVFF77S0Q05N.cambridge.arm.com (FVFF77S0Q05N.cambridge.arm.com [10.1.34.186]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2DEB43F67D; Tue, 11 Jul 2023 03:34:35 -0700 (PDT) Date: Tue, 11 Jul 2023 11:34:24 +0100 From: Mark Rutland To: "Aiqun(Maria) Yu" Cc: Will Deacon , corbet@lwn.net, catalin.marinas@arm.com, maz@kernel.org, quic_pkondeti@quicinc.com, quic_kaushalk@quicinc.com, quic_satyap@quicinc.com, quic_shashim@quicinc.com, quic_songxue@quicinc.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH] arm64: Add the arm64.nolse_atomics command line option Message-ID: References: <20230710055955.36551-1-quic_aiquny@quicinc.com> <20230710093751.GC32673@willie-the-truck> <5cf15f85-0397-96f7-4110-13494551b53b@quicinc.com> <20230711082226.GA1554@willie-the-truck> <84f0994a-26de-c20a-a32f-ec8fe41df3a3@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <84f0994a-26de-c20a-a32f-ec8fe41df3a3@quicinc.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 11, 2023 at 06:15:49PM +0800, Aiqun(Maria) Yu wrote: > On 7/11/2023 4:22 PM, Will Deacon wrote: > > On Tue, Jul 11, 2023 at 12:02:22PM +0800, Aiqun(Maria) Yu wrote: > > > On 7/10/2023 5:37 PM, Will Deacon wrote: > > > > On Mon, Jul 10, 2023 at 01:59:55PM +0800, Maria Yu wrote: > > > > > In order to be able to disable lse_atomic even if cpu > > > > > support it, most likely because of memory controller > > > > > cannot deal with the lse atomic instructions, use a > > > > > new idreg override to deal with it. > > > > > > > > This should not be a problem for cacheable memory though, right? > > > > > > > > Given that Linux does not issue atomic operations to non-cacheable mappings, > > > > I'm struggling to see why there's a problem here. > > > > > > The lse atomic operation can be issued on non-cacheable mappings as well. > > > Even if it is cached data, with different CPUECTLR_EL1 setting, it can also > > > do far lse atomic operations. > > > > Please can you point me to the place in the kernel sources where this > > happens? The architecture doesn't guarantee that atomics to non-cacheable > > mappings will work, see "B2.2.6 Possible implementation restrictions on > > using atomic instructions". Linux, therefore, doesn't issue atomics > > to non-cacheable memory. > > We encounter the issue on third party kernel modules Which kernel modules? Those modules are clearly broken; as Will has already said, the architecture says doing atomics to non-cacheable memory can result in external aborts, and that's exaclty the behaviour that you're reporting as a problem. This is working *as designed*. Note that the same is true for LDXR+STXR; so just hiding LSE doesn't make sense: if the code falls back to LDXR+STXR it still suffers from the exact same problem. Regardless, hiding bugs in out-of-tree code is not a justification for changing the upstream kernel. > and third party apps instead of linux kernel itself. Which apps? Why are those apps using non-cacheable memory? Why are those apps trying to perform atomics to non-cacheable memory? > This is a tradeoff of performance and stability. Per my understanding, > options can be used to enable the lse_atomic to have the most performance > cared system, and disable the lse_atomic by stability cared most system. I think that's a misrepresentation of this patch. This patch disables a feature to *hide* bugs in out-of-tree kernel modules and userspace software. It's not about making the system more stable, it's about making broken code appear to work. The LSE atomics aren't just about performance. They're significantly fairer than LDXR+STXR in many practical situations, and contribute to the stability of the system. Thanks, Mark. > > > > Please can you explain the problem that you are trying to solve? > > > > > > In our current case, it is a 100% reproducible issue that happened for > > > uncached data, the cpu which support LSE atomic, but the system's DDR > > > subsystem is not support this and caused a NOC error and thus synchronous > > > external abort happened. > > > > So? The Arm ARM allows this behaviour and Linux shouldn't run into it. > > > > Will > > -- > Thx and BRs, > Aiqun(Maria) Yu >