Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp5852620imb; Fri, 8 Mar 2019 03:56:38 -0800 (PST) X-Google-Smtp-Source: APXvYqzQ/jkqIpXUFZW8jdtsTiIPllgHzjSoz71s2ydYMXc3M09wbI3F27ITnc7X826IhCptAKzt X-Received: by 2002:a65:608c:: with SMTP id t12mr16500101pgu.400.1552046198913; Fri, 08 Mar 2019 03:56:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1552046198; cv=none; d=google.com; s=arc-20160816; b=F0KnNR0yLKDBUk4NTL8lwzE9vZXlBGDRMSuFONUGzh2+wiofU7Y+XeNtB0OJz8XYzr WXiZygJ0DvOpmYz7hCGqCZGHzt0+WkHWNQCksH0MdVoo9jlVBjc6sMf6G0Bj4OYEhgXz Khu1MOu125QWAxsVTLwIaV61LsDpD4foUeDB0OTOt4jsJs3mvC9jiUbxz6FI1TVEclzc Pzo23Ldf/B5lsC3l+1PE7cOMX9KQlKImOZ1fhmrDL2jeqR29q+P6sXZYToklHjhZ45kJ C7yR5mXXnwhY00aI3wVwd36qMIB4XRInTI7Li2dRa1P/TL/f4g9o6vp/hO9Gkmdn51V0 ct2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=/M44t1cmpwvFcb+E1JItj801vAot/GefGSMCsuqowwU=; b=khu8rHz59YFSewCRTqo6vj4sqzkMTB4DKTlTy+fiERWjRJy6a0362/WgbVAyLrzN0V 7XslSvXSRjfAbpIUftl55p3K0jHXS5msvwYiFGIruIvNEnbe59u+Bc614DEzOCJM35Vo +kjd4yqifpG967TZx81ex7R/DZGqEwSQFlDvwna8wuUEIeSfIWO/Q3+ig7ZnzA+ijAaQ Ol0g69Gtl1pYbc+rDdZh7N9JUf3137zluQ5mNiLDQp73GEOHHagMVnc7IsNTMw7FKrMn 6Nk2aV+EGWzZk6YuWTUf+MSzq/jfUcJa/oKdmYJr6dDzJh+TiFZP1lO04RYTrn6tOAap EH0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u10si6754015plq.266.2019.03.08.03.56.23; Fri, 08 Mar 2019 03:56:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726411AbfCHLzz (ORCPT + 99 others); Fri, 8 Mar 2019 06:55:55 -0500 Received: from foss.arm.com ([217.140.101.70]:57224 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726042AbfCHLzz (ORCPT ); Fri, 8 Mar 2019 06:55:55 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1A63B80D; Fri, 8 Mar 2019 03:55:54 -0800 (PST) Received: from e103592.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 063193F706; Fri, 8 Mar 2019 03:55:51 -0800 (PST) Date: Fri, 8 Mar 2019 11:55:49 +0000 From: Dave Martin To: Ard Biesheuvel Cc: Russell King - ARM Linux admin , Mikael Pettersson , Mikael Pettersson , Arnd Bergmann , Peter Zijlstra , Nick Desaulniers , LKML , Ingo Molnar , Darren Hart , Thomas Gleixner , Linux ARM Subject: Re: [PATCH 2/2] ARM: futex: make futex_detect_cmpxchg more reliable Message-ID: <20190308115546.GM3567@e103592.cambridge.arm.com> References: <20190307091514.2489338-1-arnd@arndb.de> <20190307091514.2489338-2-arnd@arndb.de> <20190307234850.nsbpkfcit3lnmytu@shell.armlinux.org.uk> <20190308095308.hjjrzdp4fzbbtnnv@shell.armlinux.org.uk> <20190308103429.ycasmpt6tcpsoqps@shell.armlinux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org oN fRI, Mar 08, 2019 at 11:45:21AM +0100, Ard Biesheuvel wrote: > On Fri, 8 Mar 2019 at 11:34, Russell King - ARM Linux admin > wrote: > > > > On Fri, Mar 08, 2019 at 11:08:40AM +0100, Ard Biesheuvel wrote: > > > On Fri, 8 Mar 2019 at 10:53, Russell King - ARM Linux admin > > > wrote: > > > > > > > > On Fri, Mar 08, 2019 at 09:57:45AM +0100, Ard Biesheuvel wrote: > > > > > On Fri, 8 Mar 2019 at 00:49, Russell King - ARM Linux admin > > > > > wrote: > > > > > > > > > > > > On Thu, Mar 07, 2019 at 11:39:08AM -0800, Nick Desaulniers wrote: > > > > > > > On Thu, Mar 7, 2019 at 1:15 AM Arnd Bergmann wrote: > > > > > > > > > > > > > > > > Passing registers containing zero as both the address (NULL pointer) > > > > > > > > and data into cmpxchg_futex_value_locked() leads clang to assign > > > > > > > > the same register for both inputs on ARM, which triggers a warning > > > > > > > > explaining that this instruction has unpredictable behavior on ARMv5. > > > > > > > > > > > > > > > > /tmp/futex-7e740e.s: Assembler messages: > > > > > > > > /tmp/futex-7e740e.s:12713: Warning: source register same as write-back base > > > > > > > > > > > > > > > > This patch was suggested by Mikael Pettersson back in 2011 (!) with gcc-4.4, > > > > > > > > as Mikael wrote: > > > > > > > > "One way of fixing this is to make uaddr an input/output register, since > > > > > > > > "that prevents it from overlapping any other input or output." > > > > > > > > > > > > > > > > but then withdrawn as the warning was determined to be harmless, and it > > > > > > > > apparently never showed up again with later gcc versions. > > > > > > > > > > > > > > > > Now the same problem is back when compiling with clang, and we are trying > > > > > > > > to get clang to build the kernel without warnings, as gcc normally does. > > > > > > > > > > > > > > > > Cc: Mikael Pettersson > > > > > > > > Cc: Mikael Pettersson > > > > > > > > Cc: Dave Martin > > > > > > > > Link: https://lore.kernel.org/linux-arm-kernel/20009.45690.158286.161591@pilspetsen.it.uu.se/ > > > > > > > > Signed-off-by: Arnd Bergmann > > > > > > > > --- > > > > > > > > arch/arm/include/asm/futex.h | 10 +++++----- > > > > > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > > > diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h > > > > > > > > index 0a46676b4245..79790912974e 100644 > > > > > > > > --- a/arch/arm/include/asm/futex.h > > > > > > > > +++ b/arch/arm/include/asm/futex.h > > > > > > > > @@ -110,13 +110,13 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, > > > > > > > > preempt_disable(); > > > > > > > > __ua_flags = uaccess_save_and_enable(); > > > > > > > > __asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n" > > > > > > > > - "1: " TUSER(ldr) " %1, [%4]\n" > > > > > > > > - " teq %1, %2\n" > > > > > > > > + "1: " TUSER(ldr) " %1, [%2]\n" > > > > > > > > + " teq %1, %3\n" > > > > > > > > " it eq @ explicit IT needed for the 2b label\n" > > > > > > > > - "2: " TUSER(streq) " %3, [%4]\n" > > > > > > > > + "2: " TUSER(streq) " %4, [%2]\n" > > > > > > > > __futex_atomic_ex_table("%5") > > > > > > > > - : "+r" (ret), "=&r" (val) > > > > > > > > - : "r" (oldval), "r" (newval), "r" (uaddr), "Ir" (-EFAULT) > > > > > > > > + : "+&r" (ret), "=&r" (val), "+&r" (uaddr) > > > > > > > > + : "r" (oldval), "r" (newval), "Ir" (-EFAULT) > > > > > > > > : "cc", "memory"); > > > > > > > > uaccess_restore(__ua_flags); > > > > > > > > > > > > > > Underspecification of constraints to extended inline assembly is a > > > > > > > common issue exposed by other compilers (and possibly but in-effect > > > > > > > infrequently compiler upgrades). > > > > > > > So the reordering of the constraints means the in the assembly (notes > > > > > > > for other reviewers): > > > > > > > %2 -> %3 > > > > > > > %3 -> %4 > > > > > > > %4 -> %2 > > > > > > > Yep, looks good to me, thanks for finding this old patch and resending, Arnd! > > > > > > > > > > > > I don't see what is "underspecified" in the original constraints. > > > > > > Please explain. > > > > > > > > > > > > > > > > I agree that that statement makes little sense. > > > > > > > > > > As Russell points out in the referenced thread, there is nothing wrong > > > > > with the generated assembly, given that the UNPREDICTABLE opcode is > > > > > unreachable in practice. Unfortunately, we have no way to flag this > > > > > diagnostic as a known false positive, and AFAICT, there is no reason > > > > > we couldn't end up with the same diagnostic popping up for GCC builds > > > > > in the future, considering that the register assignment matches the > > > > > constraints. (We have seen somewhat similar issues where constant > > > > > folded function clones are emitted with a constant argument that could > > > > > never occur in reality [0]) > > > > > > > > > > Given the above, the only meaningful way to invoke this function is > > > > > with different registers assigned to %3 and %4, and so tightening the > > > > > constraints to guarantee that does not actually result in worse code > > > > > (except maybe for the instantiations that we won't ever call in the > > > > > first place). So I think we should fix this. > > > > > > > > > > I wonder if just adding > > > > > > > > > > BUG_ON(__builtin_constant_p(uaddr)); > > > > > > > > > > at the beginning makes any difference - this shouldn't result in any > > > > > object code differences since the conditional will always evaluate to > > > > > false at build time for instantiations we care about. > > > > > > > > > > > > > > > [0] https://lore.kernel.org/lkml/9c74d635-d0d1-0893-8093-ce20b0933fc7@redhat.com/ > > > > > > > > What I'm actually asking is: > > > > > > > > The GCC manual says that input operands _may_ overlap output operands > > > > since GCC assumes that input operands are consumed before output > > > > operands are written. This is an explicit statement. > > > > > > > > The GCC manual does not say that input operands may overlap with each > > > > other, and the behaviour of GCC thus far (apart from one version, > > > > presumably caused by a bug) has been that input operands are unique. > > > > > > > > > > Not entirely. I have run into issues where GCC assumes that registers > > > that are only used for input operands are left untouched by the asm > > > code. I.e., if you put an asm() block in a loop and modify an input > > > register, your code may break on the next pass, even if the input > > > register does not overlap with an output register. > > > > GCC has had the expectation for decades that _input_ operands are not > > changed in value by the code in the assembly. This isn't quite the > > same thing as the uniqueness of the register allocation for input > > operands. > > > > > To me, that seems to suggest that whether or not inputs may overlap is > > > irrelevant, since they are not expected to be modified. > > > > How is: > > > > stmfd sp!, {r0-r3, ip, lr} > > bl foo > > ldmfd sp!, {r0-r3, ip, lr} > > > > where r1 may be an input operand (to pass an argument to foo) any > > different from: > > > > ldrt r0, [r1] > > > > as far as whether r1 is modified in both cases? In both cases, the > > value of r1 is read and written by both instructions, but in both > > cases the value of r1 remains the same no matter what the value of r1 > > was. > > > > The "input operands should not be modified" is entirely orthogonal to > > the input operand register allocation. > > > > The question is whether it is reasonable for GCC to use the same > register for input operands that have the same value. From the > assumption that GCC makes that the asm will not modified follows > directly that we can use the same register for different operands. Whether "reasonable" or not, GCC does it. And I don't think this is new behaviour... int f(void) { int res; asm ("ADD %0, %0, %0" : "=r" (res) : "r" (77), "r" (77)); return res; } -> 00000000 : 0: e3a0004d mov r0, #77 ; 0x4d 4: e0800000 add r0, r0, r0 8: e12fff1e bx lr > And in fact, since that asm code (when built in ARM mode) does modify > the register, uaddr should not be an input operand to begin with. In > other words, there is an actual bug here, and this patch fixes it. Does the old code modify the register? As I read it, the register is written (in the ARM case) by the underlying STRT instruction, but since the post-index offset it 0, the value written back is the same as the value originally read. In ARMv7-A, strt r0, [r0], #imm will store the original (unmodified) value of r0 if I read the pseudocode correctly. I can't remember the history, but I think that older architecture versions provided a choice about whether the unmodified or modified value was stored. So gas probably just checks whether the registers are the same and emits a warning to be on the safe side. If #imm is 0 (as in the existing futex code here) then it may make no difference in practice though. So, I'm not absolutely convinced there's a bug here, unless this is truly specified as UNPREDICATABLE in older arch versions. But the warning it at least annoying and use of "&" to prevent gas allocating things to the same register is already widespread for "+" asm arguments, even for arm. If the _value_ of the affected operand is not changed by the asm, I think we don't strictly need "+", but we are using "&" here for its register allocation side-effects, and "&" (for its original purpose at least) is only applicable to output ("=" or "+") operands. So I think the patch probably makes sense. IMHO the gas documentation is misleading (or at least unhelpful). Cheers ---Dave