Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp5773712imb; Fri, 8 Mar 2019 01:54:15 -0800 (PST) X-Google-Smtp-Source: APXvYqw04lpW6PuGr8Zkmy/Op5oIzqBZ3rw2G4hX48OMfkv9MVFituGdt6abx8greh4vPIBPRdQ4 X-Received: by 2002:a63:81c1:: with SMTP id t184mr15359802pgd.228.1552038855856; Fri, 08 Mar 2019 01:54:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1552038855; cv=none; d=google.com; s=arc-20160816; b=dFcD3Y/V/ppmoh1kpLKXx4wGpttGv42sAPUX7mxSnA2EX1vIijIcep2218CKpUx0NV zONE0A86f8+xShGMCm2OOnSnhJ2zRKvCmJFDddW4hGFEaCO2hGE/68POAqtlhAQzbfx+ Kj3OEH95j/JBmv547mBMhxVGZiwJdm/M/LXtMuOza88SUauB1z2wC7WIUvNH/k7Eom2p eqiAB/fMGyih1WKzCXFxd/4GBJ261KgVnsBqY2F8ixZmwI4JoFES8t+A+UhdGBYJFqtN cXSUx8f+5+CpBEHXdh8eppKIgsaajzQ2KaDVmXhJ9TbfiaiIte+XsDx58p0gZS+2B5ct DUGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=0ZCJGvEfKtbagsuS/DZ8eedBtI4jyGFg1QPwORb4FmI=; b=CjtEnCsmSc6NJV0ry13d5T/BsVvu7GzSZ1+LC8ypI1/v9BjAQFGPd56G51JDZAxPcd 3INspozfWA60+pbRZ3c+sjQI01a99KOqfopQgex/wqYFHEaUZpbz3aiqx0fDiQb87L/y 2AAYUlbbHc7Pkr1fw57slNH8TAprpzAWxeZx9M3khbmN/aFx74vYX/WHQdB2PcK5CKqJ zEh0oqSStvVcDbyRlK6NvV8VYLTisp7Lk8I+gbf0bCj2fiB9QEhkjoNySG537T+/JMRc 7Fl2tP3k2Az3gm0YCcDl6lN1iXWi9jzTHhf/T/s5s/wtM2mnsvQKppR6h2opFBmMKOZu hDCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=jyvuVYgf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b14si6480845pfj.144.2019.03.08.01.54.00; Fri, 08 Mar 2019 01:54:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=jyvuVYgf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726427AbfCHJx2 (ORCPT + 99 others); Fri, 8 Mar 2019 04:53:28 -0500 Received: from pandora.armlinux.org.uk ([78.32.30.218]:33482 "EHLO pandora.armlinux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726386AbfCHJx0 (ORCPT ); Fri, 8 Mar 2019 04:53:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=0ZCJGvEfKtbagsuS/DZ8eedBtI4jyGFg1QPwORb4FmI=; b=jyvuVYgfCEMZnuLKFRxdn/JsB AGJnPAUfbwitP9KbcodBcFgf80nGmDMDp2ZADaboUhF+2KPCOq/k+2Yyy60X1rsztGJ00P62rnvVz n7RV8wGg4S9ORkIZzx6ZaN5GDO1AfDCx7kt1otRyeT+8tq67rhQnXzJD8YW6XWxR7+ul6GoefBmIf 1Uejjg7JxyANBLhsBCg0kEMFMJsT1YXa0khlSM4L4wMRszrMf+VxxuisabPxun3CDcr+cOu0rzFjo jzwLADfo/WNMR/EpVkVheKaN4oy4F8eWo6L/iYNrePoalMgbEcjyEGhPY8TkgTfzO6X/XcyakRWQ8 QOQy6XQ4Q==; Received: from shell.armlinux.org.uk ([2002:4e20:1eda:1:5054:ff:fe00:4ec]:37314) by pandora.armlinux.org.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1h2CBm-000367-3U; Fri, 08 Mar 2019 09:53:14 +0000 Received: from linux by shell.armlinux.org.uk with local (Exim 4.89) (envelope-from ) id 1h2CBg-0005fX-GL; Fri, 08 Mar 2019 09:53:08 +0000 Date: Fri, 8 Mar 2019 09:53:08 +0000 From: Russell King - ARM Linux admin To: Ard Biesheuvel Cc: Nick Desaulniers , Mikael Pettersson , Mikael Pettersson , Arnd Bergmann , Peter Zijlstra , LKML , Ingo Molnar , Darren Hart , Thomas Gleixner , Dave Martin , Linux ARM Subject: Re: [PATCH 2/2] ARM: futex: make futex_detect_cmpxchg more reliable Message-ID: <20190308095308.hjjrzdp4fzbbtnnv@shell.armlinux.org.uk> References: <20190307091514.2489338-1-arnd@arndb.de> <20190307091514.2489338-2-arnd@arndb.de> <20190307234850.nsbpkfcit3lnmytu@shell.armlinux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 08, 2019 at 09:57:45AM +0100, Ard Biesheuvel wrote: > On Fri, 8 Mar 2019 at 00:49, Russell King - ARM Linux admin > wrote: > > > > On Thu, Mar 07, 2019 at 11:39:08AM -0800, Nick Desaulniers wrote: > > > On Thu, Mar 7, 2019 at 1:15 AM Arnd Bergmann wrote: > > > > > > > > Passing registers containing zero as both the address (NULL pointer) > > > > and data into cmpxchg_futex_value_locked() leads clang to assign > > > > the same register for both inputs on ARM, which triggers a warning > > > > explaining that this instruction has unpredictable behavior on ARMv5. > > > > > > > > /tmp/futex-7e740e.s: Assembler messages: > > > > /tmp/futex-7e740e.s:12713: Warning: source register same as write-back base > > > > > > > > This patch was suggested by Mikael Pettersson back in 2011 (!) with gcc-4.4, > > > > as Mikael wrote: > > > > "One way of fixing this is to make uaddr an input/output register, since > > > > "that prevents it from overlapping any other input or output." > > > > > > > > but then withdrawn as the warning was determined to be harmless, and it > > > > apparently never showed up again with later gcc versions. > > > > > > > > Now the same problem is back when compiling with clang, and we are trying > > > > to get clang to build the kernel without warnings, as gcc normally does. > > > > > > > > Cc: Mikael Pettersson > > > > Cc: Mikael Pettersson > > > > Cc: Dave Martin > > > > Link: https://lore.kernel.org/linux-arm-kernel/20009.45690.158286.161591@pilspetsen.it.uu.se/ > > > > Signed-off-by: Arnd Bergmann > > > > --- > > > > arch/arm/include/asm/futex.h | 10 +++++----- > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h > > > > index 0a46676b4245..79790912974e 100644 > > > > --- a/arch/arm/include/asm/futex.h > > > > +++ b/arch/arm/include/asm/futex.h > > > > @@ -110,13 +110,13 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, > > > > preempt_disable(); > > > > __ua_flags = uaccess_save_and_enable(); > > > > __asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n" > > > > - "1: " TUSER(ldr) " %1, [%4]\n" > > > > - " teq %1, %2\n" > > > > + "1: " TUSER(ldr) " %1, [%2]\n" > > > > + " teq %1, %3\n" > > > > " it eq @ explicit IT needed for the 2b label\n" > > > > - "2: " TUSER(streq) " %3, [%4]\n" > > > > + "2: " TUSER(streq) " %4, [%2]\n" > > > > __futex_atomic_ex_table("%5") > > > > - : "+r" (ret), "=&r" (val) > > > > - : "r" (oldval), "r" (newval), "r" (uaddr), "Ir" (-EFAULT) > > > > + : "+&r" (ret), "=&r" (val), "+&r" (uaddr) > > > > + : "r" (oldval), "r" (newval), "Ir" (-EFAULT) > > > > : "cc", "memory"); > > > > uaccess_restore(__ua_flags); > > > > > > Underspecification of constraints to extended inline assembly is a > > > common issue exposed by other compilers (and possibly but in-effect > > > infrequently compiler upgrades). > > > So the reordering of the constraints means the in the assembly (notes > > > for other reviewers): > > > %2 -> %3 > > > %3 -> %4 > > > %4 -> %2 > > > Yep, looks good to me, thanks for finding this old patch and resending, Arnd! > > > > I don't see what is "underspecified" in the original constraints. > > Please explain. > > > > I agree that that statement makes little sense. > > As Russell points out in the referenced thread, there is nothing wrong > with the generated assembly, given that the UNPREDICTABLE opcode is > unreachable in practice. Unfortunately, we have no way to flag this > diagnostic as a known false positive, and AFAICT, there is no reason > we couldn't end up with the same diagnostic popping up for GCC builds > in the future, considering that the register assignment matches the > constraints. (We have seen somewhat similar issues where constant > folded function clones are emitted with a constant argument that could > never occur in reality [0]) > > Given the above, the only meaningful way to invoke this function is > with different registers assigned to %3 and %4, and so tightening the > constraints to guarantee that does not actually result in worse code > (except maybe for the instantiations that we won't ever call in the > first place). So I think we should fix this. > > I wonder if just adding > > BUG_ON(__builtin_constant_p(uaddr)); > > at the beginning makes any difference - this shouldn't result in any > object code differences since the conditional will always evaluate to > false at build time for instantiations we care about. > > > [0] https://lore.kernel.org/lkml/9c74d635-d0d1-0893-8093-ce20b0933fc7@redhat.com/ What I'm actually asking is: The GCC manual says that input operands _may_ overlap output operands since GCC assumes that input operands are consumed before output operands are written. This is an explicit statement. The GCC manual does not say that input operands may overlap with each other, and the behaviour of GCC thus far (apart from one version, presumably caused by a bug) has been that input operands are unique. Clang appears to be different: it allows input operands that are registers, and contain the same constant value to be the same physical register. The assertion is that the constraints are under-specified. I am questioning that assertion. If the constraints are under-specified, I would have expected gcc-4.4's behaviour to have persisted, and we would've been told by gcc's developers to fix our code. That didn't happen, and instead gcc seems to have been fixed. So, my conclusion is that it is intentional that input operands to asm() do not overlap with themselves. It seems to me that the work-around for clang is to change every input operand to be an output operand with a "+&r" contraint - an operand that is both read and written by the "instruction", and that the operand is "earlyclobber". For something that is really only read, that seems strange. Also, reading GCC's manual, it would appear that "+&" is wrong. `+' Means that this operand is both read and written by the instruction. When the compiler fixes up the operands to satisfy the constraints, it needs to know which operands are inputs to the instruction and which are outputs from it. `=' identifies an output; `+' identifies an operand that is both input and output; all other ^^^^^^^^^^^^^^^^^^^^^ operands are assumed to be input only. `&' Means (in a particular alternative) that this operand is an "earlyclobber" operand, which is modified before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ of any memory address. So "+" says that this operand is an input but "&" says that it must not be in a register that is used as an input. That's contradictory, and I think we can expect GCC to barf or at least end up doing strange stuff, if not with existing versions, then with future versions. Hence, I'm asking for clarification why it is thought that the existing code underspecifies the asm constraints, and I'm trying to get some more thought about what the constraints should be, in case there is a need to use "better" constraints. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up