Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753613AbdGUJLK (ORCPT ); Fri, 21 Jul 2017 05:11:10 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:37432 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753349AbdGUJLI (ORCPT ); Fri, 21 Jul 2017 05:11:08 -0400 Subject: Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" To: Josh Poimboeuf Cc: Matthias Kaehlcke , Chris J Arges , Borislav Petkov , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , "x86@kernel.org" , LKML , Douglas Anderson , Michael Davidson , Greg Hackmann , Nick Desaulniers , Stephen Hines , Kees Cook , Arnd Bergmann , =?UTF-8?Q?Bernhard_Rosenkr=c3=a4nzer?= References: <20170713184748.GF95735@google.com> <75850bb7-a60e-057d-d88b-afa0c79e94a0@gmail.com> <20170713203416.isvijqbwbcgupgj7@treble> <20170713211245.GG95735@google.com> <20170713213406.gx4ixkx6kxa4ppps@treble> <20170713215704.GJ95735@google.com> <20170719174630.kz5g553evcrnirmr@treble> <20170720151813.5wnpsb5wy7bqrpec@treble> <20170720205652.patvjnfqymrv73ho@treble> From: Andrey Ryabinin Message-ID: <7501b8fc-366d-239d-6358-a403a5bc0eab@gmail.com> Date: Fri, 21 Jul 2017 12:13:31 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20170720205652.patvjnfqymrv73ho@treble> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3238 Lines: 74 On 07/20/2017 11:56 PM, Josh Poimboeuf wrote: > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: >> FWIW bellow is my understanding of what's going on. >> >> It seems clang treats local named register almost the same as ordinary >> local variables. >> The only difference is that before reading the register variable clang >> puts variable's value into the specified register. >> >> So clang just assigns stack slot for the variable __sp where it's >> going to keep variable's value. >> But since __sp is unitialized (we haven't assign anything to it), the >> value of the __sp is some garbage from stack. >> inline asm specifies __sp as input, so clang assumes that it have to >> load __sp into 'rsp' because inline asm is going to use >> it. And it just loads garbage from stack into 'rsp' >> >> In fact, such behavior (I mean storing the value on stack and loading >> into reg before the use) is very useful. >> Clang's behavior allows to keep the value assigned to the >> call-clobbered register across the function calls. >> >> Unlike clang, gcc assigns value to the register right away and doesn't >> store the value anywhere else. So if the reg is >> call clobbered register you have to be absolutely sure that there is >> no subsequent function call that might clobber the register. >> >> E.g. see some real examples >> https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: >> Fix improper register assignment"). >> These bugs shouldn't happen with clang. >> >> But the global named register works slightly differently in clang. For >> the global, the value is just the value of the register itself, >> whatever it is. Read/write from global named register is just like >> direct read/write to the register > > Thanks, that clears up a lot of the confusion for me. > > Still, unfortunately, I don't think that's going to work for GCC. > Changing the '__sp' register variable to global in the header file > causes it to make a *bunch* of changes across the kernel, even in > functions which don't do inline asm. It seems to be disabling some > optimizations across the board. All I see is just bunch of reordering of independent instructions, like this: -ffffffff81012760: 5b pop %rbx -ffffffff81012761: 31 c0 xor %eax,%eax +ffffffff81012760: 31 c0 xor %eax,%eax +ffffffff81012762: 5b pop %rbx -ffffffff810c29ae: 48 83 c4 28 add $0x28,%rsp -ffffffff810c29b2: 89 d8 mov %ebx,%eax +ffffffff810c29ae: 89 d8 mov %ebx,%eax +ffffffff810c29b0: 48 83 c4 28 add $0x28,%rsp I haven't noticed any single bad/harmful change. The size of .text remained the same. And btw, arm/arm64 already use global current_stack_pointer just fine. > I do have another idea, which is to replace all uses of > > asm(" ... call foo ... " : outputs : inputs : clobbers); > > with a new ASM_CALL macro: > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); > > Then the compiler differences can be abstracted out, with GCC adding > "sp" as an output constraint and clang doing nothing (for now). >