Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751539AbdITBrP (ORCPT ); Tue, 19 Sep 2017 21:47:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57526 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750998AbdITBrO (ORCPT ); Tue, 19 Sep 2017 21:47:14 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2E30FC057F91 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=jpoimboe@redhat.com Date: Tue, 19 Sep 2017 20:47:10 -0500 From: Josh Poimboeuf To: Alexander Potapenko Cc: x86@kernel.org, LKML , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , Andy Lutomirski , Linus Torvalds , Dmitriy Vyukov , Matthias Kaehlcke , Arnd Bergmann , Peter Zijlstra , Andrey Ryabinin Subject: Re: [PATCH 2/2] x86/asm: Fix inline asm call constraints for clang Message-ID: <20170920014710.wa6iodurvds4kpjr@treble> References: <31e96e6bcfcb47725e15a093b9c31660dfaad430.1505846562.git.jpoimboe@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 20 Sep 2017 01:47:14 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5042 Lines: 102 On Tue, Sep 19, 2017 at 03:25:59PM -0700, Alexander Potapenko wrote: > On Tue, Sep 19, 2017 at 2:55 PM, Alexander Potapenko wrote: > > On Tue, Sep 19, 2017 at 11:45 AM, Josh Poimboeuf wrote: > >> For inline asm statements which have a CALL instruction, we list the > >> stack pointer as a constraint to convince GCC to ensure the frame > >> pointer is set up first: > >> > >> static inline void foo() > >> { > >> register void *__sp asm(_ASM_SP); > >> asm("call bar" : "+r" (__sp)) > >> } > >> > >> Unfortunately, that pattern causes clang to corrupt the stack pointer. > >> > >> There's actually an easier way to achieve the same goal in GCC, without > >> causing trouble for clang. If we declare the stack pointer register > >> variable as a global variable, and remove the constraint altogether, > >> that convinces GCC to always set up the frame pointer before inserting > >> *any* inline asm. > >> > >> It basically acts as if *every* inline asm statement has a CALL > >> instruction. It's a bit overkill, but the performance impact should be > >> negligible. > >> > >> Here are the vmlinux .text size differences with the following configs > >> on GCC: > >> > >> - defconfig > >> - defconfig without frame pointers > >> - Fedora distro config > >> - Fedora distro config without frame pointers > >> > >> defconfig defconfig-nofp distro distro-nofp > >> before 9796300 9466764 9076191 8789745 > >> after 9796941 9462859 9076381 8785325 > >> > >> With frame pointers, the text size increases slightly. Without frame > >> pointers, the text size decreases, and a little more significantly. > >> > >> Reported-by: Matthias Kaehlcke > >> Signed-off-by: Josh Poimboeuf > >> --- > >> arch/x86/include/asm/alternative.h | 3 +-- > >> arch/x86/include/asm/asm.h | 9 ++++++++ > >> arch/x86/include/asm/mshyperv.h | 27 ++++++++++-------------- > >> arch/x86/include/asm/paravirt_types.h | 14 ++++++------ > >> arch/x86/include/asm/preempt.h | 15 +++++-------- > >> arch/x86/include/asm/processor.h | 6 ++---- > >> arch/x86/include/asm/rwsem.h | 6 +++--- > >> arch/x86/include/asm/uaccess.h | 5 ++--- > >> arch/x86/include/asm/xen/hypercall.h | 5 ++--- > >> arch/x86/kvm/emulate.c | 3 +-- > >> arch/x86/kvm/vmx.c | 4 +--- > >> arch/x86/mm/fault.c | 3 +-- > >> tools/objtool/Documentation/stack-validation.txt | 20 +++++++++++++----- > >> 13 files changed, 60 insertions(+), 60 deletions(-) > >> > >> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h > >> index 1b020381ab38..7aeb1cef4204 100644 > >> --- a/arch/x86/include/asm/alternative.h > >> +++ b/arch/x86/include/asm/alternative.h > >> @@ -218,10 +218,9 @@ static inline int alternatives_text_reserved(void *start, void *end) > >> #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2, \ > >> output, input...) \ > >> { \ > >> - register void *__sp asm(_ASM_SP); \ > >> asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\ > >> "call %P[new2]", feature2) \ > >> - : output, "+r" (__sp) \ > >> + : output \ > >> : [old] "i" (oldfunc), [new1] "i" (newfunc1), \ > >> [new2] "i" (newfunc2), ## input); \ > >> } > >> diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h > >> index 676ee5807d86..ff8921d4615b 100644 > >> --- a/arch/x86/include/asm/asm.h > >> +++ b/arch/x86/include/asm/asm.h > >> @@ -132,4 +132,13 @@ > >> /* For C file, we already have NOKPROBE_SYMBOL macro */ > >> #endif > >> > >> +#ifndef __ASSEMBLY__ > >> +/* > >> + * This variable declaration has the side effect of forcing GCC to always set > >> + * up the frame pointer before inserting any inline asm. This is necessary > >> + * because some inline asm statements have CALL instructions. > >> + */ > >> +register unsigned int __sp_unused asm("esp"); > > Shouldn't this be "asm(_ASM_SP)"? > Answering my own question, this can't be _ASM_SP because of the > realmode code, which is built with -m16 whereas _ASM_SP is "rsp". > The patch works fine with "esp" though - most certainly declaring a > ESP variable is enough to reserve the whole RSP in an x86_64 build. Right, clang failing to build the realmode code was exactly why I did that. -- Josh