Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751540AbeACXnM (ORCPT + 1 other); Wed, 3 Jan 2018 18:43:12 -0500 Received: from mail-pl0-f65.google.com ([209.85.160.65]:33181 "EHLO mail-pl0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751048AbeACXnK (ORCPT ); Wed, 3 Jan 2018 18:43:10 -0500 X-Google-Smtp-Source: ACJfBotx36C/YY8EXsBsfTAJcO9R4G+D9fB72kzANKDNOAz5MLLvKJ2szvFpJvAVV2uX28z3UQT3QQ== Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: CONFIG_PAGE_TABLE_ISOLATION=y on x86_64 causes gcc to segfault when building x86_32 binaries From: Andy Lutomirski X-Mailer: iPhone Mail (15C153) In-Reply-To: Date: Wed, 3 Jan 2018 15:43:08 -0800 Cc: Andy Lutomirski , Lars Wendler , LKML , X86 ML , Borislav Betkov , Dave Hansen , Peter Zijlstra , Greg KH , Laura Abbott , Boris Ostrovsky , Juergen Gross Content-Transfer-Encoding: 8BIT Message-Id: <3C2A7852-06E2-4C95-AD46-AE965B412818@amacapital.net> References: <20180103123723.1dd26828@abudhabi.paradoxon.rec> <20180103143036.60e592eb@abudhabi.paradoxon.rec> To: Thomas Gleixner Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: > On Jan 3, 2018, at 2:22 PM, Thomas Gleixner wrote: > >> On Wed, 3 Jan 2018, Andy Lutomirski wrote: >> >>> On Wed, Jan 3, 2018 at 10:52 AM, Thomas Gleixner wrote: >>>> On Wed, 3 Jan 2018, Thomas Gleixner wrote: >>>> >>>>> On Wed, 3 Jan 2018, Lars Wendler wrote: >>>>> Am Wed, 3 Jan 2018 13:05:38 +0100 (CET) >>>>> schrieb Thomas Gleixner : >>>>>> Also can you please try Linus v4.15-rc6 with PTI enabled so we can see >>>>>> whether that's a backport issue or a general one? >>>>> >>>>> Same problem with 4.15-rc6. So I suppose that means it's a general >>>>> issue. >>>> >>>> Just a shot in the dark as I just decoded another issue on a AMD CPU. Can >>>> you please try the patch below? >>> >>> Ok. Found the real issue. This is a problem on AMD boxen. >>> >>> Fix below. >>> >>> Can Xen folks please have a look at that as well? >>> >>> Thanks, >>> >>> tglx >>> >>> 8<------------------- >>> >>> arch/x86/entry/entry_64_compat.S | 13 ++++++------- >>> 1 file changed, 6 insertions(+), 7 deletions(-) >>> >>> --- a/arch/x86/entry/entry_64_compat.S >>> +++ b/arch/x86/entry/entry_64_compat.S >>> @@ -190,8 +190,13 @@ ENTRY(entry_SYSCALL_compat) >>> /* Interrupts are off on entry. */ >>> swapgs >>> >>> - /* Stash user ESP and switch to the kernel stack. */ >>> + /* Stash user ESP */ >>> movl %esp, %r8d >>> + >>> + /* Use %rsp as scratch reg. User ESP is stashed in r8 */ >>> + SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp >>> + >>> + /* Switch to the kernel stack */ >>> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp >>> >>> /* Construct struct pt_regs on stack */ >>> @@ -220,12 +225,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram >>> pushq $0 /* pt_regs->r15 = 0 */ >>> >>> /* >>> - * We just saved %rdi so it is safe to clobber. It is not >>> - * preserved during the C calls inside TRACE_IRQS_OFF anyway. >>> - */ >>> - SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi >>> - >>> - /* >>> * User mode is traced as though IRQs are on, and SYSENTER >>> * turned them off. >>> */ >> >> What's the issue that this is fixing? > >>> movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp > > before switching CR3 is obviously broken ... > > Duh. This is what happens when we have five hundred versions of the patches and we change how it all works half way through. And the 0day bot doesn't test the AMD path. >