Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751624AbdLZXR3 (ORCPT ); Tue, 26 Dec 2017 18:17:29 -0500 Received: from mail-qt0-f195.google.com ([209.85.216.195]:33285 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751357AbdLZXR2 (ORCPT ); Tue, 26 Dec 2017 18:17:28 -0500 X-Google-Smtp-Source: ACJfBovqseEHkrec5m5OOuE86o1hUK1YqAEeajB/E28fo+8Nfbd+uGFaDeqSXqtqtySiF8Sq9+48Xg== Date: Tue, 26 Dec 2017 18:19:00 -0500 From: Alexandru Chirvasitu To: Linus Torvalds Cc: Andy Lutomirski , Thomas Gleixner , kernel list , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Peter Zijlstra , Steven Rostedt , Ingo Molnar Subject: Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot Message-ID: <20171226231900.GB1410@arch-chirva.localdomain> References: <20171224014415.GA5663@chirva-void> <20171225212934.GA1410@arch-chirva.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7527 Lines: 216 On Tue, Dec 26, 2017 at 10:51:12AM -0800, Linus Torvalds wrote: > [ Sorry, I was off-line on Christmas Eve due to festivities, and then > yesterday because I've apparently caught a cold. > > Still not back to normal, but at least I can sit in front of the > computer again ] > > On Mon, Dec 25, 2017 at 1:29 PM, Alexandru Chirvasitu > wrote: > > > > On Mon, Dec 25, 2017 at 06:40:14AM -0800, Andy Lutomirski wrote: > >> > >> This is presumably the same call-tracing-without-TLS-working problem. > >> idt_invalidate() is out-of-line and is compiled with full tracing on, > > Yeah. The difference literally seems to be that in the old case it was > accidentally inlined. > > I say "accidentally", because "load_idt()" itself is explicitly > inlined, but the "set_idt()" in machine_kexec_32.c was not. > > But before that commit ("e802a51ede91 x86/idt: Consolidate IDT > invalidation") the compiler inlined it anyway because it was a small > static function. > > Afterwards, not so much (different C file), and then the stack tracing > code blew up because of the incomplete CPU state. > > > This works.. I went back to the troublesome commit e802a51 and > > modified it as follows: > > > > +/** > > + * idt_invalidate - Invalidate interrupt descriptor table > > + * @addr: The virtual address of the 'invalid' IDT > > + */ > > +static inline void idt_invalidate(void *addr) > > +{ > > + struct desc_ptr idt = { .address = (unsigned long) addr, .size = 0 }; > > + > > + load_idt(&idt); > > +} > > Yes, I suspect that is the right thing to do. It's small enough that > inliningh it makes sense. > > HOWEVER. Would you mind testing a totally different fix instead? > > In particular, take the current top of tree (that doesn't work for > you), and try to just change the order of these two lines: > > set_gdt(phys_to_virt(0), 0); > idt_invalidate(phys_to_virt(0)); > > in arch/x86/kernel/machine_kexec_32.c. > > I think it's a better idea to invalidate the IDT first, because that > is only used for exceptions. In contrast, invalidating the GDT will > obviously make any segment load do horrible things, _and_ any > exceptions would fail anyway (because exceptions need segments too). > > So in many ways, that "set_get()" that invalidates the GDT is the more > destructive thing, and should be done last. > > And if we do it last, maybe the whole "oops, we have tracing code > enabled" thing wouldn't have mattered. > > Does that trivial line switching make the old broken config work for you again? I've tried three more kernels just now: (1) I went back to the initial problematic commit e802a51 and modified it as you suggest: ---------------------------------------------------------- interchanged invalidation instructions in machine_kexec_32 diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 00bc751..4ebf6bf 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -232,8 +232,8 @@ void machine_kexec(struct kimage *image) * The gdt & idt are now invalid. * If you want to load them you must set up your own idt & gdt. */ - set_gdt(phys_to_virt(0), 0); idt_invalidate(phys_to_virt(0)); + set_gdt(phys_to_virt(0), 0); /* now call it */ image->start = relocate_kernel_ptr((unsigned long)image->head, ---------------------------------------------------------- This did not work out for me, but now it fails differently. Both (kexec -l + kexec -e) and (kexec -p + echo c > /proc/sysrq-trigger) end in call traces and freezes. It does seem to be tied to idt_invalidate. One of the last things I see on the screen (which is ends up frozen with the computer inactive) is EIP: idt_invalidate+0x6/0x40 SS:ESP: 0068:f6c47cd0 Well, that address at the end changes on different iterations of this. I also see the usual 'Kernel panic: not syncing', as well as Shuttind down CPUs with NMI and another worrisome line higher up: CPU:0 PID: 682 comm: kexec Tainted G None of this seems to register in logs I can send. For instance, I've grepped -r for 'invalidate' in /var/log/ with no hits. (2) In order to look into the argument-of-idt_invalidate issue, I took commit (1) above and changed it to ---------------------------------------------------------- call idt_invalidate(0) in machine_kexec_32 diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 4ebf6bf..71bd3c0 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -232,7 +232,7 @@ void machine_kexec(struct kimage *image) * The gdt & idt are now invalid. * If you want to load them you must set up your own idt & gdt. */ - idt_invalidate(phys_to_virt(0)); + idt_invalidate(0); set_gdt(phys_to_virt(0), 0); /* now call it */ ---------------------------------------------------------- Same issues as noted in (1) above. I suppose we were expecting this, but since I'm doing this anyway, I figured might as well do it with some degree of thoroughness. (3) Also related to the argument issue: I went back to the commit I described in my previous message (making idt_invalidate static inline in the header) and made the identical argument-0 modification to *that* ---------------------------------------------------------- call idt_invalidate(0) in machine_kexec_32 diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c index 00bc751..36c1b27 100644 --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -233,7 +233,7 @@ void machine_kexec(struct kimage *image) * If you want to load them you must set up your own idt & gdt. */ set_gdt(phys_to_virt(0), 0); - idt_invalidate(phys_to_virt(0)); + idt_invalidate(0); /* now call it */ image->start = relocate_kernel_ptr((unsigned long)image->head, ---------------------------------------------------------- All good here. So passing the argument 0 to idt_invalidate seems to make no difference, either to kexec or to reboot (which also works fine). > > > kexec now works as expected; tested repeatedly, both with direct > > execution and crash triggering. > > > > I had to google 'inline function' :)). > > We'll make a kernel developer out of you yet. You've already found the > most important development tool (I kid, I kid. Google is useful, but > "willingness to try things out" is actually the #1 thing). > > Mind googling "linux kernel patch submission" and adding the required > sign-off, and I suspect the x86 people will happily take your patch? > > That said, I do wonder about a few things: > > - the 'addr' argument is pointless, afaik. I *suspect* it used to be > 0, and then some mindless editing just changed it to that > "phys_to_virt(0)". > > With a zero length, it shouldn't matter what the actual IDT base > address actually is. Any access is going to trap regardless. > > - some people were clearly aware of just how critical that whole > "load_idt()" sequence were, because things were marked "inline" and > "NOKPROBE_SYMBOL()" etc, but there was no comment in the code that > actually did this about how the machine state is total garbage after > the "set_gdt()" in machine_kexec(). > > - the above "I think we should invalidate GDT last" issue. > > Hmm? > > Linus