Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751655AbdLZX4v convert rfc822-to-8bit (ORCPT ); Tue, 26 Dec 2017 18:56:51 -0500 Received: from terminus.zytor.com ([65.50.211.136]:33697 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751163AbdLZX4u (ORCPT ); Tue, 26 Dec 2017 18:56:50 -0500 Date: Tue, 26 Dec 2017 15:45:28 -0800 User-Agent: K-9 Mail for Android In-Reply-To: <20171226231900.GB1410@arch-chirva.localdomain> References: <20171224014415.GA5663@chirva-void> <20171225212934.GA1410@arch-chirva.localdomain> <20171226231900.GB1410@arch-chirva.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Subject: Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot To: Alexandru Chirvasitu , Linus Torvalds CC: Andy Lutomirski , Thomas Gleixner , kernel list , Borislav Petkov , Brian Gerst , Denys Vlasenko , Josh Poimboeuf , Peter Zijlstra , Steven Rostedt , Ingo Molnar From: hpa@zytor.com Message-ID: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8139 Lines: 239 On December 26, 2017 3:19:00 PM PST, Alexandru Chirvasitu wrote: >On Tue, Dec 26, 2017 at 10:51:12AM -0800, Linus Torvalds wrote: >> [ Sorry, I was off-line on Christmas Eve due to festivities, and then >> yesterday because I've apparently caught a cold. >> >> Still not back to normal, but at least I can sit in front of the >> computer again ] >> >> On Mon, Dec 25, 2017 at 1:29 PM, Alexandru Chirvasitu >> wrote: >> > >> > On Mon, Dec 25, 2017 at 06:40:14AM -0800, Andy Lutomirski wrote: >> >> >> >> This is presumably the same call-tracing-without-TLS-working >problem. >> >> idt_invalidate() is out-of-line and is compiled with full tracing >on, >> >> Yeah. The difference literally seems to be that in the old case it >was >> accidentally inlined. >> >> I say "accidentally", because "load_idt()" itself is explicitly >> inlined, but the "set_idt()" in machine_kexec_32.c was not. >> >> But before that commit ("e802a51ede91 x86/idt: Consolidate IDT >> invalidation") the compiler inlined it anyway because it was a small >> static function. >> >> Afterwards, not so much (different C file), and then the stack >tracing >> code blew up because of the incomplete CPU state. >> >> > This works.. I went back to the troublesome commit e802a51 and >> > modified it as follows: >> > >> > +/** >> > + * idt_invalidate - Invalidate interrupt descriptor table >> > + * @addr: The virtual address of the 'invalid' IDT >> > + */ >> > +static inline void idt_invalidate(void *addr) >> > +{ >> > + struct desc_ptr idt = { .address = (unsigned long) addr, >.size = 0 }; >> > + >> > + load_idt(&idt); >> > +} >> >> Yes, I suspect that is the right thing to do. It's small enough that >> inliningh it makes sense. >> >> HOWEVER. Would you mind testing a totally different fix instead? >> >> In particular, take the current top of tree (that doesn't work for >> you), and try to just change the order of these two lines: >> >> set_gdt(phys_to_virt(0), 0); >> idt_invalidate(phys_to_virt(0)); >> >> in arch/x86/kernel/machine_kexec_32.c. >> >> I think it's a better idea to invalidate the IDT first, because that >> is only used for exceptions. In contrast, invalidating the GDT will >> obviously make any segment load do horrible things, _and_ any >> exceptions would fail anyway (because exceptions need segments too). >> >> So in many ways, that "set_get()" that invalidates the GDT is the >more >> destructive thing, and should be done last. >> >> And if we do it last, maybe the whole "oops, we have tracing code >> enabled" thing wouldn't have mattered. >> >> Does that trivial line switching make the old broken config work for >you again? > >I've tried three more kernels just now: > >(1) > >I went back to the initial problematic commit e802a51 and modified it >as you suggest: > >---------------------------------------------------------- > > interchanged invalidation instructions in machine_kexec_32 > >diff --git a/arch/x86/kernel/machine_kexec_32.c >b/arch/x86/kernel/machine_kexec_32.c >index 00bc751..4ebf6bf 100644 >--- a/arch/x86/kernel/machine_kexec_32.c >+++ b/arch/x86/kernel/machine_kexec_32.c >@@ -232,8 +232,8 @@ void machine_kexec(struct kimage *image) > * The gdt & idt are now invalid. > * If you want to load them you must set up your own idt & gdt. > */ >- set_gdt(phys_to_virt(0), 0); > idt_invalidate(phys_to_virt(0)); >+ set_gdt(phys_to_virt(0), 0); > > /* now call it */ > image->start = relocate_kernel_ptr((unsigned long)image->head, > >---------------------------------------------------------- > >This did not work out for me, but now it fails differently. Both >(kexec -l + kexec -e) and (kexec -p + echo c > /proc/sysrq-trigger) >end in call traces and freezes. > >It does seem to be tied to idt_invalidate. One of the last things I >see on the screen (which is ends up frozen with the computer inactive) >is > >EIP: idt_invalidate+0x6/0x40 SS:ESP: 0068:f6c47cd0 > >Well, that address at the end changes on different iterations of >this. I also see the usual 'Kernel panic: not syncing', as well as > >Shuttind down CPUs with NMI > >and another worrisome line higher up: > >CPU:0 PID: 682 comm: kexec Tainted G > >None of this seems to register in logs I can send. For instance, I've >grepped -r for 'invalidate' in /var/log/ with no hits. > > > >(2) > >In order to look into the argument-of-idt_invalidate issue, I took >commit (1) above and changed it to > >---------------------------------------------------------- > > call idt_invalidate(0) in machine_kexec_32 > >diff --git a/arch/x86/kernel/machine_kexec_32.c >b/arch/x86/kernel/machine_kexec_32.c >index 4ebf6bf..71bd3c0 100644 >--- a/arch/x86/kernel/machine_kexec_32.c >+++ b/arch/x86/kernel/machine_kexec_32.c >@@ -232,7 +232,7 @@ void machine_kexec(struct kimage *image) > * The gdt & idt are now invalid. > * If you want to load them you must set up your own idt & gdt. > */ >- idt_invalidate(phys_to_virt(0)); >+ idt_invalidate(0); > set_gdt(phys_to_virt(0), 0); > > /* now call it */ > >---------------------------------------------------------- > >Same issues as noted in (1) above. I suppose we were expecting this, >but since I'm doing this anyway, I figured might as well do it with >some degree of thoroughness. > > > >(3) > >Also related to the argument issue: I went back to the commit I >described in my previous message (making idt_invalidate static inline >in the header) and made the identical argument-0 modification to *that* > >---------------------------------------------------------- > > call idt_invalidate(0) in machine_kexec_32 > >diff --git a/arch/x86/kernel/machine_kexec_32.c >b/arch/x86/kernel/machine_kexec_32.c >index 00bc751..36c1b27 100644 >--- a/arch/x86/kernel/machine_kexec_32.c >+++ b/arch/x86/kernel/machine_kexec_32.c >@@ -233,7 +233,7 @@ void machine_kexec(struct kimage *image) > * If you want to load them you must set up your own idt & gdt. > */ > set_gdt(phys_to_virt(0), 0); >- idt_invalidate(phys_to_virt(0)); >+ idt_invalidate(0); > > /* now call it */ > image->start = relocate_kernel_ptr((unsigned long)image->head, > >---------------------------------------------------------- > >All good here. So passing the argument 0 to idt_invalidate seems to >make no difference, either to kexec or to reboot (which also works >fine). > > > > >> >> > kexec now works as expected; tested repeatedly, both with direct >> > execution and crash triggering. >> > >> > I had to google 'inline function' :)). >> >> We'll make a kernel developer out of you yet. You've already found >the >> most important development tool (I kid, I kid. Google is useful, but >> "willingness to try things out" is actually the #1 thing). >> >> Mind googling "linux kernel patch submission" and adding the required >> sign-off, and I suspect the x86 people will happily take your patch? >> >> That said, I do wonder about a few things: >> >> - the 'addr' argument is pointless, afaik. I *suspect* it used to be >> 0, and then some mindless editing just changed it to that >> "phys_to_virt(0)". >> >> With a zero length, it shouldn't matter what the actual IDT base >> address actually is. Any access is going to trap regardless. >> >> - some people were clearly aware of just how critical that whole >> "load_idt()" sequence were, because things were marked "inline" and >> "NOKPROBE_SYMBOL()" etc, but there was no comment in the code that >> actually did this about how the machine state is total garbage after >> the "set_gdt()" in machine_kexec(). >> >> - the above "I think we should invalidate GDT last" issue. >> >> Hmm? >> >> Linus Again, to me this code is insane. It works out to 3 assembly instructions which *better* be sequential: LIDT LGDT JMP This is exactly the kind of thing to use an assembly stub or even just a single asm() statement for. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.