Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751320AbdLZTjN convert rfc822-to-8bit (ORCPT ); Tue, 26 Dec 2017 14:39:13 -0500 Received: from terminus.zytor.com ([65.50.211.136]:56647 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750854AbdLZTjM (ORCPT ); Tue, 26 Dec 2017 14:39:12 -0500 Date: Tue, 26 Dec 2017 11:26:22 -0800 User-Agent: K-9 Mail for Android In-Reply-To: References: <20171224014415.GA5663@chirva-void> <20171225212934.GA1410@arch-chirva.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Subject: Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot To: Linus Torvalds , Alexandru Chirvasitu CC: Andy Lutomirski , Thomas Gleixner , kernel list , Borislav Petkov , Brian Gerst , Denys Vlasenko , Josh Poimboeuf , Peter Zijlstra , Steven Rostedt , Ingo Molnar From: hpa@zytor.com Message-ID: <8A693E1E-4E49-4BD9-B956-EF4AAA79101A@zytor.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4099 Lines: 110 On December 26, 2017 10:51:12 AM PST, Linus Torvalds wrote: >[ Sorry, I was off-line on Christmas Eve due to festivities, and then >yesterday because I've apparently caught a cold. > > Still not back to normal, but at least I can sit in front of the >computer again ] > >On Mon, Dec 25, 2017 at 1:29 PM, Alexandru Chirvasitu > wrote: >> >> On Mon, Dec 25, 2017 at 06:40:14AM -0800, Andy Lutomirski wrote: >>> >>> This is presumably the same call-tracing-without-TLS-working >problem. >>> idt_invalidate() is out-of-line and is compiled with full tracing >on, > >Yeah. The difference literally seems to be that in the old case it was >accidentally inlined. > >I say "accidentally", because "load_idt()" itself is explicitly >inlined, but the "set_idt()" in machine_kexec_32.c was not. > >But before that commit ("e802a51ede91 x86/idt: Consolidate IDT >invalidation") the compiler inlined it anyway because it was a small >static function. > >Afterwards, not so much (different C file), and then the stack tracing >code blew up because of the incomplete CPU state. > >> This works.. I went back to the troublesome commit e802a51 and >> modified it as follows: >> >> +/** >> + * idt_invalidate - Invalidate interrupt descriptor table >> + * @addr: The virtual address of the 'invalid' IDT >> + */ >> +static inline void idt_invalidate(void *addr) >> +{ >> + struct desc_ptr idt = { .address = (unsigned long) addr, >.size = 0 }; >> + >> + load_idt(&idt); >> +} > >Yes, I suspect that is the right thing to do. It's small enough that >inliningh it makes sense. > >HOWEVER. Would you mind testing a totally different fix instead? > >In particular, take the current top of tree (that doesn't work for >you), and try to just change the order of these two lines: > > set_gdt(phys_to_virt(0), 0); > idt_invalidate(phys_to_virt(0)); > >in arch/x86/kernel/machine_kexec_32.c. > >I think it's a better idea to invalidate the IDT first, because that >is only used for exceptions. In contrast, invalidating the GDT will >obviously make any segment load do horrible things, _and_ any >exceptions would fail anyway (because exceptions need segments too). > >So in many ways, that "set_get()" that invalidates the GDT is the more >destructive thing, and should be done last. > >And if we do it last, maybe the whole "oops, we have tracing code >enabled" thing wouldn't have mattered. > >Does that trivial line switching make the old broken config work for >you again? > >> kexec now works as expected; tested repeatedly, both with direct >> execution and crash triggering. >> >> I had to google 'inline function' :)). > >We'll make a kernel developer out of you yet. You've already found the >most important development tool (I kid, I kid. Google is useful, but >"willingness to try things out" is actually the #1 thing). > >Mind googling "linux kernel patch submission" and adding the required >sign-off, and I suspect the x86 people will happily take your patch? > >That said, I do wonder about a few things: > > - the 'addr' argument is pointless, afaik. I *suspect* it used to be >0, and then some mindless editing just changed it to that >"phys_to_virt(0)". > > With a zero length, it shouldn't matter what the actual IDT base >address actually is. Any access is going to trap regardless. > > - some people were clearly aware of just how critical that whole >"load_idt()" sequence were, because things were marked "inline" and >"NOKPROBE_SYMBOL()" etc, but there was no comment in the code that >actually did this about how the machine state is total garbage after >the "set_gdt()" in machine_kexec(). > > - the above "I think we should invalidate GDT last" issue. > >Hmm? > > Linus Can anyone explain why on Earth we do a phys_to_virt() instead of just stuffing in a zero, since this is a null pointer anyway?! The other option would be to use the real mode IDT settings (address 0, limit 0xffff although 0x3ff is equivalent.) -- Sent from my Android device with K-9 Mail. Please excuse my brevity.