Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753769AbdLNVjH (ORCPT ); Thu, 14 Dec 2017 16:39:07 -0500 Received: from mail-it0-f42.google.com ([209.85.214.42]:42975 "EHLO mail-it0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752650AbdLNVjG (ORCPT ); Thu, 14 Dec 2017 16:39:06 -0500 X-Google-Smtp-Source: ACJfBovM1XYraNf1iV0wg12tAC4xAmHPwoUjQ8C+jvnXS+mDvB5BtBysNIEaVzcB6mKWt1t29phTop0chVJPWSKAnSQ= MIME-Version: 1.0 In-Reply-To: References: <001a1145e8548cbd3d055f73374f@google.com> From: Linus Torvalds Date: Thu, 14 Dec 2017 13:39:05 -0800 X-Google-Sender-Auth: IVXOXQu0NgSseUe4aImoBqfTivE Message-ID: Subject: Re: BUG: unable to handle kernel paging request in __switch_to To: Andy Lutomirski Cc: Thomas Gleixner , syzbot , Borislav Petkov , Dmitry Safonov , Peter Anvin , Linux Kernel Mailing List , Kyle Huey , Ingo Molnar , syzkaller-bugs@googlegroups.com, "the arch/x86 maintainers" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1452 Lines: 36 On Thu, Dec 14, 2017 at 1:27 PM, Andy Lutomirski wrote: > On Thu, Dec 14, 2017 at 11:28 AM, Linus Torvalds > wrote: >> I don't think that's the case. "int3" is entirely synchronous, and >> doesn't have the same odd issues as a breakpoint trap (which honors RF >> etc). It's literally just a one-byte shorthand for "int $3". > > The SDM says precisely the same thing about INT N, so, whichever way > you dice it, int3 is a benign exception. That just means that it doesn't double-fault when it takes the page fault. Which we already know, because we see a page fault, not a double fault. > 0xfffffffffffffff8 is *exactly* where the fault would be if the > microcoded push of SS faulted if the IST contained zeros. Yes, I suspect it's the stack that is buggered for some reason. >> Plus I think the instruction that gets overwritten is just a 5-byte >> nop isn't it? So it really shouldn't take a fault without the "int3" >> overwriting. > > Unless it was being overwritten the other way and the oops hit while > tracing was being turned *off*. Doesn't really matter. The two forms of that instruction are "5-byte nop" and "unconditional branch". Neither of them will write to anything - the only page fault they could take is for instruction fetch. So it really must be the "int3" that fails. Unless we're looking at some odd CPU errata, which sounds very very unlikely. Linus