Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754115AbdLNSzJ (ORCPT ); Thu, 14 Dec 2017 13:55:09 -0500 Received: from mail.kernel.org ([198.145.29.99]:56656 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753800AbdLNSzI (ORCPT ); Thu, 14 Dec 2017 13:55:08 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C45F4218DA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: ACJfBotqa9+k1Kvg3ApaeskGOZM0sV4aHVdQa1t0d5PxhMYrJoWGXl0yNWnjl5iHLcFYL0421yveN4eEVjODDuJM8NY= MIME-Version: 1.0 In-Reply-To: References: <001a1145e8548cbd3d055f73374f@google.com> From: Andy Lutomirski Date: Thu, 14 Dec 2017 10:54:46 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: BUG: unable to handle kernel paging request in __switch_to To: Linus Torvalds Cc: Thomas Gleixner , syzbot , Borislav Petkov , Dmitry Safonov , Peter Anvin , Linux Kernel Mailing List , Andrew Lutomirski , Kyle Huey , Ingo Molnar , syzkaller-bugs@googlegroups.com, "the arch/x86 maintainers" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3608 Lines: 84 On Thu, Dec 14, 2017 at 10:42 AM, Linus Torvalds wrote: > On Thu, Dec 14, 2017 at 9:12 AM, Thomas Gleixner wrote: >> On Sun, 3 Dec 2017, syzbot wrote: >>> BUG: unable to handle kernel paging request at fffffffffffffff8 >>> Oops: 0002 [#1] SMP KASAN > > System write of a non-existent page. > >>> RIP: 0010:switch_fpu_prepare arch/x86/include/asm/fpu/internal.h:535 [inline] >>> RIP: 0010:__switch_to+0x95b/0x1330 arch/x86/kernel/process_64.c:407 > > This says it's > > old_fpu->last_cpu = cpu; > > and the code disassembly ends up looking something like this: > > 0: 48 c1 ea 03 shr $0x3,%rdx > 4: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax > 8: 84 c0 test %al,%al > a: 74 08 je 0x14 > c: 3c 03 cmp $0x3,%al > e: 0f 8e d5 06 00 00 jle 0x6e9 > 14: 8b 85 70 fe ff ff mov -0x190(%rbp),%eax > 1a: 41 89 84 24 c0 15 00 mov %eax,0x15c0(%r12) > 21: 00 > 22:* cc int3 <-- trapping instruction > > where that preceding two "mov" instructions look like it might indeed be that > > old_fpu->last_cpu = cpu; > > thing, and the register state doesn't look insane for this. > > So I think the RIP->line encoding is slightly off, and that "int3" is > almost certainly due to the very next thing after the write: > > trace_x86_fpu_regs_deactivated(old_fpu); > > and that actually makes sense if the test robot is doing some tracing, > particularly if it's just about to _start_ tracing, and it has > replaced the first byte of the instruction with 'int3' and is in the > process of doing the rewrite. > > The fact that it then takes a system write fault is because some GDT > or IDT setup is screwed up. Or possibly the stack is screwed up and > started out as 0, and then the push to the stack would decrement the > stack pointer and try to push the error state or something. > >> That's the second report I'm staring at today which has CR2 >> fffffffffffffffx and points to a faulting instruction which does not make >> any sense at all. > > That actually does make sense - see above. It just requires that race > with the instruction rewriting. > > *Normally* we never actually take the "int3" exception, because > normally we'll have completed the rewrite before another CPU actually > executes the instruction that is being rewritten. > > So I'm assuming this is with the page table isolation, and some > unusual case in exception handling got screwed up. SDM time. Assuming the CPU actually decoded int3 and tried to execute it, I can see a couple possible outcomes: 1. Something's wrong with the IDT and it can't read the vector. I think this would end up triple-faulting, though. 2. It actually tries to handle the breakpoint. A breakpoint is a benign exception, so any exception encountered while delivering it would result in serial delivery. I've never thought that serial delivery made any sense -- presumably it just cancels the breakpoint and delivers the other exception. So this *could* be a page fault hit during delivery of the int3 exception. I don't believe it's a GDT problem, though, because that would also likely lead to a triple fault. What I *would* believe is that the IST table got messed up and we're seeing the result of trying to push to the stack with the initial RSP=0 so the fault hits at address -8. I have no idea how that would happen, though. Especially since int3 from userspace would have exactly the same problem, and we exercise that code in the selftests.