MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.20.1712141807400.4998@nanos>
References: <001a1145e8548cbd3d055f73374f@google.com> <alpine.DEB.2.20.1712141807400.4998@nanos>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 14 Dec 2017 10:42:08 -0800
Message-ID: <CA+55aFzLw629nbk0GL=9=x3sjkxkKjiVW=mL6Pjm7i2vTLwyVw@mail.gmail.com>
Subject: Re: BUG: unable to handle kernel paging request in __switch_to
To: Thomas Gleixner <tglx@linutronix.de>
Cc: syzbot 
        <bot+1f445b1009b8eeededa30fe62ccf685f2ec9d155@syzkaller.appspotmail.com>,
        Borislav Petkov <bp@suse.de>, Dmitry Safonov <dsafonov@virtuozzo.com>,
        Peter Anvin <hpa@zytor.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Andrew Lutomirski <luto@kernel.org>, Kyle Huey <me@kylehuey.com>,
        Ingo Molnar <mingo@redhat.com>, syzkaller-bugs@googlegroups.com,
        "the arch/x86 maintainers" <x86@kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2397
Lines: 63

On Thu, Dec 14, 2017 at 9:12 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Sun, 3 Dec 2017, syzbot wrote:
>> BUG: unable to handle kernel paging request at fffffffffffffff8
>> Oops: 0002 [#1] SMP KASAN

System write of a non-existent page.

>> RIP: 0010:switch_fpu_prepare arch/x86/include/asm/fpu/internal.h:535 [inline]
>> RIP: 0010:__switch_to+0x95b/0x1330 arch/x86/kernel/process_64.c:407

This says it's

     old_fpu->last_cpu = cpu;

and the code disassembly ends up looking something like this:

   0: 48 c1 ea 03          shr    $0x3,%rdx
   4: 0f b6 04 02          movzbl (%rdx,%rax,1),%eax
   8: 84 c0                test   %al,%al
   a: 74 08                je     0x14
   c: 3c 03                cmp    $0x3,%al
   e: 0f 8e d5 06 00 00    jle    0x6e9
  14: 8b 85 70 fe ff ff    mov    -0x190(%rbp),%eax
  1a: 41 89 84 24 c0 15 00 mov    %eax,0x15c0(%r12)
  21: 00
  22:* cc                    int3    <-- trapping instruction

where that preceding two "mov" instructions look like it might indeed be that

     old_fpu->last_cpu = cpu;

thing, and the register state doesn't look insane for this.

So I think the RIP->line encoding is slightly off, and that "int3" is
almost certainly due to the very next thing after the write:

                trace_x86_fpu_regs_deactivated(old_fpu);

and that actually makes sense if the test robot is doing some tracing,
particularly if it's just about to _start_ tracing, and it has
replaced the first byte of the instruction with 'int3' and is in the
process of doing the rewrite.

The fact that it then takes a system write fault is because some GDT
or IDT setup is screwed up. Or possibly the stack is screwed up and
started out as 0, and then the push to the stack would decrement the
stack pointer and try to push the error state or something.

> That's the second report I'm staring at today which has CR2
> fffffffffffffffx and points to a faulting instruction which does not make
> any sense at all.

That actually does make sense - see above.  It just requires that race
with the instruction rewriting.

*Normally* we never actually take the "int3" exception, because
normally we'll have completed the rewrite before another CPU actually
executes the instruction that is being rewritten.

So I'm assuming this is with the page table isolation, and some
unusual case in exception handling got screwed up.

                 Linus