2015-11-11 12:32:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu: Fix MSR value truncation issue

On Fri, Oct 30, 2015 at 06:28:25PM +0100, Borislav Petkov wrote:
> More specifically, MSR_STAR[31:0] is being set to 0. That field is
> reserved on Intel and on AMD it is 32-bit SYSCALL Target EIP.
>
> I'd strongly guess because Intel doesn't have SYSCALL in compat/legacy
> mode and we're using SYSENTER and INT80 there. And for compat syscalls
> in long mode we use CSTAR.

So I was wondering what would happen if I used SYSCALL on 32-bit AMD.

This is what happens on a normal system:

$ strace -f ./syscall
execve("./syscall", ["./syscall"], [/* 24 vars */]) = 0
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x80480e8} ---
+++ killed by SIGILL +++
Illegal instruction

Wondering who causes the SIGILL and after some code staring, it is MSR
EFER.SCE which we don't enable on 32-bit.

And, because I like to cause fire (woahahahah... /me rubs hands and
laughs ominously), I went and toggled that bit.

Oh well, we bomb out, as expected:

BUG: sleeping function called from invalid context at /mnt/kernel/kernel/linux-2.6/arch/x86/mm/fault.c:1191
in_atomic(): 0, irqs_disabled(): 1, pid: 2567, name: syscall
1 lock held by syscall/2567:
#0: (&mm->mmap_sem){++++++}, at: [<c10447f7>] __do_page_fault+0xf7/0x3f0
irq event stamp: 1812
hardirqs last enabled at (1811): [<c165f29a>] restore_all_notrace+0x0/0xe
hardirqs last disabled at (1812): [<c1660145>] error_code+0x31/0x3c
softirqs last enabled at (988): [<c1059e5b>] __do_softirq+0x37b/0x440
softirqs last disabled at (965): [<c1005749>] do_softirq_own_stack+0x39/0x50
CPU: 1 PID: 2567 Comm: syscall Not tainted 4.3.0+ #1
Hardware name: LENOVO 30515QG/30515QG, BIOS 8RET30WW (1.12 ) 09/15/2011
00000000 00000000 bff53b20 c12fdfa2 00000000 bff53b48 c107a9bc c181aca4
00000000 00000001 00000a07 f2cb3830 f2cb3500 00000000 00000000 bff53b7c
c107aae6 f453f70c 00000001 bff53bd0 00000000 bff53b7c c109ee4d 00000001
Call Trace:
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
*pdpt = 0000000032e0b001 *pde = 0000000000000000
Oops: 0010 [#1] PREEMPT SMP
Modules linked in: ipv6 usbhid kvm_amd rtsx_pci_sdmmc kvm mmc_core snd_hda_codec_conexant snd_hda_codec_generic snd_hda_codec_hdmi pcspkr snd_hda_intel k10temp ohci_pci snd_hda_codec snd_hwdep snd_hda_core snd_pcm rtsx_pci mfd_core ohci_hcd battery snd_timer radeon thinkpad_acpi nvram ehci_pci ehci_hcd snd soundcore video ac button thermal
CPU: 1 PID: 2567 Comm: syscall Not tainted 4.3.0+ #1
Hardware name: LENOVO 30515QG/30515QG, BIOS 8RET30WW (1.12 ) 09/15/2011
task: f2cb3500 ti: f2d74000 task.ti: f2d74000
EIP: 0000:[<00000000>] EFLAGS: 00010086 CPU: 1
EIP is at 0x0
EAX: 00000000 EBX: 00000000 ECX: 080480ea EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: bff53c1c ESP: bff53c0c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0008
CR0: 8005003b CR2: 00000000 CR3: 33af5900 CR4: 000006f0
Stack:
00000000 00000000 00000000 00000000 00000000 00000001 bff54df4 00000000
bff54dfe bff54e0c bff54e18 bff54e31 bff54e3c bff54e4c bff54e6e bff54e81
bff54e94 bff54e9e bff54eb2 bff54efe bff54f07 bff54f18 bff54f20 bff54f2b
Call Trace:
Code: Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 0008:bff53c0c
CR2: 0000000000000000
---[ end trace fa036c454007a131 ]---
PANIC: double fault, gdt at f7bb7000 [255 bytes]
double fault, tss at f7bbe9c0
eip = c104afc3, esp = bff539dc
eax = 00000000, ebx = f453f680, ecx = ffffffff, edx = f453f680
esi = ffffffff, edi = f453f680

Nice.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.


2015-11-11 15:50:30

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu: Fix MSR value truncation issue

On Wed, Nov 11, 2015 at 4:31 AM, Borislav Petkov <[email protected]> wrote:
> On Fri, Oct 30, 2015 at 06:28:25PM +0100, Borislav Petkov wrote:
>> More specifically, MSR_STAR[31:0] is being set to 0. That field is
>> reserved on Intel and on AMD it is 32-bit SYSCALL Target EIP.
>>
>> I'd strongly guess because Intel doesn't have SYSCALL in compat/legacy
>> mode and we're using SYSENTER and INT80 there. And for compat syscalls
>> in long mode we use CSTAR.
>
> So I was wondering what would happen if I used SYSCALL on 32-bit AMD.
>
> This is what happens on a normal system:
>
> $ strace -f ./syscall
> execve("./syscall", ["./syscall"], [/* 24 vars */]) = 0
> --- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x80480e8} ---
> +++ killed by SIGILL +++
> Illegal instruction
>
> Wondering who causes the SIGILL and after some code staring, it is MSR
> EFER.SCE which we don't enable on 32-bit.
>
> And, because I like to cause fire (woahahahah... /me rubs hands and
> laughs ominously), I went and toggled that bit.
>
> Oh well, we bomb out, as expected:
>

Not terribly surprising :) Someone (I forget who) told me that 32-bit
SYSCALL (native 32-bit, not compat) was so full of errata that it was
unusable. Even without errata, I don't really see how it would work
well -- there's no MSR_SYSCALL_MASK, so we can't mask off TF when
SYSCALL happens, and I don't see how we're expected to handle SYSCALL
with TF set on a 32-bit kernel unless we route #DB through a task
gate, which I'm reasonably confident no one wants to do.

--Andy

2015-11-11 16:05:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu: Fix MSR value truncation issue

On Wed, Nov 11, 2015 at 07:50:04AM -0800, Andy Lutomirski wrote:

> Not terribly surprising :) Someone (I forget who) told me that 32-bit
> SYSCALL (native 32-bit, not compat) was so full of errata that it was
> unusable. Even without errata, I don't really see how it would work
> well

No, showstopper appears much earlier: it is only supported on AMD. Which
would mean, yet another vendor special-handling. And I don't think it's
worth it.

Yeah, yeah, it might still be faster than SYSENTER, but 32-bit?! Srsly?!
I'm surprised that thing still builds even. :-)

> -- there's no MSR_SYSCALL_MASK,

Of course there is:

MSRC000_0084 SYSCALL Flag Mask (SYSCALL_FLAG_MASK):

31:0 - Mask: SYSCALL flag mask. Read-write. Reset: 0000_0000h. This register holds the EFLAGS
mask used by the SYSCALL instruction. 1=Clear the corresponding EFLAGS bit when executing the
SYSCALL instruction.

Intel has that too, except again, no SYSCALL in legacy mode on Intel.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2015-11-11 18:07:36

by Brian Gerst

[permalink] [raw]
Subject: Re: [RFC PATCH] x86/cpu: Fix MSR value truncation issue

On Wed, Nov 11, 2015 at 11:05 AM, Borislav Petkov <[email protected]> wrote:
> On Wed, Nov 11, 2015 at 07:50:04AM -0800, Andy Lutomirski wrote:
>
>> Not terribly surprising :) Someone (I forget who) told me that 32-bit
>> SYSCALL (native 32-bit, not compat) was so full of errata that it was
>> unusable. Even without errata, I don't really see how it would work
>> well

I had tried to implement it when the K6 came out, but the major
problem was that implementation set an internal flag that forced
return to userspace with SYSRET. IRET would fault, which made task
switching a big problem.

Specifically, the SYSCALL description for the K6 has this text:
"The CS and SS registers should not be modified by the operating
system between the
execution of the SYSCALL instruction and its corresponding SYSRET instruction."

It's likely that behavior has been fixed on modern 64-bit AMD cpus
running in legacy mode, but I haven't tested it. It's not really
worth pursuing.

> No, showstopper appears much earlier: it is only supported on AMD. Which
> would mean, yet another vendor special-handling. And I don't think it's
> worth it.
>
> Yeah, yeah, it might still be faster than SYSENTER, but 32-bit?! Srsly?!
> I'm surprised that thing still builds even. :-)
>
>> -- there's no MSR_SYSCALL_MASK,
>
> Of course there is:
>
> MSRC000_0084 SYSCALL Flag Mask (SYSCALL_FLAG_MASK):
>
> 31:0 - Mask: SYSCALL flag mask. Read-write. Reset: 0000_0000h. This register holds the EFLAGS
> mask used by the SYSCALL instruction. 1=Clear the corresponding EFLAGS bit when executing the
> SYSCALL instruction.
>
> Intel has that too, except again, no SYSCALL in legacy mode on Intel.

SYSCALL_FLAG_MASK was added with the 64-bit processors. It's not used
in legacy mode according to the AMD docs.

--
Brian Gerst