2017-12-13 03:10:30

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2] KVM: X86: Fix host dr6 miss restore

From: Wanpeng Li <[email protected]>

Reported by syzkaller:

WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
CPU: 0 PID: 12927 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #16
RIP: 0010:do_debug+0x222/0x250
Call Trace:
<#DB>
debug+0x3e/0x70
RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
</#DB>
_copy_from_user+0x5b/0x90
SyS_timer_create+0x33/0x80
entry_SYSCALL_64_fastpath+0x23/0x9a

The syzkaller will mmap a buffer which is also the struct sigevent parameter of
timer_create(), it will also call perf_event_open() to set a BP for the buffer,
so when the implementation of timer_create() in kernel tries to get the struct
sigevent parameter by copy_from_user(), rep movsb triggers the BP. The syzkaller
testcase also sets the debug registers for the guest, however, the kvm just
restores host debug registers when we have active breakpoints. I can observe
the dr6 single step bit is set and !hw_breakpoint_active() sporadically by print
when running the testcase heavy multithreading. The do_debug() which is triggered
by rep movsb will splash when (dr6 & DR_STEP && !user_mode(regs)).

This patch fixes it by restoring host dr6 in sched_out if no breakpoint is active.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
v1 -> v2:
* move to sched_out path

arch/x86/kvm/x86.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1c5c7a3..76886c4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2964,6 +2964,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
pagefault_enable();
kvm_x86_ops->vcpu_put(vcpu);
vcpu->arch.last_host_tsc = rdtsc();
+ if (!hw_breakpoint_active())
+ set_debugreg(current->thread.debugreg6, 6);
}

static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
--
2.7.4


2017-12-13 09:18:25

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2] KVM: X86: Fix host dr6 miss restore

On 13.12.2017 04:10, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> Reported by syzkaller:
>
> WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
> CPU: 0 PID: 12927 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #16
> RIP: 0010:do_debug+0x222/0x250
> Call Trace:
> <#DB>
> debug+0x3e/0x70
> RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
> </#DB>
> _copy_from_user+0x5b/0x90
> SyS_timer_create+0x33/0x80
> entry_SYSCALL_64_fastpath+0x23/0x9a
>
> The syzkaller will mmap a buffer which is also the struct sigevent parameter of
> timer_create(), it will also call perf_event_open() to set a BP for the buffer,
> so when the implementation of timer_create() in kernel tries to get the struct
> sigevent parameter by copy_from_user(), rep movsb triggers the BP. The syzkaller
> testcase also sets the debug registers for the guest, however, the kvm just
> restores host debug registers when we have active breakpoints. I can observe
> the dr6 single step bit is set and !hw_breakpoint_active() sporadically by print
> when running the testcase heavy multithreading. The do_debug() which is triggered
> by rep movsb will splash when (dr6 & DR_STEP && !user_mode(regs)).
>
> This patch fixes it by restoring host dr6 in sched_out if no breakpoint is active.
>
> Reported-by: Dmitry Vyukov <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> v1 -> v2:
> * move to sched_out path
>
> arch/x86/kvm/x86.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1c5c7a3..76886c4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2964,6 +2964,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> pagefault_enable();
> kvm_x86_ops->vcpu_put(vcpu);
> vcpu->arch.last_host_tsc = rdtsc();

Can you add a comment like

/* With active breakpoints we already restored all debugregs in
vcpu_enter_guest(), however without active breakpoints we have to
restore debugreg 6 before scheduled out.
*/

> + if (!hw_breakpoint_active())
> + set_debugreg(current->thread.debugreg6, 6);
> }
>
> static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>

Having the restore at the old place was for me easier to understands.
But this moves it out of the hot loop.

--

Thanks,

David / dhildenb

2017-12-13 09:43:00

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2] KVM: X86: Fix host dr6 miss restore

On 13/12/2017 10:18, David Hildenbrand wrote:
> On 13.12.2017 04:10, Wanpeng Li wrote:
>> From: Wanpeng Li <[email protected]>
>>
>> Reported by syzkaller:
>>
>> WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
>> CPU: 0 PID: 12927 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #16
>> RIP: 0010:do_debug+0x222/0x250
>> Call Trace:
>> <#DB>
>> debug+0x3e/0x70
>> RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
>> </#DB>
>> _copy_from_user+0x5b/0x90
>> SyS_timer_create+0x33/0x80
>> entry_SYSCALL_64_fastpath+0x23/0x9a
>>
>> The syzkaller will mmap a buffer which is also the struct sigevent parameter of
>> timer_create(), it will also call perf_event_open() to set a BP for the buffer,
>> so when the implementation of timer_create() in kernel tries to get the struct
>> sigevent parameter by copy_from_user(), rep movsb triggers the BP. The syzkaller
>> testcase also sets the debug registers for the guest, however, the kvm just
>> restores host debug registers when we have active breakpoints. I can observe
>> the dr6 single step bit is set and !hw_breakpoint_active() sporadically by print
>> when running the testcase heavy multithreading. The do_debug() which is triggered
>> by rep movsb will splash when (dr6 & DR_STEP && !user_mode(regs)).
>>
>> This patch fixes it by restoring host dr6 in sched_out if no breakpoint is active.
>>
>> Reported-by: Dmitry Vyukov <[email protected]>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Radim Krčmář <[email protected]>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Dmitry Vyukov <[email protected]>
>> Reviewed-by: David Hildenbrand <[email protected]>
>> Signed-off-by: Wanpeng Li <[email protected]>
>> ---
>> v1 -> v2:
>> * move to sched_out path
>>
>> arch/x86/kvm/x86.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 1c5c7a3..76886c4 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2964,6 +2964,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>> pagefault_enable();
>> kvm_x86_ops->vcpu_put(vcpu);
>> vcpu->arch.last_host_tsc = rdtsc();
>
> Can you add a comment like
>
> /* With active breakpoints we already restored all debugregs in
> vcpu_enter_guest(), however without active breakpoints we have to
> restore debugreg 6 before scheduled out.
> */

Actually, we should make it unconditionally zero, not reset it to
current->thread.debugreg6. That's because the invariant at exit from
do_debug is DR6 = 0.

/*
* do_debug expects dr6 to be cleared after it runs, but here
* we might have a stale dr6 from the guest.
*/
set_debugreg(0, 6);

I'll push the patch to kvm/queue.

Thanks,

Paolo

2017-12-13 09:53:11

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v2] KVM: X86: Fix host dr6 miss restore

2017-12-13 17:42 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 13/12/2017 10:18, David Hildenbrand wrote:
>> On 13.12.2017 04:10, Wanpeng Li wrote:
>>> From: Wanpeng Li <[email protected]>
>>>
>>> Reported by syzkaller:
>>>
>>> WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
>>> CPU: 0 PID: 12927 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #16
>>> RIP: 0010:do_debug+0x222/0x250
>>> Call Trace:
>>> <#DB>
>>> debug+0x3e/0x70
>>> RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
>>> </#DB>
>>> _copy_from_user+0x5b/0x90
>>> SyS_timer_create+0x33/0x80
>>> entry_SYSCALL_64_fastpath+0x23/0x9a
>>>
>>> The syzkaller will mmap a buffer which is also the struct sigevent parameter of
>>> timer_create(), it will also call perf_event_open() to set a BP for the buffer,
>>> so when the implementation of timer_create() in kernel tries to get the struct
>>> sigevent parameter by copy_from_user(), rep movsb triggers the BP. The syzkaller
>>> testcase also sets the debug registers for the guest, however, the kvm just
>>> restores host debug registers when we have active breakpoints. I can observe
>>> the dr6 single step bit is set and !hw_breakpoint_active() sporadically by print
>>> when running the testcase heavy multithreading. The do_debug() which is triggered
>>> by rep movsb will splash when (dr6 & DR_STEP && !user_mode(regs)).
>>>
>>> This patch fixes it by restoring host dr6 in sched_out if no breakpoint is active.
>>>
>>> Reported-by: Dmitry Vyukov <[email protected]>
>>> Cc: Paolo Bonzini <[email protected]>
>>> Cc: Radim Krčmář <[email protected]>
>>> Cc: David Hildenbrand <[email protected]>
>>> Cc: Dmitry Vyukov <[email protected]>
>>> Reviewed-by: David Hildenbrand <[email protected]>
>>> Signed-off-by: Wanpeng Li <[email protected]>
>>> ---
>>> v1 -> v2:
>>> * move to sched_out path
>>>
>>> arch/x86/kvm/x86.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 1c5c7a3..76886c4 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -2964,6 +2964,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>> pagefault_enable();
>>> kvm_x86_ops->vcpu_put(vcpu);
>>> vcpu->arch.last_host_tsc = rdtsc();
>>
>> Can you add a comment like
>>
>> /* With active breakpoints we already restored all debugregs in
>> vcpu_enter_guest(), however without active breakpoints we have to
>> restore debugreg 6 before scheduled out.
>> */
>
> Actually, we should make it unconditionally zero, not reset it to
> current->thread.debugreg6. That's because the invariant at exit from
> do_debug is DR6 = 0.
>
> /*
> * do_debug expects dr6 to be cleared after it runs, but here
> * we might have a stale dr6 from the guest.
> */
> set_debugreg(0, 6);
>
> I'll push the patch to kvm/queue.

Do you need I to send a new version?

Regards,
Wanpeng Li