2018-02-07 06:26:40

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

From: Wanpeng Li <[email protected]>

Reported by syzkaller:

WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c

The syzkaller creates a former thread to issue KVM_SMI ioctl, and then creates
a latter thread to mmap and operate on the same vCPU, rsm emulation will not be
executed since there is no something like seabios which implements smi handler
when running syzkaller directly. This triggers a race condition when running
the testcase with multiple threads. Sometimes one thread exit w/ SHUTDOWN
reason, another thread mmaps and operates on the same vCPU, it continues to
use CS=0x30000, IP=0x8000 to access the address of SMI handler which results
in the above ept misconfig. This patch fixes it by bailing out immediately if
the vCPU is marked EXIT_SHUTDOWN reason.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/x86.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 786cd00..445e702 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
goto out;
}

+ if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
+ r = -EINVAL;
+ goto out;
+ }
+
if (vcpu->run->kvm_dirty_regs) {
r = sync_regs(vcpu);
if (r != 0)
--
2.7.4



2018-02-07 06:43:17

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

On Wed, Feb 7, 2018 at 7:25 AM, Wanpeng Li <[email protected]> wrote:
> From: Wanpeng Li <[email protected]>
>
> Reported by syzkaller:
>
> WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
> CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
> RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
> Call Trace:
> vmx_handle_exit+0xbd/0xe20 [kvm_intel]
> kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
> kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
> do_vfs_ioctl+0xa4/0x6a0
> SyS_ioctl+0x79/0x90
> entry_SYSCALL_64_fastpath+0x25/0x9c
>
> The syzkaller creates a former thread to issue KVM_SMI ioctl, and then creates
> a latter thread to mmap and operate on the same vCPU, rsm emulation will not be
> executed since there is no something like seabios which implements smi handler
> when running syzkaller directly. This triggers a race condition when running
> the testcase with multiple threads. Sometimes one thread exit w/ SHUTDOWN
> reason, another thread mmaps and operates on the same vCPU, it continues to
> use CS=0x30000, IP=0x8000 to access the address of SMI handler which results
> in the above ept misconfig. This patch fixes it by bailing out immediately if
> the vCPU is marked EXIT_SHUTDOWN reason.
>
> Reported-by: Dmitry Vyukov <[email protected]>

This was reported by syzbot:
https://groups.google.com/d/msg/syzkaller-bugs/6GrlY0UcDEk/aMShRKq3AwAJ

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed.


> Cc: Dmitry Vyukov <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kvm/x86.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 786cd00..445e702 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> goto out;
> }
>
> + if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
> + r = -EINVAL;
> + goto out;
> + }
> +
> if (vcpu->run->kvm_dirty_regs) {
> r = sync_regs(vcpu);
> if (r != 0)
> --
> 2.7.4
>

2018-02-07 14:17:48

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

On 07/02/2018 07:25, Wanpeng Li wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 786cd00..445e702 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> goto out;
> }
>
> + if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
> + r = -EINVAL;
> + goto out;
> + }
> +
> if (vcpu->run->kvm_dirty_regs) {
> r = sync_regs(vcpu);
> if (r != 0)
>

This most likely breaks triple faults in the usual case where they
should result in resetting the system; the KVM API doesn't say that you
should clear vcpu->run->exit_reason before entering.

What exactly causes the EPT misconfig to reach the WARN? That is, how
does kvm_mmu_page_fault end up returning a negative errno value? If I
read the code correctly only tdp_page_fault can do so, so my guess would
be kvm_handle_bad_page:

if (pfn == KVM_PFN_ERR_RO_FAULT)
return RET_PF_EMULATE;

if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
current);
return RET_PF_RETRY;
}

/* KVM_PFN_ERR_FAULT */
return -EFAULT;

Maybe it should return RET_PF_EMULATE, which would cause an emulation
failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

Thanks,

Paolo

2018-02-08 07:36:44

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

2018-02-07 22:16 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 07/02/2018 07:25, Wanpeng Li wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 786cd00..445e702 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>> goto out;
>> }
>>
>> + if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
>> + r = -EINVAL;
>> + goto out;
>> + }
>> +
>> if (vcpu->run->kvm_dirty_regs) {
>> r = sync_regs(vcpu);
>> if (r != 0)
>>
>
> This most likely breaks triple faults in the usual case where they
> should result in resetting the system; the KVM API doesn't say that you
> should clear vcpu->run->exit_reason before entering.
>
> What exactly causes the EPT misconfig to reach the WARN? That is, how
> does kvm_mmu_page_fault end up returning a negative errno value? If I
> read the code correctly only tdp_page_fault can do so, so my guess would
> be kvm_handle_bad_page:
>
> if (pfn == KVM_PFN_ERR_RO_FAULT)
> return RET_PF_EMULATE;
>
> if (pfn == KVM_PFN_ERR_HWPOISON) {
> kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> current);
> return RET_PF_RETRY;
> }
>
> /* KVM_PFN_ERR_FAULT */
> return -EFAULT;
>
> Maybe it should return RET_PF_EMULATE, which would cause an emulation
> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

Agreed, just do it in v2. In addition, I didn't remove the
RET_PFN_ERR_PO_FAULT check since I think otherwise we will miss the
comments above it.

Regards,
Wanpeng Li

2018-02-08 08:57:44

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown



On 02/07/2018 10:16 PM, Paolo Bonzini wrote:
> On 07/02/2018 07:25, Wanpeng Li wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 786cd00..445e702 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -7458,6 +7458,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>> goto out;
>> }
>>
>> + if (unlikely(vcpu->run->exit_reason == KVM_EXIT_SHUTDOWN)) {
>> + r = -EINVAL;
>> + goto out;
>> + }
>> +
>> if (vcpu->run->kvm_dirty_regs) {
>> r = sync_regs(vcpu);
>> if (r != 0)
>>
>
> This most likely breaks triple faults in the usual case where they
> should result in resetting the system; the KVM API doesn't say that you
> should clear vcpu->run->exit_reason before entering.
>
> What exactly causes the EPT misconfig to reach the WARN? That is, how
> does kvm_mmu_page_fault end up returning a negative errno value? If I
> read the code correctly only tdp_page_fault can do so, so my guess would
> be kvm_handle_bad_page:
>
> if (pfn == KVM_PFN_ERR_RO_FAULT)
> return RET_PF_EMULATE;
>
> if (pfn == KVM_PFN_ERR_HWPOISON) {
> kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> current);
> return RET_PF_RETRY;
> }
>
> /* KVM_PFN_ERR_FAULT */
> return -EFAULT;
>
> Maybe it should return RET_PF_EMULATE, which would cause an emulation
> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.

So the root cause is that a running vCPU accessing the memory whose memslot
is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
memslot).

The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace, we'd
better to make ept-misconfig's handler follow this style as well.

Actually, the WARN_ON in ept-misconfig's handler is unnecessary as
kvm_mmu_page_fault() will warn us if it is the real ept misconfig, so we can
simply return kvm_mmu_page_fault().


2018-02-08 10:32:45

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

On 08/02/2018 09:57, Xiao Guangrong wrote:
>> Maybe it should return RET_PF_EMULATE, which would cause an emulation
>> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.
>
> So the root cause is that a running vCPU accessing the memory whose memslot
> is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
> memslot).
>
> The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace,
> we'd better to make ept-misconfig's handler follow this style as well.

Why return -EFAULT and not attempt emulation (which will fail)?

Paolo

2018-02-09 03:23:55

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown



On 02/08/2018 06:31 PM, Paolo Bonzini wrote:
> On 08/02/2018 09:57, Xiao Guangrong wrote:
>>> Maybe it should return RET_PF_EMULATE, which would cause an emulation
>>> failure and then an exit with KVM_EXIT_INTERNAL_ERROR.
>>
>> So the root cause is that a running vCPU accessing the memory whose memslot
>> is being updated (met the condition KVM_MEMSLOT_INVALID is set on the its
>> memslot).
>>
>> The normal #PF handler breaks KVM_RUN and returns -EFAULT to userspace,
>> we'd better to make ept-misconfig's handler follow this style as well.
>
> Why return -EFAULT and not attempt emulation (which will fail)?
>

That is a good question... :)

This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
userspace should avoid this case by itself (avoiding vCPU accessing the
memslot which is being updated). If it happens, it's a operation issue
rather than INTERNAL ERROR.

Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
is a better solution...

2018-02-09 12:44:04

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

On 09/02/2018 04:22, Xiao Guangrong wrote:
>>
>
> That is a good question... :)
>
> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
> userspace should avoid this case by itself (avoiding vCPU accessing the
> memslot which is being updated). If it happens, it's a operation issue
> rather than INTERNAL ERROR.
>
> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
> is a better solution...

Yeah, that's what emulation would do (except if it's an instruction
fetch, which will cause emulation to fail). I think it's a bug in the
non-EPT #PF case that we return with -EFAULT.

Paolo

2018-02-11 03:20:47

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown



On 02/09/2018 08:42 PM, Paolo Bonzini wrote:
> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>
>>
>> That is a good question... :)
>>
>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>> userspace should avoid this case by itself (avoiding vCPU accessing the
>> memslot which is being updated). If it happens, it's a operation issue
>> rather than INTERNAL ERROR.
>>
>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>> is a better solution...
>
> Yeah, that's what emulation would do (except if it's an instruction
> fetch, which will cause emulation to fail). I think it's a bug in the
> non-EPT #PF case that we return with -EFAULT.

Wanpeng, could you please do it? :)

2018-02-11 08:57:48

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

2018-02-11 11:20 GMT+08:00 Xiao Guangrong <[email protected]>:
>
>
> On 02/09/2018 08:42 PM, Paolo Bonzini wrote:
>>
>> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>>
>>>>
>>>
>>> That is a good question... :)
>>>
>>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>>> userspace should avoid this case by itself (avoiding vCPU accessing the
>>> memslot which is being updated). If it happens, it's a operation issue
>>> rather than INTERNAL ERROR.
>>>
>>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>>> is a better solution...
>>
>>
>> Yeah, that's what emulation would do (except if it's an instruction
>> fetch, which will cause emulation to fail). I think it's a bug in the
>> non-EPT #PF case that we return with -EFAULT.
>
>
> Wanpeng, could you please do it? :)

Thanks for the discussion, I will have a try. :)

Regards,
Wanpeng Li

2018-02-11 10:58:35

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix SMRAM accessing even if VM is shutdown

2018-02-09 20:42 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 09/02/2018 04:22, Xiao Guangrong wrote:
>>>
>>
>> That is a good question... :)
>>
>> This case (with KVM_MEMSLOT_INVALID is set) can be easily constructed,
>> userspace should avoid this case by itself (avoiding vCPU accessing the
>> memslot which is being updated). If it happens, it's a operation issue
>> rather than INTERNAL ERROR.
>>
>> Maybe treat it as MMIO accessing and return to userspace with MMIO_EXIT
>> is a better solution...
>
> Yeah, that's what emulation would do (except if it's an instruction
> fetch, which will cause emulation to fail). I think it's a bug in the

After patch v2, I found that instruction fetch causes emulation to
fail since KVM_MEMSLOT_INVALID.

Regards,
Wanpeng Li