2022-03-18 11:30:26

by Chenyi Qiang

[permalink] [raw]
Subject: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

For the triple fault sythesized by KVM, e.g. the RSM path or
nested_vmx_abort(), if KVM exits to userspace before the request is
serviced, userspace could migrate the VM and lose the triple fault.
Fix this issue by adding a new event KVM_VCPUEVENT_TRIPLE_FAULT in
get/set_vcpu_events() to track the triple fault request.

Signed-off-by: Chenyi Qiang <[email protected]>
---
Documentation/virt/kvm/api.rst | 6 ++++++
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/x86.c | 9 ++++++++-
3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 691ff84444bd..9682b0a438bd 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1146,6 +1146,9 @@ The following bits are defined in the flags field:
fields contain a valid state. This bit will be set whenever
KVM_CAP_EXCEPTION_PAYLOAD is enabled.

+- KVM_VCPUEVENT_TRIPLE_FAULT may be set to signal that there's a
+ triple fault request waiting to be serviced.
+
ARM/ARM64:
^^^^^^^^^^

@@ -1241,6 +1244,9 @@ can be set in the flags field to signal that the
exception_has_payload, exception_payload, and exception.pending fields
contain a valid state and shall be written into the VCPU.

+KVM_VCPUEVENT_TRIPLE_FAULT can be set in flags field to signal that a
+triple fault request should be made.
+
ARM/ARM64:
^^^^^^^^^^

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index bf6e96011dfe..d8ef0d993e86 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -325,6 +325,7 @@ struct kvm_reinject_control {
#define KVM_VCPUEVENT_VALID_SHADOW 0x00000004
#define KVM_VCPUEVENT_VALID_SMM 0x00000008
#define KVM_VCPUEVENT_VALID_PAYLOAD 0x00000010
+#define KVM_VCPUEVENT_TRIPLE_FAULT 0x00000020

/* Interrupt shadow states */
#define KVM_X86_SHADOW_INT_MOV_SS 0x01
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4fa4d8269e5b..fee402a700df 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4891,6 +4891,9 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
if (vcpu->kvm->arch.exception_payload_enabled)
events->flags |= KVM_VCPUEVENT_VALID_PAYLOAD;

+ if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
+ events->flags |= KVM_VCPUEVENT_TRIPLE_FAULT;
+
memset(&events->reserved, 0, sizeof(events->reserved));
}

@@ -4903,7 +4906,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
| KVM_VCPUEVENT_VALID_SIPI_VECTOR
| KVM_VCPUEVENT_VALID_SHADOW
| KVM_VCPUEVENT_VALID_SMM
- | KVM_VCPUEVENT_VALID_PAYLOAD))
+ | KVM_VCPUEVENT_VALID_PAYLOAD
+ | KVM_VCPUEVENT_TRIPLE_FAULT))
return -EINVAL;

if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
@@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
}
}

+ if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+
kvm_make_request(KVM_REQ_EVENT, vcpu);

return 0;
--
2.17.1


2022-04-06 12:12:12

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Tue, Apr 05, 2022, Sean Christopherson wrote:
> On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> > @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > }
> > }
> >
> > + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > +
> > kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> Looks correct, but this really needs a selftest, at least for the SET path since
> the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> exit.
>
> Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
> making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
> window to intercept the request.

Drat, I bet that MCE path means the WARN in nested_vmx_vmexit() can be triggered
by userspace. If so, this patch makes it really, really easy to hit, e.g. queue the
request while L2 is active, then do KVM_SET_NESTED_STATE to force an "exit" without
bouncing through kvm_check_nested_events().

WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))

I don't think SVM has a user-triggerable WARN, but the request should still be
dropped on forced exit from L2, e.g. I believe this is the correct fix:

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 9a6dc2b38fcf..18c5e96b12a5 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1128,6 +1128,7 @@ void svm_leave_nested(struct kvm_vcpu *vcpu)
if (is_guest_mode(vcpu)) {
svm->nested.nested_run_pending = 0;
svm->nested.vmcb12_gpa = INVALID_GPA;
+ kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);

leave_guest_mode(vcpu);

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f18744f7ff82..0587ef647553 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6276,6 +6276,7 @@ void vmx_leave_nested(struct kvm_vcpu *vcpu)
{
if (is_guest_mode(vcpu)) {
to_vmx(vcpu)->nested.nested_run_pending = 0;
+ kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);
nested_vmx_vmexit(vcpu, -1, 0, 0);
}
free_nested(vcpu);

2022-04-06 12:51:09

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> For the triple fault sythesized by KVM, e.g. the RSM path or
> nested_vmx_abort(), if KVM exits to userspace before the request is
> serviced, userspace could migrate the VM and lose the triple fault.
> Fix this issue by adding a new event KVM_VCPUEVENT_TRIPLE_FAULT in
> get/set_vcpu_events() to track the triple fault request.
>
> Signed-off-by: Chenyi Qiang <[email protected]>
> ---
> Documentation/virt/kvm/api.rst | 6 ++++++
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/x86.c | 9 ++++++++-
> 3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 691ff84444bd..9682b0a438bd 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -1146,6 +1146,9 @@ The following bits are defined in the flags field:
> fields contain a valid state. This bit will be set whenever
> KVM_CAP_EXCEPTION_PAYLOAD is enabled.
>
> +- KVM_VCPUEVENT_TRIPLE_FAULT may be set to signal that there's a
> + triple fault request waiting to be serviced.

Please avoid "request" in the docs, as before, that's a KVM implemenation detail.
For this one, maybe "there's a pending triple fault event"?

> +
> ARM/ARM64:
> ^^^^^^^^^^
>
> @@ -1241,6 +1244,9 @@ can be set in the flags field to signal that the
> exception_has_payload, exception_payload, and exception.pending fields
> contain a valid state and shall be written into the VCPU.
>
> +KVM_VCPUEVENT_TRIPLE_FAULT can be set in flags field to signal that a
> +triple fault request should be made.


And here, "to signal that KVM should synthesize a triple fault for the guest"?

> +
> ARM/ARM64:
> ^^^^^^^^^^
>
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index bf6e96011dfe..d8ef0d993e86 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -325,6 +325,7 @@ struct kvm_reinject_control {
> #define KVM_VCPUEVENT_VALID_SHADOW 0x00000004
> #define KVM_VCPUEVENT_VALID_SMM 0x00000008
> #define KVM_VCPUEVENT_VALID_PAYLOAD 0x00000010
> +#define KVM_VCPUEVENT_TRIPLE_FAULT 0x00000020
>
> /* Interrupt shadow states */
> #define KVM_X86_SHADOW_INT_MOV_SS 0x01
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4fa4d8269e5b..fee402a700df 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4891,6 +4891,9 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
> if (vcpu->kvm->arch.exception_payload_enabled)
> events->flags |= KVM_VCPUEVENT_VALID_PAYLOAD;
>
> + if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
> + events->flags |= KVM_VCPUEVENT_TRIPLE_FAULT;
> +
> memset(&events->reserved, 0, sizeof(events->reserved));
> }
>
> @@ -4903,7 +4906,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> | KVM_VCPUEVENT_VALID_SIPI_VECTOR
> | KVM_VCPUEVENT_VALID_SHADOW
> | KVM_VCPUEVENT_VALID_SMM
> - | KVM_VCPUEVENT_VALID_PAYLOAD))
> + | KVM_VCPUEVENT_VALID_PAYLOAD
> + | KVM_VCPUEVENT_TRIPLE_FAULT))
> return -EINVAL;
>
> if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
> @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> }
> }
>
> + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> +
> kvm_make_request(KVM_REQ_EVENT, vcpu);

Looks correct, but this really needs a selftest, at least for the SET path since
the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
exit.

Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
window to intercept the request.

2022-04-06 14:14:01

by Chenyi Qiang

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request



On 4/6/2022 7:31 AM, Sean Christopherson wrote:
> On Fri, Mar 18, 2022, Chenyi Qiang wrote:
>> For the triple fault sythesized by KVM, e.g. the RSM path or
>> nested_vmx_abort(), if KVM exits to userspace before the request is
>> serviced, userspace could migrate the VM and lose the triple fault.
>> Fix this issue by adding a new event KVM_VCPUEVENT_TRIPLE_FAULT in
>> get/set_vcpu_events() to track the triple fault request.
>>
>> Signed-off-by: Chenyi Qiang <[email protected]>
>> ---
>> Documentation/virt/kvm/api.rst | 6 ++++++
>> arch/x86/include/uapi/asm/kvm.h | 1 +
>> arch/x86/kvm/x86.c | 9 ++++++++-
>> 3 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 691ff84444bd..9682b0a438bd 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -1146,6 +1146,9 @@ The following bits are defined in the flags field:
>> fields contain a valid state. This bit will be set whenever
>> KVM_CAP_EXCEPTION_PAYLOAD is enabled.
>>
>> +- KVM_VCPUEVENT_TRIPLE_FAULT may be set to signal that there's a
>> + triple fault request waiting to be serviced.
>
> Please avoid "request" in the docs, as before, that's a KVM implemenation detail.
> For this one, maybe "there's a pending triple fault event"?
>
>> +
>> ARM/ARM64:
>> ^^^^^^^^^^
>>
>> @@ -1241,6 +1244,9 @@ can be set in the flags field to signal that the
>> exception_has_payload, exception_payload, and exception.pending fields
>> contain a valid state and shall be written into the VCPU.
>>
>> +KVM_VCPUEVENT_TRIPLE_FAULT can be set in flags field to signal that a
>> +triple fault request should be made.
>
>
> And here, "to signal that KVM should synthesize a triple fault for the guest"?
>
>> +
>> ARM/ARM64:
>> ^^^^^^^^^^
>>
>> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
>> index bf6e96011dfe..d8ef0d993e86 100644
>> --- a/arch/x86/include/uapi/asm/kvm.h
>> +++ b/arch/x86/include/uapi/asm/kvm.h
>> @@ -325,6 +325,7 @@ struct kvm_reinject_control {
>> #define KVM_VCPUEVENT_VALID_SHADOW 0x00000004
>> #define KVM_VCPUEVENT_VALID_SMM 0x00000008
>> #define KVM_VCPUEVENT_VALID_PAYLOAD 0x00000010
>> +#define KVM_VCPUEVENT_TRIPLE_FAULT 0x00000020
>>
>> /* Interrupt shadow states */
>> #define KVM_X86_SHADOW_INT_MOV_SS 0x01
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4fa4d8269e5b..fee402a700df 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4891,6 +4891,9 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>> if (vcpu->kvm->arch.exception_payload_enabled)
>> events->flags |= KVM_VCPUEVENT_VALID_PAYLOAD;
>>
>> + if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
>> + events->flags |= KVM_VCPUEVENT_TRIPLE_FAULT;
>> +
>> memset(&events->reserved, 0, sizeof(events->reserved));
>> }
>>
>> @@ -4903,7 +4906,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
>> | KVM_VCPUEVENT_VALID_SIPI_VECTOR
>> | KVM_VCPUEVENT_VALID_SHADOW
>> | KVM_VCPUEVENT_VALID_SMM
>> - | KVM_VCPUEVENT_VALID_PAYLOAD))
>> + | KVM_VCPUEVENT_VALID_PAYLOAD
>> + | KVM_VCPUEVENT_TRIPLE_FAULT))
>> return -EINVAL;
>>
>> if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
>> @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
>> }
>> }
>>
>> + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
>> + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
>> +
>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> Looks correct, but this really needs a selftest, at least for the SET path since
> the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> exit.
>
> Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
> making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
> window to intercept the request.

OK, will cook a selftest to verify it.

2022-04-07 01:12:00

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Tue, Apr 05, 2022, Sean Christopherson wrote:
> On Tue, Apr 05, 2022, Sean Christopherson wrote:
> > On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> > > @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > > }
> > > }
> > >
> > > + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> > > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > > +
> > > kvm_make_request(KVM_REQ_EVENT, vcpu);
> >
> > Looks correct, but this really needs a selftest, at least for the SET path since
> > the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> > e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> > exit.
> >
> > Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
> > making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
> > window to intercept the request.
>
> Drat, I bet that MCE path means the WARN in nested_vmx_vmexit() can be triggered
> by userspace. If so, this patch makes it really, really easy to hit, e.g. queue the
> request while L2 is active, then do KVM_SET_NESTED_STATE to force an "exit" without
> bouncing through kvm_check_nested_events().
>
> WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
>
> I don't think SVM has a user-triggerable WARN, but the request should still be
> dropped on forced exit from L2, e.g. I believe this is the correct fix:

Confirmed the WARN can be triggered by abusing this patch, I'll get a patch out
once I figure out why kvm/queue is broken.

diff --git a/tools/testing/selftests/kvm/x86_64/state_test.c b/tools/testing/selftests/kvm/x86_64/state_test.c
index 2e0a92da8ff5..b7faeae3dcc4 100644
--- a/tools/testing/selftests/kvm/x86_64/state_test.c
+++ b/tools/testing/selftests/kvm/x86_64/state_test.c
@@ -210,6 +210,12 @@ int main(int argc, char *argv[])
memset(&regs1, 0, sizeof(regs1));
vcpu_regs_get(vm, VCPU_ID, &regs1);

+ if (stage == 6) {
+ state->events.flags |= 0x20;
+ vcpu_events_set(vm, VCPU_ID, &state->events);
+ vcpu_nested_state_set(vm, VCPU_ID, &state->nested, false);
+ }
+
kvm_vm_release(vm);

/* Restore state in a new VM. */

2022-04-07 01:32:19

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Tue, Apr 05, 2022, Sean Christopherson wrote:
> On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> > @@ -4903,7 +4906,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > | KVM_VCPUEVENT_VALID_SIPI_VECTOR
> > | KVM_VCPUEVENT_VALID_SHADOW
> > | KVM_VCPUEVENT_VALID_SMM
> > - | KVM_VCPUEVENT_VALID_PAYLOAD))
> > + | KVM_VCPUEVENT_VALID_PAYLOAD
> > + | KVM_VCPUEVENT_TRIPLE_FAULT))
> > return -EINVAL;
> >
> > if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
> > @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > }
> > }
> >
> > + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > +
> > kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> Looks correct, but this really needs a selftest, at least for the SET path since
> the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> exit.

It finally dawned on me why all the other events use two "flags", i.e. an actual
flags entry of KVM_VCPUEVENT_VALID_* and then the value itself. Userspace needs
to be able to _clear_ the request, not just set the request. So this needs to
follow the existing pattern of adding a VALID flag and then yet another field to
specify whether or not a triple fault is pending.

2022-04-07 01:59:08

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Tue, Apr 05, 2022, Sean Christopherson wrote:
> On Tue, Apr 05, 2022, Sean Christopherson wrote:
> > On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> > > @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > > }
> > > }
> > >
> > > + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> > > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > > +
> > > kvm_make_request(KVM_REQ_EVENT, vcpu);
> >
> > Looks correct, but this really needs a selftest, at least for the SET path since
> > the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> > e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> > exit.
> >
> > Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
> > making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
> > window to intercept the request.
>
> Drat, I bet that MCE path means the WARN in nested_vmx_vmexit() can be triggered
> by userspace. If so, this patch makes it really, really easy to hit, e.g. queue the
> request while L2 is active, then do KVM_SET_NESTED_STATE to force an "exit" without
> bouncing through kvm_check_nested_events().
>
> WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
>
> I don't think SVM has a user-triggerable WARN, but the request should still be
> dropped on forced exit from L2, e.g. I believe this is the correct fix:
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 9a6dc2b38fcf..18c5e96b12a5 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1128,6 +1128,7 @@ void svm_leave_nested(struct kvm_vcpu *vcpu)
> if (is_guest_mode(vcpu)) {
> svm->nested.nested_run_pending = 0;
> svm->nested.vmcb12_gpa = INVALID_GPA;
> + kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);
>
> leave_guest_mode(vcpu);
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index f18744f7ff82..0587ef647553 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6276,6 +6276,7 @@ void vmx_leave_nested(struct kvm_vcpu *vcpu)
> {
> if (is_guest_mode(vcpu)) {
> to_vmx(vcpu)->nested.nested_run_pending = 0;
> + kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> nested_vmx_vmexit(vcpu, -1, 0, 0);
> }
> free_nested(vcpu);
>

Wrong again. Once your patch to save/restore via KVM_VCPUEVENT_TRIPLE_FAULT
lands, clearing KVM_REQ_TRIPLE_FAULT is wrong because that will result in KVM
dropping the request when forcing an exit from L2 in order to reload L2 state,
e.g. if userspace restores events before nested state. So I think the only thing
that can be done is to delete the WARN :-(

2022-04-07 08:50:10

by Chenyi Qiang

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request



On 4/7/2022 5:15 AM, Sean Christopherson wrote:
> On Tue, Apr 05, 2022, Sean Christopherson wrote:
>> On Tue, Apr 05, 2022, Sean Christopherson wrote:
>>> On Fri, Mar 18, 2022, Chenyi Qiang wrote:
>>>> @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
>>>> }
>>>> }
>>>>
>>>> + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
>>>> + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
>>>> +
>>>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>
>>> Looks correct, but this really needs a selftest, at least for the SET path since
>>> the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
>>> e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
>>> exit.
>>>
>>> Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
>>> making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
>>> window to intercept the request.
>>
>> Drat, I bet that MCE path means the WARN in nested_vmx_vmexit() can be triggered
>> by userspace. If so, this patch makes it really, really easy to hit, e.g. queue the
>> request while L2 is active, then do KVM_SET_NESTED_STATE to force an "exit" without
>> bouncing through kvm_check_nested_events().
>>
>> WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
>>
>> I don't think SVM has a user-triggerable WARN, but the request should still be
>> dropped on forced exit from L2, e.g. I believe this is the correct fix:
>
> Confirmed the WARN can be triggered by abusing this patch, I'll get a patch out
> once I figure out why kvm/queue is broken.
>
> diff --git a/tools/testing/selftests/kvm/x86_64/state_test.c b/tools/testing/selftests/kvm/x86_64/state_test.c
> index 2e0a92da8ff5..b7faeae3dcc4 100644
> --- a/tools/testing/selftests/kvm/x86_64/state_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/state_test.c
> @@ -210,6 +210,12 @@ int main(int argc, char *argv[])
> memset(&regs1, 0, sizeof(regs1));
> vcpu_regs_get(vm, VCPU_ID, &regs1);
>
> + if (stage == 6) {
> + state->events.flags |= 0x20;
> + vcpu_events_set(vm, VCPU_ID, &state->events);
> + vcpu_nested_state_set(vm, VCPU_ID, &state->nested, false);
> + }
> +
> kvm_vm_release(vm);
>
> /* Restore state in a new VM. */

Also verified the WARN with this. Then, is it still necessary to add an
individual selftest about the working flow of save/restore triple fault
event?

>

2022-04-08 03:18:36

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] KVM: X86: Save&restore the triple fault request

On Thu, Apr 07, 2022, Chenyi Qiang wrote:
>
>
> On 4/7/2022 5:15 AM, Sean Christopherson wrote:
> > On Tue, Apr 05, 2022, Sean Christopherson wrote:
> > > On Tue, Apr 05, 2022, Sean Christopherson wrote:
> > > > On Fri, Mar 18, 2022, Chenyi Qiang wrote:
> > > > > @@ -4976,6 +4980,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> > > > > }
> > > > > }
> > > > > + if (events->flags & KVM_VCPUEVENT_TRIPLE_FAULT)
> > > > > + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> > > > > +
> > > > > kvm_make_request(KVM_REQ_EVENT, vcpu);
> > > >
> > > > Looks correct, but this really needs a selftest, at least for the SET path since
> > > > the intent is to use that for the NOTIFY handling. Doesn't need to be super fancy,
> > > > e.g. do port I/O from L2, inject a triple fault, and verify L1 sees the appropriate
> > > > exit.
> > > >
> > > > Aha! And for the GET path, abuse KVM_X86_SET_MCE with CR4.MCE=0 to coerce KVM into
> > > > making a KVM_REQ_TRIPLE_FAULT, that way there's no need to try and hit a timing
> > > > window to intercept the request.
> > >
> > > Drat, I bet that MCE path means the WARN in nested_vmx_vmexit() can be triggered
> > > by userspace. If so, this patch makes it really, really easy to hit, e.g. queue the
> > > request while L2 is active, then do KVM_SET_NESTED_STATE to force an "exit" without
> > > bouncing through kvm_check_nested_events().
> > >
> > > WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu))
> > >
> > > I don't think SVM has a user-triggerable WARN, but the request should still be
> > > dropped on forced exit from L2, e.g. I believe this is the correct fix:
> >
> > Confirmed the WARN can be triggered by abusing this patch, I'll get a patch out
> > once I figure out why kvm/queue is broken.
> >
> > diff --git a/tools/testing/selftests/kvm/x86_64/state_test.c b/tools/testing/selftests/kvm/x86_64/state_test.c
> > index 2e0a92da8ff5..b7faeae3dcc4 100644
> > --- a/tools/testing/selftests/kvm/x86_64/state_test.c
> > +++ b/tools/testing/selftests/kvm/x86_64/state_test.c
> > @@ -210,6 +210,12 @@ int main(int argc, char *argv[])
> > memset(&regs1, 0, sizeof(regs1));
> > vcpu_regs_get(vm, VCPU_ID, &regs1);
> >
> > + if (stage == 6) {
> > + state->events.flags |= 0x20;
> > + vcpu_events_set(vm, VCPU_ID, &state->events);
> > + vcpu_nested_state_set(vm, VCPU_ID, &state->nested, false);
> > + }
> > +
> > kvm_vm_release(vm);
> >
> > /* Restore state in a new VM. */
>
> Also verified the WARN with this. Then, is it still necessary to add an
> individual selftest about the working flow of save/restore triple fault
> event?

Yeah, the above hack fails the test even on a good kernel. It's not an actual test
of the feature, just a hack to confirm the bug.