2021-05-15 07:45:03

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
exit code and parameters fail. Since the VMGEXIT instruction can be issued
from userspace, even though userspace (likely) can't update the GHCB,
don't allow userspace to be able to kill the guest.

Return a #GP request through the GHCB when validation fails, rather than
terminating the guest.

Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/kvm/svm/sev.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5bc887e9a986..bc77f26f0880 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2075,7 +2075,7 @@ static void sev_es_sync_from_ghcb(struct vcpu_svm *svm)
memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
}

-static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+static bool sev_es_validate_vmgexit(struct vcpu_svm *svm)
{
struct kvm_vcpu *vcpu;
struct ghcb *ghcb;
@@ -2174,7 +2174,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
goto vmgexit_err;
}

- return 0;
+ return true;

vmgexit_err:
vcpu = &svm->vcpu;
@@ -2188,13 +2188,16 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
dump_ghcb(svm);
}

- vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
- vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
- vcpu->run->internal.ndata = 2;
- vcpu->run->internal.data[0] = exit_code;
- vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu;
+ /* Clear the valid entries fields */
+ memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));

- return -EINVAL;
+ ghcb_set_sw_exit_info_1(ghcb, 1);
+ ghcb_set_sw_exit_info_2(ghcb,
+ X86_TRAP_GP |
+ SVM_EVTINJ_TYPE_EXEPT |
+ SVM_EVTINJ_VALID);
+
+ return false;
}

void sev_es_unmap_ghcb(struct vcpu_svm *svm)
@@ -2459,9 +2462,8 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)

exit_code = ghcb_get_sw_exit_code(ghcb);

- ret = sev_es_validate_vmgexit(svm);
- if (ret)
- return ret;
+ if (!sev_es_validate_vmgexit(svm))
+ return 1;

sev_es_sync_from_ghcb(svm);
ghcb_set_sw_exit_info_1(ghcb, 0);
--
2.31.0



2021-05-21 07:19:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On Thu, May 20, 2021, Sean Christopherson wrote:
> On Mon, May 17, 2021, Tom Lendacky wrote:
> > On 5/14/21 6:06 PM, Peter Gonda wrote:
> > > On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
> > >>
> > >> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> > >> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> > >> from userspace, even though userspace (likely) can't update the GHCB,
> > >> don't allow userspace to be able to kill the guest.
> > >>
> > >> Return a #GP request through the GHCB when validation fails, rather than
> > >> terminating the guest.
> > >
> > > Is this a gap in the spec? I don't see anything that details what
> > > should happen if the correct fields for NAE are not set in the first
> > > couple paragraphs of section 4 'GHCB Protocol'.
> >
> > No, I don't think the spec needs to spell out everything like this. The
> > hypervisor is free to determine its course of action in this case.
>
> The hypervisor can decide whether to inject/return an error or kill the guest,
> but what errors can be returned and how they're returned absolutely needs to be
> ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> is the logical place to define said ABI.
>
> For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
> completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
> which means if something does go awry KVM has just made debugging the guest that
> much harder, e.g. imagine the confusion that will ensue if the end result is a
> SIGBUS to userspace on CPUID.
>
> There needs to be an explicit error code for "you gave me bad data", otherwise
> we're signing ourselves up for future pain.

More concretely, I think the best course of action is to define a new return code
in SW_EXITINFO1[31:0], e.g. '2', with additional information in SW_EXITINFO2.

In theory, an old-but-sane guest will interpret the unexpected return code as
fatal to whatever triggered the #VMGEXIT, e.g. SIGBUS to userspace. Unfortunately
Linux isn't sane because sev_es_ghcb_hv_call() assumes any non-'1' result means
success, but that's trivial to fix and IMO should be fixed irrespective of where
this goes.

2021-05-21 07:50:46

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On Thu, May 20, 2021, Sean Christopherson wrote:
> On Thu, May 20, 2021, Sean Christopherson wrote:
> > On Mon, May 17, 2021, Tom Lendacky wrote:
> > > On 5/14/21 6:06 PM, Peter Gonda wrote:
> > > > On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
> > > >>
> > > >> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> > > >> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> > > >> from userspace, even though userspace (likely) can't update the GHCB,
> > > >> don't allow userspace to be able to kill the guest.
> > > >>
> > > >> Return a #GP request through the GHCB when validation fails, rather than
> > > >> terminating the guest.
> > > >
> > > > Is this a gap in the spec? I don't see anything that details what
> > > > should happen if the correct fields for NAE are not set in the first
> > > > couple paragraphs of section 4 'GHCB Protocol'.
> > >
> > > No, I don't think the spec needs to spell out everything like this. The
> > > hypervisor is free to determine its course of action in this case.
> >
> > The hypervisor can decide whether to inject/return an error or kill the guest,
> > but what errors can be returned and how they're returned absolutely needs to be
> > ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> > is the logical place to define said ABI.
> >
> > For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
> > completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
> > which means if something does go awry KVM has just made debugging the guest that
> > much harder, e.g. imagine the confusion that will ensue if the end result is a
> > SIGBUS to userspace on CPUID.
> >
> > There needs to be an explicit error code for "you gave me bad data", otherwise
> > we're signing ourselves up for future pain.
>
> More concretely, I think the best course of action is to define a new return code
> in SW_EXITINFO1[31:0], e.g. '2', with additional information in SW_EXITINFO2.
>
> In theory, an old-but-sane guest will interpret the unexpected return code as
> fatal to whatever triggered the #VMGEXIT, e.g. SIGBUS to userspace. Unfortunately
> Linux isn't sane because sev_es_ghcb_hv_call() assumes any non-'1' result means
> success, but that's trivial to fix and IMO should be fixed irrespective of where
> this goes.

One last thing (hopefully): Erdem pointed out that if the GCHB GPA (or any
derferenced pointers within the GHCB) is invalid or is set to a private GPA
(mostly in the context of SNP) then the VMM will likely have no choice but to
kill the guest in response to #VMGEXIT.

It's probably a good idea to add a blurb in one of the specs explicitly calling
out that #VMGEXIT can be executed from userspace, and that before returning to
uesrspace the guest kernel must always ensure that the GCHB points at a legal
GPA _and_ all primary fields are marked invalid.

2021-05-21 07:58:13

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On 5/20/21 2:16 PM, Sean Christopherson wrote:
> On Mon, May 17, 2021, Tom Lendacky wrote:
>> On 5/14/21 6:06 PM, Peter Gonda wrote:
>>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
>>>>
>>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
>>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued
>>>> from userspace, even though userspace (likely) can't update the GHCB,
>>>> don't allow userspace to be able to kill the guest.
>>>>
>>>> Return a #GP request through the GHCB when validation fails, rather than
>>>> terminating the guest.
>>>
>>> Is this a gap in the spec? I don't see anything that details what
>>> should happen if the correct fields for NAE are not set in the first
>>> couple paragraphs of section 4 'GHCB Protocol'.
>>
>> No, I don't think the spec needs to spell out everything like this. The
>> hypervisor is free to determine its course of action in this case.
>
> The hypervisor can decide whether to inject/return an error or kill the guest,
> but what errors can be returned and how they're returned absolutely needs to be
> ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> is the logical place to define said ABI.

For now, that is all we have for versions 1 and 2 of the spec. We can
certainly extend it in future versions if that is desired.

I would suggest starting a thread on what we would like to see in the next
version of the GHCB spec on the amd-sev-snp mailing list:

[email protected]

>
> For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
> completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
> which means if something does go awry KVM has just made debugging the guest that
> much harder, e.g. imagine the confusion that will ensue if the end result is a
> SIGBUS to userspace on CPUID.

I see the point you're making, but I would also say that we probably
wouldn't even boot successfully if the kernel can't handle, e.g., a CPUID
#VC properly. A lot of what could go wrong with required inputs, not the
values, but the required state being communicated, should have already
been ironed out during development of whichever OS is providing the SEV-ES
support.

>
> There needs to be an explicit error code for "you gave me bad data", otherwise
> we're signing ourselves up for future pain.

I'll make note of that for the next update to the spec and we can work on
it further during the spec review.

Thanks,
Tom

>
>> I suppose the spec could suggest a course of action, but I don't think the
>> spec should require a specific course of action.
>>
>> Thanks,
>> Tom
>>
>>>

2021-05-21 19:42:32

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On 5/20/21 3:22 PM, Sean Christopherson wrote:
> On Thu, May 20, 2021, Sean Christopherson wrote:
>> On Thu, May 20, 2021, Sean Christopherson wrote:
>>> On Mon, May 17, 2021, Tom Lendacky wrote:
>>>> On 5/14/21 6:06 PM, Peter Gonda wrote:
>>>>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
>>>>>>
>>>>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
>>>>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued
>>>>>> from userspace, even though userspace (likely) can't update the GHCB,
>>>>>> don't allow userspace to be able to kill the guest.
>>>>>>
>>>>>> Return a #GP request through the GHCB when validation fails, rather than
>>>>>> terminating the guest.
>>>>>
>>>>> Is this a gap in the spec? I don't see anything that details what
>>>>> should happen if the correct fields for NAE are not set in the first
>>>>> couple paragraphs of section 4 'GHCB Protocol'.
>>>>
>>>> No, I don't think the spec needs to spell out everything like this. The
>>>> hypervisor is free to determine its course of action in this case.
>>>
>>> The hypervisor can decide whether to inject/return an error or kill the guest,
>>> but what errors can be returned and how they're returned absolutely needs to be
>>> ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
>>> is the logical place to define said ABI.
>>>
>>> For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
>>> completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
>>> which means if something does go awry KVM has just made debugging the guest that
>>> much harder, e.g. imagine the confusion that will ensue if the end result is a
>>> SIGBUS to userspace on CPUID.
>>>
>>> There needs to be an explicit error code for "you gave me bad data", otherwise
>>> we're signing ourselves up for future pain.
>>
>> More concretely, I think the best course of action is to define a new return code
>> in SW_EXITINFO1[31:0], e.g. '2', with additional information in SW_EXITINFO2.
>>
>> In theory, an old-but-sane guest will interpret the unexpected return code as
>> fatal to whatever triggered the #VMGEXIT, e.g. SIGBUS to userspace. Unfortunately
>> Linux isn't sane because sev_es_ghcb_hv_call() assumes any non-'1' result means
>> success, but that's trivial to fix and IMO should be fixed irrespective of where
>> this goes.
>
> One last thing (hopefully): Erdem pointed out that if the GCHB GPA (or any
> derferenced pointers within the GHCB) is invalid or is set to a private GPA
> (mostly in the context of SNP) then the VMM will likely have no choice but to
> kill the guest in response to #VMGEXIT.
>
> It's probably a good idea to add a blurb in one of the specs explicitly calling
> out that #VMGEXIT can be executed from userspace, and that before returning to
> uesrspace the guest kernel must always ensure that the GCHB points at a legal
> GPA _and_ all primary fields are marked invalid.

Yes, the spec can be updated to include a "best practices" section for
OSes and Hypervisors to follow without actually having to update the
version of the GHCB spec, so that should be doable.

Thanks,
Tom

>

2021-05-27 19:02:24

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On Thu, May 20, 2021, Tom Lendacky wrote:
> On 5/20/21 2:16 PM, Sean Christopherson wrote:
> > On Mon, May 17, 2021, Tom Lendacky wrote:
> >> On 5/14/21 6:06 PM, Peter Gonda wrote:
> >>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
> >>>>
> >>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> >>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> >>>> from userspace, even though userspace (likely) can't update the GHCB,
> >>>> don't allow userspace to be able to kill the guest.
> >>>>
> >>>> Return a #GP request through the GHCB when validation fails, rather than
> >>>> terminating the guest.
> >>>
> >>> Is this a gap in the spec? I don't see anything that details what
> >>> should happen if the correct fields for NAE are not set in the first
> >>> couple paragraphs of section 4 'GHCB Protocol'.
> >>
> >> No, I don't think the spec needs to spell out everything like this. The
> >> hypervisor is free to determine its course of action in this case.
> >
> > The hypervisor can decide whether to inject/return an error or kill the guest,
> > but what errors can be returned and how they're returned absolutely needs to be
> > ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> > is the logical place to define said ABI.
>
> For now, that is all we have for versions 1 and 2 of the spec. We can
> certainly extend it in future versions if that is desired.
>
> I would suggest starting a thread on what we would like to see in the next
> version of the GHCB spec on the amd-sev-snp mailing list:
>
> [email protected]

Will do, but in the meantime, I don't think we should merge a fix of any kind
until there is consensus on what the VMM behavior will be. IMO, fixing this in
upstream is not urgent; I highly doubt anyone is deploying SEV-ES in production
using a bleeding edge KVM.

> > For example, "injecting" #GP if the guest botched the GHCB on #VMGEXIT(CPUID) is
> > completely nonsensical. As is, a Linux guest appears to blindly forward the #GP,
> > which means if something does go awry KVM has just made debugging the guest that
> > much harder, e.g. imagine the confusion that will ensue if the end result is a
> > SIGBUS to userspace on CPUID.
>
> I see the point you're making, but I would also say that we probably
> wouldn't even boot successfully if the kernel can't handle, e.g., a CPUID
> #VC properly.

I agree that GHCB bugs in the guest will be fatal, but that doesn't give the VMM
carte blanche to do whatever it wants given bad input.

> A lot of what could go wrong with required inputs, not the values, but the
> required state being communicated, should have already been ironed out during
> development of whichever OS is providing the SEV-ES support.

Yes, but better on the kernel never having a regression is a losing proposition.
And it doesn't even necessarily require a regression, e.g. an existing memory
corruption bug elsewhere in the guest kernel (that escaped qualification) could
corrupt the GHCB. If the GHCB is corrupted at runtime, the guest needs
well-defined semantics from the VMM so that the guest at least has a chance of
sanely handling the error. Handling in this case would mean an oops/panic, but
that's far, far better than a random pseudo-#GP that might not even be immediately
logged as a failure.

2021-07-21 13:56:38

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

Sean Christopherson <[email protected]> writes:

> On Thu, May 20, 2021, Tom Lendacky wrote:
>> On 5/20/21 2:16 PM, Sean Christopherson wrote:
>> > On Mon, May 17, 2021, Tom Lendacky wrote:
>> >> On 5/14/21 6:06 PM, Peter Gonda wrote:
>> >>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
>> >>>>
>> >>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
>> >>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued
>> >>>> from userspace, even though userspace (likely) can't update the GHCB,
>> >>>> don't allow userspace to be able to kill the guest.
>> >>>>
>> >>>> Return a #GP request through the GHCB when validation fails, rather than
>> >>>> terminating the guest.
>> >>>
>> >>> Is this a gap in the spec? I don't see anything that details what
>> >>> should happen if the correct fields for NAE are not set in the first
>> >>> couple paragraphs of section 4 'GHCB Protocol'.
>> >>
>> >> No, I don't think the spec needs to spell out everything like this. The
>> >> hypervisor is free to determine its course of action in this case.
>> >
>> > The hypervisor can decide whether to inject/return an error or kill the guest,
>> > but what errors can be returned and how they're returned absolutely needs to be
>> > ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
>> > is the logical place to define said ABI.
>>
>> For now, that is all we have for versions 1 and 2 of the spec. We can
>> certainly extend it in future versions if that is desired.
>>
>> I would suggest starting a thread on what we would like to see in the next
>> version of the GHCB spec on the amd-sev-snp mailing list:
>>
>> [email protected]
>
> Will do, but in the meantime, I don't think we should merge a fix of any kind
> until there is consensus on what the VMM behavior will be. IMO, fixing this in
> upstream is not urgent; I highly doubt anyone is deploying SEV-ES in production
> using a bleeding edge KVM.

Sorry for resurrecting this old thread but were there any deveopments
here? I may have missed something but last time I've checked a single
"rep; vmmcall" from userspace was still crashing the guest. The issue,
however, doesn't seem to reproduce with Vmware ESXi which probably means
they're just skipping the instruction and not even injecting #GP (AFAIR,
I don't have an environment to re-test handy).

--
Vitaly

2021-07-21 21:12:08

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: SVM: Do not terminate SEV-ES guests on GHCB validation failure

On Wed, Jul 21, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <[email protected]> writes:
>
> > On Thu, May 20, 2021, Tom Lendacky wrote:
> >> On 5/20/21 2:16 PM, Sean Christopherson wrote:
> >> > On Mon, May 17, 2021, Tom Lendacky wrote:
> >> >> On 5/14/21 6:06 PM, Peter Gonda wrote:
> >> >>> On Fri, May 14, 2021 at 1:22 PM Tom Lendacky <[email protected]> wrote:
> >> >>>>
> >> >>>> Currently, an SEV-ES guest is terminated if the validation of the VMGEXIT
> >> >>>> exit code and parameters fail. Since the VMGEXIT instruction can be issued
> >> >>>> from userspace, even though userspace (likely) can't update the GHCB,
> >> >>>> don't allow userspace to be able to kill the guest.
> >> >>>>
> >> >>>> Return a #GP request through the GHCB when validation fails, rather than
> >> >>>> terminating the guest.
> >> >>>
> >> >>> Is this a gap in the spec? I don't see anything that details what
> >> >>> should happen if the correct fields for NAE are not set in the first
> >> >>> couple paragraphs of section 4 'GHCB Protocol'.
> >> >>
> >> >> No, I don't think the spec needs to spell out everything like this. The
> >> >> hypervisor is free to determine its course of action in this case.
> >> >
> >> > The hypervisor can decide whether to inject/return an error or kill the guest,
> >> > but what errors can be returned and how they're returned absolutely needs to be
> >> > ABI between guest and host, and to make the ABI vendor agnostic the GHCB spec
> >> > is the logical place to define said ABI.
> >>
> >> For now, that is all we have for versions 1 and 2 of the spec. We can
> >> certainly extend it in future versions if that is desired.
> >>
> >> I would suggest starting a thread on what we would like to see in the next
> >> version of the GHCB spec on the amd-sev-snp mailing list:
> >>
> >> [email protected]
> >
> > Will do, but in the meantime, I don't think we should merge a fix of any kind
> > until there is consensus on what the VMM behavior will be. IMO, fixing this in
> > upstream is not urgent; I highly doubt anyone is deploying SEV-ES in production
> > using a bleeding edge KVM.
>
> Sorry for resurrecting this old thread but were there any deveopments
> here? I may have missed something but last time I've checked a single
> "rep; vmmcall" from userspace was still crashing the guest.

I don't think it went anywhere, I completely forgot about this. I'll bump this
back to the top of my todo list, unless someone else wants the honors :-)

> The issue, however, doesn't seem to reproduce with Vmware ESXi which probably
> means they're just skipping the instruction and not even injecting #GP
> (AFAIR, I don't have an environment to re-test handy).