2021-01-12 17:39:08

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions

On Tue, Jan 12, 2021, Wei Huang wrote:
> From: Bandan Das <[email protected]>
>
> While running VM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
> CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
> before checking VMCB's instruction intercept.

It would be very helpful to list exactly which CPUs are/aren't affected, even if
that just means stating something like "all CPUs before XYZ". Given patch 2/2,
I assume it's all CPUs without the new CPUID flag?


2021-01-12 18:04:48

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions

On Tue, Jan 12, 2021, Sean Christopherson wrote:
> On Tue, Jan 12, 2021, Wei Huang wrote:
> > From: Bandan Das <[email protected]>
> >
> > While running VM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
> > CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
> > before checking VMCB's instruction intercept.
>
> It would be very helpful to list exactly which CPUs are/aren't affected, even if
> that just means stating something like "all CPUs before XYZ". Given patch 2/2,
> I assume it's all CPUs without the new CPUID flag?

Ah, despite calling this an 'errata', the bad behavior is explicitly documented
in the APM, i.e. it's an architecture bug, not a silicon bug.

Can you reword the changelog to make it clear that the premature #GP is the
correct architectural behavior for CPUs without the new CPUID flag?

2021-01-12 19:01:00

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions

On Tue, Jan 12, 2021 at 9:59 AM Sean Christopherson <[email protected]> wrote:
>
> On Tue, Jan 12, 2021, Sean Christopherson wrote:
> > On Tue, Jan 12, 2021, Wei Huang wrote:
> > > From: Bandan Das <[email protected]>
> > >
> > > While running VM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
> > > CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
> > > before checking VMCB's instruction intercept.
> >
> > It would be very helpful to list exactly which CPUs are/aren't affected, even if
> > that just means stating something like "all CPUs before XYZ". Given patch 2/2,
> > I assume it's all CPUs without the new CPUID flag?
>
> Ah, despite calling this an 'errata', the bad behavior is explicitly documented
> in the APM, i.e. it's an architecture bug, not a silicon bug.
>
> Can you reword the changelog to make it clear that the premature #GP is the
> correct architectural behavior for CPUs without the new CPUID flag?

Andrew Cooper points out that there may be a nicer workaround. Make
sure that the SMRAM and HT region (FFFD00000000 - FFFFFFFFFFFF) are
marked as reserved in the guest, too.

--Andy

2021-01-13 05:07:11

by Wei Huang

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions



On 1/12/21 11:59 AM, Sean Christopherson wrote:
> On Tue, Jan 12, 2021, Sean Christopherson wrote:
>> On Tue, Jan 12, 2021, Wei Huang wrote:
>>> From: Bandan Das <[email protected]>
>>>
>>> While running VM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
>>> CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
>>> before checking VMCB's instruction intercept.
>>
>> It would be very helpful to list exactly which CPUs are/aren't affected, even if
>> that just means stating something like "all CPUs before XYZ". Given patch 2/2,
>> I assume it's all CPUs without the new CPUID flag?

This behavior was dated back to fairly old CPUs. It is fair to assume
that _most_ CPUs without this CPUID bit can demonstrate such behavior.

>
> Ah, despite calling this an 'errata', the bad behavior is explicitly documented
> in the APM, i.e. it's an architecture bug, not a silicon bug.
>
> Can you reword the changelog to make it clear that the premature #GP is the
> correct architectural behavior for CPUs without the new CPUID flag?

Sure, will do in the next version.

>

2021-01-13 05:17:49

by Wei Huang

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions



On 1/12/21 12:58 PM, Andy Lutomirski wrote:
> Andrew Cooper points out that there may be a nicer workaround. Make
> sure that the SMRAM and HT region (FFFD00000000 - FFFFFFFFFFFF) are
> marked as reserved in the guest, too.

In theory this proposed solution can avoid intercepting #GP. But in
reality SMRAM regions can be different on different machines. So this
solution can break after VM migration.

2021-01-13 12:43:32

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions

On 12/01/21 18:59, Sean Christopherson wrote:
>> It would be very helpful to list exactly which CPUs are/aren't affected, even if
>> that just means stating something like "all CPUs before XYZ". Given patch 2/2,
>> I assume it's all CPUs without the new CPUID flag?
> Ah, despite calling this an 'errata', the bad behavior is explicitly documented
> in the APM, i.e. it's an architecture bug, not a silicon bug.

I would still call it an errata for the case when virtualized
VMSAVE/VMLOAD is enabled (and therefore VMLOAD intercepts are disabled).
In that case, the problem is that the GPA does not go through NPT
before it is checked against *host* reserved memory regions.

In fact I hope that, on processors that have the fix, VMSAVE/VMLOAD
from guest mode _does_ check the GPA after it's been translated!

Paolo

2021-01-14 11:45:27

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: x86: Add emulation support for #GP triggered by VM instructions

On Tue, 2021-01-12 at 23:15 -0600, Wei Huang wrote:
>
> On 1/12/21 12:58 PM, Andy Lutomirski wrote:
> > Andrew Cooper points out that there may be a nicer workaround. Make
> > sure that the SMRAM and HT region (FFFD00000000 - FFFFFFFFFFFF) are
> > marked as reserved in the guest, too.
>
> In theory this proposed solution can avoid intercepting #GP. But in
> reality SMRAM regions can be different on different machines. So this
> solution can break after VM migration.
>
I should add to this, that on my 3970X,
I just noticed that the problematic SMRAM region moved on
its own (likely due to the fact that I moved some pcie cards around recently).

Best regards,
Maxim Levitsky