2023-11-23 19:07:05

by Maciej S. Szmigiero

[permalink] [raw]
Subject: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

From: "Maciej S. Szmigiero" <[email protected]>

Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.

Since KVM CPU caps are initialized from the kernel boot CPU features this
makes the XSAVES feature also unavailable for KVM guests in this case, even
though they might want to decide on their own whether they are affected by
this errata.

Allow KVM guests to make such decision by setting the XSAVES KVM CPU
capability bit based on the actual CPU capability.

This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
one vCPU on Zen1/2 CPUs.

Signed-off-by: Maciej S. Szmigiero <[email protected]>
---
arch/x86/kvm/cpuid.c | 16 ++++++++++++++++
arch/x86/kvm/svm/svm.c | 5 ++++-
2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index dda6fc4cfae8..a8820460163a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -679,6 +679,22 @@ void kvm_set_cpu_caps(void)
F(AMX_COMPLEX)
);

+ /*
+ * It is possible that CPU supports XSAVES but the host kernel decided
+ * not to use it, for example due to AMD Erratum 1386, and cleared the
+ * relevant CPU feature bit.
+ *
+ * In such case let the guest decide on it own whether to make use of
+ * this feature.
+ */
+ if (boot_cpu_data.cpuid_level >= XSTATE_CPUID) {
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
+ if (eax & F(XSAVES))
+ kvm_cpu_cap_set(X86_FEATURE_XSAVES);
+ }
+
kvm_cpu_cap_mask(CPUID_D_1_EAX,
F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES) | f_xfd
);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 712146312358..3cc36710eb21 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4306,9 +4306,12 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
* whether it's advertised to the guest so that KVM context switches
* XSS on VM-Enter/VM-Exit. Failure to do so would effectively give
* the guest read/write access to the host's XSS.
+ *
+ * Make sure to check for XSAVES in KVM CPU capabilities, since the
+ * boot CPU feature bit might be disabled due to Erratum 1386.
*/
if (boot_cpu_has(X86_FEATURE_XSAVE) &&
- boot_cpu_has(X86_FEATURE_XSAVES) &&
+ kvm_cpu_cap_has(X86_FEATURE_XSAVES) &&
guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
kvm_governed_feature_set(vcpu, X86_FEATURE_XSAVES);


2023-11-27 17:24:22

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <[email protected]>
>
> Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
> kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
>
> Since KVM CPU caps are initialized from the kernel boot CPU features this
> makes the XSAVES feature also unavailable for KVM guests in this case, even
> though they might want to decide on their own whether they are affected by
> this errata.
>
> Allow KVM guests to make such decision by setting the XSAVES KVM CPU
> capability bit based on the actual CPU capability

This is not generally safe, as the guest can make such a decision if and only if
the Family/Model/Stepping information is reasonably accurate.

> This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
> one vCPU on Zen1/2 CPUs.

How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
XSAVES based on FMS?

2023-11-27 17:47:37

by Maciej S. Szmigiero

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On 27.11.2023 18:24, Sean Christopherson wrote:
> On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <[email protected]>
>>
>> Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
>> kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
>>
>> Since KVM CPU caps are initialized from the kernel boot CPU features this
>> makes the XSAVES feature also unavailable for KVM guests in this case, even
>> though they might want to decide on their own whether they are affected by
>> this errata.
>>
>> Allow KVM guests to make such decision by setting the XSAVES KVM CPU
>> capability bit based on the actual CPU capability
>
> This is not generally safe, as the guest can make such a decision if and only if
> the Family/Model/Stepping information is reasonably accurate.

If one lies to the guest about the CPU it is running on then obviously
things may work non-optimally.

>> This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
>> one vCPU on Zen1/2 CPUs.
>
> How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
> XSAVES based on FMS?

The hypercall from L2 Windows to L1 Hyper-V asking to boot the first AP
returns HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR.

It's apparently a "should never happen" scenario for Windows since it
crashes soon after.

That's why uniprocessor configurations aren't broken - the BSP
doesn't need to be specifically booted by the L2 guest.

Unfortunately, Windows Server 2016 mainstream support has ended in
Jan 2022 so it is only getting security updates.
And you can't really break into an OS that you can't even start.

Thanks,
Maciej

2023-11-28 16:48:30

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Mon, Nov 27, 2023, Maciej S. Szmigiero wrote:
> On 27.11.2023 18:24, Sean Christopherson wrote:
> > On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
> > > From: "Maciej S. Szmigiero" <[email protected]>
> > >
> > > Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
> > > kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
> > >
> > > Since KVM CPU caps are initialized from the kernel boot CPU features this
> > > makes the XSAVES feature also unavailable for KVM guests in this case, even
> > > though they might want to decide on their own whether they are affected by
> > > this errata.
> > >
> > > Allow KVM guests to make such decision by setting the XSAVES KVM CPU
> > > capability bit based on the actual CPU capability
> >
> > This is not generally safe, as the guest can make such a decision if and only if
> > the Family/Model/Stepping information is reasonably accurate.
>
> If one lies to the guest about the CPU it is running on then obviously
> things may work non-optimally.

But this isn't about running optimally, it's about functional correctness. And
"lying" to the guest about F/M/S is extremely common.

> > > This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
> > > one vCPU on Zen1/2 CPUs.
> >
> > How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
> > XSAVES based on FMS?
>
> The hypercall from L2 Windows to L1 Hyper-V asking to boot the first AP
> returns HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR.

If it's just about CPUID enumeration, then userspace can simply stuff the XSAVES
feature flag. This is not something that belongs in KVM, because this is safe if
and only if F/M/S is accurate and the guest is actually aware of the erratum (or
will not actually use XSAVES for other reasons), neither of which KVM can guarantee.

2023-11-28 18:04:24

by Maciej S. Szmigiero

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On 28.11.2023 17:48, Sean Christopherson wrote:
> On Mon, Nov 27, 2023, Maciej S. Szmigiero wrote:
>> On 27.11.2023 18:24, Sean Christopherson wrote:
>>> On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
>>>> From: "Maciej S. Szmigiero" <[email protected]>
>>>>
>>>> Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
>>>> kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
>>>>
>>>> Since KVM CPU caps are initialized from the kernel boot CPU features this
>>>> makes the XSAVES feature also unavailable for KVM guests in this case, even
>>>> though they might want to decide on their own whether they are affected by
>>>> this errata.
>>>>
>>>> Allow KVM guests to make such decision by setting the XSAVES KVM CPU
>>>> capability bit based on the actual CPU capability
>>>
>>> This is not generally safe, as the guest can make such a decision if and only if
>>> the Family/Model/Stepping information is reasonably accurate.
>>
>> If one lies to the guest about the CPU it is running on then obviously
>> things may work non-optimally.
>
> But this isn't about running optimally, it's about functional correctness. And
> "lying" to the guest about F/M/S is extremely common.
>
>>>> This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
>>>> one vCPU on Zen1/2 CPUs.
>>>
>>> How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
>>> XSAVES based on FMS?
>>
>> The hypercall from L2 Windows to L1 Hyper-V asking to boot the first AP
>> returns HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR.
>
> If it's just about CPUID enumeration, then userspace can simply stuff the XSAVES
> feature flag. This is not something that belongs in KVM, because this is safe if
> and only if F/M/S is accurate and the guest is actually aware of the erratum (or
> will not actually use XSAVES for other reasons), neither of which KVM can guarantee.

In other words, your suggestion is that QEMU (or other VMM) not KVM
should be the one setting the XSAVES CPUID bit back, correct?

I don't think this would work with the current KVM code since it seems
to make various decisions depending on presence of XSAVES bit in KVM
caps rather than the guest CPUID and on boot_cpu_has(XSAVES) - one of
such code blocks was even modified by this patch.

It even says in the comment above that code that it is not possible to
actually disable XSAVES without disabling all other variants on SVM so
this has to be enabled if CPU supports it to switch the XSS MSR at
guest entry/exit (in this case it looks harmless since Zen1/2
supposedly don't support any supervisor extended states).

So it looks like we would need changes to *both* KVM and QEMU to
restore the XSAVES support this way.

Thanks,
Maciej

2023-11-28 23:43:38

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Tue, Nov 28, 2023, Maciej S. Szmigiero wrote:
> On 28.11.2023 17:48, Sean Christopherson wrote:
> > On Mon, Nov 27, 2023, Maciej S. Szmigiero wrote:
> > > On 27.11.2023 18:24, Sean Christopherson wrote:
> > > > On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
> > > > > From: "Maciej S. Szmigiero" <[email protected]>
> > > > >
> > > > > Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
> > > > > kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
> > > > >
> > > > > Since KVM CPU caps are initialized from the kernel boot CPU features this
> > > > > makes the XSAVES feature also unavailable for KVM guests in this case, even
> > > > > though they might want to decide on their own whether they are affected by
> > > > > this errata.
> > > > >
> > > > > Allow KVM guests to make such decision by setting the XSAVES KVM CPU
> > > > > capability bit based on the actual CPU capability
> > > >
> > > > This is not generally safe, as the guest can make such a decision if and only if
> > > > the Family/Model/Stepping information is reasonably accurate.
> > >
> > > If one lies to the guest about the CPU it is running on then obviously
> > > things may work non-optimally.
> >
> > But this isn't about running optimally, it's about functional correctness. And
> > "lying" to the guest about F/M/S is extremely common.
> >
> > > > > This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
> > > > > one vCPU on Zen1/2 CPUs.
> > > >
> > > > How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
> > > > XSAVES based on FMS?
> > >
> > > The hypercall from L2 Windows to L1 Hyper-V asking to boot the first AP
> > > returns HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR.
> >
> > If it's just about CPUID enumeration, then userspace can simply stuff the XSAVES
> > feature flag. This is not something that belongs in KVM, because this is safe if
> > and only if F/M/S is accurate and the guest is actually aware of the erratum (or
> > will not actually use XSAVES for other reasons), neither of which KVM can guarantee.
>
> In other words, your suggestion is that QEMU (or other VMM) not KVM
> should be the one setting the XSAVES CPUID bit back, correct?
>
> I don't think this would work with the current KVM code since it seems
> to make various decisions depending on presence of XSAVES bit in KVM
> caps rather than the guest CPUID and on boot_cpu_has(XSAVES) - one of
> such code blocks was even modified by this patch.
>
> It even says in the comment above that code that it is not possible to
> actually disable XSAVES without disabling all other variants on SVM so
> this has to be enabled if CPU supports it to switch the XSS MSR at
> guest entry/exit (in this case it looks harmless since Zen1/2
> supposedly don't support any supervisor extended states).
>
> So it looks like we would need changes to *both* KVM and QEMU to
> restore the XSAVES support this way.

I'm not suggesting we restore XSAVES support, I'm suggesting that _if_ someone
wants to hack their setup to let the guest use broken hardware, then they should
do that in userspace or in an a private kernel, not in upstream KVM.

2023-11-30 19:15:51

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Mon, 2023-11-27 at 09:24 -0800, Sean Christopherson wrote:
> On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
> > From: "Maciej S. Szmigiero" <[email protected]>
> >
> > Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
> > kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
> >
> > Since KVM CPU caps are initialized from the kernel boot CPU features this
> > makes the XSAVES feature also unavailable for KVM guests in this case, even
> > though they might want to decide on their own whether they are affected by
> > this errata.
> >
> > Allow KVM guests to make such decision by setting the XSAVES KVM CPU
> > capability bit based on the actual CPU capability
>
> This is not generally safe, as the guest can make such a decision if and only if
> the Family/Model/Stepping information is reasonably accurate.

Another thing that really worries me is that the XSAVES errata is really nasty one -
AFAIK it silently corrupts some registers.

Is it better to let a broken CPU boot a broken OS (OS which demands XSAVES blindly),
and let a silent data corruption happen than refuse to boot it completely?

I mean I understand that it is technically OS fault in this case (assuming that we
do provide it the correct CPU family info), but still this seems like the wrong thing to do.

I guess this is one of those few cases when it makes sense for the userspace to
override KVM's CPUID caps and force a feature - in this case at least that
won't be KVM's fault.


Best regards,
Maxim Levitsky

>
> > This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
> > one vCPU on Zen1/2 CPUs.
>
> How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
> XSAVES based on FMS?
>


2023-11-30 22:01:20

by Maciej S. Szmigiero

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On 30.11.2023 18:24, Maxim Levitsky wrote:
> On Mon, 2023-11-27 at 09:24 -0800, Sean Christopherson wrote:
>> On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
>>> From: "Maciej S. Szmigiero" <[email protected]>
>>>
>>> Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
>>> kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.
>>>
>>> Since KVM CPU caps are initialized from the kernel boot CPU features this
>>> makes the XSAVES feature also unavailable for KVM guests in this case, even
>>> though they might want to decide on their own whether they are affected by
>>> this errata.
>>>
>>> Allow KVM guests to make such decision by setting the XSAVES KVM CPU
>>> capability bit based on the actual CPU capability
>>
>> This is not generally safe, as the guest can make such a decision if and only if
>> the Family/Model/Stepping information is reasonably accurate.
>
> Another thing that really worries me is that the XSAVES errata is really nasty one -
> AFAIK it silently corrupts some registers.

It's not unconditional state corruption, but corruption in specific set
of conditions, all of which have to be true for it to occur [1]:
* All XMM registers were restored to the initialization value by the most
recent XRSTORS instruction because the XSTATE_BV[SSE] bit was clear.
* The state save area for the XMM registers does not contain the
initialization state.
* The value in the XMM registers match the initialization value when the
XSAVES instruction is executed.
* The MXCSR register has been modified to a value different from the
initialization value since the most recent XRSTORS instruction.

According to [2] this issue was fixed in the microcode update released on
2022-08-09.
[2] also says it is not present anymore in (at least) version 0x08301055.

> Is it better to let a broken CPU boot a broken OS (OS which demands XSAVES blindly),
> and let a silent data corruption happen than refuse to boot it completely?

It is possible that, for example, Windows only uses safe subset of this
instruction or just verifies its presence but doesn't actually use it -
it's Hyper-V (L1) that throws this HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR
but I presume it's Windows (L2) kernel which chooses which XSAVE-family
variant to actually use.

At least in the Linux guest case the guest won't use XSAVES anyway due
to this errata.

For other guests we also don't make situation any worse than on bare metal
- if they would use XSAVES anyway in KVM they would do it when running on
bare metal too.

> I mean I understand that it is technically OS fault in this case (assuming that we
> do provide it the correct CPU family info), but still this seems like the wrong thing to do.
>
> I guess this is one of those few cases when it makes sense for the userspace to
> override KVM's CPUID caps and force a feature - in this case at least that
> won't be KVM's fault.

I am not against making the decision in QEMU instead of doing this in KVM,
but as I said to Sean it looks like this will still require some KVM
changes since KVM seems to make various decisions depending on presence
of XSAVES bit in KVM caps and boot_cpu_has(XSAVES) rather that exclusively
based on what VMM has set in CPUID.

That's why some KVM changes will be necessary even if the actual decision
logic will be in QEMU.

My point is to make this work out-of-the box for QEMU "-cpu host" and
similar CPU models that have support for XSAVES.

The reason for this is simple:
It's not like such Windows guests throw a big error screen saying
"Please enable XSAVES or disable XSAVEC for successful boot".

Instead, they simply hang at boot leaving the user wondering what could
be wrong.

Users can get very frustrated with situation like this since they don't
know what to do - on Intel side of things look for example how people
are unable to boot recent Windows versions (both server and client) on
Intel Core 12th gen or later in KVM and how people still try random
things to fix it [3] (it's Hyper-V being picky about extended topology
information in CPUID btw).

Cloud providers often know which guest type is going to be launched in
a VM so there's no problem adding a extra QEMU "-cpu" flag for
particular cloud VM.

But for individual users having to guess which magic flags are necessary
to make particular KVM guest work makes for miserable user experience.

I think that if particular guest would work on bare metal it should
work on "-cpu host" too - no tinkering should be required for such
basic functionality as being able to successfully finish booting.

>
> Best regards,
> Maxim Levitsky

Thanks,
Maciej

[1]: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/56323-PUB_1_01.pdf
[2]: https://google.github.io/security-research/pocs/cpus/errata/amd/1386/
[3]: https://lore.kernel.org/kvm/MN2PR12MB3023F67FF37889AB3E8885F2A0849@MN2PR12MB3023.namprd12.prod.outlook.com/
https://bugzilla.kernel.org/show_bug.cgi?id=217307
https://forums.unraid.net/topic/131838-windows-11-virtual-machine-platform-wsl2-boot-loop/

2023-11-30 23:58:06

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Thu, Nov 30, 2023 at 2:00 PM Maciej S. Szmigiero
<[email protected]> wrote:
> I think that if particular guest would work on bare metal it should
> work on "-cpu host" too - no tinkering should be required for such
> basic functionality as being able to successfully finish booting.

I disagree. Let's not focus on one particular erratum. If, for
whatever reason, the host kernel is booted with "noxsaves," I don't
think KVM should allow a guest to bypass that directive.

2023-12-01 16:05:36

by Maciej S. Szmigiero

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On 1.12.2023 00:57, Jim Mattson wrote:
> On Thu, Nov 30, 2023 at 2:00 PM Maciej S. Szmigiero
> <[email protected]> wrote:
>> I think that if particular guest would work on bare metal it should
>> work on "-cpu host" too - no tinkering should be required for such
>> basic functionality as being able to successfully finish booting.
>
> I disagree. Let's not focus on one particular erratum. If, for
> whatever reason, the host kernel is booted with "noxsaves," I don't
> think KVM should allow a guest to bypass that directive.

Good point, I agree that if user explicitly disabled XSAVES on
the host via this parameter then it should remain disabled for
the whole host kernel, including KVM.

This could be achieved by either adding special "noxsaves" flag
or by setting X86_BUG_XSAVES_AVOID instead of clearing
X86_FEATURE_XSAVES on these CPUs.

Then the core kernel XSAVES code would check for lack of
X86_BUG_XSAVES_AVOID (in addition to checking for
presence of X86_FEATURE_XSAVES) while KVM would keep using
only X86_FEATURE_XSAVES.

Thanks,
Maciej

2023-12-06 22:45:58

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

On Fri, Dec 1, 2023 at 5:05 PM Maciej S. Szmigiero
<[email protected]> wrote:
>
> On 1.12.2023 00:57, Jim Mattson wrote:
> > On Thu, Nov 30, 2023 at 2:00 PM Maciej S. Szmigiero
> > <[email protected]> wrote:
> >> I think that if particular guest would work on bare metal it should
> >> work on "-cpu host" too - no tinkering should be required for such
> >> basic functionality as being able to successfully finish booting.
> >
> > I disagree. Let's not focus on one particular erratum. If, for
> > whatever reason, the host kernel is booted with "noxsaves," I don't
> > think KVM should allow a guest to bypass that directive.
>
> This could be achieved by either adding special "noxsaves" flag
> or by setting X86_BUG_XSAVES_AVOID instead of clearing
> X86_FEATURE_XSAVES on these CPUs.
>
> Then the core kernel XSAVES code would check for lack of
> X86_BUG_XSAVES_AVOID (in addition to checking for
> presence of X86_FEATURE_XSAVES) while KVM would keep using
> only X86_FEATURE_XSAVES.

This is feasible, on the other hand the erratum is pretty bad. Since
the workaround is easy (just disable XSAVEC; and maybe XGETBV1 as
well?), you could just do a printk_ratelimited() on the write to
HV_X64_MSR_GUEST_OS_ID, in particular if:

1) guest CPUID has XSAVEC and SVM

2) host CPUID does not have XSAVES

3) guest OS id indicates Windows Server 10.x (which is 2016 to 2022),
which should be 0x0001040A00??????.

Paolo