LinuxLists.cc - Re: Possible 5.19 regression for systems with 52-bit physical address support

2022-07-28 14:06:44

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
> Hi Sean,
>
> With this patch applied, AMD processors that support 52-bit physical

Sorry, threading got messed up. This is in reference to:

https://lore.kernel.org/lkml/[email protected]/#r

commit 8b9e74bfbf8c7020498a9ea600bd4c0f1915134d
Author: Sean Christopherson <[email protected]>
Date: Wed Apr 20 00:27:47 2022 +0000

KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled

> address will result in MMIO caching being disabled. This ends up
> breaking SEV-ES and SNP, since they rely on the MMIO reserved bit to
> generate the appropriate NAE MMIO exit event.
>
> This failure can also be reproduced on Milan by disabling mmio_caching
> via KVM module parameter.
>
> In the case of AMD, guests use a separate physical address range that
> and so there are still reserved bits available to make use of the MMIO
> caching. This adjustment happens in svm_adjust_mmio_mask(), but since
> mmio_caching_enabled flag is 0, any attempts to update masks get
> ignored by kvm_mmu_set_mmio_spte_mask().
>
> Would adding 'force' parameter to kvm_mmu_set_mmio_spte_mask() that
> svm_adjust_mmio_mask() can set to ignore enable_mmio_caching be
> reasonable fix, or should we take a different approach?
>
> Thanks!
>
> -Mike

2022-07-28 15:01:11

by Sean Christopherson

[permalink] [raw]

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On Thu, Jul 28, 2022, Michael Roth wrote:
> On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
> > Hi Sean,
> >
> > With this patch applied, AMD processors that support 52-bit physical
>
> Sorry, threading got messed up. This is in reference to:
>
> https://lore.kernel.org/lkml/[email protected]/#r
>
> commit 8b9e74bfbf8c7020498a9ea600bd4c0f1915134d
> Author: Sean Christopherson <[email protected]>
> Date: Wed Apr 20 00:27:47 2022 +0000
>
> KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled

Oh crud. I suspect I also broke EPT with MAXPHYADDR=52; the initial
kvm_mmu_reset_all_pte_masks() will clear the flag, and it won't get set back to
true even though EPT can generate a reserved bit fault.

> > address will result in MMIO caching being disabled. This ends up
> > breaking SEV-ES and SNP, since they rely on the MMIO reserved bit to
> > generate the appropriate NAE MMIO exit event.
> >
> > This failure can also be reproduced on Milan by disabling mmio_caching
> > via KVM module parameter.

Hrm, this is a separate bug of sorts. SEV-ES (and later) needs to have an explicit
check the MMIO caching is enabled, e.g. my bug aside, if KVM can't use MMIO caching
due to the location of the C-bit, then SEV-ES must be disabled.

Speaking of which, what prevents hardware (firmware?) from configuring the C-bit
position to be bit 51 and thus preventing KVM from generating the reserved #NPF?

> > In the case of AMD, guests use a separate physical address range that
> > and so there are still reserved bits available to make use of the MMIO
> > caching. This adjustment happens in svm_adjust_mmio_mask(), but since
> > mmio_caching_enabled flag is 0, any attempts to update masks get
> > ignored by kvm_mmu_set_mmio_spte_mask().
> >
> > Would adding 'force' parameter to kvm_mmu_set_mmio_spte_mask() that
> > svm_adjust_mmio_mask() can set to ignore enable_mmio_caching be
> > reasonable fix, or should we take a different approach?

Different approach. To fix the bug with enable_mmio_caching not being set back to
true when a vendor-specific mask allows caching, I believe the below will do the
trick.

The SEV-ES dependency is easy to solve, but will require a few patches in order
to get the necessary ordering; svm_adjust_mmio_mask() is currently called _after_
SEV-ES is configured.

I'll test (as much as I can, I don't think we have platforms with MAXPHYADDR=52)
and get a series sent out later today.

diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 7314d27d57a4..a57add994b8d 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -19,8 +19,9 @@
#include <asm/memtype.h>
#include <asm/vmx.h>

-bool __read_mostly enable_mmio_caching = true;
-module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
+bool __read_mostly enable_mmio_caching;
+static bool __read_mostly __enable_mmio_caching = true;
+module_param_named(mmio_caching, __enable_mmio_caching, bool, 0444);

u64 __read_mostly shadow_host_writable_mask;
u64 __read_mostly shadow_mmu_writable_mask;
@@ -340,6 +341,8 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
BUG_ON((u64)(unsigned)access_mask != access_mask);
WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);

+ enable_mmio_caching = __enable_mmio_caching;
+
if (!enable_mmio_caching)
mmio_value = 0;

2022-07-28 15:46:14

by Sean Christopherson

[permalink] [raw]

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On Thu, Jul 28, 2022, Sean Christopherson wrote:
> On Thu, Jul 28, 2022, Michael Roth wrote:
> > On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
> Different approach. To fix the bug with enable_mmio_caching not being set back to
> true when a vendor-specific mask allows caching, I believe the below will do the
> trick.

...

> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 7314d27d57a4..a57add994b8d 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -19,8 +19,9 @@
> #include <asm/memtype.h>
> #include <asm/vmx.h>
>
> -bool __read_mostly enable_mmio_caching = true;
> -module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
> +bool __read_mostly enable_mmio_caching;
> +static bool __read_mostly __enable_mmio_caching = true;
> +module_param_named(mmio_caching, __enable_mmio_caching, bool, 0444);
>
> u64 __read_mostly shadow_host_writable_mask;
> u64 __read_mostly shadow_mmu_writable_mask;
> @@ -340,6 +341,8 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
> BUG_ON((u64)(unsigned)access_mask != access_mask);
> WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
>
> + enable_mmio_caching = __enable_mmio_caching;

This isn't ideal as the value used by KVM won't be reflected in the module param.
The basic approach is sound, but KVM should snapshot the original value of the module
param and "reset" to that.

> +
> if (!enable_mmio_caching)
> mmio_value = 0;
>
>

2022-07-28 16:10:39

by Tom Lendacky

[permalink] [raw]

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On 7/28/22 09:56, Sean Christopherson wrote:
> On Thu, Jul 28, 2022, Michael Roth wrote:
>> On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
>>> Hi Sean,
>>>
>>> With this patch applied, AMD processors that support 52-bit physical
>>
>> Sorry, threading got messed up. This is in reference to:
>>
>> https://lore.kernel.org/lkml/[email protected]/#r
>>
>> commit 8b9e74bfbf8c7020498a9ea600bd4c0f1915134d
>> Author: Sean Christopherson <[email protected]>
>> Date: Wed Apr 20 00:27:47 2022 +0000
>>
>> KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled
>
> Oh crud. I suspect I also broke EPT with MAXPHYADDR=52; the initial
> kvm_mmu_reset_all_pte_masks() will clear the flag, and it won't get set back to
> true even though EPT can generate a reserved bit fault.
>
>>> address will result in MMIO caching being disabled. This ends up
>>> breaking SEV-ES and SNP, since they rely on the MMIO reserved bit to
>>> generate the appropriate NAE MMIO exit event.
>>>
>>> This failure can also be reproduced on Milan by disabling mmio_caching
>>> via KVM module parameter.
>
> Hrm, this is a separate bug of sorts. SEV-ES (and later) needs to have an explicit
> check the MMIO caching is enabled, e.g. my bug aside, if KVM can't use MMIO caching
> due to the location of the C-bit, then SEV-ES must be disabled.
>
> Speaking of which, what prevents hardware (firmware?) from configuring the C-bit
> position to be bit 51 and thus preventing KVM from generating the reserved #NPF?

On the hypervisor side, there is more than a single bit of physical
addressing reduction when memory encryption is enabled. So even when the
C-bit position is bit 51, some number of bits below 51 are reserved and
will cause the reserved #NPF.

Thanks,
Tom

>
>>> In the case of AMD, guests use a separate physical address range that
>>> and so there are still reserved bits available to make use of the MMIO
>>> caching. This adjustment happens in svm_adjust_mmio_mask(), but since
>>> mmio_caching_enabled flag is 0, any attempts to update masks get
>>> ignored by kvm_mmu_set_mmio_spte_mask().
>>>
>>> Would adding 'force' parameter to kvm_mmu_set_mmio_spte_mask() that
>>> svm_adjust_mmio_mask() can set to ignore enable_mmio_caching be
>>> reasonable fix, or should we take a different approach?
>
> Different approach. To fix the bug with enable_mmio_caching not being set back to
> true when a vendor-specific mask allows caching, I believe the below will do the
> trick.
>
> The SEV-ES dependency is easy to solve, but will require a few patches in order
> to get the necessary ordering; svm_adjust_mmio_mask() is currently called _after_
> SEV-ES is configured.
>
> I'll test (as much as I can, I don't think we have platforms with MAXPHYADDR=52)
> and get a series sent out later today.
>
> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 7314d27d57a4..a57add994b8d 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -19,8 +19,9 @@
> #include <asm/memtype.h>
> #include <asm/vmx.h>
>
> -bool __read_mostly enable_mmio_caching = true;
> -module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
> +bool __read_mostly enable_mmio_caching;
> +static bool __read_mostly __enable_mmio_caching = true;
> +module_param_named(mmio_caching, __enable_mmio_caching, bool, 0444);
>
> u64 __read_mostly shadow_host_writable_mask;
> u64 __read_mostly shadow_mmu_writable_mask;
> @@ -340,6 +341,8 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
> BUG_ON((u64)(unsigned)access_mask != access_mask);
> WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
>
> + enable_mmio_caching = __enable_mmio_caching;
> +
> if (!enable_mmio_caching)
> mmio_value = 0;
>
>

2022-07-28 16:11:04

by Michael Roth

[permalink] [raw]

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On Thu, Jul 28, 2022 at 02:56:50PM +0000, Sean Christopherson wrote:
> On Thu, Jul 28, 2022, Michael Roth wrote:
> > On Thu, Jul 28, 2022 at 08:44:30AM -0500, Michael Roth wrote:
> > > Hi Sean,
> > >
> > > With this patch applied, AMD processors that support 52-bit physical
> >
> > Sorry, threading got messed up. This is in reference to:
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20220420002747.3287931-1-seanjc%40google.com%2F%23r&data=05%7C01%7Cmichael.roth%40amd.com%7Cb0cbbc83a88e4aca870008da70a96a80%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637946170190371699%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KUBKPThOGMP36SpN3OCkJKeymkpkh5tK%2BJJv7ExUI6w%3D&reserved=0
> >
> > commit 8b9e74bfbf8c7020498a9ea600bd4c0f1915134d
> > Author: Sean Christopherson <[email protected]>
> > Date: Wed Apr 20 00:27:47 2022 +0000
> >
> > KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled
>
> Oh crud. I suspect I also broke EPT with MAXPHYADDR=52; the initial
> kvm_mmu_reset_all_pte_masks() will clear the flag, and it won't get set back to
> true even though EPT can generate a reserved bit fault.
>
> > > address will result in MMIO caching being disabled. This ends up
> > > breaking SEV-ES and SNP, since they rely on the MMIO reserved bit to
> > > generate the appropriate NAE MMIO exit event.
> > >
> > > This failure can also be reproduced on Milan by disabling mmio_caching
> > > via KVM module parameter.
>
> Hrm, this is a separate bug of sorts. SEV-ES (and later) needs to have an explicit
> check the MMIO caching is enabled, e.g. my bug aside, if KVM can't use MMIO caching
> due to the location of the C-bit, then SEV-ES must be disabled.

Ok, make sense.

>
> Speaking of which, what prevents hardware (firmware?) from configuring the C-bit
> position to be bit 51 and thus preventing KVM from generating the reserved #NPF?

I'm not sure if there's a way to change this: the related PPR documents
the CPUID 0x8000001F as read-only along with the expected value, but
it's not documented as 'fixed' so maybe there is some way.

However in this case, just like with Milan the C-bit position actually
already is 51, but since for guests we rely on the value from
boot_cpu_data.x86_phys_bits, which is less than 51, any bits in-between
can be used to generate the RSVD bit in the exit field.

So more problematic would be if boot_cpu_data.x86_phys_bits could be set
to 51+, in which case we would silently break SEV-ES/SNP in a similar
manner. That should probably just print an error and disable SEV-ES,
similar to what should be done if mmio_caching is disabled in KVM
module.

>
> > > In the case of AMD, guests use a separate physical address range that
> > > and so there are still reserved bits available to make use of the MMIO
> > > caching. This adjustment happens in svm_adjust_mmio_mask(), but since
> > > mmio_caching_enabled flag is 0, any attempts to update masks get
> > > ignored by kvm_mmu_set_mmio_spte_mask().
> > >
> > > Would adding 'force' parameter to kvm_mmu_set_mmio_spte_mask() that
> > > svm_adjust_mmio_mask() can set to ignore enable_mmio_caching be
> > > reasonable fix, or should we take a different approach?
>
> Different approach. To fix the bug with enable_mmio_caching not being set back to
> true when a vendor-specific mask allows caching, I believe the below will do the
> trick.
>
> The SEV-ES dependency is easy to solve, but will require a few patches in order
> to get the necessary ordering; svm_adjust_mmio_mask() is currently called _after_
> SEV-ES is configured.
>
> I'll test (as much as I can, I don't think we have platforms with MAXPHYADDR=52)
> and get a series sent out later today.

Will make sure to give this a test on my system as soon as it goes out.

Thanks!

-Mike

2022-07-28 18:36:15

by Sean Christopherson

[permalink] [raw]

Subject: Re: Possible 5.19 regression for systems with 52-bit physical address support

On Thu, Jul 28, 2022, Michael Roth wrote:
> On Thu, Jul 28, 2022 at 02:56:50PM +0000, Sean Christopherson wrote:
> > On Thu, Jul 28, 2022, Michael Roth wrote:
> > Speaking of which, what prevents hardware (firmware?) from configuring the C-bit
> > position to be bit 51 and thus preventing KVM from generating the reserved #NPF?
>
> I'm not sure if there's a way to change this: the related PPR documents
> the CPUID 0x8000001F as read-only along with the expected value, but
> it's not documented as 'fixed' so maybe there is some way.
>
> However in this case, just like with Milan the C-bit position actually
> already is 51, but since for guests we rely on the value from
> boot_cpu_data.x86_phys_bits, which is less than 51, any bits in-between
> can be used to generate the RSVD bit in the exit field.

Ya, I forgot to include the "and MAXPHYADDR >= 50" clause.

> So more problematic would be if boot_cpu_data.x86_phys_bits could be set
> to 51+, in which case we would silently break SEV-ES/SNP in a similar
> manner. That should probably just print an error and disable SEV-ES,
> similar to what should be done if mmio_caching is disabled in KVM
> module.

This is the scenario I'm curious about. It's mostly a future problem, so I guess
I'm just wondering if there's a plan for making things work if/when this collision
occurs.