LinuxLists.cc - [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

2021-03-17 18:35:49

Subject: [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

Always reduce x86_phys_bits per CPUID.0x8000001f[11:6] for SEV-* guests;
the existing flow that queries X86_FEATURE_SEV may or may not trigger
depending on what the VMM emulates, e.g. the VMM likely does not emulate
MSR_K8_SYSCFG.

Print a somewhat scary message and override x86_phys_bits if the VMM
doesn't omit the C-bit from MAXPHYADDR, which can be done either by
enumerating a lower MAXPHYADDR or by enumerating a non-zero
PhysAddrReduction.

Failure to adjust x86_phys_bits results in a false positive for
phys_addr_valid() if the address sets the C-bit, and may also result in
false positives for virt_addr_valid(). This is likely benign for a well-
functioning kernel+drivers, but it's nearly impossible to confidently
audit all users of the *_addr_valid() helpers, so who knows.

Opportunistically force clearing of SME, SEV, and SEV_ES in this case,
as the kernel and KVM treat those feature flags as host capabilities, not
guest capabilities. This is likely a nop for most deployments, e.g. KVM
doesn't emulate MSR_K8_SYSCFG.

Note, early kernel boot code for SEV-*, e.g. get_sev_encryption_bit(),
_requires_ the SEV feature flag to be set in CPUID in order to identify
SEV (this requirement comes from the SEV-ES GHCB standard). But, that
requirement does not mean the kernel must also "advertise" SEV in its own
CPU features array.

Fixes: d8aa7eea78a1 ("x86/mm: Add Secure Encrypted Virtualization (SEV) support")
Cc: [email protected]
Cc: Joerg Roedel <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Brijesh Singh <[email protected]>
Cc: Peter Gonda <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---

Regarding clearing SME, SEV, SEV_ES, etc..., it's obviously not required,
but to avoid false postives, identifying "SEV guest" within the kernel
must be done with sev_active(). And if we want to display support in
/proc/cpuinfo, IMO it should be a separate synthetic feature so that
userspace sees "sev_guest" instead of "sev".

arch/x86/kernel/cpu/amd.c | 32 ++++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 2d11384dc9ab..0f7f8c905226 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -15,6 +15,7 @@
#include <asm/cpu.h>
#include <asm/spec-ctrl.h>
#include <asm/smp.h>
+#include <asm/mem_encrypt.h>
#include <asm/numa.h>
#include <asm/pci-direct.h>
#include <asm/delay.h>
@@ -575,10 +576,33 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
resctrl_cpu_detect(c);
}

+#define SEV_CBIT_MSG "SEV: C-bit (bit %d), overlaps MAXPHYADDR (%d bits). VMM is buggy or malicious, overriding MAXPHYADDR to %d.\n"
+
static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
{
u64 msr;

+ /*
+ * When running as an SEV guest of any flavor, update the physical
+ * address width to account for the C-bit and clear all of the SME/SVE
+ * feature flags. As far as the kernel is concerned, the SEV flags
+ * enumerate what features can be used by the kernel/KVM, not what
+ * features have been activated by the VMM.
+ */
+ if (sev_active()) {
+ int c_bit = ilog2(sme_me_mask);
+
+ BUG_ON(!sme_me_mask);
+
+ c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;
+
+ if (c_bit < c->x86_phys_bits) {
+ pr_crit_once(SEV_CBIT_MSG, c_bit, c->x86_phys_bits, c_bit);
+ c->x86_phys_bits = c_bit;
+ }
+ goto clear_all;
+ }
+
/*
* BIOS support is required for SME and SEV.
* For SME: If BIOS has enabled SME then adjust x86_phys_bits by
@@ -612,13 +636,13 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
goto clear_sev;

return;
+ }

clear_all:
- setup_clear_cpu_cap(X86_FEATURE_SME);
+ setup_clear_cpu_cap(X86_FEATURE_SME);
clear_sev:
- setup_clear_cpu_cap(X86_FEATURE_SEV);
- setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
- }
+ setup_clear_cpu_cap(X86_FEATURE_SEV);
+ setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
}

static void early_init_amd(struct cpuinfo_x86 *c)
--
2.31.0.rc2.261.g7f71774620-goog

2021-03-17 19:13:59

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

On Wed, Mar 17, 2021 at 11:32:43AM -0700, Sean Christopherson wrote:
> Always reduce x86_phys_bits per CPUID.0x8000001f[11:6] for SEV-* guests;
> the existing flow that queries X86_FEATURE_SEV may or may not trigger
> depending on what the VMM emulates, e.g. the VMM likely does not emulate
> MSR_K8_SYSCFG.
>
> Print a somewhat scary message and override x86_phys_bits if the VMM
> doesn't omit the C-bit from MAXPHYADDR, which can be done either by
> enumerating a lower MAXPHYADDR or by enumerating a non-zero
> PhysAddrReduction.
>
> Failure to adjust x86_phys_bits results in a false positive for
> phys_addr_valid() if the address sets the C-bit, and may also result in
> false positives for virt_addr_valid(). This is likely benign for a well-
> functioning kernel+drivers, but it's nearly impossible to confidently
> audit all users of the *_addr_valid() helpers, so who knows.
>
> Opportunistically force clearing of SME, SEV, and SEV_ES in this case,
> as the kernel and KVM treat those feature flags as host capabilities, not
> guest capabilities. This is likely a nop for most deployments, e.g. KVM
> doesn't emulate MSR_K8_SYSCFG.
>
> Note, early kernel boot code for SEV-*, e.g. get_sev_encryption_bit(),
> _requires_ the SEV feature flag to be set in CPUID in order to identify
> SEV (this requirement comes from the SEV-ES GHCB standard). But, that
> requirement does not mean the kernel must also "advertise" SEV in its own
> CPU features array.

Sure it does - /proc/cpuinfo contains feature bits of stuff which has
been enabled in the kernel. And when it comes to SEV, yeah, that was a
lot of enablement. :-)

>
> Fixes: d8aa7eea78a1 ("x86/mm: Add Secure Encrypted Virtualization (SEV) support")
> Cc: [email protected]
> Cc: Joerg Roedel <[email protected]>
> Cc: Tom Lendacky <[email protected]>
> Cc: Brijesh Singh <[email protected]>
> Cc: Peter Gonda <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
> Regarding clearing SME, SEV, SEV_ES, etc..., it's obviously not required,
> but to avoid false postives, identifying "SEV guest" within the kernel
> must be done with sev_active(). And if we want to display support in
> /proc/cpuinfo, IMO it should be a separate synthetic feature so that
> userspace sees "sev_guest" instead of "sev".

I'm on the fence here, frankly. We issue capabilities in the guest dmesg
in print_mem_encrypt_feature_info(). However, if someone wants to query
SEV* status in the guest, then I don't have a good suggestion where to
put it. cpuinfo is probably ok-ish, a new /sys/devices/system/cpu/caps/
or so, should work too, considering the vuln stuff we stuck there so we
can extend that. We'll see.

>
> arch/x86/kernel/cpu/amd.c | 32 ++++++++++++++++++++++++++++----
> 1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 2d11384dc9ab..0f7f8c905226 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -15,6 +15,7 @@
> #include <asm/cpu.h>
> #include <asm/spec-ctrl.h>
> #include <asm/smp.h>
> +#include <asm/mem_encrypt.h>
> #include <asm/numa.h>
> #include <asm/pci-direct.h>
> #include <asm/delay.h>
> @@ -575,10 +576,33 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
> resctrl_cpu_detect(c);
> }
>
> +#define SEV_CBIT_MSG "SEV: C-bit (bit %d), overlaps MAXPHYADDR (%d bits). VMM is buggy or malicious, overriding MAXPHYADDR to %d.\n"

Not sure about that. This will make a lot of users run scared, not
knowing what's going on and open bugzillas.

> +
> static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
> {
> u64 msr;
>
> + /*
> + * When running as an SEV guest of any flavor, update the physical
> + * address width to account for the C-bit and clear all of the SME/SVE
> + * feature flags. As far as the kernel is concerned, the SEV flags
> + * enumerate what features can be used by the kernel/KVM, not what
> + * features have been activated by the VMM.
> + */
> + if (sev_active()) {
> + int c_bit = ilog2(sme_me_mask);
> +
> + BUG_ON(!sme_me_mask);
> +
> + c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;

Well, if that leaf is intercepted, how do you wanna trust this at all?

IOW, you have c_bit so your valid address space is [0 .. c_bit-1] no?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-03-17 21:44:50

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

On Wed, Mar 17, 2021, Borislav Petkov wrote:
> On Wed, Mar 17, 2021 at 11:32:43AM -0700, Sean Christopherson wrote:
> > Note, early kernel boot code for SEV-*, e.g. get_sev_encryption_bit(),
> > _requires_ the SEV feature flag to be set in CPUID in order to identify
> > SEV (this requirement comes from the SEV-ES GHCB standard). But, that
> > requirement does not mean the kernel must also "advertise" SEV in its own
> > CPU features array.
>
> Sure it does - /proc/cpuinfo contains feature bits of stuff which has
> been enabled in the kernel. And when it comes to SEV, yeah, that was a
> lot of enablement. :-)

Ha, all I'm saying is that /proc/cpuinfo doesn't have to match the GHCB spec.

> > Fixes: d8aa7eea78a1 ("x86/mm: Add Secure Encrypted Virtualization (SEV) support")
> > Cc: [email protected]
> > Cc: Joerg Roedel <[email protected]>
> > Cc: Tom Lendacky <[email protected]>
> > Cc: Brijesh Singh <[email protected]>
> > Cc: Peter Gonda <[email protected]>
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> >
> > Regarding clearing SME, SEV, SEV_ES, etc..., it's obviously not required,
> > but to avoid false postives, identifying "SEV guest" within the kernel
> > must be done with sev_active(). And if we want to display support in
> > /proc/cpuinfo, IMO it should be a separate synthetic feature so that
> > userspace sees "sev_guest" instead of "sev".
>
> I'm on the fence here, frankly. We issue capabilities in the guest dmesg
> in print_mem_encrypt_feature_info(). However, if someone wants to query
> SEV* status in the guest, then I don't have a good suggestion where to
> put it. cpuinfo is probably ok-ish, a new /sys/devices/system/cpu/caps/
> or so, should work too, considering the vuln stuff we stuck there so we
> can extend that. We'll see.
>
> >
> > arch/x86/kernel/cpu/amd.c | 32 ++++++++++++++++++++++++++++----
> > 1 file changed, 28 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> > index 2d11384dc9ab..0f7f8c905226 100644
> > --- a/arch/x86/kernel/cpu/amd.c
> > +++ b/arch/x86/kernel/cpu/amd.c
> > @@ -15,6 +15,7 @@
> > #include <asm/cpu.h>
> > #include <asm/spec-ctrl.h>
> > #include <asm/smp.h>
> > +#include <asm/mem_encrypt.h>
> > #include <asm/numa.h>
> > #include <asm/pci-direct.h>
> > #include <asm/delay.h>
> > @@ -575,10 +576,33 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
> > resctrl_cpu_detect(c);
> > }
> >
> > +#define SEV_CBIT_MSG "SEV: C-bit (bit %d), overlaps MAXPHYADDR (%d bits). VMM is buggy or malicious, overriding MAXPHYADDR to %d.\n"
>
> Not sure about that. This will make a lot of users run scared, not
> knowing what's going on and open bugzillas.

Yeah, I'm not too sure about it either. I would not object to dropping it to
a pr_info or pr_warn, and/or removing the "VMM is buggy or malicious" snippet.

> > +
> > static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
> > {
> > u64 msr;
> >
> > + /*
> > + * When running as an SEV guest of any flavor, update the physical
> > + * address width to account for the C-bit and clear all of the SME/SVE
> > + * feature flags. As far as the kernel is concerned, the SEV flags
> > + * enumerate what features can be used by the kernel/KVM, not what
> > + * features have been activated by the VMM.
> > + */
> > + if (sev_active()) {
> > + int c_bit = ilog2(sme_me_mask);
> > +
> > + BUG_ON(!sme_me_mask);
> > +
> > + c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;
>
> Well, if that leaf is intercepted, how do you wanna trust this at all?

That's a good question for the AMD folks. CPUID.0x80000008 and thus the original
x86_phys_bits is also untrusted.

> IOW, you have c_bit so your valid address space is [0 .. c_bit-1] no?

I haven't found anything in the GHCB that dictates that MAXPHYADDR == C_BIT-1,
or more specifically that MAXPHYADDR == C_BIT - PhysAddrReduction. E.g. AFAICT,
a VMM could do C_BIT=47, MAXPHYADDR=36, PhysAddrReduction=0, and that would be
allowed by the GHCB.

Forcing "c->x86_phys_bits = c_bit - 1" doesn't seem like it would break anything,
but it's also technically wrong.

2021-03-17 21:47:08

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

On Wed, Mar 17, 2021, Sean Christopherson wrote:
> On Wed, Mar 17, 2021, Borislav Petkov wrote:
> > IOW, you have c_bit so your valid address space is [0 .. c_bit-1] no?
>
> I haven't found anything in the GHCB that dictates that MAXPHYADDR == C_BIT-1,
> or more specifically that MAXPHYADDR == C_BIT - PhysAddrReduction. E.g. AFAICT,
> a VMM could do C_BIT=47, MAXPHYADDR=36, PhysAddrReduction=0, and that would be
> allowed by the GHCB.
>
> Forcing "c->x86_phys_bits = c_bit - 1" doesn't seem like it would break anything,
> but it's also technically wrong.

On the other hand, "C_BIT=47, MAXPHYADDR=36, PhysAddrReduction=0" would mean the
C-bit is an illegal PA bit from the guest's perspective. That's rather
nonsensical, but also not technically disallowed by the APM or GHCB specs.

2021-03-17 21:55:54

by Peter Gonda

[permalink] [raw]

Subject: Re: [PATCH] x86/cpu/AMD: Adjust x86_phys_bits to account for reduced PA in SEV-* guests

On Wed, Mar 17, 2021 at 3:19 PM Sean Christopherson <[email protected]> wrote:
>
> On Wed, Mar 17, 2021, Sean Christopherson wrote:
> > On Wed, Mar 17, 2021, Borislav Petkov wrote:
> > > IOW, you have c_bit so your valid address space is [0 .. c_bit-1] no?
> >
> > I haven't found anything in the GHCB that dictates that MAXPHYADDR == C_BIT-1,
> > or more specifically that MAXPHYADDR == C_BIT - PhysAddrReduction. E.g. AFAICT,
> > a VMM could do C_BIT=47, MAXPHYADDR=36, PhysAddrReduction=0, and that would be
> > allowed by the GHCB.
> >
> > Forcing "c->x86_phys_bits = c_bit - 1" doesn't seem like it would break anything,
> > but it's also technically wrong.
>
> On the other hand, "C_BIT=47, MAXPHYADDR=36, PhysAddrReduction=0" would mean the
> C-bit is an illegal PA bit from the guest's perspective. That's rather
> nonsensical, but also not technically disallowed by the APM or GHCB specs.

The C-bit location on Rome is 47 but it's 51 on Milan. So we already
have a C-bit that is an illegal PA bit.