2021-03-10 08:49:06

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH v2 0/7] x86/seves: Support 32-bit boot path and other updates

From: Joerg Roedel <[email protected]>

Hi,

these patches add support for the 32-bit boot in the decompressor
code. This is needed to boot an SEV-ES guest on some firmware and grub
versions. The patches also add the necessary CPUID sanity checks and a
32-bit version of the C-bit check.

Other updates included here:

1. Add code to shut down exception handling in the
decompressor code before jumping to the real kernel.
Once in the real kernel it is not safe anymore to jump
back to the decompressor code via exceptions.

2. Replace open-coded hlt loops with proper calls to
sev_es_terminate().

Please review.

Thanks,

Joerg

Changes to v1:

- Addressed Boris' review comments.
- Fixed a bug which caused the cbit-check to never be
executed even in an SEV guest.

Joerg Roedel (7):
x86/boot/compressed/64: Cleanup exception handling before booting
kernel
x86/boot/compressed/64: Reload CS in startup_32
x86/boot/compressed/64: Setup IDT in startup_32 boot path
x86/boot/compressed/64: Add 32-bit boot #VC handler
x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path
x86/boot/compressed/64: Check SEV encryption in 32-bit boot-path
x86/sev-es: Replace open-coded hlt-loops with sev_es_terminate()

arch/x86/boot/compressed/head_64.S | 170 ++++++++++++++++++++++++-
arch/x86/boot/compressed/idt_64.c | 14 ++
arch/x86/boot/compressed/mem_encrypt.S | 132 ++++++++++++++++++-
arch/x86/boot/compressed/misc.c | 7 +-
arch/x86/boot/compressed/misc.h | 6 +
arch/x86/boot/compressed/sev-es.c | 12 +-
arch/x86/kernel/sev-es-shared.c | 10 +-
7 files changed, 328 insertions(+), 23 deletions(-)

--
2.30.1


2021-03-10 08:49:09

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH v2 6/7] x86/boot/compressed/64: Check SEV encryption in 32-bit boot-path

From: Joerg Roedel <[email protected]>

Check whether the hypervisor reported the correct C-bit when running as
an SEV guest. Using a wrong C-bit position could be used to leak
sensitive data from the guest to the hypervisor.

Signed-off-by: Joerg Roedel <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 83 ++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index ee448aedb8b0..7c5c2698a96e 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -183,11 +183,21 @@ SYM_FUNC_START(startup_32)
*/
call get_sev_encryption_bit
xorl %edx, %edx
+#ifdef CONFIG_AMD_MEM_ENCRYPT
testl %eax, %eax
jz 1f
subl $32, %eax /* Encryption bit is always above bit 31 */
bts %eax, %edx /* Set encryption mask for page tables */
+ /*
+ * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
+ * will do a check. The sev_status memory will be fully initialized
+ * with the contents of MSR_AMD_SEV_STATUS later in
+ * set_sev_encryption_mask(). For now it is sufficient to know that SEV
+ * is active.
+ */
+ movl $1, rva(sev_status)(%ebp)
1:
+#endif

/* Initialize Page tables to 0 */
leal rva(pgtable)(%ebx), %edi
@@ -272,6 +282,9 @@ SYM_FUNC_START(startup_32)
movl %esi, %edx
1:
#endif
+ /* Check if the C-bit position is correct when SEV is active */
+ call startup32_check_sev_cbit
+
pushl $__KERNEL_CS
pushl %eax

@@ -871,6 +884,76 @@ SYM_FUNC_START(startup32_load_idt)
ret
SYM_FUNC_END(startup32_load_idt)

+/*
+ * Check for the correct C-bit position when the startup_32 boot-path is used.
+ *
+ * The check makes use of the fact that all memory is encrypted when paging is
+ * disabled. The function creates 64 bits of random data using the RDRAND
+ * instruction. RDRAND is mandatory for SEV guests, so always available. If the
+ * hypervisor violates that the kernel will crash right here.
+ *
+ * The 64 bits of random data are stored to a memory location and at the same
+ * time kept in the %eax and %ebx registers. Since encryption is always active
+ * when paging is off the random data will be stored encrypted in main memory.
+ *
+ * Then paging is enabled. When the C-bit position is correct all memory is
+ * still mapped encrypted and comparing the register values with memory will
+ * succeed. An incorrect C-bit position will map all memory unencrypted, so that
+ * the compare will use the encrypted random data and fail.
+ */
+SYM_FUNC_START(startup32_check_sev_cbit)
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ pushl %eax
+ pushl %ebx
+ pushl %ecx
+ pushl %edx
+
+ /* Check for non-zero sev_status */
+ movl rva(sev_status)(%ebp), %eax
+ testl %eax, %eax
+ jz 4f
+
+ /*
+ * Get two 32-bit random values - Don't bail out if RDRAND fails
+ * because it is better to prevent forward progress if no random value
+ * can be gathered.
+ */
+1: rdrand %eax
+ jnc 1b
+2: rdrand %ebx
+ jnc 2b
+
+ /* Store to memory and keep it in the registers */
+ movl %eax, rva(sev_check_data)(%ebp)
+ movl %ebx, rva(sev_check_data+4)(%ebp)
+
+ /* Enable paging to see if encryption is active */
+ movl %cr0, %edx /* Backup %cr0 in %edx */
+ movl $(X86_CR0_PG | X86_CR0_PE), %ecx /* Enable Paging and Protected mode */
+ movl %ecx, %cr0
+
+ cmpl %eax, rva(sev_check_data)(%ebp)
+ jne 3f
+ cmpl %ebx, rva(sev_check_data+4)(%ebp)
+ jne 3f
+
+ movl %edx, %cr0 /* Restore previous %cr0 */
+
+ jmp 4f
+
+3: /* Check failed - hlt the machine */
+ hlt
+ jmp 3b
+
+4:
+ popl %edx
+ popl %ecx
+ popl %ebx
+ popl %eax
+#endif
+ ret
+SYM_FUNC_END(startup32_check_sev_cbit)
+
/*
* Stack and heap for uncompression
*/
--
2.30.1

2021-03-10 08:49:09

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

From: Joerg Roedel <[email protected]>

The 32-bit #VC handler has no GHCB and can only handle CPUID exit codes.
It is needed by the early boot code to handle #VC exceptions raised in
verify_cpu() and to get the position of the C bit.

But the CPUID information comes from the hypervisor, which is untrusted
and might return results which trick the guest into the no-SEV boot path
with no C bit set in the page-tables. All data written to memory would
then be unencrypted and could leak sensitive data to the hypervisor.

Add sanity checks to the 32-bit boot #VC handler to make sure the
hypervisor does not pretend that SEV is not enabled.

Signed-off-by: Joerg Roedel <[email protected]>
---
arch/x86/boot/compressed/mem_encrypt.S | 36 ++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
index 2ca056a3707c..8941c3a8ff8a 100644
--- a/arch/x86/boot/compressed/mem_encrypt.S
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -145,6 +145,34 @@ SYM_CODE_START(startup32_vc_handler)
jnz .Lfail
movl %edx, 0(%esp) # Store result

+ /*
+ * Sanity check CPUID results from the Hypervisor. See comment in
+ * do_vc_no_ghcb() for more details on why this is necessary.
+ */
+
+ /* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */
+ cmpl $1, %ebx
+ jne .Lcheck_leaf
+ btl $31, 4(%esp)
+ jnc .Lfail
+ jmp .Ldone
+
+.Lcheck_leaf:
+ /* Fail if SEV leaf not available in CPUID[0x80000000].EAX */
+ cmpl $0x80000000, %ebx
+ jne .Lcheck_sev
+ cmpl $0x8000001f, 12(%esp)
+ jb .Lfail
+ jmp .Ldone
+
+.Lcheck_sev:
+ /* Fail if SEV bit not set in CPUID[0x8000001f].EAX[1] */
+ cmpl $0x8000001f, %ebx
+ jne .Ldone
+ btl $1, 12(%esp)
+ jnc .Lfail
+
+.Ldone:
popl %edx
popl %ecx
popl %ebx
@@ -158,6 +186,14 @@ SYM_CODE_START(startup32_vc_handler)

iret
.Lfail:
+ /* Send terminate request to Hypervisor */
+ movl $0x100, %eax
+ xorl %edx, %edx
+ movl $MSR_AMD64_SEV_ES_GHCB, %ecx
+ wrmsr
+ rep; vmmcall
+
+ /* If request fails, go to hlt loop */
hlt
jmp .Lfail
SYM_CODE_END(startup32_vc_handler)
--
2.30.1

2021-03-10 16:10:27

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

On Wed, Mar 10, 2021, Joerg Roedel wrote:
> From: Joerg Roedel <[email protected]>
>
> The 32-bit #VC handler has no GHCB and can only handle CPUID exit codes.
> It is needed by the early boot code to handle #VC exceptions raised in
> verify_cpu() and to get the position of the C bit.
>
> But the CPUID information comes from the hypervisor, which is untrusted
> and might return results which trick the guest into the no-SEV boot path
> with no C bit set in the page-tables. All data written to memory would
> then be unencrypted and could leak sensitive data to the hypervisor.
>
> Add sanity checks to the 32-bit boot #VC handler to make sure the
> hypervisor does not pretend that SEV is not enabled.
>
> Signed-off-by: Joerg Roedel <[email protected]>
> ---
> arch/x86/boot/compressed/mem_encrypt.S | 36 ++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> index 2ca056a3707c..8941c3a8ff8a 100644
> --- a/arch/x86/boot/compressed/mem_encrypt.S
> +++ b/arch/x86/boot/compressed/mem_encrypt.S
> @@ -145,6 +145,34 @@ SYM_CODE_START(startup32_vc_handler)
> jnz .Lfail
> movl %edx, 0(%esp) # Store result
>
> + /*
> + * Sanity check CPUID results from the Hypervisor. See comment in
> + * do_vc_no_ghcb() for more details on why this is necessary.
> + */
> +
> + /* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */

This check is flawed, as is the existing check in 64-bit boot. Or I guess more
accurately, the check in get_sev_encryption_bit() is flawed. AIUI, SEV-ES
doesn't require the hypervisor to intercept CPUID. A malicious hypervisor can
temporarily pass-through CPUID to bypass the CPUID[1].ECX[31] check. The
hypervisor likely has access to the guest firmware source, so it wouldn't be
difficult for the hypervisor to disable CPUID interception once it detects that
firmware is handing over control to the kernel.

> + cmpl $1, %ebx
> + jne .Lcheck_leaf
> + btl $31, 4(%esp)
> + jnc .Lfail
> + jmp .Ldone
> +
> +.Lcheck_leaf:
> + /* Fail if SEV leaf not available in CPUID[0x80000000].EAX */
> + cmpl $0x80000000, %ebx
> + jne .Lcheck_sev
> + cmpl $0x8000001f, 12(%esp)
> + jb .Lfail
> + jmp .Ldone
> +
> +.Lcheck_sev:
> + /* Fail if SEV bit not set in CPUID[0x8000001f].EAX[1] */
> + cmpl $0x8000001f, %ebx
> + jne .Ldone
> + btl $1, 12(%esp)
> + jnc .Lfail
> +
> +.Ldone:
> popl %edx
> popl %ecx
> popl %ebx
> @@ -158,6 +186,14 @@ SYM_CODE_START(startup32_vc_handler)
>
> iret
> .Lfail:
> + /* Send terminate request to Hypervisor */
> + movl $0x100, %eax
> + xorl %edx, %edx
> + movl $MSR_AMD64_SEV_ES_GHCB, %ecx
> + wrmsr
> + rep; vmmcall
> +
> + /* If request fails, go to hlt loop */
> hlt
> jmp .Lfail
> SYM_CODE_END(startup32_vc_handler)
> --
> 2.30.1
>

2021-03-10 17:29:40

by Martin Radev

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

On Wed, Mar 10, 2021 at 08:08:37AM -0800, Sean Christopherson wrote:
> On Wed, Mar 10, 2021, Joerg Roedel wrote:
> > From: Joerg Roedel <[email protected]>
> >
> > The 32-bit #VC handler has no GHCB and can only handle CPUID exit codes.
> > It is needed by the early boot code to handle #VC exceptions raised in
> > verify_cpu() and to get the position of the C bit.
> >
> > But the CPUID information comes from the hypervisor, which is untrusted
> > and might return results which trick the guest into the no-SEV boot path
> > with no C bit set in the page-tables. All data written to memory would
> > then be unencrypted and could leak sensitive data to the hypervisor.
> >
> > Add sanity checks to the 32-bit boot #VC handler to make sure the
> > hypervisor does not pretend that SEV is not enabled.
> >
> > Signed-off-by: Joerg Roedel <[email protected]>
> > ---
> > arch/x86/boot/compressed/mem_encrypt.S | 36 ++++++++++++++++++++++++++
> > 1 file changed, 36 insertions(+)
> >
> > diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> > index 2ca056a3707c..8941c3a8ff8a 100644
> > --- a/arch/x86/boot/compressed/mem_encrypt.S
> > +++ b/arch/x86/boot/compressed/mem_encrypt.S
> > @@ -145,6 +145,34 @@ SYM_CODE_START(startup32_vc_handler)
> > jnz .Lfail
> > movl %edx, 0(%esp) # Store result
> >
> > + /*
> > + * Sanity check CPUID results from the Hypervisor. See comment in
> > + * do_vc_no_ghcb() for more details on why this is necessary.
> > + */
> > +
> > + /* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */
>
> This check is flawed, as is the existing check in 64-bit boot. Or I guess more
> accurately, the check in get_sev_encryption_bit() is flawed. AIUI, SEV-ES
> doesn't require the hypervisor to intercept CPUID. A malicious hypervisor can
> temporarily pass-through CPUID to bypass the CPUID[1].ECX[31] check.

If erroneous information is provided, either through interception or without, there's
this check which is performed every time a new page table is set in the early linux stages:
https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/x86/kernel/sev_verify_cbit.S#L22

This should lead to a halt if corruption is detected, unless I'm overlooking something.
Please share more info.


> The
> hypervisor likely has access to the guest firmware source, so it wouldn't be
> difficult for the hypervisor to disable CPUID interception once it detects that
> firmware is handing over control to the kernel.
>

You probably don't even need to know the firmware for that. There the option to set CR* changes to cause
#AE which probably gives away enough information.

> > + cmpl $1, %ebx
> > + jne .Lcheck_leaf
> > + btl $31, 4(%esp)
> > + jnc .Lfail
> > + jmp .Ldone
> > +
> > +.Lcheck_leaf:
> > + /* Fail if SEV leaf not available in CPUID[0x80000000].EAX */
> > + cmpl $0x80000000, %ebx
> > + jne .Lcheck_sev
> > + cmpl $0x8000001f, 12(%esp)
> > + jb .Lfail
> > + jmp .Ldone
> > +
> > +.Lcheck_sev:
> > + /* Fail if SEV bit not set in CPUID[0x8000001f].EAX[1] */
> > + cmpl $0x8000001f, %ebx
> > + jne .Ldone
> > + btl $1, 12(%esp)
> > + jnc .Lfail
> > +
> > +.Ldone:
> > popl %edx
> > popl %ecx
> > popl %ebx
> > @@ -158,6 +186,14 @@ SYM_CODE_START(startup32_vc_handler)
> >
> > iret
> > .Lfail:
> > + /* Send terminate request to Hypervisor */
> > + movl $0x100, %eax
> > + xorl %edx, %edx
> > + movl $MSR_AMD64_SEV_ES_GHCB, %ecx
> > + wrmsr
> > + rep; vmmcall
> > +
> > + /* If request fails, go to hlt loop */
> > hlt
> > jmp .Lfail
> > SYM_CODE_END(startup32_vc_handler)
> > --
> > 2.30.1
> >

2021-03-10 17:53:41

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

On Wed, Mar 10, 2021, Martin Radev wrote:
> On Wed, Mar 10, 2021 at 08:08:37AM -0800, Sean Christopherson wrote:
> > On Wed, Mar 10, 2021, Joerg Roedel wrote:
> > > + /*
> > > + * Sanity check CPUID results from the Hypervisor. See comment in
> > > + * do_vc_no_ghcb() for more details on why this is necessary.
> > > + */
> > > +
> > > + /* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */
> >
> > This check is flawed, as is the existing check in 64-bit boot. Or I guess more
> > accurately, the check in get_sev_encryption_bit() is flawed. AIUI, SEV-ES
> > doesn't require the hypervisor to intercept CPUID. A malicious hypervisor can
> > temporarily pass-through CPUID to bypass the CPUID[1].ECX[31] check.
>
> If erroneous information is provided, either through interception or without, there's
> this check which is performed every time a new page table is set in the early linux stages:
> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/x86/kernel/sev_verify_cbit.S#L22
>
> This should lead to a halt if corruption is detected, unless I'm overlooking something.
> Please share more info.

That check is predicated on sme_me_mask != 0, sme_me_mask is set based on the
result of get_sev_encryption_bit(), and that returns '0' if CPUID[1].ECX[31] is
'0'.

sme_enable() also appears to have the same issue, as CPUID[1].ECX[31]=0 would
cause it to check for SME instead of SEV, and the hypervisor can simply return
0 for a VMGEXIT to get MSR_K8_SYSCFG.

I've no idea if the guest would actually survive with a bogus sme_me_mask, but
relying on CPUID[1] to #VC is flawed.

Since MSR_AMD64_SEV is non-interceptable, that seems like it should be the
canonical way to detect SEV/SEV-ES. The only complication seems to be handling
#GP faults on the RDMSR in early boot.

> > The hypervisor likely has access to the guest firmware source, so it
> > wouldn't be difficult for the hypervisor to disable CPUID interception once
> > it detects that firmware is handing over control to the kernel.
> >
>
> You probably don't even need to know the firmware for that. There the option
> to set CR* changes to cause #AE which probably gives away enough information.

2021-03-10 18:14:36

by Martin Radev

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] x86/boot/compressed/64: Add CPUID sanity check to 32-bit boot-path

On Wed, Mar 10, 2021 at 09:51:48AM -0800, Sean Christopherson wrote:
> On Wed, Mar 10, 2021, Martin Radev wrote:
> > On Wed, Mar 10, 2021 at 08:08:37AM -0800, Sean Christopherson wrote:
> > > On Wed, Mar 10, 2021, Joerg Roedel wrote:
> > > > + /*
> > > > + * Sanity check CPUID results from the Hypervisor. See comment in
> > > > + * do_vc_no_ghcb() for more details on why this is necessary.
> > > > + */
> > > > +
> > > > + /* Fail if Hypervisor bit not set in CPUID[1].ECX[31] */
> > >
> > > This check is flawed, as is the existing check in 64-bit boot. Or I guess more
> > > accurately, the check in get_sev_encryption_bit() is flawed. AIUI, SEV-ES
> > > doesn't require the hypervisor to intercept CPUID. A malicious hypervisor can
> > > temporarily pass-through CPUID to bypass the CPUID[1].ECX[31] check.
> >
> > If erroneous information is provided, either through interception or without, there's
> > this check which is performed every time a new page table is set in the early linux stages:
> > https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/x86/kernel/sev_verify_cbit.S#L22
> >
> > This should lead to a halt if corruption is detected, unless I'm overlooking something.
> > Please share more info.
>
> That check is predicated on sme_me_mask != 0, sme_me_mask is set based on the
> result of get_sev_encryption_bit(), and that returns '0' if CPUID[1].ECX[31] is
> '0'.
>
> sme_enable() also appears to have the same issue, as CPUID[1].ECX[31]=0 would
> cause it to check for SME instead of SEV, and the hypervisor can simply return
> 0 for a VMGEXIT to get MSR_K8_SYSCFG.
>
> I've no idea if the guest would actually survive with a bogus sme_me_mask, but
> relying on CPUID[1] to #VC is flawed.
>
> Since MSR_AMD64_SEV is non-interceptable, that seems like it should be the
> canonical way to detect SEV/SEV-ES. The only complication seems to be handling
> #GP faults on the RDMSR in early boot.
>
> > > The hypervisor likely has access to the guest firmware source, so it
> > > wouldn't be difficult for the hypervisor to disable CPUID interception once
> > > it detects that firmware is handing over control to the kernel.
> > >
> >
> > You probably don't even need to know the firmware for that. There the option
> > to set CR* changes to cause #AE which probably gives away enough information.

I see what you mean but I never tried out disabling interception for cpuid.
There was the idea of checking for bogus information in the VC handler, but what
you suggested would bypass it, I guess.

If the C-bit is not set and memory gets interpreted as unencrypted, then the HV
can gain code execution easily by means of ROP and then switch to the OVMF page
table to easily do proper payload injection.

If interested, check video at https://fosdem.org/2021/schedule/event/tee_sev_es/
on minute 15.