2024-04-26 23:48:24

by Pawan Gupta

[permalink] [raw]
Subject: [PATCH] x86/entry_32: Move CLEAR_CPU_BUFFERS before CR3 switch

As the mitigation for MDS and RFDS, CLEAR_CPU_BUFFERS macro executes VERW
instruction that is used to clear the CPU buffers before returning to user
space. Currently, VERW is executed after the user CR3 is restored. This
leads to vm86() to fault because VERW takes a memory operand that is not
mapped in user page tables when vm86() syscall returns. This is an issue
with 32-bit kernels only, as 64-bit kernels do not support vm86().

Move the VERW before the CR3 switch for 32-bit kernels as a workaround.
This is slightly less secure because there is a possibility that the data
in the registers may be sensitive, and doesn't get cleared from CPU
buffers. As 32-bit kernels haven't received some of the other transient
execution mitigations, this is a reasonable trade-off to ensure that
vm86() syscall works.

Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
Closes: https://lore.kernel.org/all/[email protected]/
Reported-by: Robert Gill <[email protected]>
Signed-off-by: Pawan Gupta <[email protected]>
---
arch/x86/entry/entry_32.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index d3a814efbff6..1b9c1587f06e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -837,6 +837,7 @@ SYM_FUNC_START(entry_SYSENTER_32)
jz .Lsyscall_32_done

STACKLEAK_ERASE
+ CLEAR_CPU_BUFFERS

/* Opportunistic SYSEXIT */

@@ -881,7 +882,6 @@ SYM_FUNC_START(entry_SYSENTER_32)
BUG_IF_WRONG_CR3 no_user_check=1
popfl
popl %eax
- CLEAR_CPU_BUFFERS

/*
* Return back to the vDSO, which will pop ecx and edx.
@@ -941,6 +941,7 @@ SYM_FUNC_START(entry_INT80_32)
STACKLEAK_ERASE

restore_all_switch_stack:
+ CLEAR_CPU_BUFFERS
SWITCH_TO_ENTRY_STACK
CHECK_AND_APPLY_ESPFIX

@@ -951,7 +952,6 @@ restore_all_switch_stack:

/* Restore user state */
RESTORE_REGS pop=4 # skip orig_eax/error_code
- CLEAR_CPU_BUFFERS
.Lirq_return:
/*
* ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization

---
base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
change-id: 20240426-fix-dosemu-vm86-dd111a01737e



2024-05-09 12:19:29

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH] x86/entry_32: Move CLEAR_CPU_BUFFERS before CR3 switch

On 27.04.24 01:48, Pawan Gupta wrote:
> As the mitigation for MDS and RFDS, CLEAR_CPU_BUFFERS macro executes VERW
> instruction that is used to clear the CPU buffers before returning to user
> space. Currently, VERW is executed after the user CR3 is restored. This
> leads to vm86() to fault because VERW takes a memory operand that is not
> mapped in user page tables when vm86() syscall returns. This is an issue
> with 32-bit kernels only, as 64-bit kernels do not support vm86().
>
> Move the VERW before the CR3 switch for 32-bit kernels as a workaround.
> This is slightly less secure because there is a possibility that the data
> in the registers may be sensitive, and doesn't get cleared from CPU
> buffers. As 32-bit kernels haven't received some of the other transient
> execution mitigations, this is a reasonable trade-off to ensure that
> vm86() syscall works.
>
> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> Closes: https://lore.kernel.org/all/[email protected]/
> Reported-by: Robert Gill <[email protected]>
> Signed-off-by: Pawan Gupta <[email protected]>

Did this fall through the cracks? Just wondering, as from here it looks
like for about two weeks now nothing happened to fix the regression
linked above. But I might have missed something.

Cioa, Thorsten

> ---
> arch/x86/entry/entry_32.S | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index d3a814efbff6..1b9c1587f06e 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -837,6 +837,7 @@ SYM_FUNC_START(entry_SYSENTER_32)
> jz .Lsyscall_32_done
>
> STACKLEAK_ERASE
> + CLEAR_CPU_BUFFERS
>
> /* Opportunistic SYSEXIT */
>
> @@ -881,7 +882,6 @@ SYM_FUNC_START(entry_SYSENTER_32)
> BUG_IF_WRONG_CR3 no_user_check=1
> popfl
> popl %eax
> - CLEAR_CPU_BUFFERS
>
> /*
> * Return back to the vDSO, which will pop ecx and edx.
> @@ -941,6 +941,7 @@ SYM_FUNC_START(entry_INT80_32)
> STACKLEAK_ERASE
>
> restore_all_switch_stack:
> + CLEAR_CPU_BUFFERS
> SWITCH_TO_ENTRY_STACK
> CHECK_AND_APPLY_ESPFIX
>
> @@ -951,7 +952,6 @@ restore_all_switch_stack:
>
> /* Restore user state */
> RESTORE_REGS pop=4 # skip orig_eax/error_code
> - CLEAR_CPU_BUFFERS
> .Lirq_return:
> /*
> * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization
>
> ---
> base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
> change-id: 20240426-fix-dosemu-vm86-dd111a01737e
>
>

2024-05-09 16:14:46

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] x86/entry_32: Move CLEAR_CPU_BUFFERS before CR3 switch

On 4/26/24 16:48, Pawan Gupta wrote:
> As the mitigation for MDS and RFDS, CLEAR_CPU_BUFFERS macro executes VERW
> instruction that is used to clear the CPU buffers before returning to user
> space. Currently, VERW is executed after the user CR3 is restored. This
> leads to vm86() to fault because VERW takes a memory operand that is not
> mapped in user page tables when vm86() syscall returns. This is an issue
> with 32-bit kernels only, as 64-bit kernels do not support vm86().

entry.S has this handy comment:

/*
* Define the VERW operand that is disguised as entry code so that
* it can be referenced with KPTI enabled. This ensure VERW can be
* used late in exit-to-user path after page tables are switched.
*/

Why isn't that working?

> Move the VERW before the CR3 switch for 32-bit kernels as a workaround.
> This is slightly less secure because there is a possibility that the data
> in the registers may be sensitive, and doesn't get cleared from CPU
> buffers. As 32-bit kernels haven't received some of the other transient
> execution mitigations, this is a reasonable trade-off to ensure that
> vm86() syscall works.
>
> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> Closes: https://lore.kernel.org/all/[email protected]/
> Reported-by: Robert Gill <[email protected]>
> Signed-off-by: Pawan Gupta <[email protected]>
> ---
> arch/x86/entry/entry_32.S | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index d3a814efbff6..1b9c1587f06e 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -837,6 +837,7 @@ SYM_FUNC_START(entry_SYSENTER_32)
> jz .Lsyscall_32_done
>
> STACKLEAK_ERASE
> + CLEAR_CPU_BUFFERS
>
> /* Opportunistic SYSEXIT */
>
> @@ -881,7 +882,6 @@ SYM_FUNC_START(entry_SYSENTER_32)
> BUG_IF_WRONG_CR3 no_user_check=1
> popfl
> popl %eax
> - CLEAR_CPU_BUFFERS

Right now, this code basically does:

STACKLEAK_ERASE
/* Restore user registers and segments */
movl PT_EIP(%esp), %edx
...
SWITCH_TO_USER_CR3 scratch_reg=%eax
...
CLEAR_CPU_BUFFERS

The proposed patch is:

STACKLEAK_ERASE
+ CLEAR_CPU_BUFFERS
/* Restore user registers and segments */
movl PT_EIP(%esp), %edx
...
SWITCH_TO_USER_CR3 scratch_reg=%eax
...
- CLEAR_CPU_BUFFERS

That's a bit confusing to me. I would have expected the
CLEAR_CPU_BUFFERS to go _just_ before the SWITCH_TO_USER_CR3 and after
the user register restore.

Is there a reason it can't go there? I think only %eax is "live" with
kernel state at that point and it's only an entry stack pointer, so not
a secret.

> /*
> * Return back to the vDSO, which will pop ecx and edx.
> @@ -941,6 +941,7 @@ SYM_FUNC_START(entry_INT80_32)
> STACKLEAK_ERASE
>
> restore_all_switch_stack:
> + CLEAR_CPU_BUFFERS
> SWITCH_TO_ENTRY_STACK
> CHECK_AND_APPLY_ESPFIX
>
> @@ -951,7 +952,6 @@ restore_all_switch_stack:
>
> /* Restore user state */
> RESTORE_REGS pop=4 # skip orig_eax/error_code
> - CLEAR_CPU_BUFFERS
> .Lirq_return:
> /*
> * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization

There is a working stack here, on both sides of the CR3 switch. It's
annoying to do another push/pop which won't get patched out, but this
_could_ just do:

RESTORE_REGS pop=4
CLEAR_CPU_BUFFERS

pushl %eax
SWITCH_TO_USER_CR3 scratch_reg=%eax
popl %eax

right?

That would only expose the CR3 value, which isn't a secret.