2023-10-27 14:40:49

by Pawan Gupta

[permalink] [raw]
Subject: [PATCH v4 0/6] Delay VERW

v4:
- Fill unused part of mds_verw_sel cacheline with int3. (Andrew)
- Fix the formatting in documentation (0-day CI).
- s/inspite/in spite/ (Sean).
- Explicitly skip FB_CLEAR optimization when MDS affected (Sean).

v3: https://lore.kernel.org/r/[email protected]
- Use .entry.text section for VERW memory operand. (Andrew/PeterZ)
- Fix the duplicate header inclusion. (Chao)

v2: https://lore.kernel.org/r/[email protected]
- Removed the extra EXEC_VERW macro layers. (Sean)
- Move NOPL before VERW. (Sean)
- s/USER_CLEAR_CPU_BUFFERS/CLEAR_CPU_BUFFERS/. (Josh/Dave)
- Removed the comments before CLEAR_CPU_BUFFERS. (Josh)
- Remove CLEAR_CPU_BUFFERS from NMI returning to kernel and document the
reason. (Josh/Dave)
- Reformat comment in md_clear_update_mitigation(). (Josh)
- Squash "x86/bugs: Cleanup mds_user_clear" patch. (Nikolay)
- s/GUEST_CLEAR_CPU_BUFFERS/CLEAR_CPU_BUFFERS/. (Josh)
- Added a patch from Sean to use CFLAGS.CF for VMLAUNCH/VMRESUME
selection. This facilitates a single CLEAR_CPU_BUFFERS location for both
VMLAUNCH and VMRESUME. (Sean)

v1: https://lore.kernel.org/r/[email protected]

Hi,

Legacy instruction VERW was overloaded by some processors to clear
micro-architectural CPU buffers as a mitigation of CPU bugs. This series
moves VERW execution to a later point in exit-to-user path. This is
needed because in some cases it may be possible for kernel data to be
accessed after VERW in arch_exit_to_user_mode(). Such accesses may put
data into MDS affected CPU buffers, for example:

1. Kernel data accessed by an NMI between VERW and return-to-user can
remain in CPU buffers (since NMI returning to kernel does not
execute VERW to clear CPU buffers).
2. Alyssa reported that after VERW is executed,
CONFIG_GCC_PLUGIN_STACKLEAK=y scrubs the stack used by a system
call. Memory accesses during stack scrubbing can move kernel stack
contents into CPU buffers.
3. When caller saved registers are restored after a return from
function executing VERW, the kernel stack accesses can remain in
CPU buffers(since they occur after VERW).

Although these cases are less practical to exploit, moving VERW closer
to ring transition reduces the attack surface.

Overview of the series:

Patch 1: Prepares VERW macros for use in asm.
Patch 2: Adds macros to 64-bit entry/exit points.
Patch 3: Adds macros to 32-bit entry/exit points.
Patch 4: Enables the new macros.
Patch 5: Uses CFLAGS.CF for VMLAUNCH/VMRESUME selection.
Patch 6: Adds macro to VMenter.

Below is some performance data collected on a Skylake client
compared with previous implementation:

Baseline: v6.6-rc5

| Test | Configuration | v1 | v3 |
| ------------------ | ---------------------- | ---- | ---- |
| build-linux-kernel | defconfig | 1.00 | 1.00 |
| hackbench | 32 - Process | 1.02 | 1.06 |
| nginx | Short Connection - 500 | 1.01 | 1.04 |

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Alyssa Milburn <[email protected]>
Cc: Daniel Sneddon <[email protected]>
Cc: [email protected]
Cc: Greg Kroah-Hartman <[email protected]>
To: Thomas Gleixner <[email protected]>
To: Ingo Molnar <[email protected]>
To: Borislav Petkov <[email protected]>
To: Dave Hansen <[email protected]>
To: [email protected]
To: "H. Peter Anvin" <[email protected]>
To: Peter Zijlstra <[email protected]>
To: Josh Poimboeuf <[email protected]>
To: Andy Lutomirski <[email protected]>
To: Jonathan Corbet <[email protected]>
To: Sean Christopherson <[email protected]>
To: Paolo Bonzini <[email protected]>
To: [email protected]
To: [email protected]
To: [email protected]
To: Andrew Cooper <[email protected]>
To: Nikolay Borisov <[email protected]>

Signed-off-by: Pawan Gupta <[email protected]>
---
Pawan Gupta (5):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM: VMX: Move VERW closer to VMentry for MDS mitigation

Sean Christopherson (1):
KVM: VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH

Documentation/arch/x86/mds.rst | 38 +++++++++++++++++++++++++-----------
arch/x86/entry/entry.S | 17 ++++++++++++++++
arch/x86/entry/entry_32.S | 3 +++
arch/x86/entry/entry_64.S | 11 +++++++++++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 1 -
arch/x86/include/asm/nospec-branch.h | 27 +++++++++++++------------
arch/x86/kernel/cpu/bugs.c | 15 ++++++--------
arch/x86/kernel/nmi.c | 2 --
arch/x86/kvm/vmx/run_flags.h | 7 +++++--
arch/x86/kvm/vmx/vmenter.S | 9 ++++++---
arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++-----
13 files changed, 106 insertions(+), 46 deletions(-)
---
base-commit: 05d3ef8bba77c1b5f98d941d8b2d4aeab8118ef1
change-id: 20231011-delay-verw-d0474986b2c3

Best regards,
--
Thanks,
Pawan



2023-10-27 14:40:55

by Pawan Gupta

[permalink] [raw]
Subject: [PATCH v4 6/6] KVM: VMX: Move VERW closer to VMentry for MDS mitigation

During VMentry VERW is executed to mitigate MDS. After VERW, any memory
access like register push onto stack may put host data in MDS affected
CPU buffers. A guest can then use MDS to sample host data.

Although likelihood of secrets surviving in registers at current VERW
callsite is less, but it can't be ruled out. Harden the MDS mitigation
by moving the VERW mitigation late in VMentry path.

Note that VERW for MMIO Stale Data mitigation is unchanged because of
the complexity of per-guest conditional VERW which is not easy to handle
that late in asm with no GPRs available. If the CPU is also affected by
MDS, VERW is unconditionally executed late in asm regardless of guest
having MMIO access.

Signed-off-by: Pawan Gupta <[email protected]>
---
arch/x86/kvm/vmx/vmenter.S | 3 +++
arch/x86/kvm/vmx/vmx.c | 19 ++++++++++++++-----
2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index b3b13ec04bac..139960deb736 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -161,6 +161,9 @@ SYM_FUNC_START(__vmx_vcpu_run)
/* Load guest RAX. This kills the @regs pointer! */
mov VCPU_RAX(%_ASM_AX), %_ASM_AX

+ /* Clobbers EFLAGS.ZF */
+ CLEAR_CPU_BUFFERS
+
/* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */
jnc .Lvmlaunch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 24e8694b83fc..a05c6b80b06c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7226,16 +7226,24 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,

guest_state_enter_irqoff();

- /* L1D Flush includes CPU buffer clear to mitigate MDS */
+ /*
+ * L1D Flush includes CPU buffer clear to mitigate MDS, but VERW
+ * mitigation for MDS is done late in VMentry and is still
+ * executed in spite of L1D Flush. This is because an extra VERW
+ * should not matter much after the big hammer L1D Flush.
+ */
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
- else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
- mds_clear_cpu_buffers();
else if (static_branch_unlikely(&mmio_stale_data_clear) &&
kvm_arch_has_assigned_device(vcpu->kvm))
mds_clear_cpu_buffers();

- vmx_disable_fb_clear(vmx);
+ /*
+ * Optimize the latency of VERW in guests for MMIO mitigation. Skip
+ * the optimization when MDS mitigation(later in asm) is enabled.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
+ vmx_disable_fb_clear(vmx);

if (vcpu->arch.cr2 != native_read_cr2())
native_write_cr2(vcpu->arch.cr2);
@@ -7248,7 +7256,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,

vmx->idt_vectoring_info = 0;

- vmx_enable_fb_clear(vmx);
+ if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
+ vmx_enable_fb_clear(vmx);

if (unlikely(vmx->fail)) {
vmx->exit_reason.full = 0xdead;

--
2.34.1


2023-10-27 14:49:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 07:38:34AM -0700, Pawan Gupta wrote:
> v4:

Why are you spamming people with your patchset? You've sent it 4 times
in a week:

Oct 20 Pawan Gupta ( : 75|) [PATCH 0/6] Delay VERW
Oct 24 Pawan Gupta ( :7.3K|) [PATCH v2 0/6] Delay VERW
Oct 25 Pawan Gupta ( :7.5K|) [PATCH v3 0/6] Delay VERW
Oct 27 Pawan Gupta ( :8.8K|) [PATCH v4 0/6] Delay VERW

Is this something urgent or can you take your time like everyone else?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-10-27 15:08:00

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 04:48:48PM +0200, Borislav Petkov wrote:
> On Fri, Oct 27, 2023 at 07:38:34AM -0700, Pawan Gupta wrote:
> > v4:
>
> Why are you spamming people with your patchset? You've sent it 4 times
> in a week:
>
> Oct 20 Pawan Gupta ( : 75|) [PATCH 0/6] Delay VERW
> Oct 24 Pawan Gupta ( :7.3K|) [PATCH v2 0/6] Delay VERW
> Oct 25 Pawan Gupta ( :7.5K|) [PATCH v3 0/6] Delay VERW
> Oct 27 Pawan Gupta ( :8.8K|) [PATCH v4 0/6] Delay VERW
>
> Is this something urgent or can you take your time like everyone else?

I am going on a long vacation next week, I won't be working for the rest
of the year. So I wanted to get this in a good shape quickly. This
patchset addresses some security issues (although theoretical). So there
is some sense of urgency. Sorry for spamming, I'll take you off the To:
list.

2023-10-27 15:13:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 08:05:35AM -0700, Pawan Gupta wrote:
> I am going on a long vacation next week, I won't be working for the rest
> of the year. So I wanted to get this in a good shape quickly. This
> patchset addresses some security issues (although theoretical). So there
> is some sense of urgency. Sorry for spamming, I'll take you off the To:
> list.

Even if you're leaving for vacation, I'm sure some colleague of yours or
dhansen will take over this for you. So there's no need to keep sending
this every day. Imagine everyone who leaves for vacation would start
doing that...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-10-27 15:33:02

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 05:12:26PM +0200, Borislav Petkov wrote:
> On Fri, Oct 27, 2023 at 08:05:35AM -0700, Pawan Gupta wrote:
> > I am going on a long vacation next week, I won't be working for the rest
> > of the year. So I wanted to get this in a good shape quickly. This
> > patchset addresses some security issues (although theoretical). So there
> > is some sense of urgency. Sorry for spamming, I'll take you off the To:
> > list.
>
> Even if you're leaving for vacation, I'm sure some colleague of yours or
> dhansen will take over this for you. So there's no need to keep sending
> this every day. Imagine everyone who leaves for vacation would start
> doing that...

I can imagine the amount emails maintainers get. I'll take care of this
in future. But, its good to get some idea on how much is too much,
specially for a security issue?

2023-10-27 15:37:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 08:32:42AM -0700, Pawan Gupta wrote:
> I can imagine the amount emails maintainers get. I'll take care of this
> in future. But, its good to get some idea on how much is too much,
> specially for a security issue?

If it ain't really urgent, once a week like every other patchset. We
have all this documented in

Documentation/process/submitting-patches.rst

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-10-27 15:38:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] Delay VERW

On Fri, Oct 27, 2023 at 08:32:42AM -0700, Pawan Gupta wrote:
> On Fri, Oct 27, 2023 at 05:12:26PM +0200, Borislav Petkov wrote:
> > On Fri, Oct 27, 2023 at 08:05:35AM -0700, Pawan Gupta wrote:
> > > I am going on a long vacation next week, I won't be working for the rest
> > > of the year. So I wanted to get this in a good shape quickly. This
> > > patchset addresses some security issues (although theoretical). So there
> > > is some sense of urgency. Sorry for spamming, I'll take you off the To:
> > > list.
> >
> > Even if you're leaving for vacation, I'm sure some colleague of yours or
> > dhansen will take over this for you. So there's no need to keep sending
> > this every day. Imagine everyone who leaves for vacation would start
> > doing that...
>
> I can imagine the amount emails maintainers get. I'll take care of this
> in future. But, its good to get some idea on how much is too much,
> specially for a security issue?

You said it wasn't a security issue (theoretical?)

And are we supposed to drop everything for such things? Again, think of
the people who are on the other end of your patches please...

greg k-h

2023-12-01 20:03:27

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v4 6/6] KVM: VMX: Move VERW closer to VMentry for MDS mitigation

On Fri, Oct 27, 2023 at 07:39:12AM -0700, Pawan Gupta wrote:
> - vmx_disable_fb_clear(vmx);
> + /*
> + * Optimize the latency of VERW in guests for MMIO mitigation. Skip
> + * the optimization when MDS mitigation(later in asm) is enabled.
> + */
> + if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
> + vmx_disable_fb_clear(vmx);
>
> if (vcpu->arch.cr2 != native_read_cr2())
> native_write_cr2(vcpu->arch.cr2);
> @@ -7248,7 +7256,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
>
> vmx->idt_vectoring_info = 0;
>
> - vmx_enable_fb_clear(vmx);
> + if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
> + vmx_enable_fb_clear(vmx);
>

It may be cleaner to instead check X86_FEATURE_CLEAR_CPU_BUF when
setting vmx->disable_fb_clear in the first place, in
vmx_update_fb_clear_dis().

--
Josh

2023-12-20 01:26:20

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v4 6/6] KVM: VMX: Move VERW closer to VMentry for MDS mitigation

On Fri, Dec 01, 2023 at 12:02:47PM -0800, Josh Poimboeuf wrote:
> On Fri, Oct 27, 2023 at 07:39:12AM -0700, Pawan Gupta wrote:
> > - vmx_disable_fb_clear(vmx);
> > + /*
> > + * Optimize the latency of VERW in guests for MMIO mitigation. Skip
> > + * the optimization when MDS mitigation(later in asm) is enabled.
> > + */
> > + if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
> > + vmx_disable_fb_clear(vmx);
> >
> > if (vcpu->arch.cr2 != native_read_cr2())
> > native_write_cr2(vcpu->arch.cr2);
> > @@ -7248,7 +7256,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> >
> > vmx->idt_vectoring_info = 0;
> >
> > - vmx_enable_fb_clear(vmx);
> > + if (!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF))
> > + vmx_enable_fb_clear(vmx);
> >
>
> It may be cleaner to instead check X86_FEATURE_CLEAR_CPU_BUF when
> setting vmx->disable_fb_clear in the first place, in
> vmx_update_fb_clear_dis().

Right. Thanks for the review.