v5:
- Update comment in patch 1.
- Minor doc update and code twist in patch 4 as suggested by Peter and
Randy.
v4:
- Add a new __update_spec_ctrl() helper in patch 1.
- Rebased to the latest linux kernel.
v3:
- Drop patches 1 ("x86/speculation: Provide a debugfs file to dump
SPEC_CTRL MSRs") and 5 ("x86/idle: Disable IBRS entering mwait idle
and enable it on wakeup") for now.
- Drop the MSR restoration code in ("x86/idle: Disable IBRS when cpu
is offline") as native_play_dead() does not return.
- For patch ("intel_idle: Add ibrs_off module parameter to force
disable IBRS"), change the name from "no_ibrs" to "ibrs_off" and
document the new parameter in intel_idle.rst.
For Intel processors that need to turn on IBRS to protect against
Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
the performance of the whole core even if only one thread is turning
it on when running in the kernel. For user space heavy applications,
the performance impact of occasionally turning IBRS on during syscalls
shouldn't be significant. Unfortunately, that is not the case when the
sibling thread is idling in the kernel. In that case, the performance
impact can be significant.
When DPDK is running on an isolated CPU thread processing network packets
in user space while its sibling thread is idle. The performance of the
busy DPDK thread with IBRS on and off in the sibling idle thread are:
IBRS on IBRS off
------- --------
packets/second: 7.8M 10.4M
avg tsc cycles/packet: 282.26 209.86
This is a 25% performance degradation. The test system is a Intel Xeon
4114 CPU @ 2.20GHz.
Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle (C6 or below). However, there
are existing users out there who have set "intel_idle.max_cstate=1"
to decrease latency. Those users won't be able to benefit from this
commit. This patch series extends this commit by providing a new
"intel_idle.ibrs_off" module parameter to force disable IBRS even when
"intel_idle.max_cstate=1" at the expense of increased IRQ response
latency. It also includes a commit to allow the disabling of IBRS when
a CPU becomes offline.
Waiman Long (4):
x86/speculation: Add __update_spec_ctrl() helper
x86/idle: Disable IBRS when cpu is offline
intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
intel_idle: Add ibrs_off module parameter to force disable IBRS
Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
arch/x86/include/asm/nospec-branch.h | 12 +++++++++++-
arch/x86/kernel/smpboot.c | 8 ++++++++
drivers/idle/intel_idle.c | 15 ++++++++++++---
4 files changed, 47 insertions(+), 5 deletions(-)
--
2.31.1
Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU
becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
is enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline. When the CPU is online again, it will be re-initialized
and so restoring the SPEC_CTRL value isn't needed.
Add a comment to say that native_play_dead() is a __noreturn function,
but it can't be marked as such to avoid confusion about the missing
MSR restoration code.
Signed-off-by: Waiman Long <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
arch/x86/kernel/smpboot.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e1aa2cd7734b..68e2e044ab8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -87,6 +87,7 @@
#include <asm/hw_irq.h>
#include <asm/stackprotector.h>
#include <asm/sev.h>
+#include <asm/nospec-branch.h>
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1743,8 +1744,15 @@ void __noreturn hlt_play_dead(void)
native_halt();
}
+/*
+ * native_play_dead() is essentially a __noreturn function, but it can't
+ * be marked as such as the compiler may complain about it.
+ */
void native_play_dead(void)
{
+ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS))
+ __update_spec_ctrl(0);
+
play_dead_common();
tboot_shutdown(TB_SHUTDOWN_WFS);
--
2.31.1
Add a new __update_spec_ctrl() helper which is a variant of
update_spec_ctrl() that can be used in a noinstr function.
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 55388c9f7601..06ceacfd1fe2 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -9,7 +9,7 @@
#include <asm/alternative.h>
#include <asm/cpufeatures.h>
-#include <asm/msr-index.h>
+#include <asm/msr.h>
#include <asm/unwind_hints.h>
#include <asm/percpu.h>
#include <asm/current.h>
@@ -488,6 +488,16 @@ DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
extern void update_spec_ctrl_cond(u64 val);
extern u64 spec_ctrl_current(void);
+/*
+ * This can be used in noinstr function & should only be called in bare
+ * metal context.
+ */
+static __always_inline void __update_spec_ctrl(u64 val)
+{
+ __this_cpu_write(x86_spec_ctrl_current, val);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
+}
+
/*
* With retpoline, we must use IBRS to restrict branch prediction
* before calling into firmware.
--
2.31.1
When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to 0
in order disable IBRS. However, the new MSR value isn't reflected in
x86_spec_ctrl_current which is at odd with the other code that keep track
of its state in that percpu variable. Use the new __update_spec_ctrl()
to have the x86_spec_ctrl_current percpu value properly updated.
Signed-off-by: Waiman Long <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
---
drivers/idle/intel_idle.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b930036edbbe..c9479f089037 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -182,12 +182,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
int ret;
if (smt_active)
- native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ __update_spec_ctrl(0);
ret = __intel_idle(dev, drv, index);
if (smt_active)
- native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+ __update_spec_ctrl(spec_ctrl);
return ret;
}
--
2.31.1