2018-06-29 14:39:38

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH v2 0/4] x86/split_lock: Enable #AC exception for split locked accesses

==Introduction==

A split lock is any atomic operation whose operand crosses two cache
lines. Since the operand spans two cache lines and the operation must
be atomic, the system locks the bus while the CPU accesses the two cache
lines.

During bus locking, request from other CPUs or bus agents for control
of the bus are blocked. Blocking bus access from other CPUs plus
overhead of configuring bus locking protocol degrade not only the
performance of one CPU but overall system performance.

If operand is cacheable and completely contained in one cache line, atomic
operation is optimized by less expensive cache locking on Intel P6 and
recent processors. If split lock is detected and the two cache lines in the
operand can be merged into one cache line, cache locking instead of
more expensive bus locking will be used for atomic operation. Removing
split lock can improve overall performance.

Instructions that may cause split lock issue include lock add, lock btc,
xchg, lsl, far call, ltr, etc.

More information about split lock, bus locking, and cache locking can be
found in the latest Intel 64 and IA-32 Architecture Software Developer's
Manual.

==#AC for split lock==

Currently we can trace split lock event counter sq_misc.split_lock
for debug purpose. But for system deployed in the field, this event
counter after the fact is insufficient. We need a mechanism that
detects split lock before it happens to ensure that bus lock is never
incurred due to split lock.

Intel introduces a mechanism to detect split lock via Alignment Check
(#AC) exception before badly aligned atomic instructions might impact
whole system performance in Tremont and other future processors.

This capability is critical for real time system designers who build
consolidated real time systems. These systems run hard real time
code on some cores and run "untrusted" user processes on some
other cores. The hard real time cannot afford to have any bus lock from
the untrusted processes to hurt real time performance. To date the
designers have been unable to deploy these solutions as they have no
way to prevent the "untrusted" user code from generating split lock and
bus lock to block the hard real time code to access memory during bus
locking.

This capability may also find usage in cloud. A user process with split
lock running in one guest can block other cores from accessing shared
memory during its split locked memory access. That may cause overall
system performance degradation.

Split lock may open a security hole where malicious user code may slow
down overall system by executing instructions with split lock.

==Detect Split Lock==

To detect split lock, a new control bit (bit 29) in TEST_CTL MSR 0x33
will be introduced in future x86 processors. When the bit 29 is set,
the processor causes #AC exception for split locked accesses at all CPL.

The bit 29 specification in MSR TEST_CTL is published in the latest
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference.

==Handle Split Lock===

There may be different considerations to handle split lock, e.g. how
to handle split lock issue in firmware after kernel enables #AC for
split lock.

But we use a simple way to handle split lock:

- If split lock happens in kernel, a warning is issued and #AC for
split lock is disabled on the local CPU. The split lock issue should
be fixed in kernel.

Please note: After enabling this feature, we do find a kernel split
lock issue and fix it in patch 2. Patch 2 is before patch 3 in the
patch set because the WARN in patch 3 uses x86_capability bits which
cannot be set up without fix from patch 2.

- If split lock happens in user process, the process is killed by
SIGBUS. Unless the issue is fixed, the process cannot run in the
system.

- If split lock happens in firmware, system may hang in firmware. The
issue should be fixed in firmware.

- Kernel parameter 'ac_split_lock_off" opt-out of the feature.

==Changelog==
v2:
- Remove code that handles split lock issue in firmware and fix
x86_capability issue mainly based on comments from Thomas Gleixner and
Peter Zijlstra.

In previous version:
Comments from Dave Hansen:
- Enumerate feature in X86_FEATURE_SPLIT_LOCK_AC
- Separate #AC handler from do_error_trap
- Use CONFIG to configure inherit BIOS setting, enable, or disable split
lock. Remove kernel parameter "split_lock_ac="
- Change config interface to debugfs from sysfs
- Fix a few bisectable issues
- Other changes.

Comment from Tony Luck and Dave Hansen:
- Dump right information in #AC handler

Comment from Alan Cox and Dave Hansen:
- Description of split lock in patch 0

Others:
- Remove tracing because we can trace split lock in existing
sq_misc.split_lock.
- Add CONFIG to configure either panic or re-execute faulting instruction
for split lock in kernel.
- other minor changes.

Fenghua Yu (4):
x86/split_lock: Enumerate #AC exception for split locked access
feature
x86/split_lock: Align x86_capability to unsigned long to avoid split
locked access
x86/split_lock: Handle #AC exception for split lock
x86/split_lock: Disable #AC for split locked accesses

Documentation/admin-guide/kernel-parameters.txt | 4 ++
arch/x86/include/asm/cpu.h | 3 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 ++
arch/x86/include/asm/processor.h | 3 +-
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/test_ctl.c | 96 +++++++++++++++++++++++++
arch/x86/kernel/setup.c | 4 ++
arch/x86/kernel/smpboot.c | 3 +
arch/x86/kernel/traps.c | 32 ++++++++-
10 files changed, 149 insertions(+), 2 deletions(-)
create mode 100644 arch/x86/kernel/cpu/test_ctl.c

--
2.5.0



2018-06-29 14:39:12

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH v2 4/4] x86/split_lock: Disable #AC for split locked accesses

By default, #AC for split lock is enabled. In some cases (e.g. system
hang when firmware hits split lock), user may want to opt-out of the
feature.

Kernel parameter "ac_split_lock_off" disables the feature during boot time.

Signed-off-by: Fenghua Yu <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ++++
arch/x86/kernel/cpu/test_ctl.c | 9 +++++++++
2 files changed, 13 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index efc7aa7a0670..7bf3b37dccdb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4086,6 +4086,10 @@
spia_pedr=
spia_peddr=

+ ac_split_lock_off [X86]
+ Disable #AC exception for split locked accesses.
+ By default, this feature is enabled in kernel.
+
srcutree.counter_wrap_check [KNL]
Specifies how frequently to check for
grace-period sequence counter wrap for the
diff --git a/arch/x86/kernel/cpu/test_ctl.c b/arch/x86/kernel/cpu/test_ctl.c
index f12e8b24215d..4e6ed267e8b7 100644
--- a/arch/x86/kernel/cpu/test_ctl.c
+++ b/arch/x86/kernel/cpu/test_ctl.c
@@ -85,3 +85,12 @@ void setup_ac_split_lock(void)
pr_info_once("#AC for split lock is disabled\n");
}
}
+
+static __init int ac_split_lock_off(char *str)
+{
+ enable_ac_split_lock = false;
+
+ return 0;
+}
+
+early_param("ac_split_lock_off", ac_split_lock_off);
--
2.5.0


2018-06-29 14:39:25

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

There may be different considerations on how to handle #AC for split lock,
e.g. how to handle system hang caused by split lock issue in firmware, if
need to emulate faulting instruction, etc. We use a simple method to
handle user and kernel split lock and may extend the method in the future.

When #AC exception for split lock is triggered from user process, the
process is killed by SIGBUS. To execute the process properly, user
application developer needs to fix the split lock issue.

When #AC exception for split lock is triggered from a kernel instruction,
disable #AC for split lock on local CPU and warn the split lock issue.
After the exception, the faulting instruction will be executed and kernel
execution continues. #AC for split lock is only disabled on the local CPU
not globally. It will be re-enabled if the CPU is offline and then online.

Kernel developer should check the warning and fix the split lock issue
one by one. Then further split lock may be captured and fixed.

After bit 29 in MSR_TEST_CTL is set as one in kernel, firmware inherits
the setting when firmware is executed in S4, S5, run time services, SMI,
etc. Split lock issue in firmware triggers #AC and may hang the system
depending on how firmware handles the #AC. It's up to firmware developer
to fix the split lock issues in firmware.

Signed-off-by: Fenghua Yu <[email protected]>
---
arch/x86/include/asm/cpu.h | 2 ++
arch/x86/kernel/cpu/test_ctl.c | 41 +++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/setup.c | 2 ++
arch/x86/kernel/smpboot.c | 3 +++
arch/x86/kernel/traps.c | 32 +++++++++++++++++++++++++++++++-
5 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index bd64380d598b..836d7e3f70c8 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -41,4 +41,6 @@ unsigned int x86_family(unsigned int sig);
unsigned int x86_model(unsigned int sig);
unsigned int x86_stepping(unsigned int sig);
void detect_ac_split_lock(void);
+void setup_ac_split_lock(void);
+bool do_ac_split_lock(struct pt_regs *regs);
#endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/test_ctl.c b/arch/x86/kernel/cpu/test_ctl.c
index af1822469c94..f12e8b24215d 100644
--- a/arch/x86/kernel/cpu/test_ctl.c
+++ b/arch/x86/kernel/cpu/test_ctl.c
@@ -7,10 +7,17 @@
* Author:
* Fenghua Yu <[email protected]>
*/
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/printk.h>
#include <linux/cpufeature.h>
+#include <linux/cpu.h>
#include <asm/msr.h>

+/* By default, #AC for split lock is enabled. */
+static bool enable_ac_split_lock = true;
+
/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
void detect_ac_split_lock(void)
{
@@ -44,3 +51,37 @@ void detect_ac_split_lock(void)
*/
wrmsrl(MSR_TEST_CTL, orig_val);
}
+
+/*
+ * #AC handler for split lock is called by generic #AC handler.
+ *
+ * On split lock in kernel, warn and disable #AC for split lock on current CPU.
+ *
+ * On split lock in user process, send SIGBUS in the generic #AC handler.
+ */
+bool do_ac_split_lock(struct pt_regs *regs)
+{
+ /* Generic #AC handler will handle split lock in user. */
+ if (user_mode(regs))
+ return false;
+
+ /* Clear the split lock bit to disable the feature on local CPU. */
+ msr_clear_bit(MSR_TEST_CTL, MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK_SHIFT);
+
+ WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
+
+ return true;
+}
+
+void setup_ac_split_lock(void)
+{
+ if (enable_ac_split_lock) {
+ msr_set_bit(MSR_TEST_CTL,
+ MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK_SHIFT);
+ pr_info_once("#AC for split lock is enabled\n");
+ } else {
+ msr_clear_bit(MSR_TEST_CTL,
+ MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK_SHIFT);
+ pr_info_once("#AC for split lock is disabled\n");
+ }
+}
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 18de4e35a4e5..ca4ef8325dfe 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -954,6 +954,8 @@ void __init setup_arch(char **cmdline_p)
parse_early_param();

detect_ac_split_lock();
+ /* Set up #AC for split lock at the earliest phase. */
+ setup_ac_split_lock();

if (efi_enabled(EFI_BOOT))
efi_memblock_x86_reserve_range();
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c2f7d1d2a5c3..d6b224e6284f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -225,6 +225,9 @@ static void notrace start_secondary(void *unused)
#endif
load_current_idt();
cpu_init();
+
+ setup_ac_split_lock();
+
x86_cpuinit.early_percpu_clock_init();
preempt_disable();
smp_callin();
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e6db475164ed..dd309a7b46bd 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -61,6 +61,7 @@
#include <asm/mpx.h>
#include <asm/vm86.h>
#include <asm/umip.h>
+#include <asm/cpu.h>

#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -318,7 +319,36 @@ DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",coprocessor_seg
DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS)
DO_ERROR(X86_TRAP_NP, SIGBUS, "segment not present", segment_not_present)
DO_ERROR(X86_TRAP_SS, SIGBUS, "stack segment", stack_segment)
-DO_ERROR(X86_TRAP_AC, SIGBUS, "alignment check", alignment_check)
+
+dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
+{
+ unsigned int trapnr = X86_TRAP_AC;
+ char str[] = "alignment check";
+ int signr = SIGBUS;
+ siginfo_t info;
+ int ret;
+
+ RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+ if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
+ NOTIFY_STOP) {
+ /* #AC exception could be handled by split lock handler. */
+ ret = do_ac_split_lock(regs);
+ if (ret) {
+ cond_local_irq_enable(regs);
+
+ return;
+ }
+
+ cond_local_irq_enable(regs);
+ /*
+ * If not processed by split lock handler, go to generic
+ * #AC handler.
+ */
+ do_trap(trapnr, signr, str, regs, error_code,
+ fill_trap_info(regs, signr, trapnr, &info));
+ }
+}

#ifdef CONFIG_VMAP_STACK
__visible void __noreturn handle_stack_overflow(const char *message,
--
2.5.0


2018-06-29 14:41:11

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

Alignment Check (#AC) exception for split lock is supported on Tremont
and future processors. We need to enumerate the feature on processors.

Bit 29 in MSR TEST_CTL 0x33 can only be set on processors that support
the feature. On processors not supporting the feature, the bit is reserved
(i.e. cannot be set as one) or the MSR doesn't exist.

To detect the feature, attempt to set the bit in the MSR. If the writing
succeeds, the feature is available. Otherwise, the feature is not
supported on this platform.

test_ctl.c is created to contain majority of split lock code. Hopefully
more features related to MSR_TEST_CTL will be added to the file and
share some code with split lock in future.

More information on the bit 29 and MSR TEST_CTL can be found in the latest
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference.

Signed-off-by: Fenghua Yu <[email protected]>
---
arch/x86/include/asm/cpu.h | 1 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/test_ctl.c | 46 ++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/setup.c | 2 ++
6 files changed, 55 insertions(+)
create mode 100644 arch/x86/kernel/cpu/test_ctl.c

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index adc6cc86b062..bd64380d598b 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -40,4 +40,5 @@ int mwait_usable(const struct cpuinfo_x86 *);
unsigned int x86_family(unsigned int sig);
unsigned int x86_model(unsigned int sig);
unsigned int x86_stepping(unsigned int sig);
+void detect_ac_split_lock(void);
#endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 5701f5cecd31..c1bc06542e78 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -219,6 +219,7 @@
#define X86_FEATURE_IBPB ( 7*32+26) /* Indirect Branch Prediction Barrier */
#define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */
+#define X86_FEATURE_AC_SPLIT_LOCK ( 7*32+29) /* #AC exception for split locked access */

/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 68b2c3150de1..7b9496850370 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -39,6 +39,10 @@

/* Intel MSRs. Some also available on other CPUs */

+#define MSR_TEST_CTL 0x00000033
+#define MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK_SHIFT 29
+#define MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK BIT(29)
+
#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 347137e80bf5..22a3fa26bfd4 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MICROCODE) += microcode/
obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o

obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o
+obj-y += test_ctl.o

ifdef CONFIG_X86_FEATURE_NAMES
quiet_cmd_mkcapflags = MKCAP $@
diff --git a/arch/x86/kernel/cpu/test_ctl.c b/arch/x86/kernel/cpu/test_ctl.c
new file mode 100644
index 000000000000..af1822469c94
--- /dev/null
+++ b/arch/x86/kernel/cpu/test_ctl.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Enable #AC exception for split locked accesses in TEST_CTL MSR
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Author:
+ * Fenghua Yu <[email protected]>
+ */
+#include <linux/printk.h>
+#include <linux/cpufeature.h>
+#include <asm/msr.h>
+
+/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
+void detect_ac_split_lock(void)
+{
+ u64 val, orig_val;
+ int ret;
+
+ /* Attempt to read the MSR. If the MSR doesn't exist, reading fails. */
+ ret = rdmsrl_safe(MSR_TEST_CTL, &val);
+ if (ret)
+ return;
+
+ orig_val = val;
+
+ /* Turn on the split lock bit */
+ val |= MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK;
+
+ /*
+ * Attempt to set bit 29 in the MSR. The bit is set successfully
+ * only on processors that support #AC for split lock.
+ */
+ ret = wrmsrl_safe(MSR_TEST_CTL, val);
+ if (ret)
+ return;
+
+ /* The feature is supported on CPU. */
+ setup_force_cpu_cap(X86_FEATURE_AC_SPLIT_LOCK);
+
+ /*
+ * Need to restore split lock setting to original firmware setting
+ * before leaving.
+ */
+ wrmsrl(MSR_TEST_CTL, orig_val);
+}
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2f86d883dd95..18de4e35a4e5 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -953,6 +953,8 @@ void __init setup_arch(char **cmdline_p)

parse_early_param();

+ detect_ac_split_lock();
+
if (efi_enabled(EFI_BOOT))
efi_memblock_x86_reserve_range();
#ifdef CONFIG_MEMORY_HOTPLUG
--
2.5.0


2018-06-29 14:57:43

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> +/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
> +void detect_ac_split_lock(void)
> +{
> + u64 val, orig_val;
> + int ret;
> +
> + /* Attempt to read the MSR. If the MSR doesn't exist, reading fails. */
> + ret = rdmsrl_safe(MSR_TEST_CTL, &val);
> + if (ret)
> + return;

This is a bit fast and loose with how the feature is detected, which
might be OK, but can we call out why we are doing this, please?

Is this MSR not really model-specific? Is it OK to go poking at it on
all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?

2018-06-29 16:28:11

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to
operate on bitmap defined in x86_capability.

Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode,
the location is at:
base address of x86_capability + (bit offset in x86_capability % 64) * 8

Since base address of x86_capability may not aligned to unsigned long,
the single unsigned long location may cross two cache lines and
accessing the location by locked BTS/BTR intructions will trigger #AC.

To fix the split lock issue, align x86_capability to unsigned long so that
the location will be always within one cache line.

Signed-off-by: Fenghua Yu <[email protected]>
---
arch/x86/include/asm/processor.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index cfd29ee8c3da..5ad6ba4657a2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -105,7 +105,8 @@ struct cpuinfo_x86 {
__u32 extended_cpuid_level;
/* Maximum supported CPUID level, -1=no CPUID: */
int cpuid_level;
- __u32 x86_capability[NCAPINTS + NBUGINTS];
+ __u32 x86_capability[NCAPINTS + NBUGINTS]
+ __aligned(sizeof(unsigned long));
char x86_vendor_id[16];
char x86_model_id[64];
/* in KB - valid for CPUS which support this call: */
--
2.5.0


2018-06-29 17:22:36

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -105,7 +105,8 @@ struct cpuinfo_x86 {
> __u32 extended_cpuid_level;
> /* Maximum supported CPUID level, -1=no CPUID: */
> int cpuid_level;
> - __u32 x86_capability[NCAPINTS + NBUGINTS];
> + __u32 x86_capability[NCAPINTS + NBUGINTS]
> + __aligned(sizeof(unsigned long));
> char x86_vendor_id[16];
> char x86_model_id[64];
> /* in KB - valid for CPUS which support this call: */

This is begging for comments.

2018-06-29 17:33:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On Fri, 29 Jun 2018, Dave Hansen wrote:
> On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> > +/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
> > +void detect_ac_split_lock(void)
> > +{
> > + u64 val, orig_val;
> > + int ret;
> > +
> > + /* Attempt to read the MSR. If the MSR doesn't exist, reading fails. */
> > + ret = rdmsrl_safe(MSR_TEST_CTL, &val);
> > + if (ret)
> > + return;
>
> This is a bit fast and loose with how the feature is detected, which
> might be OK, but can we call out why we are doing this, please?
>
> Is this MSR not really model-specific? Is it OK to go poking at it on
> all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?

That definitely needs a vendor check. Also the whole code needs to be
compiled out if CONFIG_INTEL=n.

Aside of that this wants to be enumerated. CPUID or MISC_FEATURES and not
this guess work detection logic. Why do I have to ask for that for every
other new feature thingy?

And pretty please, can we either stick that stuff into cpu/intel.c or if it
really needs a separate file avoid that horrible test_ctl.c file name?

Thanks,

tglx








2018-06-29 17:33:14

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

> +/*
> + * #AC handler for split lock is called by generic #AC handler.
> + *
> + * On split lock in kernel, warn and disable #AC for split lock on current CPU.
> + *
> + * On split lock in user process, send SIGBUS in the generic #AC handler.
> + */

Don't comment the function, comment the code, please. The thing that
needs to be here that is missing is what the return values mean.

> +bool do_ac_split_lock(struct pt_regs *regs)
> +{
> + /* Generic #AC handler will handle split lock in user. */
> + if (user_mode(regs))
> + return false;
> +
> + /* Clear the split lock bit to disable the feature on local CPU. */
> + msr_clear_bit(MSR_TEST_CTL, MSR_TEST_CTL_ENABLE_AC_SPLIT_LOCK_SHIFT);
> +
> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");

I think folks understand that warnings need to get fixed. They don't
need to be urged to FIX IT IN CAPS, or asked nicely. This can simply be:

"lock split across cacheline boundary"

But, warning here is also not super useful. Shouldn't we be dumping out
the info in 'regs' instead of the current context? We don't care about
the state in the #AC handler, we care about 'regs'.


> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 18de4e35a4e5..ca4ef8325dfe 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -954,6 +954,8 @@ void __init setup_arch(char **cmdline_p)
> parse_early_param();
>
> detect_ac_split_lock();

This ^ needs:

/* Do detection only on the boot cpu. */

> + /* Set up #AC for split lock at the earliest phase. */
> + setup_ac_split_lock();
>
> if (efi_enabled(EFI_BOOT))
> efi_memblock_x86_reserve_range();
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index c2f7d1d2a5c3..d6b224e6284f 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -225,6 +225,9 @@ static void notrace start_secondary(void *unused)
> #endif
> load_current_idt();
> cpu_init();
> +
> + setup_ac_split_lock();

and:

/* Feature detection was done on the boot cpu, only do setup */

> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index e6db475164ed..dd309a7b46bd 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -61,6 +61,7 @@
> #include <asm/mpx.h>
> #include <asm/vm86.h>
> #include <asm/umip.h>
> +#include <asm/cpu.h>
>
> #ifdef CONFIG_X86_64
> #include <asm/x86_init.h>
> @@ -318,7 +319,36 @@ DO_ERROR(X86_TRAP_OLD_MF, SIGFPE, "coprocessor segment overrun",coprocessor_seg
> DO_ERROR(X86_TRAP_TS, SIGSEGV, "invalid TSS", invalid_TSS)
> DO_ERROR(X86_TRAP_NP, SIGBUS, "segment not present", segment_not_present)
> DO_ERROR(X86_TRAP_SS, SIGBUS, "stack segment", stack_segment)
> -DO_ERROR(X86_TRAP_AC, SIGBUS, "alignment check", alignment_check)
> +
> +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
> +{
> + unsigned int trapnr = X86_TRAP_AC;
> + char str[] = "alignment check";
> + int signr = SIGBUS;
> + siginfo_t info;
> + int ret;
> +
> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
> + NOTIFY_STOP) {

There does not seem to be a _lot_ of consistency for whether
notify_die() gets called before or after kernel exceptions are handled.
Why did you choose to do it this way?

Also, please unindent this block and just return if notify_die() returns
false.

> + /* #AC exception could be handled by split lock handler. */
> + ret = do_ac_split_lock(regs);
> + if (ret) {
> + cond_local_irq_enable(regs);
> +
> + return;
> + }
> +
> + cond_local_irq_enable(regs);

FWIW, you can consolidate the cond_local_irq_enable() calls:

ret = do_ac_split_lock(regs);
cond_local_irq_enable(regs);
if (ret)
return;

...


2018-06-29 17:33:33

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] x86/split_lock: Disable #AC for split locked accesses

On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> + ac_split_lock_off [X86]
> + Disable #AC exception for split locked accesses.
> + By default, this feature is enabled in kernel.

FWIW, I detest the "ac_" nomenclature this patch spreads around. It's
marginally passable for in-kernel use, but I'd rather see it stopped
where it reflects out to user-visible places. Let's give this a logical
name, like:

split_lock_detect=off


2018-06-29 17:33:52

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On 06/29/2018 09:23 AM, Thomas Gleixner wrote:
> And pretty please, can we either stick that stuff into cpu/intel.c or if it
> really needs a separate file avoid that horrible test_ctl.c file name?

Fenghua, I also asked for this earlier. It would be really awesome if
you could go through and double-check that you have not missed any more
review comments before you post this again.

2018-06-29 17:39:47

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

On Fri, Jun 29, 2018 at 09:33:54AM -0700, Luck, Tony wrote:
> >> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
> >
> > But, warning here is also not super useful. Shouldn't we be dumping out
> > the info in 'regs' instead of the current context? We don't care about
> > the state in the #AC handler, we care about 'regs'.

But WARN dump not only the state in the #AC handler, but also dump the regs
in the current context. And WARN dumps stack.

>
> Maybe:
>
> WARN_ONCE(1, "split lock detected at %pF\n", regs[EIP]);

Should we dump redundant regs info while WARN shows them all already?

Thanks.

-Fenghua

2018-06-29 17:41:01

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

On 06/29/2018 10:16 AM, Fenghua Yu wrote:
> On Fri, Jun 29, 2018 at 09:33:54AM -0700, Luck, Tony wrote:
>>>> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
>>>
>>> But, warning here is also not super useful. Shouldn't we be dumping out
>>> the info in 'regs' instead of the current context? We don't care about
>>> the state in the #AC handler, we care about 'regs'.
>
> But WARN dump not only the state in the #AC handler, but also dump the regs
> in the current context. And WARN dumps stack.

Oh, I forgot about the fancy stack following. That might give us useful
output, although mixed with useless output about the #AC handler.

But, in any case, could you please at least confirm that this does what
you think it does? *Actually* generate #AC inside the kernel, with this
code, and share the output?

>> Maybe:
>>
>> WARN_ONCE(1, "split lock detected at %pF\n", regs[EIP]);
>
> Should we dump redundant regs info while WARN shows them all already?

I bet it actually makes it easier to read the output and locate the real
source of the problem. It's especially important if you're going to do
the WARN_() from the #AC handler with all the #AC information as noise
in the warning.

2018-06-29 17:43:54

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

On Fri, Jun 29, 2018 at 10:29:08AM -0700, Dave Hansen wrote:
> On 06/29/2018 10:16 AM, Fenghua Yu wrote:
> > On Fri, Jun 29, 2018 at 09:33:54AM -0700, Luck, Tony wrote:
> >>>> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
> >>>
> >>> But, warning here is also not super useful. Shouldn't we be dumping out
> >>> the info in 'regs' instead of the current context? We don't care about
> >>> the state in the #AC handler, we care about 'regs'.
> >
> > But WARN dump not only the state in the #AC handler, but also dump the regs
> > in the current context. And WARN dumps stack.
>
> Oh, I forgot about the fancy stack following. That might give us useful
> output, although mixed with useless output about the #AC handler.

The useful split lock info follows useless #AC handler info.

>
> But, in any case, could you please at least confirm that this does what
> you think it does? *Actually* generate #AC inside the kernel, with this
> code, and share the output?

Actually I did that already. Below is the dumped info for an actual kernel
split lock:

[41652.528398] ------------[ cut here ]------------
[41652.528408] A split lock issue is detected. Please FIX it
[41652.528422] WARNING: CPU: 5 PID: 6873 at arch/x86/kernel/cpu/test_ctl.c:71 do_ac_split_lock+0x3a/0x50
[41652.528432] Modules linked in: test_split_lock_drv(PO+) efivarfs
[41652.528445] CPU: 5 PID: 6873 Comm: insmod Tainted: P W O 4.18.0-rc2+ #60
[41652.528456] Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS WLYDCRB1.86B.0005.D60.1802040214 02/04/2018
[41652.528469] RIP: 0010:do_ac_split_lock+0x3a/0x50
[41652.528479] Code: 00 00 bf 33 00 00 00 e8 44 8e 36 00 0f b6 05 6e 32 4f 01 84 c0 75 e4 48 c7 c7 68 52 a0 97 c6 05 5c 32 4f 01 01 e8 16 0f 02 00 <0f> 0b b8 01 00 00 00 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00
[41652.528518] RSP: 0018:ffffa979c13e7a60 EFLAGS: 00010082
[41652.528530] RAX: 0000000000000000 RBX: ffffa979c13e7b18 RCX: 0000000000000006
[41652.528541] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffffa0323b755420
[41652.528552] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000056d
[41652.528563] R10: 0000000000000000 R11: 000000000000056d R12: 0000000000000000
[41652.528574] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[41652.528586] FS: 00007fa1ee559700(0000) GS:ffffa0323b740000(0000) knlGS:0000000000000000
[41652.528597] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41652.528608] CR2: 000055ba38138518 CR3: 0000000177138006 CR4: 00000000007606e0
[41652.528619] PKRU: 55555554
[41652.528629] Call Trace:
[41652.528642] do_alignment_check+0x79/0x100
[41652.528654] ? sched_clock_cpu+0xc/0xa0
[41652.528665] ? up+0xd/0x50
[41652.528678] alignment_check+0x1e/0x30
[41652.528690] RIP: 0010:init_module+0x46/0x9a [test_split_lock_drv]
[41652.528701] Code: b5 4d ff ff ff 48 81 ec e8 00 00 00 c7 85 4d ff ff ff 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 e8 23 05 6b d6 <f0> 0f ba ad 4d ff ff ff 00 83 bd 4d ff ff ff 01 74 2f 48 c7 c7 77
[41652.528740] RSP: 0018:ffffa979c13e7bc0 EFLAGS: 00010282
[41652.528752] RAX: 0000000000000036 RBX: ffffffffc01fd000 RCX: 0000000000000006
[41652.528763] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffffa0323b755420
[41652.528774] RBP: ffffa979c13e7cb0 R08: 0000000000000001 R09: 000000000000056b
[41652.528785] R10: ffffa979c13e7cd0 R11: 000000000000056b R12: ffffa03237b29920
[41652.528796] R13: 0000000000000000 R14: ffffa979c13e7ea0 R15: ffffa032351984e0
[41652.528808] ? 0xffffffffc01fd000
[41652.528821] ? __wake_up_common_lock+0x84/0xb0
[41652.528834] ? netlink_broadcast_filtered+0x132/0x3f0
[41652.528846] ? do_one_initcall+0x45/0x1b6
[41652.528857] ? 0xffffffffc01fd000
[41652.528869] do_one_initcall+0x45/0x1b6
[41652.528881] ? _cond_resched+0x11/0x40
[41652.528893] ? kmem_cache_alloc_trace+0x36/0x170
[41652.528905] do_init_module+0x56/0x1f0
[41652.528918] load_module+0x206a/0x2830
[41652.528930] ? vfs_read+0x10f/0x130
[41652.528941] ? vfs_read+0x10f/0x130
[41652.528954] ? __do_sys_finit_module+0xce/0xe0
[41652.528966] __do_sys_finit_module+0xce/0xe0
[41652.528979] do_syscall_64+0x39/0xe0
[41652.528991] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[41652.529003] RIP: 0033:0x7fa1ee075c19
[41652.529013] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 22 2c 00 f7 d8 64 89 01 48
[41652.529052] RSP: 002b:00007ffdd9bc74c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000139
[41652.529064] RAX: ffffffffffffffda RBX: 000055ba381381f0 RCX: 00007fa1ee075c19
[41652.529075] RDX: 0000000000000000 RSI: 000055ba371d1246 RDI: 0000000000000003
[41652.529086] RBP: 000055ba371d1246 R08: 0000000000000000 R09: 00007fa1ee33af00
[41652.529098] R10: 0000000000000003 R11: 0000000000000202 R12: 0000000000000000
[41652.529109] R13: 000055ba38137130 R14: 0000000000000000 R15: 0000000000000000
[41652.529120] ---[ end trace ae353afe85d7514d ]---
[41652.529205] split lock kernel test passes

>
> >> Maybe:
> >>
> >> WARN_ONCE(1, "split lock detected at %pF\n", regs[EIP]);
> >
> > Should we dump redundant regs info while WARN shows them all already?
>
> I bet it actually makes it easier to read the output and locate the real
> source of the problem. It's especially important if you're going to do
> the WARN_() from the #AC handler with all the #AC information as noise
> in the warning.

From the above WARN info, it's sufficient for people to know a split lock
happens and where it is.

Maybe remove "please FIX it" and just:
WARN_ONCE(1, "split lock detected\n");
?

Thanks.

-Fenghua

2018-06-29 17:49:00

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

On 06/29/2018 10:39 AM, Fenghua Yu wrote:
> On Fri, Jun 29, 2018 at 10:29:08AM -0700, Dave Hansen wrote:
>> On 06/29/2018 10:16 AM, Fenghua Yu wrote:
>>> On Fri, Jun 29, 2018 at 09:33:54AM -0700, Luck, Tony wrote:
>>>>>> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
>>>>> But, warning here is also not super useful. Shouldn't we be dumping out
>>>>> the info in 'regs' instead of the current context? We don't care about
>>>>> the state in the #AC handler, we care about 'regs'.
>>> But WARN dump not only the state in the #AC handler, but also dump the regs
>>> in the current context. And WARN dumps stack.
>> Oh, I forgot about the fancy stack following. That might give us useful
>> output, although mixed with useless output about the #AC handler.
> The useful split lock info follows useless #AC handler info.

Can we do better than this?

Why don't I see the page fault stack in oopses that result in page
fault, for instance?


2018-06-29 19:17:24

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 06:35:39PM +0200, Thomas Gleixner wrote:
> On Fri, 29 Jun 2018, Dave Hansen wrote:
>
> Plus what enforces proper alignment for the other capability related
> u32 arrays?

Do you want me to enforce unsigned long alignment for all that are used by
locked BTS/BTR? There are quite a few/many those places. x86_capability
is one that is found in boot time. I tried to find split lock issues
in (very limited) tests; but so far only find the issue in x86_capability
in the tests.

Or you think we can push the patches upstream to allow broad test to find
and fix the issues?

Thanks.

-Fenghua



2018-06-29 19:30:22

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH v2 3/4] x86/split_lock: Handle #AC exception for split lock

>> + WARN_ONCE(1, "A split lock issue is detected. Please FIX it\n");
>
> But, warning here is also not super useful. Shouldn't we be dumping out
> the info in 'regs' instead of the current context? We don't care about
> the state in the #AC handler, we care about 'regs'.

Maybe:

WARN_ONCE(1, "split lock detected at %pF\n", regs[EIP]);

-Tony

2018-06-29 19:30:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, 29 Jun 2018, Dave Hansen wrote:

> On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -105,7 +105,8 @@ struct cpuinfo_x86 {
> > __u32 extended_cpuid_level;
> > /* Maximum supported CPUID level, -1=no CPUID: */
> > int cpuid_level;
> > - __u32 x86_capability[NCAPINTS + NBUGINTS];
> > + __u32 x86_capability[NCAPINTS + NBUGINTS]
> > + __aligned(sizeof(unsigned long));
> > char x86_vendor_id[16];
> > char x86_model_id[64];
> > /* in KB - valid for CPUS which support this call: */
>
> This is begging for comments.

Right and this patch wants to be the first in the series as it fixes an
existing issue and can be picked up independently of the rest.

Plus what enforces proper alignment for the other capability related
u32 arrays?

Thanks,

tglx

2018-06-29 21:16:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, 29 Jun 2018, Fenghua Yu wrote:

> On Fri, Jun 29, 2018 at 06:35:39PM +0200, Thomas Gleixner wrote:
> > On Fri, 29 Jun 2018, Dave Hansen wrote:
> >
> > Plus what enforces proper alignment for the other capability related
> > u32 arrays?
>
> Do you want me to enforce unsigned long alignment for all that are used by
> locked BTS/BTR?

If there are variables which might be unaligned and accessed with locked
instructions and you have them identified, then why are you asking whether
they should be fixed?

Ignoring them because they do not trigger #AC right now, is only the
correct answer if you are a follower of the 'works by chance' cult.

Yeah, I know that most of this industry just works by chance....

> Or you think we can push the patches upstream to allow broad test to find
> and fix the issues?

And all testers have access to the emulator running the design of the
silicon with that new feature which will be released in a year from now?

Aside of that we merge the patches when they are ready and done. And AFAICT
there is enough homework to be finished before that.

Thanks,

tglx



2018-06-29 21:20:04

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On 06/29/2018 01:38 PM, Fenghua Yu wrote:
> How to handle data that is used in generic code which can be used on
> non-Intel platform? For exmple, if I do this change for struct efi in
> include/linux/efi.h because set_bit() sets bits in efi.flags:
> - unsigned long flags;
> + unsigned long flags __aligned(unsigned long);
> } efi;
>
> People may argue that the alignment unnecessarily increases size of 'efi'
> on non-Intel platform which doesn't have split lock issue. Do we care this
> argument?

Unaligned memory accesses are bad, pretty much universally. This is a
general good practice that we should have been doing anyway. Let folks
complain. Don't let it stop you.

Also, look at the size of that structure. Look at how many pointers it
has. Do you think *anyone* is going to complain about an extra 4 bytes
in a 400-byte structure?

> Another question, there will be a bunch of one-line changes for
> the alignment (i.e. adding __aligned(unsigned long)) in various files.
> Will the changes be put in one big patch or in separate one-liner patches?

Just group them logically.

2018-06-29 22:41:32

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 10:08:59PM +0200, Thomas Gleixner wrote:
> On Fri, 29 Jun 2018, Fenghua Yu wrote:
>
> > On Fri, Jun 29, 2018 at 06:35:39PM +0200, Thomas Gleixner wrote:
> > > On Fri, 29 Jun 2018, Dave Hansen wrote:
> > >
> > > Plus what enforces proper alignment for the other capability related
> > > u32 arrays?
> >
> > Do you want me to enforce unsigned long alignment for all that are used by
> > locked BTS/BTR?
>
> If there are variables which might be unaligned and accessed with locked
> instructions and you have them identified, then why are you asking whether
> they should be fixed?
>
> Ignoring them because they do not trigger #AC right now, is only the
> correct answer if you are a follower of the 'works by chance' cult.
>
> Yeah, I know that most of this industry just works by chance....
>

Ok. I can work on fixing alignment for these instructions in next version.

How to handle data that is used in generic code which can be used on
non-Intel platform? For exmple, if I do this change for struct efi in
include/linux/efi.h because set_bit() sets bits in efi.flags:
- unsigned long flags;
+ unsigned long flags __aligned(unsigned long);
} efi;

People may argue that the alignment unnecessarily increases size of 'efi'
on non-Intel platform which doesn't have split lock issue. Do we care this
argument?

Another question, there will be a bunch of one-line changes for
the alignment (i.e. adding __aligned(unsigned long)) in various files.
Will the changes be put in one big patch or in separate one-liner patches?

Thanks.

-Fenghua

2018-06-29 22:44:16

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 01:48:45PM -0700, Dave Hansen wrote:
> On 06/29/2018 01:38 PM, Fenghua Yu wrote:
> > How to handle data that is used in generic code which can be used on
> > non-Intel platform? For exmple, if I do this change for struct efi in
> > include/linux/efi.h because set_bit() sets bits in efi.flags:
> > - unsigned long flags;
> > + unsigned long flags __aligned(unsigned long);
> > } efi;
> >
> > People may argue that the alignment unnecessarily increases size of 'efi'
> > on non-Intel platform which doesn't have split lock issue. Do we care this
> > argument?
>
> Unaligned memory accesses are bad, pretty much universally. This is a
> general good practice that we should have been doing anyway. Let folks
> complain. Don't let it stop you.
>
> Also, look at the size of that structure. Look at how many pointers it
> has. Do you think *anyone* is going to complain about an extra 4 bytes
> in a 400-byte structure?
>
> > Another question, there will be a bunch of one-line changes for
> > the alignment (i.e. adding __aligned(unsigned long)) in various files.
> > Will the changes be put in one big patch or in separate one-liner patches?
>
> Just group them logically.

Sure.

Thanks.

-Fenghua

2018-06-30 00:35:58

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, 29 Jun 2018, Dave Hansen wrote:

> On 06/29/2018 01:38 PM, Fenghua Yu wrote:
> > How to handle data that is used in generic code which can be used on
> > non-Intel platform? For exmple, if I do this change for struct efi in
> > include/linux/efi.h because set_bit() sets bits in efi.flags:
> > - unsigned long flags;
> > + unsigned long flags __aligned(unsigned long);
> > } efi;
> >
> > People may argue that the alignment unnecessarily increases size of 'efi'
> > on non-Intel platform which doesn't have split lock issue. Do we care this
> > argument?
>
> Unaligned memory accesses are bad, pretty much universally. This is a
> general good practice that we should have been doing anyway. Let folks
> complain. Don't let it stop you.
>
> Also, look at the size of that structure. Look at how many pointers it
> has. Do you think *anyone* is going to complain about an extra 4 bytes
> in a 400-byte structure?

But in the above case the compiler does already the right thing. Why?
Because struct members are aligned to their natural alignment unless the
struct is explicitely marked 'packed'. In that case the programmer has to
take care of the alignment.

Just look at it with pahole:

struct efi_memory_map memmap; /* 280 56 */

/* XXX last struct has 7 bytes of padding */

/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
long unsigned int flags; /* 336 8 */

The issue with the capability arrays is that the data type is u32 which has
the natural alignment of 4 byte, while unsigned long has 8 byte on 64bit.

So just slapping blindly aligned(unsigned long) to anything which is
accessed by locked instructions is pointless.

Thanks,

tglx





2018-06-30 00:59:38

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 05:00:51PM -0700, Fenghua Yu wrote:
> On Fri, Jun 29, 2018 at 11:44:44PM +0200, Thomas Gleixner wrote:
> > On Fri, 29 Jun 2018, Dave Hansen wrote:
> >
> > > On 06/29/2018 01:38 PM, Fenghua Yu wrote:
> > > > How to handle data that is used in generic code which can be used on
> > > > non-Intel platform? For exmple, if I do this change for struct efi in
> > > > include/linux/efi.h because set_bit() sets bits in efi.flags:
> > > > - unsigned long flags;
> > > > + unsigned long flags __aligned(unsigned long);
> > > > } efi;
> > > >
> > > > People may argue that the alignment unnecessarily increases size of 'efi'
> > > > on non-Intel platform which doesn't have split lock issue. Do we care this
> > > > argument?
> > >
> > > Unaligned memory accesses are bad, pretty much universally. This is a
> > > general good practice that we should have been doing anyway. Let folks
> > > complain. Don't let it stop you.
> > >
> > > Also, look at the size of that structure. Look at how many pointers it
> > > has. Do you think *anyone* is going to complain about an extra 4 bytes
> > > in a 400-byte structure?
> >
> > But in the above case the compiler does already the right thing. Why?
> > Because struct members are aligned to their natural alignment unless the
> > struct is explicitely marked 'packed'. In that case the programmer has to
> > take care of the alignment.
> >
> > Just look at it with pahole:
> >
> > struct efi_memory_map memmap; /* 280 56 */
> >
> > /* XXX last struct has 7 bytes of padding */
> >
> > /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
> > long unsigned int flags; /* 336 8 */
> >
> > The issue with the capability arrays is that the data type is u32 which has
> > the natural alignment of 4 byte, while unsigned long has 8 byte on 64bit.
> >
> > So just slapping blindly aligned(unsigned long) to anything which is
> > accessed by locked instructions is pointless.
> >
>
> Thank you for you education!
>
> Below is part of the future patches that are supposed to fix more potential
> split lock issues.
>
> Could you please take a look and see if the changes are in the
> right direction before I move further?
>
> diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h

Please ignore the patch in my last email because of some obvious stupid
mistakes. Sorry about that.

Instead, could you please take a look at the following patch and see if
the changes are in the right direction before I move further?

diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h
index 2e20814d3ce3..29de0ff74351 100644
--- a/arch/x86/boot/cpuflags.h
+++ b/arch/x86/boot/cpuflags.h
@@ -9,7 +9,7 @@ struct cpu_features {
int level; /* Family, or 64 for x86-64 */
int family; /* Family, always */
int model;
- u32 flags[NCAPINTS];
+ u32 flags[NCAPINTS] __aligned(sizeof(unsigned long));
};

extern struct cpu_features cpu;
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 8c7b3e5a2d01..444a2275c1f8 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,7 +133,7 @@ struct mce_log_buffer {
char signature[12]; /* "MACHINECHECK" */
unsigned len; /* = MCE_LOG_LEN */
unsigned next;
- unsigned flags;
+ unsigned flags __aligned(sizeof(unsigned long));
unsigned recordlen; /* length of struct mce */
struct mce entry[MCE_LOG_LEN];
};
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index eb4cb3efd20e..e6a28163e905 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -488,8 +488,8 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
return NULL; /* Not found */
}

-__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
-__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
+__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
+__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));

void load_percpu_segment(int cpu)
{

Thanks.

-Fenghua

2018-06-30 01:53:43

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 11:44:44PM +0200, Thomas Gleixner wrote:
> On Fri, 29 Jun 2018, Dave Hansen wrote:
>
> > On 06/29/2018 01:38 PM, Fenghua Yu wrote:
> > > How to handle data that is used in generic code which can be used on
> > > non-Intel platform? For exmple, if I do this change for struct efi in
> > > include/linux/efi.h because set_bit() sets bits in efi.flags:
> > > - unsigned long flags;
> > > + unsigned long flags __aligned(unsigned long);
> > > } efi;
> > >
> > > People may argue that the alignment unnecessarily increases size of 'efi'
> > > on non-Intel platform which doesn't have split lock issue. Do we care this
> > > argument?
> >
> > Unaligned memory accesses are bad, pretty much universally. This is a
> > general good practice that we should have been doing anyway. Let folks
> > complain. Don't let it stop you.
> >
> > Also, look at the size of that structure. Look at how many pointers it
> > has. Do you think *anyone* is going to complain about an extra 4 bytes
> > in a 400-byte structure?
>
> But in the above case the compiler does already the right thing. Why?
> Because struct members are aligned to their natural alignment unless the
> struct is explicitely marked 'packed'. In that case the programmer has to
> take care of the alignment.
>
> Just look at it with pahole:
>
> struct efi_memory_map memmap; /* 280 56 */
>
> /* XXX last struct has 7 bytes of padding */
>
> /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
> long unsigned int flags; /* 336 8 */
>
> The issue with the capability arrays is that the data type is u32 which has
> the natural alignment of 4 byte, while unsigned long has 8 byte on 64bit.
>
> So just slapping blindly aligned(unsigned long) to anything which is
> accessed by locked instructions is pointless.
>

Thank you for you education!

Below is part of the future patches that are supposed to fix more potential
split lock issues.

Could you please take a look and see if the changes are in the
right direction before I move further?

diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h
index 2e20814d3ce3..ca62c3784f9a 100644
--- a/arch/x86/boot/cpuflags.h
+++ b/arch/x86/boot/cpuflags.h
@@ -9,7 +9,7 @@ struct cpu_features {
int level; /* Family, or 64 for x86-64 */
int family; /* Family, always */
int model;
- u32 flags[NCAPINTS];
+ u32 flags[NCAPINTS] __aligned(unsigned long);
};

extern struct cpu_features cpu;
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 8c7b3e5a2d01..24eac32b039d 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -133,7 +133,7 @@ struct mce_log_buffer {
char signature[12]; /* "MACHINECHECK" */
unsigned len; /* = MCE_LOG_LEN */
unsigned next;
- unsigned flags;
+ unsigned flags __aligned(unsigned long);
unsigned recordlen; /* length of struct mce */
struct mce entry[MCE_LOG_LEN];
};
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index eb4cb3efd20e..fe681c695638 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -488,8 +488,8 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
return NULL; /* Not found */
}

-__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
-__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
+__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(unsigned long);
+__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(unsigned long);

void load_percpu_segment(int cpu)
{
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 56add823f190..e1a3c17945b5 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -963,7 +963,7 @@ extern struct efi {
efi_reset_system_t *reset_system;
efi_set_virtual_address_map_t *set_virtual_address_map;
struct efi_memory_map memmap;
- unsigned long flags;
+ unsigned long flags __aligned(unsigned long);
} efi;

extern struct mm_struct efi_mm;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5c91108846db..30b1f173d3ca 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -885,7 +885,7 @@ struct file {
void *f_security;
#endif
/* needed for tty driver, and maybe others */
- void *private_data;
+ void *private_data __aligned(unsigned long);

#ifdef CONFIG_EPOLL
/* Used by fs/eventpoll.c to link all the hooks to this file */


2018-06-30 06:32:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, 29 Jun 2018, Fenghua Yu wrote:
> On Fri, Jun 29, 2018 at 05:00:51PM -0700, Fenghua Yu wrote:

> diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h
> index 2e20814d3ce3..29de0ff74351 100644
> --- a/arch/x86/boot/cpuflags.h
> +++ b/arch/x86/boot/cpuflags.h
> @@ -9,7 +9,7 @@ struct cpu_features {
> int level; /* Family, or 64 for x86-64 */
> int family; /* Family, always */
> int model;
> - u32 flags[NCAPINTS];
> + u32 flags[NCAPINTS] __aligned(sizeof(unsigned long));

Lacks a comment WHY this needs the aligned() as Dave already said.

> };
>
> extern struct cpu_features cpu;
> diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> index 8c7b3e5a2d01..444a2275c1f8 100644
> --- a/arch/x86/include/asm/mce.h
> +++ b/arch/x86/include/asm/mce.h
> @@ -133,7 +133,7 @@ struct mce_log_buffer {
> char signature[12]; /* "MACHINECHECK" */
> unsigned len; /* = MCE_LOG_LEN */
> unsigned next;
> - unsigned flags;
> + unsigned flags __aligned(sizeof(unsigned long));

And whats wrong with just making that flags field unsigned long and move it
behind recordlen?

flags is at offset 20 so forcing alignement puts it at offset 24 creating a
4 byte hole. While reordering it and making it type unsigned long just
fills the already existing hole and just works w/o the stray aligned()

struct mce_log_buffer {
char signature[12]; /* 0 12 */
unsigned int len; /* 12 4 */
unsigned int next; /* 16 4 */
unsigned int flags; /* 20 4 */
unsigned int recordlen; /* 24 4 */

/* XXX 4 bytes hole, try to pack */

struct mce entry[32]; /* 32 3840 */

The point is, that there is no requirement that flags is unsigned int,
while the u32 type for the capability array has a reason.

So you need to analyse first whether the data type is required or can be
changed to unsigned long which is the right thing to do.

See?

Thanks,

tglx

2018-07-02 12:20:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Fri, Jun 29, 2018 at 01:38:44PM -0700, Fenghua Yu wrote:
> include/linux/efi.h because set_bit() sets bits in efi.flags:
> - unsigned long flags;
> + unsigned long flags __aligned(unsigned long);
> } efi;

Help me out here; how is the above change not a complete no-op?

2018-07-02 14:14:01

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access

On Mon, Jul 02, 2018 at 02:18:59PM +0200, Peter Zijlstra wrote:
> On Fri, Jun 29, 2018 at 01:38:44PM -0700, Fenghua Yu wrote:
> > include/linux/efi.h because set_bit() sets bits in efi.flags:
> > - unsigned long flags;
> > + unsigned long flags __aligned(unsigned long);
> > } efi;
>
> Help me out here; how is the above change not a complete no-op?

You are right. Thomas pointed out 'flags' is already naturally aligned to
unsigned long because its type is unsigned long. So the __aligned()
is no-op.

Thanks.

-Fenghua


2018-07-04 20:08:39

by Eduardo Habkost

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On Fri, Jun 29, 2018 at 06:23:35PM +0200, Thomas Gleixner wrote:
> On Fri, 29 Jun 2018, Dave Hansen wrote:
> > On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> > > +/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
> > > +void detect_ac_split_lock(void)
> > > +{
> > > + u64 val, orig_val;
> > > + int ret;
> > > +
> > > + /* Attempt to read the MSR. If the MSR doesn't exist, reading fails. */
> > > + ret = rdmsrl_safe(MSR_TEST_CTL, &val);
> > > + if (ret)
> > > + return;
> >
> > This is a bit fast and loose with how the feature is detected, which
> > might be OK, but can we call out why we are doing this, please?
> >
> > Is this MSR not really model-specific? Is it OK to go poking at it on
> > all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?
>
> That definitely needs a vendor check. Also the whole code needs to be
> compiled out if CONFIG_INTEL=n.
>
> Aside of that this wants to be enumerated. CPUID or MISC_FEATURES and not
> this guess work detection logic. Why do I have to ask for that for every
> other new feature thingy?

Yes, please. KVM hosts normally expect guests to not touch MSRs
unless we explicitly tell them the MSR is available (normally
through CPUID). This is important to ensure live migration
between different host kernel versions works reliably.

--
Eduardo

2018-07-10 18:48:39

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On Wed, Jul 04, 2018 at 05:07:42PM -0300, Eduardo Habkost wrote:
> On Fri, Jun 29, 2018 at 06:23:35PM +0200, Thomas Gleixner wrote:
> > On Fri, 29 Jun 2018, Dave Hansen wrote:
> > > On 06/29/2018 07:33 AM, Fenghua Yu wrote:
> > > > +/* Detect feature of #AC for split lock by probing bit 29 in MSR_TEST_CTL. */
> > > > +void detect_ac_split_lock(void)
> > > > +{
> > > > + u64 val, orig_val;
> > > > + int ret;
> > > > +
> > > > + /* Attempt to read the MSR. If the MSR doesn't exist, reading fails. */
> > > > + ret = rdmsrl_safe(MSR_TEST_CTL, &val);
> > > > + if (ret)
> > > > + return;
> > >
> > > This is a bit fast and loose with how the feature is detected, which
> > > might be OK, but can we call out why we are doing this, please?
> > >
> > > Is this MSR not really model-specific? Is it OK to go poking at it on
> > > all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?
> >
> > That definitely needs a vendor check. Also the whole code needs to be
> > compiled out if CONFIG_INTEL=n.
> >
> > Aside of that this wants to be enumerated. CPUID or MISC_FEATURES and not
> > this guess work detection logic. Why do I have to ask for that for every
> > other new feature thingy?
>
> Yes, please. KVM hosts normally expect guests to not touch MSRs
> unless we explicitly tell them the MSR is available (normally
> through CPUID). This is important to ensure live migration
> between different host kernel versions works reliably.

The problem is the hardware design for the feature is complete. The
hardware designer cannot change the feature enumeration to CPUID or
MISC_FEATURES. In the future, the designer will put future model-
specific feature bits in a processor feature MSR and hopefully we will
have a nice standard enumeration way for all future model specific
features. But split lock is too late (or too early) to be in the feature
MSR.

Considering the current situation, can we do split lock as following?

By default, #AC for split lock is not enabled by kernel. It's disabled
by BIOS by default (bit29=0). Kernel doesn't clear bit 29.

For IOT/real time users, they know the feature is in the processor and
they know the consequence of enabling the feature (e.g. some apps will
be killed because of split lock issues). They want to enable #AC for
split lock by kernel parameter "split_lock_detect=on". Kernel calls
wrmsr_safe(), which can fail on processor not supporting
the feature, to enable the feature.

There is no enumeration and no flag in /proc/cpuinfo flag for the feature.

Is this a right way to go? Please advise.

Thanks.

-Fenghua

2018-07-10 18:55:25

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On 07/10/2018 11:45 AM, Fenghua Yu wrote:
> On Wed, Jul 04, 2018 at 05:07:42PM -0300, Eduardo Habkost wrote:
>> On Fri, Jun 29, 2018 at 06:23:35PM +0200, Thomas Gleixner wrote:
>>> On Fri, 29 Jun 2018, Dave Hansen wrote:
>>>> Is this MSR not really model-specific? Is it OK to go poking at it on
>>>> all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?
>>>
>>> That definitely needs a vendor check. Also the whole code needs to be
>>> compiled out if CONFIG_INTEL=n.
>>>
>>> Aside of that this wants to be enumerated. CPUID or MISC_FEATURES and not
>>> this guess work detection logic. Why do I have to ask for that for every
>>> other new feature thingy?
>>
>> Yes, please. KVM hosts normally expect guests to not touch MSRs
>> unless we explicitly tell them the MSR is available (normally
>> through CPUID). This is important to ensure live migration
>> between different host kernel versions works reliably.
>
> The problem is the hardware design for the feature is complete. The
> hardware designer cannot change the feature enumeration to CPUID or
> MISC_FEATURES.

Let's be honest, though. That's not *hardware* design; that is a
microcode update. We've seen what microcode updates can do _very_
clearly with all the security issues. We (Intel) can surely fix this if
sufficiently motivated. No?

> There is no enumeration and no flag in /proc/cpuinfo flag for the feature.

Huh? /proc/cpuinfo has tons on non-CPUID-instruction-based features.
There's a retpoline one and a PTI one for goodness sake.

2018-07-10 19:59:19

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On Tue, 10 Jul 2018, Dave Hansen wrote:
> On 07/10/2018 11:45 AM, Fenghua Yu wrote:
> > On Wed, Jul 04, 2018 at 05:07:42PM -0300, Eduardo Habkost wrote:
> >> On Fri, Jun 29, 2018 at 06:23:35PM +0200, Thomas Gleixner wrote:
> >>> On Fri, 29 Jun 2018, Dave Hansen wrote:
> >>>> Is this MSR not really model-specific? Is it OK to go poking at it on
> >>>> all x86 variants? Or, do we at _least_ need a check for Intel cpus in here?
> >>>
> >>> That definitely needs a vendor check. Also the whole code needs to be
> >>> compiled out if CONFIG_INTEL=n.
> >>>
> >>> Aside of that this wants to be enumerated. CPUID or MISC_FEATURES and not
> >>> this guess work detection logic. Why do I have to ask for that for every
> >>> other new feature thingy?
> >>
> >> Yes, please. KVM hosts normally expect guests to not touch MSRs
> >> unless we explicitly tell them the MSR is available (normally
> >> through CPUID). This is important to ensure live migration
> >> between different host kernel versions works reliably.
> >
> > The problem is the hardware design for the feature is complete. The
> > hardware designer cannot change the feature enumeration to CPUID or
> > MISC_FEATURES.

Setting a fricking bit in a CPUID leaf or in a MSR cannot be done anymore?
That's just hilarious.

> Let's be honest, though. That's not *hardware* design; that is a
> microcode update. We've seen what microcode updates can do _very_
> clearly with all the security issues. We (Intel) can surely fix this if
> sufficiently motivated. No?

Amen to that.

And please tell your hardware people that they should stop creating
features which are not enumerated in one way or the other. That's just a
pain all over the place. Boot code, kernel, virt, tools ....

Thanks,

tglx

2018-07-12 02:03:21

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On 07/10/2018 12:47 PM, Thomas Gleixner wrote:
> And please tell your hardware people that they should stop creating
> features which are not enumerated in one way or the other. That's just a
> pain all over the place. Boot code, kernel, virt, tools ....

Assuming it's too late to get CPUID or MSR enumeration... (I haven't
given up on this yet, btw)

How about we #define a Linux-defined X86_FEATURE_SPLIT_LOCK_AC (or
whatever) for now. When the hardware that has this bit for real shows
up, we move the #define to the "correct" place, like if it is a part of
an existing cpufeature leaf.

We also do a new setcpuid= boot option that behaves like the existing
clearcpuid=. That option is used by the obscure, specialized split lock
detectino users to set the software cpufeature bit.

That way, we can merged this code *with* CPUID detection, and the only
thing we have to do in the future is move the X86_FEATURE_SPLIT_LOCK_AC
define to a better place.

It might be nice to make the setcpuid= thing use strings instead of
numbers that _can_ change from kernel-to-kernel, but that's kinda an
implementation detail. If we have strings, maybe we can start using
clearcpuid=pku instead of our endless list of (new) boot parameters like
nopku, nosmap, nosmep, etc...

2018-07-12 20:03:59

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] x86/split_lock: Enumerate #AC exception for split locked access feature

On Wed, 11 Jul 2018, Dave Hansen wrote:

> On 07/10/2018 12:47 PM, Thomas Gleixner wrote:
> > And please tell your hardware people that they should stop creating
> > features which are not enumerated in one way or the other. That's just a
> > pain all over the place. Boot code, kernel, virt, tools ....
>
> Assuming it's too late to get CPUID or MSR enumeration... (I haven't
> given up on this yet, btw)

Given the fact that they can add cpuid bits and msr and msr bits within no
time via ucode, assuming too late is the wrong approach.

As this is not yet existing silicon, the ucode patch space should be more
than sufficient for that. Or is it already reserved for the next
speculation disaster?

Btw, when is this going to be real silicon?

> How about we #define a Linux-defined X86_FEATURE_SPLIT_LOCK_AC (or
> whatever) for now. When the hardware that has this bit for real shows
> up, we move the #define to the "correct" place, like if it is a part of
> an existing cpufeature leaf.
>
> We also do a new setcpuid= boot option that behaves like the existing
> clearcpuid=. That option is used by the obscure, specialized split lock
> detectino users to set the software cpufeature bit.
>
> That way, we can merged this code *with* CPUID detection, and the only
> thing we have to do in the future is move the X86_FEATURE_SPLIT_LOCK_AC
> define to a better place.
>
> It might be nice to make the setcpuid= thing use strings instead of
> numbers that _can_ change from kernel-to-kernel, but that's kinda an
> implementation detail. If we have strings, maybe we can start using
> clearcpuid=pku instead of our endless list of (new) boot parameters like
> nopku, nosmap, nosmep, etc...

I might be persuaded to accept the setcpuid= magic if, and only if the
CPUID leaf and bit is documented upfront and going to stay the same.

If that's not the case we just open the door for the HW folks to ignore
proper enumeration forever.

Thanks,

tglx