Control-flow Enforcement (CET) is a new Intel processor feature that blocks
return/jump-oriented programming attacks. Details are in "Intel 64 and
IA-32 Architectures Software Developer's Manual" [1].
This is the second part of CET and enables Indirect Branch Tracking (IBT).
It is built on top of the shadow stack series.
Changes from v9:
- Remove the legacy bitmap arch_prctl() as it is not used by GLIBC anymore.
Should it be needed in the future, I will re-post the patch separately.
[1] Intel 64 and IA-32 Architectures Software Developer's Manual:
https://software.intel.com/en-us/download/intel-64-and-ia-32-
architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4
[2] Previous IBT patches v9:
https://lkml.kernel.org/r/[email protected]/
There have been no major changes since v9, and there is no IBT patches
v10.
H.J. Lu (4):
x86/cet/ibt: Update arch_prctl functions for Indirect Branch Tracking
x86/vdso/32: Add ENDBR32 to __kernel_vsyscall entry point
x86/vdso: Insert endbr32/endbr64 to vDSO
x86: Disallow vsyscall emulation when CET is enabled
Yu-cheng Yu (5):
x86/cet/ibt: Add Kconfig option for user-mode Indirect Branch Tracking
x86/cet/ibt: User-mode Indirect Branch Tracking support
x86/cet/ibt: Handle signals for Indirect Branch Tracking
x86/cet/ibt: ELF header parsing for Indirect Branch Tracking
x86/cet: Add PTRACE interface for CET
arch/x86/Kconfig | 26 +++++++-
arch/x86/entry/vdso/Makefile | 4 ++
arch/x86/entry/vdso/vdso32/system_call.S | 3 +
arch/x86/include/asm/cet.h | 3 +
arch/x86/include/asm/disabled-features.h | 8 ++-
arch/x86/include/asm/fpu/regset.h | 7 ++-
arch/x86/kernel/cet.c | 60 ++++++++++++++++++-
arch/x86/kernel/cet_prctl.c | 8 ++-
arch/x86/kernel/cpu/common.c | 17 ++++++
arch/x86/kernel/fpu/regset.c | 44 ++++++++++++++
arch/x86/kernel/fpu/signal.c | 8 ++-
arch/x86/kernel/process_64.c | 8 +++
arch/x86/kernel/ptrace.c | 16 +++++
include/uapi/linux/elf.h | 1 +
.../arch/x86/include/asm/disabled-features.h | 8 ++-
15 files changed, 208 insertions(+), 13 deletions(-)
--
2.21.0
From: "H.J. Lu" <[email protected]>
Emulation of the legacy vsyscall page is required by some programs built
before 2013. Newer programs after 2013 don't use it. Disallow vsyscall
emulation when Control-flow Enforcement (CET) is enabled to enhance
security.
Signed-off-by: H.J. Lu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/Kconfig | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5bd6d6a10047..bbc68ecfae2b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1210,7 +1210,7 @@ config X86_ESPFIX64
config X86_VSYSCALL_EMULATION
bool "Enable vsyscall emulation" if EXPERT
default y
- depends on X86_64
+ depends on X86_64 && !X86_INTEL_CET
help
This enables emulation of the legacy vsyscall page. Disabling
it is roughly equivalent to booting with vsyscall=none, except
@@ -1225,6 +1225,8 @@ config X86_VSYSCALL_EMULATION
Disabling this option saves about 7K of kernel size and
possibly 4K of additional runtime pagetable memory.
+ This option is disabled when Intel CET is enabled.
+
config X86_IOPL_IOPERM
bool "IOPERM and IOPL Emulation"
default y
@@ -2361,7 +2363,7 @@ config COMPAT_VDSO
choice
prompt "vsyscall table for legacy applications"
- depends on X86_64
+ depends on X86_64 && !X86_INTEL_CET
default LEGACY_VSYSCALL_XONLY
help
Legacy user code that does not know how to find the vDSO expects
@@ -2378,6 +2380,8 @@ choice
If unsure, select "Emulate execution only".
+ This option is not enabled when Intel CET is enabled.
+
config LEGACY_VSYSCALL_EMULATE
bool "Full emulation"
help
--
2.21.0
Introduce Kconfig option X86_INTEL_BRANCH_TRACKING_USER.
Indirect Branch Tracking (IBT) provides protection against CALL-/JMP-
oriented programming attacks. It is active when the kernel has this
feature enabled, and the processor and the application support it.
When this feature is enabled, legacy non-IBT applications continue to
work, but without IBT protection.
Signed-off-by: Yu-cheng Yu <[email protected]>
---
v10:
- Change build-time CET check to config depends on.
arch/x86/Kconfig | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6b6dad011763..b047e0a8d1c2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1963,6 +1963,22 @@ config X86_INTEL_SHADOW_STACK_USER
If unsure, say y.
+config X86_INTEL_BRANCH_TRACKING_USER
+ prompt "Intel Indirect Branch Tracking for user-mode"
+ def_bool n
+ depends on CPU_SUP_INTEL && X86_64
+ depends on $(cc-option,-fcf-protection)
+ select X86_INTEL_CET
+ help
+ Indirect Branch Tracking (IBT) provides protection against
+ CALL-/JMP-oriented programming attacks. It is active when
+ the kernel has this feature enabled, and the processor and
+ the application support it. When this feature is enabled,
+ legacy non-IBT applications continue to work, but without
+ IBT protection.
+
+ If unsure, say y
+
config EFI
bool "EFI runtime service support"
depends on ACPI
--
2.21.0
Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
IA32_U_CET (user-mode CET settings) and
IA32_PL3_SSP (user-mode Shadow Stack)
Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/include/asm/fpu/regset.h | 7 ++---
arch/x86/kernel/fpu/regset.c | 44 +++++++++++++++++++++++++++++++
arch/x86/kernel/ptrace.c | 16 +++++++++++
include/uapi/linux/elf.h | 1 +
4 files changed, 65 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/fpu/regset.h b/arch/x86/include/asm/fpu/regset.h
index 4f928d6a367b..8622184d87f5 100644
--- a/arch/x86/include/asm/fpu/regset.h
+++ b/arch/x86/include/asm/fpu/regset.h
@@ -7,11 +7,12 @@
#include <linux/regset.h>
-extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active;
+extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active,
+ cetregs_active;
extern user_regset_get2_fn fpregs_get, xfpregs_get, fpregs_soft_get,
- xstateregs_get;
+ xstateregs_get, cetregs_get;
extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
- xstateregs_set;
+ xstateregs_set, cetregs_set;
/*
* xstateregs_active == regset_fpregs_active. Please refer to the comment
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index c413756ba89f..8860d57eed35 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -149,6 +149,50 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
return ret;
}
+int cetregs_active(struct task_struct *target, const struct user_regset *regset)
+{
+#ifdef CONFIG_X86_INTEL_CET
+ if (target->thread.cet.shstk_size || target->thread.cet.ibt_enabled)
+ return regset->n;
+#endif
+ return 0;
+}
+
+int cetregs_get(struct task_struct *target, const struct user_regset *regset,
+ struct membuf to)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct cet_user_state *cetregs;
+
+ if (!boot_cpu_has(X86_FEATURE_SHSTK))
+ return -ENODEV;
+
+ fpu__prepare_read(fpu);
+ cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
+ if (!cetregs)
+ return -EFAULT;
+
+ return membuf_write(&to, cetregs, sizeof(struct cet_user_state));
+}
+
+int cetregs_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ struct fpu *fpu = &target->thread.fpu;
+ struct cet_user_state *cetregs;
+
+ if (!boot_cpu_has(X86_FEATURE_SHSTK))
+ return -ENODEV;
+
+ fpu__prepare_write(fpu);
+ cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
+ if (!cetregs)
+ return -EFAULT;
+
+ return user_regset_copyin(&pos, &count, &kbuf, &ubuf, cetregs, 0, -1);
+}
+
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
/*
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 5679aa3fdcb8..ea54317f087e 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -52,7 +52,9 @@ enum x86_regset {
REGSET_IOPERM64 = REGSET_XFP,
REGSET_XSTATE,
REGSET_TLS,
+ REGSET_CET64 = REGSET_TLS,
REGSET_IOPERM32,
+ REGSET_CET32,
};
struct pt_regs_offset {
@@ -1229,6 +1231,13 @@ static struct user_regset x86_64_regsets[] __ro_after_init = {
.size = sizeof(long), .align = sizeof(long),
.active = ioperm_active, .regset_get = ioperm_get
},
+ [REGSET_CET64] = {
+ .core_note_type = NT_X86_CET,
+ .n = sizeof(struct cet_user_state) / sizeof(u64),
+ .size = sizeof(u64), .align = sizeof(u64),
+ .active = cetregs_active, .regset_get = cetregs_get,
+ .set = cetregs_set
+ },
};
static const struct user_regset_view user_x86_64_view = {
@@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
.size = sizeof(u32), .align = sizeof(u32),
.active = ioperm_active, .regset_get = ioperm_get
},
+ [REGSET_CET32] = {
+ .core_note_type = NT_X86_CET,
+ .n = sizeof(struct cet_user_state) / sizeof(u64),
+ .size = sizeof(u64), .align = sizeof(u64),
+ .active = cetregs_active, .regset_get = cetregs_get,
+ .set = cetregs_set
+ },
};
static const struct user_regset_view user_x86_32_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index ca5875f384f6..d2a895369bcc 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -402,6 +402,7 @@ typedef struct elf64_shdr {
#define NT_386_TLS 0x200 /* i386 TLS slots (struct user_desc) */
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
#define NT_X86_XSTATE 0x202 /* x86 extended state using xsave */
+#define NT_X86_CET 0x203 /* x86 cet state */
#define NT_S390_HIGH_GPRS 0x300 /* s390 upper register halves */
#define NT_S390_TIMER 0x301 /* s390 timer register */
#define NT_S390_TODCMP 0x302 /* s390 TOD clock comparator register */
--
2.21.0
From: "H.J. Lu" <[email protected]>
When Indirect Branch Tracking (IBT) is enabled, vDSO functions may be
called indirectly, and must have ENDBR32 or ENDBR64 as the first
instruction. The compiler must support -fcf-protection=branch so that it
can be used to compile vDSO.
Signed-off-by: H.J. Lu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
---
arch/x86/entry/vdso/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 215376d975a2..82f8e25e139f 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -130,6 +130,10 @@ $(obj)/%-x32.o: $(obj)/%.o FORCE
targets += vdsox32.lds $(vobjx32s-y)
+ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+ $(obj)/vclock_gettime.o $(obj)/vgetcpu.o $(obj)/vdso32/vclock_gettime.o: KBUILD_CFLAGS += -fcf-protection=branch
+endif
+
$(obj)/%.so: OBJCOPYFLAGS := -S
$(obj)/%.so: $(obj)/%.so.dbg FORCE
$(call if_changed,objcopy)
--
2.21.0
From: "H.J. Lu" <[email protected]>
Update ARCH_X86_CET_STATUS and ARCH_X86_CET_DISABLE for Indirect Branch
Tracking.
Signed-off-by: H.J. Lu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
---
arch/x86/kernel/cet_prctl.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cet_prctl.c b/arch/x86/kernel/cet_prctl.c
index cc49eef08ab0..2cd089e1542c 100644
--- a/arch/x86/kernel/cet_prctl.c
+++ b/arch/x86/kernel/cet_prctl.c
@@ -22,6 +22,9 @@ static int copy_status_to_user(struct cet_status *cet, u64 arg2)
buf[2] = (u64)cet->shstk_size;
}
+ if (cet->ibt_enabled)
+ buf[0] |= GNU_PROPERTY_X86_FEATURE_1_IBT;
+
return copy_to_user((u64 __user *)arg2, buf, sizeof(buf));
}
@@ -72,7 +75,8 @@ int prctl_cet(int option, u64 arg2)
if (option == ARCH_X86_CET_STATUS)
return copy_status_to_user(cet, arg2);
- if (!static_cpu_has(X86_FEATURE_SHSTK))
+ if (!static_cpu_has(X86_FEATURE_SHSTK) &&
+ !static_cpu_has(X86_FEATURE_IBT))
return -EOPNOTSUPP;
switch (option) {
@@ -83,6 +87,8 @@ int prctl_cet(int option, u64 arg2)
return -EINVAL;
if (arg2 & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
cet_disable_free_shstk(current);
+ if (arg2 & GNU_PROPERTY_X86_FEATURE_1_IBT)
+ cet_disable_ibt();
return 0;
case ARCH_X86_CET_LOCK:
--
2.21.0
An indirect CALL/JMP moves the indirect branch tracking (IBT) state machine
to WAIT_ENDBR status until the instruction reaches an ENDBR opcode. If the
CALL/JMP does not reach an ENDBR opcode, the processor raises a control-
protection fault. WAIT_ENDBR status can be read from MSR_IA32_U_CET.
WAIT_ENDBR is cleared for signal handling, and restored for sigreturn.
IBT state machine is described in Intel SDM Vol. 1, Sec. 18.3.
Signed-off-by: Yu-cheng Yu <[email protected]>
---
v9:
- Fix missing WAIT_ENDBR in signal handling.
arch/x86/kernel/cet.c | 27 +++++++++++++++++++++++++--
arch/x86/kernel/fpu/signal.c | 8 +++++---
2 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
index b1c122a5aef4..f783229460b6 100644
--- a/arch/x86/kernel/cet.c
+++ b/arch/x86/kernel/cet.c
@@ -309,6 +309,13 @@ void cet_restore_signal(struct sc_ext *sc_ext)
msr_val |= CET_SHSTK_EN;
}
+ if (cet->ibt_enabled) {
+ msr_val |= (CET_ENDBR_EN | CET_NO_TRACK_EN);
+
+ if (sc_ext->wait_endbr)
+ msr_val |= CET_WAIT_ENDBR;
+ }
+
if (test_thread_flag(TIF_NEED_FPU_LOAD))
cet_user_state->user_cet = msr_val;
else
@@ -349,9 +356,25 @@ int cet_setup_signal(bool ia32, unsigned long rstor_addr, struct sc_ext *sc_ext)
sc_ext->ssp = new_ssp;
}
- if (ssp) {
+ if (ssp || cet->ibt_enabled) {
+
start_update_msrs();
- wrmsrl(MSR_IA32_PL3_SSP, ssp);
+
+ if (ssp)
+ wrmsrl(MSR_IA32_PL3_SSP, ssp);
+
+ if (cet->ibt_enabled) {
+ u64 r;
+
+ rdmsrl(MSR_IA32_U_CET, r);
+
+ if (r & CET_WAIT_ENDBR) {
+ sc_ext->wait_endbr = 1;
+ r &= ~CET_WAIT_ENDBR;
+ wrmsrl(MSR_IA32_U_CET, r);
+ }
+ }
+
end_update_msrs();
}
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index d02ea8c11128..a4d66fa69c1c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -57,7 +57,8 @@ int save_cet_to_sigframe(int ia32, void __user *fp, unsigned long restorer)
{
int err = 0;
- if (!current->thread.cet.shstk_size)
+ if (!current->thread.cet.shstk_size &&
+ !current->thread.cet.ibt_enabled)
return 0;
if (fp) {
@@ -89,7 +90,8 @@ static int get_cet_from_sigframe(int ia32, void __user *fp, struct sc_ext *ext)
memset(ext, 0, sizeof(*ext));
- if (!current->thread.cet.shstk_size)
+ if (!current->thread.cet.shstk_size &&
+ !current->thread.cet.ibt_enabled)
return 0;
if (fp) {
@@ -577,7 +579,7 @@ static unsigned long fpu__alloc_sigcontext_ext(unsigned long sp)
* sigcontext_ext is at: fpu + fpu_user_xstate_size +
* FP_XSTATE_MAGIC2_SIZE, then aligned to 8.
*/
- if (cet->shstk_size)
+ if (cet->shstk_size || cet->ibt_enabled)
sp -= (sizeof(struct sc_ext) + 8);
return sp;
--
2.21.0
Introduce user-mode Indirect Branch Tracking (IBT) support. Update setup
routines to include IBT.
Signed-off-by: Yu-cheng Yu <[email protected]>
---
v10:
- Change no_cet_ibt to no_user_ibt.
v9:
- Change cpu_feature_enabled() to static_cpu_has().
v2:
- Change noibt to no_cet_ibt.
arch/x86/include/asm/cet.h | 3 ++
arch/x86/include/asm/disabled-features.h | 8 ++++-
arch/x86/kernel/cet.c | 33 +++++++++++++++++++
arch/x86/kernel/cpu/common.c | 17 ++++++++++
.../arch/x86/include/asm/disabled-features.h | 8 ++++-
5 files changed, 67 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h
index f7eb197998ad..916ac2a0404c 100644
--- a/arch/x86/include/asm/cet.h
+++ b/arch/x86/include/asm/cet.h
@@ -15,6 +15,7 @@ struct cet_status {
unsigned long shstk_base;
unsigned long shstk_size;
unsigned int locked:1;
+ unsigned int ibt_enabled:1;
};
#ifdef CONFIG_X86_INTEL_CET
@@ -26,6 +27,8 @@ void cet_disable_free_shstk(struct task_struct *p);
int cet_verify_rstor_token(bool ia32, unsigned long ssp, unsigned long *new_ssp);
void cet_restore_signal(struct sc_ext *sc);
int cet_setup_signal(bool ia32, unsigned long rstor, struct sc_ext *sc);
+int cet_setup_ibt(void);
+void cet_disable_ibt(void);
#else
static inline int prctl_cet(int option, u64 arg2) { return -EINVAL; }
static inline int cet_setup_thread_shstk(struct task_struct *p) { return 0; }
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index a0e1b24cfa02..52c9c07cfacc 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -62,6 +62,12 @@
#define DISABLE_SHSTK (1<<(X86_FEATURE_SHSTK & 31))
#endif
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+#define DISABLE_IBT 0
+#else
+#define DISABLE_IBT (1<<(X86_FEATURE_IBT & 31))
+#endif
+
/*
* Make sure to add features to the correct mask
*/
@@ -83,7 +89,7 @@
#define DISABLED_MASK15 0
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP|DISABLE_SHSTK)
#define DISABLED_MASK17 0
-#define DISABLED_MASK18 0
+#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
#endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
index 2bf1a6b6abb6..b1c122a5aef4 100644
--- a/arch/x86/kernel/cet.c
+++ b/arch/x86/kernel/cet.c
@@ -13,6 +13,8 @@
#include <linux/uaccess.h>
#include <linux/sched/signal.h>
#include <linux/compat.h>
+#include <linux/vmalloc.h>
+#include <linux/bitops.h>
#include <asm/msr.h>
#include <asm/user.h>
#include <asm/fpu/internal.h>
@@ -355,3 +357,34 @@ int cet_setup_signal(bool ia32, unsigned long rstor_addr, struct sc_ext *sc_ext)
return 0;
}
+
+int cet_setup_ibt(void)
+{
+ u64 msr_val;
+
+ if (!static_cpu_has(X86_FEATURE_IBT))
+ return -EOPNOTSUPP;
+
+ start_update_msrs();
+ rdmsrl(MSR_IA32_U_CET, msr_val);
+ msr_val |= (CET_ENDBR_EN | CET_NO_TRACK_EN);
+ wrmsrl(MSR_IA32_U_CET, msr_val);
+ end_update_msrs();
+ current->thread.cet.ibt_enabled = 1;
+ return 0;
+}
+
+void cet_disable_ibt(void)
+{
+ u64 msr_val;
+
+ if (!static_cpu_has(X86_FEATURE_IBT))
+ return;
+
+ start_update_msrs();
+ rdmsrl(MSR_IA32_U_CET, msr_val);
+ msr_val &= CET_SHSTK_EN;
+ wrmsrl(MSR_IA32_U_CET, msr_val);
+ end_update_msrs();
+ current->thread.cet.ibt_enabled = 0;
+}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 5f60ddaabc46..43666b1f50a2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -536,6 +536,23 @@ static __init int setup_disable_shstk(char *s)
__setup("no_user_shstk", setup_disable_shstk);
#endif
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+static __init int setup_disable_ibt(char *s)
+{
+ /* require an exact match without trailing characters */
+ if (s[0] != '\0')
+ return 0;
+
+ if (!boot_cpu_has(X86_FEATURE_IBT))
+ return 1;
+
+ setup_clear_cpu_cap(X86_FEATURE_IBT);
+ pr_info("x86: 'no_user_ibt' specified, disabling user Branch Tracking\n");
+ return 1;
+}
+__setup("no_user_ibt", setup_disable_ibt);
+#endif
+
/*
* Some CPU features depend on higher CPUID levels, which may not always
* be available due to CPUID level capping or broken virtualization
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index a0e1b24cfa02..52c9c07cfacc 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -62,6 +62,12 @@
#define DISABLE_SHSTK (1<<(X86_FEATURE_SHSTK & 31))
#endif
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+#define DISABLE_IBT 0
+#else
+#define DISABLE_IBT (1<<(X86_FEATURE_IBT & 31))
+#endif
+
/*
* Make sure to add features to the correct mask
*/
@@ -83,7 +89,7 @@
#define DISABLED_MASK15 0
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP|DISABLE_SHSTK)
#define DISABLED_MASK17 0
-#define DISABLED_MASK18 0
+#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
#endif /* _ASM_X86_DISABLED_FEATURES_H */
--
2.21.0
On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu <[email protected]> wrote:
>
> From: "H.J. Lu" <[email protected]>
>
> Emulation of the legacy vsyscall page is required by some programs built
> before 2013. Newer programs after 2013 don't use it. Disallow vsyscall
> emulation when Control-flow Enforcement (CET) is enabled to enhance
> security.
NAK.
By all means disable execute emulation if CET-IBT is enabled at the
time emulation is attempted, and maybe even disable the vsyscall page
entirely if you can magically tell that CET-IBT will be enabled when a
process starts, but you don't get to just disable it outright on a
CET-enabled kernel.
From: "H.J. Lu" <[email protected]>
Add ENDBR32 to __kernel_vsyscall entry point.
Signed-off-by: H.J. Lu <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
---
arch/x86/entry/vdso/vdso32/system_call.S | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/entry/vdso/vdso32/system_call.S b/arch/x86/entry/vdso/vdso32/system_call.S
index de1fff7188aa..5cf74ebd4746 100644
--- a/arch/x86/entry/vdso/vdso32/system_call.S
+++ b/arch/x86/entry/vdso/vdso32/system_call.S
@@ -14,6 +14,9 @@
ALIGN
__kernel_vsyscall:
CFI_STARTPROC
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+ endbr32
+#endif
/*
* Reshuffle regs so that all of any of the entry instructions
* will preserve enough state.
--
2.21.0
Update arch_setup_elf_property() for Indirect Branch Tracking.
Signed-off-by: Yu-cheng Yu <[email protected]>
---
v9:
- Change cpu_feature_enabled() to static_cpu_has().
arch/x86/Kconfig | 2 ++
arch/x86/kernel/process_64.c | 8 ++++++++
2 files changed, 10 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b047e0a8d1c2..5bd6d6a10047 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1969,6 +1969,8 @@ config X86_INTEL_BRANCH_TRACKING_USER
depends on CPU_SUP_INTEL && X86_64
depends on $(cc-option,-fcf-protection)
select X86_INTEL_CET
+ select ARCH_USE_GNU_PROPERTY
+ select ARCH_BINFMT_ELF_STATE
help
Indirect Branch Tracking (IBT) provides protection against
CALL-/JMP-oriented programming attacks. It is active when
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index fd4644865a3b..c084e1a37d11 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -866,6 +866,14 @@ int arch_setup_elf_property(struct arch_elf_state *state)
r = cet_setup_shstk();
}
+ if (r < 0)
+ return r;
+
+ if (static_cpu_has(X86_FEATURE_IBT)) {
+ if (state->gnu_property & GNU_PROPERTY_X86_FEATURE_1_IBT)
+ r = cet_setup_ibt();
+ }
+
return r;
}
#endif
--
2.21.0
On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu <[email protected]> wrote:
>
> From: "H.J. Lu" <[email protected]>
>
> When Indirect Branch Tracking (IBT) is enabled, vDSO functions may be
> called indirectly, and must have ENDBR32 or ENDBR64 as the first
> instruction. The compiler must support -fcf-protection=branch so that it
> can be used to compile vDSO.
>
> Signed-off-by: H.J. Lu <[email protected]>
> Signed-off-by: Yu-cheng Yu <[email protected]>
> Acked-by: Andy Lutomirski <[email protected]>
I revoke my Ack. Please don't repeat the list of object files. Maybe
add the option to CFL?
--Andy
* Andy Lutomirski:
> On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu <[email protected]> wrote:
>>
>> From: "H.J. Lu" <[email protected]>
>>
>> Emulation of the legacy vsyscall page is required by some programs built
>> before 2013. Newer programs after 2013 don't use it. Disallow vsyscall
>> emulation when Control-flow Enforcement (CET) is enabled to enhance
>> security.
>
> NAK.
>
> By all means disable execute emulation if CET-IBT is enabled at the
> time emulation is attempted, and maybe even disable the vsyscall page
> entirely if you can magically tell that CET-IBT will be enabled when a
> process starts, but you don't get to just disable it outright on a
> CET-enabled kernel.
Yeah, we definitely would have to revert/avoid this downstream. People
definitely want to run glibc-2.12-era workloads on current kernels.
Thanks for catching it.
Florian
On 8/25/2020 2:14 AM, Florian Weimer wrote:
> * Andy Lutomirski:
>
>> On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu <[email protected]> wrote:
>>>
>>> From: "H.J. Lu" <[email protected]>
>>>
>>> Emulation of the legacy vsyscall page is required by some programs built
>>> before 2013. Newer programs after 2013 don't use it. Disallow vsyscall
>>> emulation when Control-flow Enforcement (CET) is enabled to enhance
>>> security.
>>
>> NAK.
>>
>> By all means disable execute emulation if CET-IBT is enabled at the
>> time emulation is attempted, and maybe even disable the vsyscall page
>> entirely if you can magically tell that CET-IBT will be enabled when a
>> process starts, but you don't get to just disable it outright on a
>> CET-enabled kernel.
>
> Yeah, we definitely would have to revert/avoid this downstream. People
> definitely want to run glibc-2.12-era workloads on current kernels.
> Thanks for catching it.
>
That makes sense. I will update the patch.
Thanks,
Yu-cheng
On 8/24/2020 5:33 PM, Andy Lutomirski wrote:
> On Mon, Aug 24, 2020 at 5:30 PM Yu-cheng Yu <[email protected]> wrote:
>>
>> From: "H.J. Lu" <[email protected]>
>>
>> When Indirect Branch Tracking (IBT) is enabled, vDSO functions may be
>> called indirectly, and must have ENDBR32 or ENDBR64 as the first
>> instruction. The compiler must support -fcf-protection=branch so that it
>> can be used to compile vDSO.
>>
>> Signed-off-by: H.J. Lu <[email protected]>
>> Signed-off-by: Yu-cheng Yu <[email protected]>
>> Acked-by: Andy Lutomirski <[email protected]>
>
> I revoke my Ack. Please don't repeat the list of object files. Maybe
> add the option to CFL?
I will update the patch.
Yu-cheng
On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>
> IA32_U_CET (user-mode CET settings) and
> IA32_PL3_SSP (user-mode Shadow Stack)
[...]
> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
[...]
> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
> + struct membuf to)
> +{
> + struct fpu *fpu = &target->thread.fpu;
> + struct cet_user_state *cetregs;
> +
> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> + return -ENODEV;
> +
> + fpu__prepare_read(fpu);
> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
> + if (!cetregs)
> + return -EFAULT;
Can this branch ever be hit without a kernel bug? If yes, I think
-EFAULT is probably a weird error code to choose here. If no, this
should probably use WARN_ON(). Same thing in cetregs_set().
> + return membuf_write(&to, cetregs, sizeof(struct cet_user_state));
> +}
[...]
> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
[...]
> @@ -52,7 +52,9 @@ enum x86_regset {
> REGSET_IOPERM64 = REGSET_XFP,
> REGSET_XSTATE,
> REGSET_TLS,
> + REGSET_CET64 = REGSET_TLS,
> REGSET_IOPERM32,
> + REGSET_CET32,
> };
[...]
> @@ -1229,6 +1231,13 @@ static struct user_regset x86_64_regsets[] __ro_after_init = {
[...]
> + [REGSET_CET64] = {
> + .core_note_type = NT_X86_CET,
> + .n = sizeof(struct cet_user_state) / sizeof(u64),
> + .size = sizeof(u64), .align = sizeof(u64),
> + .active = cetregs_active, .regset_get = cetregs_get,
> + .set = cetregs_set
> + },
> };
[...]
> @@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
[...]
> + [REGSET_CET32] = {
> + .core_note_type = NT_X86_CET,
> + .n = sizeof(struct cet_user_state) / sizeof(u64),
> + .size = sizeof(u64), .align = sizeof(u64),
> + .active = cetregs_active, .regset_get = cetregs_get,
> + .set = cetregs_set
> + },
> };
Why are there different identifiers for 32-bit CET and 64-bit CET when
they operate on the same structs and have the same handlers? If
there's a good reason for that, the commit message should probably
point that out.
On 9/2/2020 1:03 PM, Jann Horn wrote:
> On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
>> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>>
>> IA32_U_CET (user-mode CET settings) and
>> IA32_PL3_SSP (user-mode Shadow Stack)
> [...]
>> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
> [...]
>> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
>> + struct membuf to)
>> +{
>> + struct fpu *fpu = &target->thread.fpu;
>> + struct cet_user_state *cetregs;
>> +
>> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
>> + return -ENODEV;
>> +
>> + fpu__prepare_read(fpu);
>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>> + if (!cetregs)
>> + return -EFAULT;
>
> Can this branch ever be hit without a kernel bug? If yes, I think
> -EFAULT is probably a weird error code to choose here. If no, this
> should probably use WARN_ON(). Same thing in cetregs_set().
>
When a thread is not CET-enabled, its CET state does not exist. I
looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV,
which means "No such device"?
[...]
>> @@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
> [...]
>> + [REGSET_CET32] = {
>> + .core_note_type = NT_X86_CET,
>> + .n = sizeof(struct cet_user_state) / sizeof(u64),
>> + .size = sizeof(u64), .align = sizeof(u64),
>> + .active = cetregs_active, .regset_get = cetregs_get,
>> + .set = cetregs_set
>> + },
>> };
>
> Why are there different identifiers for 32-bit CET and 64-bit CET when
> they operate on the same structs and have the same handlers? If
> there's a good reason for that, the commit message should probably
> point that out.
>
Yes, the reason for two regsets is that fill_note_info() does not expect
any holes in a regsets. I will put this in the commit log.
Thanks,
Yu-cheng
> On Sep 2, 2020, at 3:13 PM, Yu, Yu-cheng <[email protected]> wrote:
>
> On 9/2/2020 1:03 PM, Jann Horn wrote:
>>> On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
>>> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>>>
>>> IA32_U_CET (user-mode CET settings) and
>>> IA32_PL3_SSP (user-mode Shadow Stack)
>> [...]
>>> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
>> [...]
>>> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
>>> + struct membuf to)
>>> +{
>>> + struct fpu *fpu = &target->thread.fpu;
>>> + struct cet_user_state *cetregs;
>>> +
>>> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
>>> + return -ENODEV;
>>> +
>>> + fpu__prepare_read(fpu);
>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>> + if (!cetregs)
>>> + return -EFAULT;
>> Can this branch ever be hit without a kernel bug? If yes, I think
>> -EFAULT is probably a weird error code to choose here. If no, this
>> should probably use WARN_ON(). Same thing in cetregs_set().
>
> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
>
> [...]
>
>>> @@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
>> [...]
>>> + [REGSET_CET32] = {
>>> + .core_note_type = NT_X86_CET,
>>> + .n = sizeof(struct cet_user_state) / sizeof(u64),
>>> + .size = sizeof(u64), .align = sizeof(u64),
>>> + .active = cetregs_active, .regset_get = cetregs_get,
>>> + .set = cetregs_set
>>> + },
>>> };
>> Why are there different identifiers for 32-bit CET and 64-bit CET when
>> they operate on the same structs and have the same handlers? If
>> there's a good reason for that, the commit message should probably
>> point that out.
>
> Yes, the reason for two regsets is that fill_note_info() does not expect any holes in a regsets. I will put this in the commit log.
>
>
Perhaps we could fix that instead?
On Thu, Sep 3, 2020 at 12:13 AM Yu, Yu-cheng <[email protected]> wrote:
> On 9/2/2020 1:03 PM, Jann Horn wrote:
> > On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
> >> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
> >>
> >> IA32_U_CET (user-mode CET settings) and
> >> IA32_PL3_SSP (user-mode Shadow Stack)
> > [...]
> >> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
> > [...]
> >> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
> >> + struct membuf to)
> >> +{
> >> + struct fpu *fpu = &target->thread.fpu;
> >> + struct cet_user_state *cetregs;
> >> +
> >> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
> >> + return -ENODEV;
> >> +
> >> + fpu__prepare_read(fpu);
> >> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
> >> + if (!cetregs)
> >> + return -EFAULT;
> >
> > Can this branch ever be hit without a kernel bug? If yes, I think
> > -EFAULT is probably a weird error code to choose here. If no, this
> > should probably use WARN_ON(). Same thing in cetregs_set().
> >
>
> When a thread is not CET-enabled, its CET state does not exist. I
> looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV,
> which means "No such device"?
Yeah, I guess ENODEV might fit reasonably well.
On 9/2/2020 4:50 PM, Andy Lutomirski wrote:
>
>
>> On Sep 2, 2020, at 3:13 PM, Yu, Yu-cheng <[email protected]> wrote:
>>
>> On 9/2/2020 1:03 PM, Jann Horn wrote:
>>>> On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
>>>> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>>>>
>>>> IA32_U_CET (user-mode CET settings) and
>>>> IA32_PL3_SSP (user-mode Shadow Stack)
>>> [...]
>>>> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
>>> [...]
>>>> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
>>>> + struct membuf to)
>>>> +{
>>>> + struct fpu *fpu = &target->thread.fpu;
>>>> + struct cet_user_state *cetregs;
>>>> +
>>>> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
>>>> + return -ENODEV;
>>>> +
>>>> + fpu__prepare_read(fpu);
>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>>> + if (!cetregs)
>>>> + return -EFAULT;
>>> Can this branch ever be hit without a kernel bug? If yes, I think
>>> -EFAULT is probably a weird error code to choose here. If no, this
>>> should probably use WARN_ON(). Same thing in cetregs_set().
>>
>> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
>>
>> [...]
>>
>>>> @@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
>>> [...]
>>>> + [REGSET_CET32] = {
>>>> + .core_note_type = NT_X86_CET,
>>>> + .n = sizeof(struct cet_user_state) / sizeof(u64),
>>>> + .size = sizeof(u64), .align = sizeof(u64),
>>>> + .active = cetregs_active, .regset_get = cetregs_get,
>>>> + .set = cetregs_set
>>>> + },
>>>> };
>>> Why are there different identifiers for 32-bit CET and 64-bit CET when
>>> they operate on the same structs and have the same handlers? If
>>> there's a good reason for that, the commit message should probably
>>> point that out.
>>
>> Yes, the reason for two regsets is that fill_note_info() does not expect any holes in a regsets. I will put this in the commit log.
>>
>>
>
> Perhaps we could fix that instead?
>
As long as we understand the root cause, leaving it as-is may be OK.
I had a patch in the past, but did not follow up on it.
https://lore.kernel.org/lkml/[email protected]/
Yu-cheng
On 9/2/2020 5:33 PM, Jann Horn wrote:
> On Thu, Sep 3, 2020 at 12:13 AM Yu, Yu-cheng <[email protected]> wrote:
>> On 9/2/2020 1:03 PM, Jann Horn wrote:
>>> On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
>>>> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>>>>
>>>> IA32_U_CET (user-mode CET settings) and
>>>> IA32_PL3_SSP (user-mode Shadow Stack)
>>> [...]
>>>> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
>>> [...]
>>>> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
>>>> + struct membuf to)
>>>> +{
>>>> + struct fpu *fpu = &target->thread.fpu;
>>>> + struct cet_user_state *cetregs;
>>>> +
>>>> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
>>>> + return -ENODEV;
>>>> +
>>>> + fpu__prepare_read(fpu);
>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>>> + if (!cetregs)
>>>> + return -EFAULT;
>>>
>>> Can this branch ever be hit without a kernel bug? If yes, I think
>>> -EFAULT is probably a weird error code to choose here. If no, this
>>> should probably use WARN_ON(). Same thing in cetregs_set().
>>>
>>
>> When a thread is not CET-enabled, its CET state does not exist. I
>> looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV,
>> which means "No such device"?
>
> Yeah, I guess ENODEV might fit reasonably well.
>
I will update it. Thanks!
> On Sep 2, 2020, at 7:53 PM, Yu, Yu-cheng <[email protected]> wrote:
>
> On 9/2/2020 4:50 PM, Andy Lutomirski wrote:
>>>> On Sep 2, 2020, at 3:13 PM, Yu, Yu-cheng <[email protected]> wrote:
>>>
>>> On 9/2/2020 1:03 PM, Jann Horn wrote:
>>>>> On Tue, Aug 25, 2020 at 2:30 AM Yu-cheng Yu <[email protected]> wrote:
>>>>> Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
>>>>>
>>>>> IA32_U_CET (user-mode CET settings) and
>>>>> IA32_PL3_SSP (user-mode Shadow Stack)
>>>> [...]
>>>>> diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
>>>> [...]
>>>>> +int cetregs_get(struct task_struct *target, const struct user_regset *regset,
>>>>> + struct membuf to)
>>>>> +{
>>>>> + struct fpu *fpu = &target->thread.fpu;
>>>>> + struct cet_user_state *cetregs;
>>>>> +
>>>>> + if (!boot_cpu_has(X86_FEATURE_SHSTK))
>>>>> + return -ENODEV;
>>>>> +
>>>>> + fpu__prepare_read(fpu);
>>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>>>> + if (!cetregs)
>>>>> + return -EFAULT;
>>>> Can this branch ever be hit without a kernel bug? If yes, I think
>>>> -EFAULT is probably a weird error code to choose here. If no, this
>>>> should probably use WARN_ON(). Same thing in cetregs_set().
>>>
>>> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
Having read the code, I’m unconvinced. It looks like a get_xsave_addr() failure means “state not saved; task sees INIT state”. So *maybe* it’s reasonable -ENODEV this, but I’m not really convinced. I tend to think we should return the actual INIT state and that we should permit writes and handle them correctly.
Dave, what do you think?
>>>
>>> [...]
>>>
>>>>> @@ -1284,6 +1293,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
>>>> [...]
>>>>> + [REGSET_CET32] = {
>>>>> + .core_note_type = NT_X86_CET,
>>>>> + .n = sizeof(struct cet_user_state) / sizeof(u64),
>>>>> + .size = sizeof(u64), .align = sizeof(u64),
>>>>> + .active = cetregs_active, .regset_get = cetregs_get,
>>>>> + .set = cetregs_set
>>>>> + },
>>>>> };
>>>> Why are there different identifiers for 32-bit CET and 64-bit CET when
>>>> they operate on the same structs and have the same handlers? If
>>>> there's a good reason for that, the commit message should probably
>>>> point that out.
>>>
>>> Yes, the reason for two regsets is that fill_note_info() does not expect any holes in a regsets. I will put this in the commit log.
>>>
>>>
>> Perhaps we could fix that instead?
>
> As long as we understand the root cause, leaving it as-is may be OK.
The regset mechanism’s interactions with compat are awful. Let’s please not make it worse. One CET regret is good; two is not good.
>
> I had a patch in the past, but did not follow up on it.
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Yu-cheng
On 9/2/20 9:35 PM, Andy Lutomirski wrote:
>>>>>> + fpu__prepare_read(fpu);
>>>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>>>>> + if (!cetregs)
>>>>>> + return -EFAULT;
>>>>> Can this branch ever be hit without a kernel bug? If yes, I think
>>>>> -EFAULT is probably a weird error code to choose here. If no, this
>>>>> should probably use WARN_ON(). Same thing in cetregs_set().
>>>> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
> Having read the code, I’m unconvinced. It looks like a get_xsave_addr() failure means “state not saved; task sees INIT state”. So *maybe* it’s reasonable -ENODEV this, but I’m not really convinced. I tend to think we should return the actual INIT state and that we should permit writes and handle them correctly.
PTRACE is asking for access to the values in the *registers*, not for
the value in the kernel XSAVE buffer. We just happen to only have the
kernel XSAVE buffer around.
If we want to really support PTRACE we have to allow the registers to be
get/set, regardless of what state they are in, INIT state or not. So,
yeah I agree with Andy.
On Thu, Sep 3, 2020 at 7:27 AM Dave Hansen <[email protected]> wrote:
>
> On 9/2/20 9:35 PM, Andy Lutomirski wrote:
> >>>>>> + fpu__prepare_read(fpu);
> >>>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
> >>>>>> + if (!cetregs)
> >>>>>> + return -EFAULT;
> >>>>> Can this branch ever be hit without a kernel bug? If yes, I think
> >>>>> -EFAULT is probably a weird error code to choose here. If no, this
> >>>>> should probably use WARN_ON(). Same thing in cetregs_set().
> >>>> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
> > Having read the code, I’m unconvinced. It looks like a get_xsave_addr() failure means “state not saved; task sees INIT state”. So *maybe* it’s reasonable -ENODEV this, but I’m not really convinced. I tend to think we should return the actual INIT state and that we should permit writes and handle them correctly.
>
> PTRACE is asking for access to the values in the *registers*, not for
> the value in the kernel XSAVE buffer. We just happen to only have the
> kernel XSAVE buffer around.
>
> If we want to really support PTRACE we have to allow the registers to be
> get/set, regardless of what state they are in, INIT state or not. So,
> yeah I agree with Andy.
I think the core dump code gets here, too, so the values might be in
registers as well. I hope that fpu__prepare_read() does the right
thing in this case.
--Andy
On 9/3/2020 7:26 AM, Dave Hansen wrote:
> On 9/2/20 9:35 PM, Andy Lutomirski wrote:
>>>>>>> + fpu__prepare_read(fpu);
>>>>>>> + cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_CET_USER);
>>>>>>> + if (!cetregs)
>>>>>>> + return -EFAULT;
>>>>>> Can this branch ever be hit without a kernel bug? If yes, I think
>>>>>> -EFAULT is probably a weird error code to choose here. If no, this
>>>>>> should probably use WARN_ON(). Same thing in cetregs_set().
>>>>> When a thread is not CET-enabled, its CET state does not exist. I looked at EFAULT, and it means "Bad address". Maybe this can be ENODEV, which means "No such device"?
>> Having read the code, I’m unconvinced. It looks like a get_xsave_addr() failure means “state not saved; task sees INIT state”. So *maybe* it’s reasonable -ENODEV this, but I’m not really convinced. I tend to think we should return the actual INIT state and that we should permit writes and handle them correctly.
>
> PTRACE is asking for access to the values in the *registers*, not for
> the value in the kernel XSAVE buffer. We just happen to only have the
> kernel XSAVE buffer around.
When get_xsave_addr() returns NULL, there are three possibilities:
- XSAVE is not enabled or not supported;
- The kernel does not support the requested feature;
- The requested feature is in INIT state.
If the debugger is going to write an MSR, only in the third case would
this make a slight sense. For example, if the system has CET enabled,
but the task does not have CET enabled, and GDB is writing to a CET MSR.
But still, this is strange to me.
>
> If we want to really support PTRACE we have to allow the registers to be
> get/set, regardless of what state they are in, INIT state or not. So,
> yeah I agree with Andy.
>
GDB does not have a WRMSR mechanism. If GDB is going to write an MSR,
it will call arch_prctl or an assembly routine in memory.
Yu-cheng
On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
> If the debugger is going to write an MSR, only in the third case would
> this make a slight sense. For example, if the system has CET enabled,
> but the task does not have CET enabled, and GDB is writing to a CET MSR.
> But still, this is strange to me.
If this is strange, then why do we even _implement_ writes?
On Thu, Sep 3, 2020 at 9:12 AM Dave Hansen <[email protected]> wrote:
>
> On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
> > If the debugger is going to write an MSR, only in the third case would
> > this make a slight sense. For example, if the system has CET enabled,
> > but the task does not have CET enabled, and GDB is writing to a CET MSR.
> > But still, this is strange to me.
>
> If this is strange, then why do we even _implement_ writes?
Well, if gdb wants to force a tracee to return early from a function,
wouldn't it want the ability to modify SSP?
--Andy
On 9/3/2020 9:11 AM, Dave Hansen wrote:
> On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
>> If the debugger is going to write an MSR, only in the third case would
>> this make a slight sense. For example, if the system has CET enabled,
>> but the task does not have CET enabled, and GDB is writing to a CET MSR.
>> But still, this is strange to me.
>
> If this is strange, then why do we even _implement_ writes?
>
For example, if the task has CET enabled, and it is in a control
protection fault, the debugger can clear the task's IBT state, or unwind
the shadow stack, etc. But if the task does not have CET enabled (its
CET MSRs are in INIT state), it would make sense for the PTRACE call to
return failure than doing something else, right?
On 9/3/20 9:15 AM, Andy Lutomirski wrote:
> On Thu, Sep 3, 2020 at 9:12 AM Dave Hansen <[email protected]> wrote:
>>
>> On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
>>> If the debugger is going to write an MSR, only in the third case would
>>> this make a slight sense. For example, if the system has CET enabled,
>>> but the task does not have CET enabled, and GDB is writing to a CET MSR.
>>> But still, this is strange to me.
>>
>> If this is strange, then why do we even _implement_ writes?
>
> Well, if gdb wants to force a tracee to return early from a function,
> wouldn't it want the ability to modify SSP?
That's true.
Yu-cheng, can you take a look through and see for the other setregs
users what they do for logically crazy, strange things? Is there
precedent for refusing a write which is possible but illogical or
strange? If so, which error code do they use?
Taking the config register out of the init state is illogical, as is
writing to SSP while the config register is in its init state.
On Thu, Sep 3, 2020 at 9:21 AM Yu, Yu-cheng <[email protected]> wrote:
>
> On 9/3/2020 9:11 AM, Dave Hansen wrote:
> > On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
> >> If the debugger is going to write an MSR, only in the third case would
> >> this make a slight sense. For example, if the system has CET enabled,
> >> but the task does not have CET enabled, and GDB is writing to a CET MSR.
> >> But still, this is strange to me.
> >
> > If this is strange, then why do we even _implement_ writes?
> >
>
> For example, if the task has CET enabled, and it is in a control
> protection fault, the debugger can clear the task's IBT state, or unwind
> the shadow stack, etc. But if the task does not have CET enabled (its
> CET MSRs are in INIT state), it would make sense for the PTRACE call to
> return failure than doing something else, right?
What do you mean "something else"? I assume that, if GDB tells
ptrace() to write some value to the CET MSR, then it should get that
value. If GDB writes to it on a task that is not currently using CET,
I don't see why it should fail.
--Andy
On Thu, Sep 3, 2020 at 9:25 AM Dave Hansen <[email protected]> wrote:
>
> On 9/3/20 9:15 AM, Andy Lutomirski wrote:
> > On Thu, Sep 3, 2020 at 9:12 AM Dave Hansen <[email protected]> wrote:
> >>
> >> On 9/3/20 9:09 AM, Yu, Yu-cheng wrote:
> >>> If the debugger is going to write an MSR, only in the third case would
> >>> this make a slight sense. For example, if the system has CET enabled,
> >>> but the task does not have CET enabled, and GDB is writing to a CET MSR.
> >>> But still, this is strange to me.
> >>
> >> If this is strange, then why do we even _implement_ writes?
> >
> > Well, if gdb wants to force a tracee to return early from a function,
> > wouldn't it want the ability to modify SSP?
>
> That's true.
>
> Yu-cheng, can you take a look through and see for the other setregs
> users what they do for logically crazy, strange things? Is there
> precedent for refusing a write which is possible but illogical or
> strange? If so, which error code do they use?
>
> Taking the config register out of the init state is illogical, as is
> writing to SSP while the config register is in its init state.
What's so special about the INIT state? It's optimized by XSAVES, but
it's just a number, right? So taking the register out of the INIT
state is kind of like saying "gdb wanted to set xmm0 to (0,0,0,1), but
it was in the INIT state to begin with", right?
On 9/3/20 9:32 AM, Andy Lutomirski wrote:
>> Taking the config register out of the init state is illogical, as is
>> writing to SSP while the config register is in its init state.
> What's so special about the INIT state? It's optimized by XSAVES, but
> it's just a number, right? So taking the register out of the INIT
> state is kind of like saying "gdb wanted to set xmm0 to (0,0,0,1), but
> it was in the INIT state to begin with", right?
Yeah, that's a good point. The init state shouldn't be special, as the
hardware is within its right to choose not to use the init optimization
at any time.
On 9/3/2020 9:42 AM, Dave Hansen wrote:
> On 9/3/20 9:32 AM, Andy Lutomirski wrote:
>>> Taking the config register out of the init state is illogical, as is
>>> writing to SSP while the config register is in its init state.
>> What's so special about the INIT state? It's optimized by XSAVES, but
>> it's just a number, right? So taking the register out of the INIT
>> state is kind of like saying "gdb wanted to set xmm0 to (0,0,0,1), but
>> it was in the INIT state to begin with", right?
>
> Yeah, that's a good point. The init state shouldn't be special, as the
> hardware is within its right to choose not to use the init optimization
> at any time.
>
Then, I would suggest changing get_xsave_addr() to return non-null for
the INIT state case. For the other two cases, it still returns NULL.
But this also requires any write to INIT states to set xstate_bv bits
properly. This would be a pitfall for any code addition later on.
Looking at this another way. Would it be better for the debugger to get
an error and then to set the MSR directly first (vs. changing the XSAVES
INIT state first)?