2018-01-04 02:01:19

by Andi Kleen

[permalink] [raw]
Subject: Avoid speculative indirect calls in kernel

This is a fix for Variant 2 in
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

Any speculative indirect calls in the kernel can be tricked
to execute any kernel code, which may allow side channel
attacks that can leak arbitrary kernel data.

So we want to avoid speculative indirect calls in the kernel.

There's a special code sequence called a retpoline that can
do indirect calls without speculation. We use a new compiler
option -mindirect-branch=thunk-extern (gcc patch will be released
separately) to recompile the kernel with this new sequence.

We also patch all the assembler code in the kernel to use
the new sequence.

The patches were originally from David Woodhouse and Tim Chen,
but then reworked and enhanced by me.

No performance numbers at this point. 32bit is only boot tested.

Git tree available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc spec/retpoline-415-2

v1: Initial post.
v2:
Add CONFIG_RETPOLINE to build kernel without it.
Change warning messages.
Hide modpost warning message


2018-01-04 02:00:49

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 12/12] retpoline: Attempt to quiten objtool warning for unreachable code

From: Andi Kleen <[email protected]>

The speculative jump trampoline has to contain unreachable code.
objtool keeps complaining

arch/x86/lib/retpoline.o: warning: objtool: __x86.indirect_thunk()+0x8: unreachable instruction

I tried to fix it here by adding ASM_UNREACHABLE annotation (after
supporting them for pure assembler), but it still complains.
Seems like a objtool bug?

So it doesn't actually fix the warning oyet.
Of course it's just a warning so the kernel will still work fine.

Perhaps Josh can figure it out

Cc: [email protected]
Not-Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/lib/retpoline.S | 3 +++
include/linux/compiler.h | 10 +++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index cb40781adbfe..f4ed556a90ee 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -11,6 +11,7 @@
*/

#include <linux/linkage.h>
+#include <linux/compiler.h>
#include <asm/dwarf2.h>
#include <asm/export.h>

@@ -21,7 +22,9 @@ ENTRY(__x86.indirect_thunk)
call retpoline_call_target
2:
lfence /* stop speculation */
+ ASM_UNREACHABLE
jmp 2b
+ ASM_UNREACHABLE
retpoline_call_target:
#ifdef CONFIG_64BIT
lea 8(%rsp), %rsp
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 52e611ab9a6c..cfba91acc79a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -269,7 +269,15 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s

#endif /* __KERNEL__ */

-#endif /* __ASSEMBLY__ */
+#else /* __ASSEMBLY__ */
+
+#define ASM_UNREACHABLE \
+ 999: \
+ .pushsection .discard.unreachable; \
+ .long 999b - .; \
+ .popsection
+
+#endif /* !__ASSEMBLY__ */

/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
--
2.14.3

2018-01-04 02:00:57

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 10/12] retpoline/taint: Taint kernel for missing retpoline in compiler

From: Andi Kleen <[email protected]>

When the kernel or a module hasn't been compiled with a retpoline
aware compiler, print a warning and set a taint flag.

For modules it is checked at compile time, however it cannot
check assembler or other non compiled objects used in the module link.

Due to lack of better letter it uses taint option 'Z'

v2: Change warning message
Signed-off-by: Andi Kleen <[email protected]>
---
Documentation/admin-guide/tainted-kernels.rst | 3 +++
arch/x86/kernel/setup.c | 6 ++++++
include/linux/kernel.h | 4 +++-
kernel/module.c | 11 ++++++++++-
kernel/panic.c | 1 +
scripts/mod/modpost.c | 9 +++++++++
6 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
index 1df03b5cb02f..800261b6bd6f 100644
--- a/Documentation/admin-guide/tainted-kernels.rst
+++ b/Documentation/admin-guide/tainted-kernels.rst
@@ -52,6 +52,9 @@ characters, each representing a particular tainted value.

16) ``K`` if the kernel has been live patched.

+ 17) ``Z`` if the x86 kernel or a module hasn't been compiled with
+ a retpoline aware compiler and may be vulnerable to data leaks.
+
The primary reason for the **'Tainted: '** string is to tell kernel
debuggers if this is a clean kernel or if anything unusual has
occurred. Tainting is permanent: even if an offending module is
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 8af2e8d0c0a1..cc880b46b756 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1296,6 +1296,12 @@ void __init setup_arch(char **cmdline_p)
#endif

unwind_init();
+
+#ifndef RETPOLINE
+ add_taint(TAINT_NO_RETPOLINE, LOCKDEP_STILL_OK);
+ pr_warn("No support for retpoline in kernel compiler\n");
+ pr_warn("System may be vulnerable to data leaks.\n");
+#endif
}

#ifdef CONFIG_X86_32
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index ce51455e2adf..fbb4d3baffcc 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -550,7 +550,9 @@ extern enum system_states {
#define TAINT_SOFTLOCKUP 14
#define TAINT_LIVEPATCH 15
#define TAINT_AUX 16
-#define TAINT_FLAGS_COUNT 17
+#define TAINT_NO_RETPOLINE 17
+
+#define TAINT_FLAGS_COUNT 18

struct taint_flag {
char c_true; /* character printed when tainted */
diff --git a/kernel/module.c b/kernel/module.c
index dea01ac9cb74..92db3f59a29a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3028,7 +3028,16 @@ static int check_modinfo(struct module *mod, struct load_info *info, int flags)
mod->name);
add_taint_module(mod, TAINT_OOT_MODULE, LOCKDEP_STILL_OK);
}
-
+#ifdef RETPOLINE
+ if (!get_modinfo(info, "retpoline")) {
+ if (!test_taint(TAINT_NO_RETPOLINE)) {
+ pr_warn("%s: loading module not compiled with retpoline compiler.\n",
+ mod->name);
+ pr_warn("Kernel may be vulnerable to data leaks.\n");
+ }
+ add_taint_module(mod, TAINT_NO_RETPOLINE, LOCKDEP_STILL_OK);
+ }
+#endif
if (get_modinfo(info, "staging")) {
add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);
pr_warn("%s: module is from the staging directory, the quality "
diff --git a/kernel/panic.c b/kernel/panic.c
index 2cfef408fec9..6686c67b6e4b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -325,6 +325,7 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
{ 'L', ' ', false }, /* TAINT_SOFTLOCKUP */
{ 'K', ' ', true }, /* TAINT_LIVEPATCH */
{ 'X', ' ', true }, /* TAINT_AUX */
+ { 'Z', ' ', true }, /* TAINT_NO_RETPOLINE */
};

/**
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index f51cf977c65b..6510536c06df 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -2165,6 +2165,14 @@ static void add_intree_flag(struct buffer *b, int is_intree)
buf_printf(b, "\nMODULE_INFO(intree, \"Y\");\n");
}

+/* Cannot check for assembler */
+static void add_retpoline(struct buffer *b)
+{
+ buf_printf(b, "\n#ifdef RETPOLINE\n");
+ buf_printf(b, "MODULE_INFO(retpoline, \"Y\");\n");
+ buf_printf(b, "#endif\n");
+}
+
static void add_staging_flag(struct buffer *b, const char *name)
{
static const char *staging_dir = "drivers/staging";
@@ -2506,6 +2514,7 @@ int main(int argc, char **argv)
err |= check_modname_len(mod);
add_header(&buf, mod);
add_intree_flag(&buf, !external_module);
+ add_retpoline(&buf);
add_staging_flag(&buf, mod->name);
err |= add_versions(&buf, mod);
add_depends(&buf, mod, modules);
--
2.14.3

2018-01-04 02:00:55

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 04/12] x86/retpoline/ftrace: Convert ftrace assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in ftrace assembler code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/ftrace_32.S | 3 ++-
arch/x86/kernel/ftrace_64.S | 6 +++---
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index b6c6468e10bc..c0c10ba53375 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -8,6 +8,7 @@
#include <asm/segment.h>
#include <asm/export.h>
#include <asm/ftrace.h>
+#include <asm/jump-asm.h>

#ifdef CC_USING_FENTRY
# define function_hook __fentry__
@@ -241,5 +242,5 @@ return_to_handler:
movl %eax, %ecx
popl %edx
popl %eax
- jmp *%ecx
+ NOSPEC_JMP %ecx
#endif
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index c832291d948a..4814c4b30b8d 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -7,7 +7,7 @@
#include <asm/ptrace.h>
#include <asm/ftrace.h>
#include <asm/export.h>
-
+#include <asm/jump-asm.h>

.code64
.section .entry.text, "ax"
@@ -286,7 +286,7 @@ trace:
* ip and parent ip are used and the list function is called when
* function tracing is enabled.
*/
- call *ftrace_trace_function
+ NOSPEC_CALL ftrace_trace_function

restore_mcount_regs

@@ -329,5 +329,5 @@ GLOBAL(return_to_handler)
movq 8(%rsp), %rdx
movq (%rsp), %rax
addq $24, %rsp
- jmp *%rdi
+ NOSPEC_JMP %rdi
#endif
--
2.14.3

2018-01-04 02:01:20

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 09/12] x86/retpoline: Finally enable retpoline for C code

From: Dave Hansen <[email protected]>

From: David Woodhouse <[email protected]>

Add retpoline compile option in Makefile

Update Makefile with retpoline compile options. This requires a gcc with the
retpoline compiler patches enabled.

Print a warning when the compiler doesn't support retpoline

[Originally from David and Tim, but hacked by AK]

v2: Use CONFIG option to enable
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Tim Chen <[email protected]>
---
arch/x86/Makefile | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 3e73bc255e4e..b02e35350244 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -230,6 +230,17 @@ KBUILD_CFLAGS += -Wno-sign-compare
#
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables

+#
+ifdef CONFIG_RETPOLINE
+ RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern)
+ ifneq ($(RETPOLINE_CFLAGS),)
+ KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
+ KBUILD_AFLAGS += -DRETPOLINE
+ else
+ $(warning Retpoline not supported in compiler. System may be insecure.)
+ endif
+endif
+
archscripts: scripts_basic
$(Q)$(MAKE) $(build)=arch/x86/tools relocs

--
2.14.3

2018-01-04 02:00:54

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

From: Andi Kleen <[email protected]>

With the indirect call thunk enabled compiler two objtool
warnings are triggered very frequently and make the build
very noisy.

I don't see a good way to avoid them, so just disable them
for now.

Signed-off-by: Andi Kleen <[email protected]>
---
tools/objtool/check.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 9b341584eb1b..435c71f944dc 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -503,8 +503,13 @@ static int add_call_destinations(struct objtool_file *file)
insn->call_dest = find_symbol_by_offset(insn->sec,
dest_off);
if (!insn->call_dest) {
+#if 0
+ /* Compilers with -mindirect-branch=thunk-extern trigger
+ * this everywhere on x86. Disable for now.
+ */
WARN_FUNC("can't find call dest symbol at offset 0x%lx",
insn->sec, insn->offset, dest_off);
+#endif
return -1;
}
} else if (rela->sym->type == STT_SECTION) {
@@ -1716,8 +1721,14 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
return 1;

} else if (func && has_modified_stack_frame(&state)) {
+#if 0
+ /* Compilers with -mindirect-branch=thunk-extern trigger
+ * this everywhere on x86. Disable for now.
+ */
+
WARN_FUNC("sibling call from callable instruction with modified stack frame",
sec, insn->offset);
+#endif
return 1;
}

--
2.14.3

2018-01-04 02:00:53

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 07/12] x86/retpoline/checksum32: Convert assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in 32bit checksum assembler code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/lib/checksum_32.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 4d34bb548b41..d19862183d4b 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -29,6 +29,7 @@
#include <asm/errno.h>
#include <asm/asm.h>
#include <asm/export.h>
+#include <asm/jump-asm.h>

/*
* computes a partial checksum, e.g. for TCP/UDP fragments
@@ -156,7 +157,7 @@ ENTRY(csum_partial)
negl %ebx
lea 45f(%ebx,%ebx,2), %ebx
testl %esi, %esi
- jmp *%ebx
+ NOSPEC_JMP %ebx

# Handle 2-byte-aligned regions
20: addw (%esi), %ax
@@ -439,7 +440,7 @@ ENTRY(csum_partial_copy_generic)
andl $-32,%edx
lea 3f(%ebx,%ebx), %ebx
testl %esi, %esi
- jmp *%ebx
+ NOSPEC_JMP %ebx
1: addl $64,%esi
addl $64,%edi
SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl)
--
2.14.3

2018-01-04 02:00:51

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 08/12] x86/retpoline/irq32: Convert assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in 32bit irq inline asm code to use
non speculative sequences.

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/irq_32.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
index a83b3346a0e1..2dce2fdd2c4e 100644
--- a/arch/x86/kernel/irq_32.c
+++ b/arch/x86/kernel/irq_32.c
@@ -20,6 +20,7 @@
#include <linux/mm.h>

#include <asm/apic.h>
+#include <asm/jump-asm.h>

#ifdef CONFIG_DEBUG_STACKOVERFLOW

@@ -55,7 +56,7 @@ DEFINE_PER_CPU(struct irq_stack *, softirq_stack);
static void call_on_stack(void *func, void *stack)
{
asm volatile("xchgl %%ebx,%%esp \n"
- "call *%%edi \n"
+ NOSPEC_CALL("%%edi")
"movl %%ebx,%%esp \n"
: "=b" (stack)
: "0" (stack),
@@ -95,7 +96,7 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc)
call_on_stack(print_stack_overflow, isp);

asm volatile("xchgl %%ebx,%%esp \n"
- "call *%%edi \n"
+ NOSPEC_CALL("%%edi")
"movl %%ebx,%%esp \n"
: "=a" (arg1), "=b" (isp)
: "0" (desc), "1" (isp),
--
2.14.3

2018-01-04 02:00:50

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 02/12] x86/retpoline/crypto: Convert crypto assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in crypto assembler code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/crypto/aesni-intel_asm.S | 5 +++--
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 3 ++-
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 2 +-
arch/x86/crypto/crc32c-pcl-intel-asm_64.S | 3 ++-
4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 16627fec80b2..f6c964f30ede 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -32,6 +32,7 @@
#include <linux/linkage.h>
#include <asm/inst.h>
#include <asm/frame.h>
+#include <asm/jump-asm.h>

/*
* The following macros are used to move an (un)aligned 16 byte value to/from
@@ -2884,7 +2885,7 @@ ENTRY(aesni_xts_crypt8)
pxor INC, STATE4
movdqu IV, 0x30(OUTP)

- call *%r11
+ NOSPEC_CALL %r11

movdqu 0x00(OUTP), INC
pxor INC, STATE1
@@ -2929,7 +2930,7 @@ ENTRY(aesni_xts_crypt8)
_aesni_gf128mul_x_ble()
movups IV, (IVP)

- call *%r11
+ NOSPEC_CALL %r11

movdqu 0x40(OUTP), INC
pxor INC, STATE1
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index f7c495e2863c..9006543227b2 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -17,6 +17,7 @@

#include <linux/linkage.h>
#include <asm/frame.h>
+#include <asm/jump-asm.h>

#define CAMELLIA_TABLE_BYTE_LEN 272

@@ -1227,7 +1228,7 @@ camellia_xts_crypt_16way:
vpxor 14 * 16(%rax), %xmm15, %xmm14;
vpxor 15 * 16(%rax), %xmm15, %xmm15;

- call *%r9;
+ NOSPEC_CALL %r9;

addq $(16 * 16), %rsp;

diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index eee5b3982cfd..1743e6850e00 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -1343,7 +1343,7 @@ camellia_xts_crypt_32way:
vpxor 14 * 32(%rax), %ymm15, %ymm14;
vpxor 15 * 32(%rax), %ymm15, %ymm15;

- call *%r9;
+ NOSPEC_CALL %r9;

addq $(16 * 32), %rsp;

diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 7a7de27c6f41..969000c64456 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -45,6 +45,7 @@

#include <asm/inst.h>
#include <linux/linkage.h>
+#include <asm/jump-asm.h>

## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction

@@ -172,7 +173,7 @@ continue_block:
movzxw (bufp, %rax, 2), len
lea crc_array(%rip), bufp
lea (bufp, len, 1), bufp
- jmp *bufp
+ NOSPEC_JMP bufp

################################################################
## 2a) PROCESS FULL BLOCKS:
--
2.14.3

2018-01-04 02:00:48

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 03/12] x86/retpoline/entry: Convert entry assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in core 32/64bit entry assembler code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/entry/entry_32.S | 5 +++--
arch/x86/entry/entry_64.S | 12 +++++++-----
2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index ace8f321a5a1..a4b88260d6f7 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -44,6 +44,7 @@
#include <asm/asm.h>
#include <asm/smap.h>
#include <asm/frame.h>
+#include <asm/jump-asm.h>

.section .entry.text, "ax"

@@ -290,7 +291,7 @@ ENTRY(ret_from_fork)

/* kernel thread */
1: movl %edi, %eax
- call *%ebx
+ NOSPEC_CALL %ebx
/*
* A kernel thread is allowed to return here after successfully
* calling do_execve(). Exit to userspace to complete the execve()
@@ -919,7 +920,7 @@ common_exception:
movl %ecx, %es
TRACE_IRQS_OFF
movl %esp, %eax # pt_regs pointer
- call *%edi
+ NOSPEC_CALL %edi
jmp ret_from_exception
END(common_exception)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f048e384ff54..486990fb3e4d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -37,6 +37,7 @@
#include <asm/pgtable_types.h>
#include <asm/export.h>
#include <asm/frame.h>
+#include <asm/jump-asm.h>
#include <linux/err.h>

#include "calling.h"
@@ -191,7 +192,7 @@ ENTRY(entry_SYSCALL_64_trampoline)
*/
pushq %rdi
movq $entry_SYSCALL_64_stage2, %rdi
- jmp *%rdi
+ NOSPEC_JMP %rdi
END(entry_SYSCALL_64_trampoline)

.popsection
@@ -269,8 +270,9 @@ entry_SYSCALL_64_fastpath:
* This call instruction is handled specially in stub_ptregs_64.
* It might end up jumping to the slow path. If it jumps, RAX
* and all argument registers are clobbered.
- */
- call *sys_call_table(, %rax, 8)
+ */
+ movq sys_call_table(, %rax, 8), %rax
+ NOSPEC_CALL %rax
.Lentry_SYSCALL_64_after_fastpath_call:

movq %rax, RAX(%rsp)
@@ -442,7 +444,7 @@ ENTRY(stub_ptregs_64)
jmp entry_SYSCALL64_slow_path

1:
- jmp *%rax /* Called from C */
+ NOSPEC_JMP %rax /* Called from C */
END(stub_ptregs_64)

.macro ptregs_stub func
@@ -521,7 +523,7 @@ ENTRY(ret_from_fork)
1:
/* kernel thread */
movq %r12, %rdi
- call *%rbx
+ NOSPEC_CALL %rbx
/*
* A kernel thread is allowed to return here after successfully
* calling do_execve(). Exit to userspace to complete the execve()
--
2.14.3

2018-01-04 02:02:54

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 01/12] x86/retpoline: Define retpoline indirect thunk and macros

From: Dave Hansen <[email protected]>

From: David Woodhouse <[email protected]>

retpoline is a special sequence on Intel CPUs to stop speculation for
indirect branches.

Provide assembler infrastructure to use retpoline by the compiler
and for assembler. We add the out of line trampoline used by the
compiler, and NOSPEC_JUMP / NOSPEC_CALL macros for assembler

[Originally from David and Tim, heavily hacked by AK]

v2: Add CONFIG_RETPOLINE option
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Tim Chen <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/Kconfig | 8 +++++
arch/x86/include/asm/jump-asm.h | 70 +++++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/vmlinux.lds.S | 1 +
arch/x86/lib/Makefile | 1 +
arch/x86/lib/retpoline.S | 35 +++++++++++++++++++++
5 files changed, 115 insertions(+)
create mode 100644 arch/x86/include/asm/jump-asm.h
create mode 100644 arch/x86/lib/retpoline.S

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4fc98c50378..8b0facfa35be 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -429,6 +429,14 @@ config GOLDFISH
def_bool y
depends on X86_GOLDFISH

+config RETPOLINE
+ bool "Avoid speculative indirect branches in kernel"
+ default y
+ help
+ Compile kernel with the retpoline compiler options to guard against
+ kernel to user data leaks by avoiding speculative indirect
+ branches. Requires a new enough compiler. The kernel may run slower.
+
config INTEL_RDT
bool "Intel Resource Director Technology support"
default n
diff --git a/arch/x86/include/asm/jump-asm.h b/arch/x86/include/asm/jump-asm.h
new file mode 100644
index 000000000000..936fa620f346
--- /dev/null
+++ b/arch/x86/include/asm/jump-asm.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef JUMP_ASM_H
+#define JUMP_ASM_H 1
+
+#ifdef __ASSEMBLY__
+
+#ifdef CONFIG_RETPOLINE
+
+/*
+ * Jump to an indirect pointer without speculation.
+ *
+ * The out of line __x86.indirect_thunk has special code sequences
+ * to stop speculation.
+ */
+
+.macro NOSPEC_JMP target
+ push \target
+ jmp __x86.indirect_thunk
+.endm
+
+
+/*
+ * Call an indirect pointer without speculation.
+ */
+
+.macro NOSPEC_CALL target
+ jmp 1221f
+1222:
+ push \target
+ jmp __x86.indirect_thunk
+1221:
+ call 1222b
+.endm
+
+#else /* CONFIG_RETPOLINE */
+
+.macro NOSPEC_JMP target
+ jmp *\target
+.endm
+
+.macro NOSPEC_CALL target
+ call *\target
+.endm
+
+#endif /* !CONFIG_RETPOLINE */
+
+#else /* __ASSEMBLY__ */
+
+#ifdef CONFIG_RETPOLINE
+
+#define NOSPEC_JMP(t) \
+ "push " t "; " \
+ "jmp __x86.indirect_thunk; "
+
+#define NOSPEC_CALL(t) \
+ " jmp 1221f; " \
+ "1222: push " t ";" \
+ " jmp __x86.indirect_thunk;" \
+ "1221: call 1222b;"
+
+#else /* CONFIG_RETPOLINE */
+
+#define NOSPEC_JMP(t) "jmp *" t "; "
+#define NOSPEC_CALL(t) "call *" t "; "
+
+#endif /* !CONFIG_RETPOLINE */
+
+#endif /* !__ASSEMBLY */
+
+#endif
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1e413a9326aa..2e64241a6664 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -103,6 +103,7 @@ SECTIONS
/* bootstrapping code */
HEAD_TEXT
. = ALIGN(8);
+ *(.text.__x86.indirect_thunk)
TEXT_TEXT
SCHED_TEXT
CPUIDLE_TEXT
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 7b181b61170e..f23934bbaf4e 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -26,6 +26,7 @@ lib-y += memcpy_$(BITS).o
lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
+lib-$(CONFIG_RETPOLINE) += retpoline.o

obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
new file mode 100644
index 000000000000..cb40781adbfe
--- /dev/null
+++ b/arch/x86/lib/retpoline.S
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Out of line jump trampoline for calls that disable speculation.
+ *
+ * This is a special sequence that prevents the CPU speculating
+ * for indirect calls.
+ *
+ * This can be called by gcc generated code, or with the asm macros
+ * in asm/jump-asm.h
+ */
+
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+#include <asm/export.h>
+
+ .section .text.__x86.indirect_thunk,"ax"
+
+ENTRY(__x86.indirect_thunk)
+ CFI_STARTPROC
+ call retpoline_call_target
+2:
+ lfence /* stop speculation */
+ jmp 2b
+retpoline_call_target:
+#ifdef CONFIG_64BIT
+ lea 8(%rsp), %rsp
+#else
+ lea 4(%esp), %esp
+#endif
+ ret
+ CFI_ENDPROC
+ENDPROC(__x86.indirect_thunk)
+
+ EXPORT_SYMBOL(__x86.indirect_thunk)
--
2.14.3

2018-01-04 02:02:53

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 06/12] x86/retpoline/crypto: Convert xen assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in xen inline assembler code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 1 +
arch/x86/include/asm/xen/hypercall.h | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 1743e6850e00..9cd8450a2050 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -12,6 +12,7 @@

#include <linux/linkage.h>
#include <asm/frame.h>
+#include <asm/jump-asm.h>

#define CAMELLIA_TABLE_BYTE_LEN 272

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 7cb282e9e587..91de35bcce5e 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -44,6 +44,7 @@
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/smap.h>
+#include <asm/jump-asm.h>

#include <xen/interface/xen.h>
#include <xen/interface/sched.h>
@@ -217,7 +218,7 @@ privcmd_call(unsigned call,
__HYPERCALL_5ARG(a1, a2, a3, a4, a5);

stac();
- asm volatile("call *%[call]"
+ asm volatile(NOSPEC_CALL("%[call]")
: __HYPERCALL_5PARAM
: [call] "a" (&hypercall_page[call])
: __HYPERCALL_CLOBBER5);
--
2.14.3

2018-01-04 02:03:19

by Andi Kleen

[permalink] [raw]
Subject: [PATCH v2 05/12] x86/retpoline/hyperv: Convert assembler indirect jumps

From: Andi Kleen <[email protected]>

Convert all indirect jumps in hyperv inline asm code to use
non speculative sequences.

Based on code from David Woodhouse and Tim Chen

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/include/asm/mshyperv.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 5400add2885b..2dfcdd401d4c 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -7,6 +7,7 @@
#include <linux/nmi.h>
#include <asm/io.h>
#include <asm/hyperv.h>
+#include <asm/jump-asm.h>

/*
* The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
@@ -186,7 +187,7 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
return U64_MAX;

__asm__ __volatile__("mov %4, %%r8\n"
- "call *%5"
+ NOSPEC_CALL("%5")
: "=a" (hv_status), ASM_CALL_CONSTRAINT,
"+c" (control), "+d" (input_address)
: "r" (output_address), "m" (hv_hypercall_pg)
@@ -200,7 +201,7 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
if (!hv_hypercall_pg)
return U64_MAX;

- __asm__ __volatile__("call *%7"
+ __asm__ __volatile__(NOSPEC_CALL("%7")
: "=A" (hv_status),
"+c" (input_address_lo), ASM_CALL_CONSTRAINT
: "A" (control),
@@ -227,7 +228,7 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)

#ifdef CONFIG_X86_64
{
- __asm__ __volatile__("call *%4"
+ __asm__ __volatile__(NOSPEC_CALL("%4")
: "=a" (hv_status), ASM_CALL_CONSTRAINT,
"+c" (control), "+d" (input1)
: "m" (hv_hypercall_pg)
@@ -238,7 +239,7 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
u32 input1_hi = upper_32_bits(input1);
u32 input1_lo = lower_32_bits(input1);

- __asm__ __volatile__ ("call *%5"
+ __asm__ __volatile__ (NOSPEC_CALL("%5")
: "=A"(hv_status),
"+c"(input1_lo),
ASM_CALL_CONSTRAINT
--
2.14.3

2018-01-04 02:15:10

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v2 01/12] x86/retpoline: Define retpoline indirect thunk and macros

On Wed, Jan 3, 2018 at 9:00 PM, Andi Kleen <[email protected]> wrote:
> From: Dave Hansen <[email protected]>
>
> From: David Woodhouse <[email protected]>
>
> retpoline is a special sequence on Intel CPUs to stop speculation for
> indirect branches.
>
> Provide assembler infrastructure to use retpoline by the compiler
> and for assembler. We add the out of line trampoline used by the
> compiler, and NOSPEC_JUMP / NOSPEC_CALL macros for assembler
>
> [Originally from David and Tim, heavily hacked by AK]
>
> v2: Add CONFIG_RETPOLINE option
> Signed-off-by: David Woodhouse <[email protected]>
> Signed-off-by: Tim Chen <[email protected]>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/Kconfig | 8 +++++
> arch/x86/include/asm/jump-asm.h | 70 +++++++++++++++++++++++++++++++++++++++++
> arch/x86/kernel/vmlinux.lds.S | 1 +
> arch/x86/lib/Makefile | 1 +
> arch/x86/lib/retpoline.S | 35 +++++++++++++++++++++
> 5 files changed, 115 insertions(+)
> create mode 100644 arch/x86/include/asm/jump-asm.h
> create mode 100644 arch/x86/lib/retpoline.S
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index d4fc98c50378..8b0facfa35be 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -429,6 +429,14 @@ config GOLDFISH
> def_bool y
> depends on X86_GOLDFISH
>
> +config RETPOLINE
> + bool "Avoid speculative indirect branches in kernel"
> + default y
> + help
> + Compile kernel with the retpoline compiler options to guard against
> + kernel to user data leaks by avoiding speculative indirect
> + branches. Requires a new enough compiler. The kernel may run slower.
> +
> config INTEL_RDT
> bool "Intel Resource Director Technology support"
> default n
> diff --git a/arch/x86/include/asm/jump-asm.h b/arch/x86/include/asm/jump-asm.h
> new file mode 100644
> index 000000000000..936fa620f346
> --- /dev/null
> +++ b/arch/x86/include/asm/jump-asm.h
> @@ -0,0 +1,70 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef JUMP_ASM_H
> +#define JUMP_ASM_H 1
> +
> +#ifdef __ASSEMBLY__
> +
> +#ifdef CONFIG_RETPOLINE
> +
> +/*
> + * Jump to an indirect pointer without speculation.
> + *
> + * The out of line __x86.indirect_thunk has special code sequences
> + * to stop speculation.
> + */
> +
> +.macro NOSPEC_JMP target
> + push \target
> + jmp __x86.indirect_thunk
> +.endm
> +
> +
> +/*
> + * Call an indirect pointer without speculation.
> + */
> +
> +.macro NOSPEC_CALL target
> + jmp 1221f
> +1222:
> + push \target
> + jmp __x86.indirect_thunk
> +1221:
> + call 1222b
> +.endm
> +
> +#else /* CONFIG_RETPOLINE */
> +
> +.macro NOSPEC_JMP target
> + jmp *\target
> +.endm
> +
> +.macro NOSPEC_CALL target
> + call *\target
> +.endm
> +
> +#endif /* !CONFIG_RETPOLINE */
> +
> +#else /* __ASSEMBLY__ */
> +
> +#ifdef CONFIG_RETPOLINE
> +
> +#define NOSPEC_JMP(t) \
> + "push " t "; " \
> + "jmp __x86.indirect_thunk; "
> +
> +#define NOSPEC_CALL(t) \
> + " jmp 1221f; " \
> + "1222: push " t ";" \
> + " jmp __x86.indirect_thunk;" \
> + "1221: call 1222b;"
> +
> +#else /* CONFIG_RETPOLINE */
> +
> +#define NOSPEC_JMP(t) "jmp *" t "; "
> +#define NOSPEC_CALL(t) "call *" t "; "
> +
> +#endif /* !CONFIG_RETPOLINE */
> +
> +#endif /* !__ASSEMBLY */
> +
> +#endif
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 1e413a9326aa..2e64241a6664 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -103,6 +103,7 @@ SECTIONS
> /* bootstrapping code */
> HEAD_TEXT
> . = ALIGN(8);
> + *(.text.__x86.indirect_thunk)
> TEXT_TEXT
> SCHED_TEXT
> CPUIDLE_TEXT
> diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
> index 7b181b61170e..f23934bbaf4e 100644
> --- a/arch/x86/lib/Makefile
> +++ b/arch/x86/lib/Makefile
> @@ -26,6 +26,7 @@ lib-y += memcpy_$(BITS).o
> lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
> lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
> lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
> +lib-$(CONFIG_RETPOLINE) += retpoline.o
>
> obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
>
> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> new file mode 100644
> index 000000000000..cb40781adbfe
> --- /dev/null
> +++ b/arch/x86/lib/retpoline.S
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Out of line jump trampoline for calls that disable speculation.
> + *
> + * This is a special sequence that prevents the CPU speculating
> + * for indirect calls.
> + *
> + * This can be called by gcc generated code, or with the asm macros
> + * in asm/jump-asm.h
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/dwarf2.h>
> +#include <asm/export.h>
> +
> + .section .text.__x86.indirect_thunk,"ax"
> +
> +ENTRY(__x86.indirect_thunk)
> + CFI_STARTPROC
> + call retpoline_call_target
> +2:
> + lfence /* stop speculation */
> + jmp 2b
> +retpoline_call_target:
> +#ifdef CONFIG_64BIT
> + lea 8(%rsp), %rsp
> +#else
> + lea 4(%esp), %esp
> +#endif
> + ret
> + CFI_ENDPROC
> +ENDPROC(__x86.indirect_thunk)
> +
> + EXPORT_SYMBOL(__x86.indirect_thunk)
> --
> 2.14.3
>

Can someone actually explain WTF this mess is trying to accomplish?

--
Brian Gerst

2018-01-04 02:32:56

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH v2 01/12] x86/retpoline: Define retpoline indirect thunk and macros

> > +ENTRY(__x86.indirect_thunk)
> > + CFI_STARTPROC
> > + call retpoline_call_target
> > +2:
> > + lfence /* stop speculation */
> > + jmp 2b
> > +retpoline_call_target:
> > +#ifdef CONFIG_64BIT
> > + lea 8(%rsp), %rsp
> > +#else
> > + lea 4(%esp), %esp
> > +#endif
> > + ret
> > + CFI_ENDPROC
> > +ENDPROC(__x86.indirect_thunk)
> > +
> > + EXPORT_SYMBOL(__x86.indirect_thunk)
> > --
> > 2.14.3
> >
>
> Can someone actually explain WTF this mess is trying to accomplish?

Think of it as an 'indirect call that doesn't speculate' instruction.
There isn't one in the processor but this specific sequence happens to
make the micro-architecture do just that as efficiently as possible.

What it's actually doing on the non-speculated path (ie the reachable
code) is to call, put the address we want to hit over the existing return
address and then return, to the address we want to indirectly go to.

It's faster than doing a far branch or flushing branch predictors and the
like.

Alan

2018-01-04 06:48:20

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v2 06/12] x86/retpoline/crypto: Convert xen assembler indirect jumps

On 04/01/18 03:00, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> Convert all indirect jumps in xen inline assembler code to use
> non speculative sequences.
>
> Based on code from David Woodhouse and Tim Chen
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 1 +
> arch/x86/include/asm/xen/hypercall.h | 3 ++-
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> index 1743e6850e00..9cd8450a2050 100644
> --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> @@ -12,6 +12,7 @@
>
> #include <linux/linkage.h>
> #include <asm/frame.h>
> +#include <asm/jump-asm.h>

I fail to connect this change to the patch title.

Maybe should be part of the crypto patch?

>
> #define CAMELLIA_TABLE_BYTE_LEN 272
>
> diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
> index 7cb282e9e587..91de35bcce5e 100644
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -44,6 +44,7 @@
> #include <asm/page.h>
> #include <asm/pgtable.h>
> #include <asm/smap.h>
> +#include <asm/jump-asm.h>
>
> #include <xen/interface/xen.h>
> #include <xen/interface/sched.h>
> @@ -217,7 +218,7 @@ privcmd_call(unsigned call,
> __HYPERCALL_5ARG(a1, a2, a3, a4, a5);
>
> stac();
> - asm volatile("call *%[call]"
> + asm volatile(NOSPEC_CALL("%[call]")
> : __HYPERCALL_5PARAM
> : [call] "a" (&hypercall_page[call])
> : __HYPERCALL_CLOBBER5);
>

Acked-by: Juergen Gross <[email protected]>


Juergen

2018-01-04 06:50:12

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 06/12] x86/retpoline/crypto: Convert xen assembler indirect jumps

> > diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> > index 1743e6850e00..9cd8450a2050 100644
> > --- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> > +++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
> > @@ -12,6 +12,7 @@
> >
> > #include <linux/linkage.h>
> > #include <asm/frame.h>
> > +#include <asm/jump-asm.h>
>
> I fail to connect this change to the patch title.
>
> Maybe should be part of the crypto patch?

Right I moved the hunk into the wrong patch. Will fix.

-Andi

2018-01-04 08:43:10

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH v2 03/12] x86/retpoline/entry: Convert entry assembler indirect jumps

On Wed, 2018-01-03 at 18:00 -0800, Andi Kleen wrote:
> @@ -269,8 +270,9 @@ entry_SYSCALL_64_fastpath:
>          * This call instruction is handled specially in stub_ptregs_64.
>          * It might end up jumping to the slow path.  If it jumps, RAX
>          * and all argument registers are clobbered.
> -        */
> -       call    *sys_call_table(, %rax, 8)
> +       */
> +       movq    sys_call_table(, %rax, 8), %rax
> +       NOSPEC_CALL     %rax
>  .Lentry_SYSCALL_64_after_fastpath_call:
>  
>         movq    %rax, RAX(%rsp)

Now I *know* you're working from an older version of my patches and
haven't just deliberately reverted the CET support (and alternatives?).

I fixed that mis-indentation of the closing */ and Intel were even
shipping that version in the latest pre-embargo patch tarball.

Should I post a new series and call it 'v3'? Was there anything
specific you changed other than the cosmetics like
s/CALL_THUNK/NOSPEC_CALL/ ? And obviously the fact that it stands alone
instead of being based on the IBRS microcode support?


Attachments:
smime.p7s (5.09 kB)

2018-01-04 11:49:22

by Pavel Machek

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

Hi!

> This is a fix for Variant 2 in
> https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
>
> Any speculative indirect calls in the kernel can be tricked
> to execute any kernel code, which may allow side channel
> attacks that can leak arbitrary kernel data.

Ok.

> So we want to avoid speculative indirect calls in the kernel.
>
> There's a special code sequence called a retpoline that can
> do indirect calls without speculation. We use a new compiler
> option -mindirect-branch=thunk-extern (gcc patch will be released
> separately) to recompile the kernel with this new sequence.

So... this "retpoline" code is quite tricky; I guess it does the right
on recent Intel CPUs. Does it also do the right thing on all the AMD,
Cyrix, ... variants?

Is it neccessary on all the CPUs? I guess 486 does not need this?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (0.99 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-04 12:09:34

by Alan Cox

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

On Thu, 4 Jan 2018 12:49:17 +0100
Pavel Machek <[email protected]> wrote:

> Hi!
>
> > This is a fix for Variant 2 in
> > https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
> >
> > Any speculative indirect calls in the kernel can be tricked
> > to execute any kernel code, which may allow side channel
> > attacks that can leak arbitrary kernel data.
>
> Ok.
>
> > So we want to avoid speculative indirect calls in the kernel.
> >
> > There's a special code sequence called a retpoline that can
> > do indirect calls without speculation. We use a new compiler
> > option -mindirect-branch=thunk-extern (gcc patch will be released
> > separately) to recompile the kernel with this new sequence.
>
> So... this "retpoline" code is quite tricky; I guess it does the right
> on recent Intel CPUs. Does it also do the right thing on all the AMD,
> Cyrix, ... variants?
>
> Is it neccessary on all the CPUs? I guess 486 does not need this?

It's architecturally valid x86 code so it should work on any x86-32
processor.

You are correct older processors and some of the newer ones don't need
it. AMD and VIA I don't know about but for Intel we can probably avoid it
on older Atom, on Quark, and the really old CPUs nobody actually runs any
more.

That's all an optimization once we have correctness.

Alan


2018-01-04 13:32:17

by Pavel Machek

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

On Thu 2018-01-04 12:09:14, Alan Cox wrote:
> On Thu, 4 Jan 2018 12:49:17 +0100
> Pavel Machek <[email protected]> wrote:
>
> > Hi!
> >
> > > This is a fix for Variant 2 in
> > > https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
> > >
> > > Any speculative indirect calls in the kernel can be tricked
> > > to execute any kernel code, which may allow side channel
> > > attacks that can leak arbitrary kernel data.
> >
> > Ok.
> >
> > > So we want to avoid speculative indirect calls in the kernel.
> > >
> > > There's a special code sequence called a retpoline that can
> > > do indirect calls without speculation. We use a new compiler
> > > option -mindirect-branch=thunk-extern (gcc patch will be released
> > > separately) to recompile the kernel with this new sequence.
> >
> > So... this "retpoline" code is quite tricky; I guess it does the right
> > on recent Intel CPUs. Does it also do the right thing on all the AMD,
> > Cyrix, ... variants?
> >
> > Is it neccessary on all the CPUs? I guess 486 does not need this?
>
> It's architecturally valid x86 code so it should work on any x86-32
> processor.
>
> You are correct older processors and some of the newer ones don't need
> it. AMD and VIA I don't know about but for Intel we can probably avoid it
> on older Atom, on Quark, and the really old CPUs nobody actually runs any
> more.

Ok. I guess I'd like to see comment in file explaining that.

> That's all an optimization once we have correctness.

Well, correctness first sounds as a good idea. OTOH for Spectre
problem, seems like a fix would be microcode update to flush branch
predictor caches on priviledge and context switches. That should make
it impractical to exploit for all the systems, not just latest Linux,
compiled by lastest Gcc, right?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.92 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-04 14:38:20

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Wed, Jan 03, 2018 at 06:00:18PM -0800, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> With the indirect call thunk enabled compiler two objtool
> warnings are triggered very frequently and make the build
> very noisy.
>
> I don't see a good way to avoid them, so just disable them
> for now.
>
> Signed-off-by: Andi Kleen <[email protected]>
>
> ---
> tools/objtool/check.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index 9b341584eb1b..435c71f944dc 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -503,8 +503,13 @@ static int add_call_destinations(struct objtool_file *file)
> insn->call_dest = find_symbol_by_offset(insn->sec,
> dest_off);
> if (!insn->call_dest) {
> +#if 0
> + /* Compilers with -mindirect-branch=thunk-extern trigger
> + * this everywhere on x86. Disable for now.
> + */
> WARN_FUNC("can't find call dest symbol at offset 0x%lx",
> insn->sec, insn->offset, dest_off);
> +#endif
> return -1;
> }
> } else if (rela->sym->type == STT_SECTION) {
> @@ -1716,8 +1721,14 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
> return 1;
>
> } else if (func && has_modified_stack_frame(&state)) {
> +#if 0
> + /* Compilers with -mindirect-branch=thunk-extern trigger
> + * this everywhere on x86. Disable for now.
> + */
> +
> WARN_FUNC("sibling call from callable instruction with modified stack frame",
> sec, insn->offset);
> +#endif
> return 1;
> }
>
> --
> 2.14.3

NAK. We can't blindly disable objtool warnings, that will break
livepatch and the ORC unwinder. If you share a .o file (or the GCC
code) I can look at adding retpoline support.

--
Josh

2018-01-04 14:46:38

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Thu, 2018-01-04 at 08:38 -0600, Josh Poimboeuf wrote:
>
> NAK.  We can't blindly disable objtool warnings, that will break
> livepatch and the ORC unwinder.  If you share a .o file (or the GCC
> code) I can look at adding retpoline support.

http://git.infradead.org/users/dwmw2/gcc-retpoline.git/shortlog/refs/heads/gcc-7_2_0-retpoline-20171219

I put packages for Fedora 27 at ftp://ftp.infradead.org/pub/retpoline/


Attachments:
smime.p7s (5.09 kB)

2018-01-04 15:59:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

> NAK. We can't blindly disable objtool warnings, that will break
> livepatch and the ORC unwinder. If you share a .o file (or the GCC
> code) I can look at adding retpoline support.

I don't think we can wait for that. We can disable livepatch and the
unwinder for now. They are not essential. Frame pointers should work
well enough for unwinding and afaik nobody can use livepatch in mainline anyways.

-Andi

2018-01-04 16:06:09

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Thu, Jan 04, 2018 at 07:59:14AM -0800, Andi Kleen wrote:
> > NAK. We can't blindly disable objtool warnings, that will break
> > livepatch and the ORC unwinder. If you share a .o file (or the GCC
> > code) I can look at adding retpoline support.
>
> I don't think we can wait for that. We can disable livepatch and the
> unwinder for now. They are not essential. Frame pointers should work
> well enough for unwinding

If you want to make this feature conflict with livepatch and ORC,
silencing objtool warnings is not the way to do it.

> and afaik nobody can use livepatch in mainline anyways.

Why not? The patch creation tooling is still out-of-tree, but livepatch
itself is fully supported in mainline.

--
Josh

2018-01-04 16:13:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Thu, Jan 04, 2018 at 10:06:01AM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 04, 2018 at 07:59:14AM -0800, Andi Kleen wrote:
> > > NAK. We can't blindly disable objtool warnings, that will break
> > > livepatch and the ORC unwinder. If you share a .o file (or the GCC
> > > code) I can look at adding retpoline support.
> >
> > I don't think we can wait for that. We can disable livepatch and the
> > unwinder for now. They are not essential. Frame pointers should work
> > well enough for unwinding
>
> If you want to make this feature conflict with livepatch and ORC,
> silencing objtool warnings is not the way to do it.

I don't see why it would conflict with the unwinder anyways?

It doesn't change the long term stack state, so it should be invisible to the
unwinder (unless you crash in the thunk, which is very unlikely)

I actually got some unwinder backtraces during development and they seemed
to work.

>
> > and afaik nobody can use livepatch in mainline anyways.
>
> Why not? The patch creation tooling is still out-of-tree, but livepatch
> itself is fully supported in mainline.

Ok.

Still doesn't seem critical at this point if it's some out of tree
thing.

-Andi

2018-01-04 16:33:01

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Thu, Jan 04, 2018 at 08:13:08AM -0800, Andi Kleen wrote:
> On Thu, Jan 04, 2018 at 10:06:01AM -0600, Josh Poimboeuf wrote:
> > On Thu, Jan 04, 2018 at 07:59:14AM -0800, Andi Kleen wrote:
> > > > NAK. We can't blindly disable objtool warnings, that will break
> > > > livepatch and the ORC unwinder. If you share a .o file (or the GCC
> > > > code) I can look at adding retpoline support.
> > >
> > > I don't think we can wait for that. We can disable livepatch and the
> > > unwinder for now. They are not essential. Frame pointers should work
> > > well enough for unwinding
> >
> > If you want to make this feature conflict with livepatch and ORC,
> > silencing objtool warnings is not the way to do it.
>
> I don't see why it would conflict with the unwinder anyways?
>
> It doesn't change the long term stack state, so it should be invisible to the
> unwinder (unless you crash in the thunk, which is very unlikely)
>
> I actually got some unwinder backtraces during development and they seemed
> to work.

Those objtool warnings are places where ORC annotations are either
missing or wrong.

At the very least, this needs to conflict with HAVE_RELIABLE_STACKTRACE
and HAVE_STACK_VALIDATION until objtool can understand the new code.
Currently ORC relies on HAVE_STACK_VALIDATION, so CONFIG_UNWINDER_ORC
would need to be disabled as well.

> > > and afaik nobody can use livepatch in mainline anyways.
> >
> > Why not? The patch creation tooling is still out-of-tree, but livepatch
> > itself is fully supported in mainline.
>
> Ok.
>
> Still doesn't seem critical at this point if it's some out of tree
> thing.

There are many livepatch users out there who would disagree. The
out-of-tree bits aren't in kernel space. If your patches are ready
before objtool supports them, then fine, make them conflict with
objtool. But please don't introduce silent breakage.

Either way we'll need to figure out a way to get objtool support ASAP.

--
Josh

2018-01-04 17:35:39

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

On Thu, Jan 04, 2018 at 10:32:52AM -0600, Josh Poimboeuf wrote:
> Either way we'll need to figure out a way to get objtool support ASAP.

BTW, I got dwmw2's GCC patches but I'm about to disappear for a few days
so it'll probably be next week before I get a chance to look at this.

--
Josh

2018-01-05 12:29:46

by James Harvey

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

On Fri, Jan 5, 2018 at 5:40 AM, Woodhouse, David <[email protected]> wrote:
> On Thu, 2018-01-04 at 21:01 -0500, james harvey wrote:
>>
>>
>> I understand the GCC patches being discussed will fix the
>> vulnerability because newly compiled kernels will be compiled with a
>> GCC with these patches.
>
> The GCC patches work by eliminating all indirect branches, thus
> avoiding 'variant 2' of the three problems which have been discovered.
>
> Note that we also need to eliminate all the indirect branches which
> occurred in native assembler code too, and provide the 'thunks' that
> GCC uses instead, which is why there's a series of kernel patches to go
> with it.
>
> But building a kernel this way is *only* sufficient to protect the
> kernel. Attacks between userspace processes are still possible — you
> need the updated microcode, with branch-predictor flushes/restrictions,
> to protect existing userspace processes from each other.
>
>> But, are the GCC patches being discussed also expected to fix the
>> vulnerability because user binaries will be compiled using them?
>
> It would be possible to do that. Sensitive userspace processes could be
> built this way, rendering them invulnerable to 'variant 2' attacks
> without the kernel having to use the microcode features.
>
>> In such case, a binary could be maliciously changed back, or a custom GCC
>> made with the patches reverted.
>
> If the attacker can replace the sensitive binary, or replace the
> compiler with which the sensitive binary was compiled, then we have
> other problems. I'm not going to lose sleep over that.
>
>
> Note that *none* of this addresses 'variant 1'. There's a separate
> patch series which addresses likely 'variant 1 gadgets' in the kernel,
> which I haven't seen posted in public yet. And I'm not sure what we do
> about that for userspace except extending the existing Coverity ruleset
> and teaching GCC to emit barriers automatically in the right places,
> which is a bit far-fetched right now. Elena?
> Amazon Web Services UK Limited. Registered in England and Wales with registration number 08650665 and which has its registered office at 60 Holborn Viaduct, London EC1A 2FD, United Kingdom.

I could have written more clearly. What I'm getting at is if any of
the GCC patches are intended to prevent an exploit from being able to
be attempted, rather than making binaries immune from attacks. So, I
don't mean being able to modify a sensitive binary or the system's
compiler. I mean someone has non-root SSH access. Compiles GCC with
whatever patches wind up for variant 1-3 reverted, or uses an old
version. Leaves their custom compiled GCC in their user directory, to
compile malicious code. Executes the compiled malicious code from
their user directory. (Or, compiles a malicious program on their own
system compatible in architecture, kernel and library versions, using
GCC without new patches, and just copies over the binary to their user
directory to execute.) Or, malicious customer with a VM on a shared
machine who has root access within their VM, reverting GCC patches
attempting to see into the host or other VM's on the machine.

2018-01-12 08:20:43

by Dr. Greg Wettstein

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

On Jan 5, 12:12pm, Alan Cox wrote:
} Subject: Re: Avoid speculative indirect calls in kernel

Good morning to everyone, a bit behind on mail given everything which
has been going on.

> On Fri, 5 Jan 2018 01:54:13 +0100 (CET)
> Thomas Gleixner <[email protected]> wrote:
>
> > On Thu, 4 Jan 2018, Jon Masters wrote:
> > > P.S. I've an internal document where I've been tracking "nice to haves"
> > > for later, and one of them is whether it makes sense to tag binaries as
> > > "trusted" (e.g. extended attribute, label, whatever). It was something I
> > > wanted to bring up at some point as potentially worth considering.
> >
> > Scratch that. There is no such thing as a trusted binary.

> There is if you are using signing and the like. I'm sure SELinux and
> friends will grow the ability to set per process policy but that's
> certainly not a priority.
>
> However the question is wrong. 'trusted' is a binary operator not a
> unary one.

Alan's observations are correct.

In our autonomous introspection work we apply the notion that
'trusted' is a binary characteristic of a context of execution (COE).
Its value is an expression of whether or not the information exchange
events it has been involved in have deviated from the desired
execution trajectory path of the system.

It is a decidedly different way of thinking about things. Most
importantly it is a namespaceable characteristic.

We have already written the futuristic LSM that Alan aludes to in
order to implement per COE security policies and forensics for
actors/COE's that have gone over to the 'dark side'.

> Alan

Have a good weekend.

Dr. Greg

}-- End of excerpt from Alan Cox

As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-1686
FAX: 701-281-3949 EMAIL: [email protected]
------------------------------------------------------------------------------
"Given a choice between a complex, difficult-to-understand,
disconcerting explanation and a simplistic, comforting one, many
prefer simplistic comfort if it's remotely plausible, especially if it
involves blaming someone else for their problems."
-- Bob Lewis
_Infoworld_

2018-02-23 21:12:18

by Ywe Cærlyn

[permalink] [raw]
Subject: Re: Avoid speculative indirect calls in kernel

Patchmeister Torvalds:

"Or is Intel basically saying "we are committed to selling you shit
forever and ever, and never fixing anything"?"

Back in Celeron days, Intel was popular because you could clock the
lesser cached Celeron 300mhz to ~500mhz.

Everybody knew then not to get anything pricier. But still Intel sells
Xeons at 3x the price, for little noticable gain?

Basically I did research on philosophy, and indeed the mainconcept of a
culture is what determines a cultures behaviour. Even the internal
design of the cpu seems to be inspired by "God".

I have tried the zén-realized version "Zün", instead, God of absolute
reality. Because regressions in philosophy, is regressions in computing,
is ultimately Bill Gates talking about fecal water.

--
Fredelige hilsener,
Ywe Cærlyn,