2018-01-26 12:13:38

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 0/4] x86: Some cleanups

From: Borislav Petkov <[email protected]>

Here's some of the stuff we were talking about yesterday in the form of
patches. It builds and boots fine with the ratpoison compiler.

Please check whether the alignment directives are correct in the macro
in the third patch.

Thx.

Borislav Petkov (4):
x86/alternative: Print unadorned pointers
x86/nospec: Fix header guards names
x86/retpoline: Simplify vmexit_fill_RSB()
x86/bugs: Drop one "mitigation" from dmesg

arch/x86/include/asm/nospec-branch.h | 45 ++++++++++++++++++++++++++----------
arch/x86/include/asm/processor.h | 5 ++++
arch/x86/kernel/alternative.c | 14 +++++------
arch/x86/kernel/cpu/bugs.c | 2 +-
arch/x86/lib/Makefile | 1 +
arch/x86/lib/retpoline.S | 5 ++++
6 files changed, 52 insertions(+), 20 deletions(-)

--
2.13.0



2018-01-26 12:12:41

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

From: Borislav Petkov <[email protected]>

Simplify it to call an asm-function instead of pasting 41 insn bytes at
every call site. Also, add alignment to the macro as suggested here:

https://support.google.com/faqs/answer/7625886

Signed-off-by: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 39 +++++++++++++++++++++++++++---------
arch/x86/include/asm/processor.h | 5 +++++
arch/x86/lib/Makefile | 1 +
arch/x86/lib/retpoline.S | 5 +++++
4 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 33a35daf6c4d..61d4d7033758 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -53,6 +53,8 @@

#ifdef __ASSEMBLY__

+#include <asm/bitsperlong.h>
+
/*
* This should be used immediately before a retpoline alternative. It tells
* objtool where the retpolines are so that it can make sense of the control
@@ -121,6 +123,30 @@
#endif
.endm

+/* Same as above but with alignment additionally */
+.macro ___FILL_RETURN_BUFFER reg:req nr:req sp:req
+ mov (\nr / 2), \reg
+ .align 16
+771:
+ call 772f
+773: /* speculation trap */
+ pause
+ lfence
+ jmp 773b
+ .align 16
+772:
+ call 774f
+775: /* speculation trap */
+ pause
+ lfence
+ jmp 775b
+ .align 16
+774:
+ dec \reg
+ jnz 771b
+ add (BITS_PER_LONG/8) * \nr, \sp
+.endm
+
/*
* A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
* monstrosity above, manually.
@@ -206,15 +232,10 @@ extern char __indirect_thunk_end[];
static inline void vmexit_fill_RSB(void)
{
#ifdef CONFIG_RETPOLINE
- unsigned long loops;
-
- asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE("jmp 910f",
- __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
- X86_FEATURE_RETPOLINE)
- "910:"
- : "=r" (loops), ASM_CALL_CONSTRAINT
- : : "memory" );
+ alternative_input("",
+ "call __fill_rsb_clobber_ax",
+ X86_FEATURE_RETPOLINE,
+ ASM_NO_INPUT_CLOBBER(_ASM_AX, "memory"));
#endif
}

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d3a67fba200a..491f6e0be66e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -971,4 +971,9 @@ bool xen_set_default_idle(void);

void stop_this_cpu(void *dummy);
void df_debug(struct pt_regs *regs, long error_code);
+
+#ifdef CONFIG_RETPOLINE
+asmlinkage void __fill_rsb_clobber_ax(void);
+#endif
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index f23934bbaf4e..69a473919260 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -27,6 +27,7 @@ lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
lib-$(CONFIG_RETPOLINE) += retpoline.o
+OBJECT_FILES_NON_STANDARD_retpoline.o :=y

obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index c909961e678a..297b0fd2ad10 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -46,3 +46,8 @@ GENERATE_THUNK(r13)
GENERATE_THUNK(r14)
GENERATE_THUNK(r15)
#endif
+
+ENTRY(__fill_rsb_clobber_ax)
+ ___FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, %_ASM_SP
+END(__fill_rsb_clobber_ax)
+EXPORT_SYMBOL_GPL(__fill_rsb_clobber_ax)
--
2.13.0


2018-01-26 12:13:02

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 4/4] x86/bugs: Drop one "mitigation" from dmesg

From: Borislav Petkov <[email protected]>

Make

[ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline

into

[ 0.031118] Spectre V2: Mitigation: Full generic retpoline

to reduce the mitigation mitigations strings.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 390b3dc3d438..94ac6f0165c0 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -90,7 +90,7 @@ static const char *spectre_v2_strings[] = {
};

#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt
+#define pr_fmt(fmt) "Spectre V2: " fmt

static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;

--
2.13.0


2018-01-26 12:13:40

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/4] x86/nospec: Fix header guards names

From: Borislav Petkov <[email protected]>

... to adhere to the _ASM_X86_ naming scheme.

No functionality change.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4ad41087ce0e..33a35daf6c4d 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#ifndef __NOSPEC_BRANCH_H__
-#define __NOSPEC_BRANCH_H__
+#ifndef _ASM_X86_NOSPEC_BRANCH_H_
+#define _ASM_X86_NOSPEC_BRANCH_H_

#include <asm/alternative.h>
#include <asm/alternative-asm.h>
@@ -219,4 +219,4 @@ static inline void vmexit_fill_RSB(void)
}

#endif /* __ASSEMBLY__ */
-#endif /* __NOSPEC_BRANCH_H__ */
+#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
--
2.13.0


2018-01-26 12:14:42

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 1/4] x86/alternative: Print unadorned pointers

From: Borislav Petkov <[email protected]>

After commit

ad67b74d2469 ("printk: hash addresses printed with %p")

pointers are being hashed when printed. However, this makes the
alternative debug output completely useless. Switch to %px in order to
see the unadorned kernel pointers.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/alternative.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 4817d743c263..30571fdaaf6f 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -298,7 +298,7 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
tgt_rip = next_rip + o_dspl;
n_dspl = tgt_rip - orig_insn;

- DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
+ DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);

if (tgt_rip - orig_insn >= 0) {
if (n_dspl - 2 <= 127)
@@ -355,7 +355,7 @@ static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *ins
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
local_irq_restore(flags);

- DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
+ DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
instr, a->instrlen - a->padlen, a->padlen);
}

@@ -376,7 +376,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];

- DPRINTK("alt table %p -> %p", start, end);
+ DPRINTK("alt table %px, -> %px", start, end);
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite previously scanned alternative code.
@@ -400,14 +400,14 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
continue;
}

- DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
+ DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
a->cpuid >> 5,
a->cpuid & 0x1f,
instr, a->instrlen,
replacement, a->replacementlen, a->padlen);

- DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
- DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
+ DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
+ DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);

memcpy(insnbuf, replacement, a->replacementlen);
insnbuf_sz = a->replacementlen;
@@ -433,7 +433,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
a->instrlen - a->replacementlen);
insnbuf_sz += a->instrlen - a->replacementlen;
}
- DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+ DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);

text_poke_early(instr, insnbuf, insnbuf_sz);
}
--
2.13.0


2018-01-26 12:34:38

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, 2018-01-26 at 13:11 +0100, Borislav Petkov wrote:
>
> +ENTRY(__fill_rsb_clobber_ax)
> +       ___FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, %_ASM_SP
> +END(__fill_rsb_clobber_ax)
> +EXPORT_SYMBOL_GPL(__fill_rsb_clobber_ax)

You still have clear vs. fill confusion there.

How about making it take the loop count in %eax? That would allow us to
drop the ___FILL_RETURN_BUFFER macro entirely.

Or does that make us depend on your other fixes to accept jumps in
places other than the first instruction of altinstr? 

Even if you give us separate __clear_rsb_clobber_ax vs.
__fill_rsb_clobber_ax functions, we could still kill the macro in
nospec-branch.h and use a .macro in retpoline.S for the actual
implementation, couldn't we?

> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -971,4 +971,9 @@ bool xen_set_default_idle(void);
>  
>  void stop_this_cpu(void *dummy);
>  void df_debug(struct pt_regs *regs, long error_code);
> +
> +#ifdef CONFIG_RETPOLINE
> +asmlinkage void __fill_rsb_clobber_ax(void);
> +#endif
> +
>  #endif /* _ASM_X86_PROCESSOR_H */

Doesn't that live in asm-prototypes.h? Don't make it visible to any C
code; it *isn't* a C function.


Attachments:
smime.p7s (5.09 kB)

2018-01-26 13:25:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, Jan 26, 2018 at 12:33:31PM +0000, David Woodhouse wrote:
> On Fri, 2018-01-26 at 13:11 +0100, Borislav Petkov wrote:
> >
> > +ENTRY(__fill_rsb_clobber_ax)
> > +       ___FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, %_ASM_SP
> > +END(__fill_rsb_clobber_ax)
> > +EXPORT_SYMBOL_GPL(__fill_rsb_clobber_ax)
>
> You still have clear vs. fill confusion there.

I just took what was there originally:

- __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),

RSB_CLEAR_LOOPS

>
> How about making it take the loop count in %eax? That would allow us to
> drop the ___FILL_RETURN_BUFFER macro entirely.
>
> Or does that make us depend on your other fixes to accept jumps in
> places other than the first instruction of altinstr? 
>
> Even if you give us separate __clear_rsb_clobber_ax vs.
> __fill_rsb_clobber_ax functions, we could still kill the macro in
> nospec-branch.h and use a .macro in retpoline.S for the actual
> implementation, couldn't we?

All good ideas. So how about the below diff ontop?

It builds and boots in a vm here. I need to go to the store but will
play with it more when I get back.

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 60c4c342316c..f7823a5a8714 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -252,7 +252,7 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %ebx, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ff6f8022612c..7a190ff524e2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -499,7 +499,7 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 1908214b9125..b889705f995a 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -38,4 +38,7 @@ INDIRECT_THUNK(dx)
INDIRECT_THUNK(si)
INDIRECT_THUNK(di)
INDIRECT_THUNK(bp)
+asmlinkage void __fill_rsb_clobber_ax(void);
+asmlinkage void __clr_rsb_clobber_ax(void);
+
#endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 61d4d7033758..3049433687c8 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -27,34 +27,8 @@
#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */
#define RSB_FILL_LOOPS 16 /* To avoid underflow */

-/*
- * Google experimented with loop-unrolling and this turned out to be
- * the optimal version — two calls, each with their own speculation
- * trap should their return address end up getting used, in a loop.
- */
-#define __FILL_RETURN_BUFFER(reg, nr, sp) \
- mov $(nr/2), reg; \
-771: \
- call 772f; \
-773: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 773b; \
-772: \
- call 774f; \
-775: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 775b; \
-774: \
- dec reg; \
- jnz 771b; \
- add $(BITS_PER_LONG/8) * nr, sp;
-
#ifdef __ASSEMBLY__

-#include <asm/bitsperlong.h>
-
/*
* This should be used immediately before a retpoline alternative. It tells
* objtool where the retpolines are so that it can make sense of the control
@@ -123,40 +97,9 @@
#endif
.endm

-/* Same as above but with alignment additionally */
-.macro ___FILL_RETURN_BUFFER reg:req nr:req sp:req
- mov (\nr / 2), \reg
- .align 16
-771:
- call 772f
-773: /* speculation trap */
- pause
- lfence
- jmp 773b
- .align 16
-772:
- call 774f
-775: /* speculation trap */
- pause
- lfence
- jmp 775b
- .align 16
-774:
- dec \reg
- jnz 771b
- add (BITS_PER_LONG/8) * \nr, \sp
-.endm
-
- /*
- * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
- * monstrosity above, manually.
- */
-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
+.macro FILL_RETURN_BUFFER nr:req ftr:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE "jmp .Lskip_rsb_\@", \
- __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \
- \ftr
+ ALTERNATIVE "jmp .Lskip_rsb_\@", "call __clr_rsb_clobber_ax", \ftr
.Lskip_rsb_\@:
#endif
.endm
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 491f6e0be66e..d3a67fba200a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -971,9 +971,4 @@ bool xen_set_default_idle(void);

void stop_this_cpu(void *dummy);
void df_debug(struct pt_regs *regs, long error_code);
-
-#ifdef CONFIG_RETPOLINE
-asmlinkage void __fill_rsb_clobber_ax(void);
-#endif
-
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 297b0fd2ad10..522d92bd3176 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -7,6 +7,7 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>
#include <asm/nospec-branch.h>
+#include <asm/bitsperlong.h>

.macro THUNK reg
.section .text.__x86.indirect_thunk
@@ -18,6 +19,32 @@ ENTRY(__x86_indirect_thunk_\reg)
ENDPROC(__x86_indirect_thunk_\reg)
.endm

+.macro BOINK_RSB nr:req sp:req
+ push %_ASM_AX
+ mov $(\nr / 2), %_ASM_AX
+ .align 16
+771:
+ call 772f
+773: /* speculation trap */
+ pause
+ lfence
+ jmp 773b
+ .align 16
+772:
+ call 774f
+775: /* speculation trap */
+ pause
+ lfence
+ jmp 775b
+ .align 16
+774:
+ dec %_ASM_AX
+ jnz 771b
+ add $((BITS_PER_LONG/8) * \nr), \sp
+ pop %_ASM_AX
+.endm
+
+
/*
* Despite being an assembler file we can't just use .irp here
* because __KSYM_DEPS__ only uses the C preprocessor and would
@@ -48,6 +75,13 @@ GENERATE_THUNK(r15)
#endif

ENTRY(__fill_rsb_clobber_ax)
- ___FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, %_ASM_SP
+ BOINK_RSB RSB_FILL_LOOPS, %_ASM_SP
+ ret
END(__fill_rsb_clobber_ax)
EXPORT_SYMBOL_GPL(__fill_rsb_clobber_ax)
+
+ENTRY(__clr_rsb_clobber_ax)
+ BOINK_RSB RSB_CLEAR_LOOPS, %_ASM_SP
+ ret
+END(__clr_rsb_clobber_ax)
+EXPORT_SYMBOL_GPL(__clr_rsb_clobber_ax)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-26 13:36:46

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/bugs: Drop one "mitigation" from dmesg

On Fri, Jan 26, 2018 at 01:11:39PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> Make
>
> [ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline
>
> into
>
> [ 0.031118] Spectre V2: Mitigation: Full generic retpoline
>
> to reduce the mitigation mitigations strings.
>
> Signed-off-by: Borislav Petkov <[email protected]>

Reviewed-by: Greg Kroah-Hartman <[email protected]>

Subject: [tip:x86/pti] x86/nospec: Fix header guards names

Commit-ID: 7a32fc51ca938e67974cbb9db31e1a43f98345a9
Gitweb: https://git.kernel.org/tip/7a32fc51ca938e67974cbb9db31e1a43f98345a9
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 26 Jan 2018 13:11:37 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 26 Jan 2018 15:53:19 +0100

x86/nospec: Fix header guards names

... to adhere to the _ASM_X86_ naming scheme.

No functional change.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/include/asm/nospec-branch.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 34e384c..865192a 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#ifndef __NOSPEC_BRANCH_H__
-#define __NOSPEC_BRANCH_H__
+#ifndef _ASM_X86_NOSPEC_BRANCH_H_
+#define _ASM_X86_NOSPEC_BRANCH_H_

#include <asm/alternative.h>
#include <asm/alternative-asm.h>
@@ -232,4 +232,4 @@ static inline void indirect_branch_prediction_barrier(void)
}

#endif /* __ASSEMBLY__ */
-#endif /* __NOSPEC_BRANCH_H__ */
+#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */

Subject: [tip:x86/pti] x86/alternative: Print unadorned pointers

Commit-ID: 0e6c16c652cadaffd25a6bb326ec10da5bcec6b4
Gitweb: https://git.kernel.org/tip/0e6c16c652cadaffd25a6bb326ec10da5bcec6b4
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 26 Jan 2018 13:11:36 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 26 Jan 2018 15:53:19 +0100

x86/alternative: Print unadorned pointers

After commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers are being hashed when printed. However, this makes the alternative
debug output completely useless. Switch to %px in order to see the
unadorned kernel pointers.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/alternative.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index e0b97e4..14a52c7 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -298,7 +298,7 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
tgt_rip = next_rip + o_dspl;
n_dspl = tgt_rip - orig_insn;

- DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
+ DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);

if (tgt_rip - orig_insn >= 0) {
if (n_dspl - 2 <= 127)
@@ -355,7 +355,7 @@ static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *ins
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
local_irq_restore(flags);

- DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
+ DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
instr, a->instrlen - a->padlen, a->padlen);
}

@@ -376,7 +376,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];

- DPRINTK("alt table %p -> %p", start, end);
+ DPRINTK("alt table %px, -> %px", start, end);
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite previously scanned alternative code.
@@ -400,14 +400,14 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
continue;
}

- DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
+ DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
a->cpuid >> 5,
a->cpuid & 0x1f,
instr, a->instrlen,
replacement, a->replacementlen, a->padlen);

- DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
- DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
+ DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
+ DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);

memcpy(insnbuf, replacement, a->replacementlen);
insnbuf_sz = a->replacementlen;
@@ -433,7 +433,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
a->instrlen - a->replacementlen);
insnbuf_sz += a->instrlen - a->replacementlen;
}
- DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+ DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);

text_poke_early(instr, insnbuf, insnbuf_sz);
}

Subject: [tip:x86/pti] x86/bugs: Drop one "mitigation" from dmesg

Commit-ID: 55fa19d3e51f33d9cd4056d25836d93abf9438db
Gitweb: https://git.kernel.org/tip/55fa19d3e51f33d9cd4056d25836d93abf9438db
Author: Borislav Petkov <[email protected]>
AuthorDate: Fri, 26 Jan 2018 13:11:39 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 26 Jan 2018 15:53:19 +0100

x86/bugs: Drop one "mitigation" from dmesg

Make

[ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline

into

[ 0.031118] Spectre V2: Mitigation: Full generic retpoline

to reduce the mitigation mitigations strings.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bac7a35..c988a8a 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -91,7 +91,7 @@ static const char *spectre_v2_strings[] = {
};

#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt
+#define pr_fmt(fmt) "Spectre V2 : " fmt

static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
static bool spectre_v2_bad_module;

2018-01-26 16:25:41

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, 2018-01-26 at 14:24 +0100, Borislav Petkov wrote:
> diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
> index 1908214b9125..b889705f995a 100644
> --- a/arch/x86/include/asm/asm-prototypes.h
> +++ b/arch/x86/include/asm/asm-prototypes.h
> @@ -38,4 +38,7 @@ INDIRECT_THUNK(dx)
>  INDIRECT_THUNK(si)
>  INDIRECT_THUNK(di)
>  INDIRECT_THUNK(bp)
> +asmlinkage void __fill_rsb_clobber_ax(void);
> +asmlinkage void __clr_rsb_clobber_ax(void);
>  #endif /* CONFIG_RETPOLINE */
>

Dammit, have the IBM vowel-stealers escaped again? What was wrong with
'__clear_rsb_clobber_ax'?

>
> -/*
> - * Google experimented with loop-unrolling and this turned out to be
> - * the optimal version — two calls, each with their own speculation
> - * trap should their return address end up getting used, in a loop.
> - */

Let's not lose that comment?

Other than that, I think it'll look OK when it's a sane patch on top of
my existing tree instead of incremental on your last one. Thanks.


Attachments:
smime.p7s (5.09 kB)

2018-01-26 16:49:12

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, Jan 26, 2018 at 04:24:39PM +0000, David Woodhouse wrote:
> Dammit, have the IBM vowel-stealers escaped again?

Yeah, I love vowel dropping. :-)

> What was wrong with '__clear_rsb_clobber_ax'?

Nothing. I've dropped the clobber part too, btw, as I'm pushin/popping
%rAX around it. It is simply '__clear_rsb' now.

> Let's not lose that comment?

Ok.

> Other than that, I think it'll look OK when it's a sane patch on top of
> my existing tree instead of incremental on your last one. Thanks.

Yeah, the incremental diff was to show only what I changed.

Proper ones coming soon.

Thx.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-26 20:07:51

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, Jan 26, 2018 at 05:47:46PM +0100, Borislav Petkov wrote:
> Proper ones coming soon.

Ok, here they are as a reply to this message. Lemme send them out for a
quick look, before I run them through the boxes tomorrow.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-26 20:08:57

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH v2 1/2] x86/retpoline: Simplify vmexit_fill_RSB()

Simplify it to call an asm-function instead of pasting 41 insn bytes at
every call site. Also, add alignment to the macro as suggested here:

https://support.google.com/faqs/answer/7625886

Signed-off-by: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
---
arch/x86/entry/entry_32.S | 2 +-
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/asm-prototypes.h | 3 +++
arch/x86/include/asm/nospec-branch.h | 49 +++++------------------------------
arch/x86/lib/Makefile | 1 +
arch/x86/lib/retpoline.S | 44 +++++++++++++++++++++++++++++++
6 files changed, 56 insertions(+), 45 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 60c4c342316c..f7823a5a8714 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -252,7 +252,7 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %ebx, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ff6f8022612c..7a190ff524e2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -499,7 +499,7 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 1908214b9125..4d111616524b 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -38,4 +38,7 @@ INDIRECT_THUNK(dx)
INDIRECT_THUNK(si)
INDIRECT_THUNK(di)
INDIRECT_THUNK(bp)
+asmlinkage void __fill_rsb(void);
+asmlinkage void __clear_rsb(void);
+
#endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 865192a2cc31..4f88e1b2599f 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -27,30 +27,6 @@
#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */
#define RSB_FILL_LOOPS 16 /* To avoid underflow */

-/*
- * Google experimented with loop-unrolling and this turned out to be
- * the optimal version — two calls, each with their own speculation
- * trap should their return address end up getting used, in a loop.
- */
-#define __FILL_RETURN_BUFFER(reg, nr, sp) \
- mov $(nr/2), reg; \
-771: \
- call 772f; \
-773: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 773b; \
-772: \
- call 774f; \
-775: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 775b; \
-774: \
- dec reg; \
- jnz 771b; \
- add $(BITS_PER_LONG/8) * nr, sp;
-
#ifdef __ASSEMBLY__

/*
@@ -121,17 +97,9 @@
#endif
.endm

- /*
- * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
- * monstrosity above, manually.
- */
-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
+.macro FILL_RETURN_BUFFER nr:req ftr:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE "jmp .Lskip_rsb_\@", \
- __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \
- \ftr
-.Lskip_rsb_\@:
+ ALTERNATIVE "", "call __clear_rsb", \ftr
#endif
.endm

@@ -206,15 +174,10 @@ extern char __indirect_thunk_end[];
static inline void vmexit_fill_RSB(void)
{
#ifdef CONFIG_RETPOLINE
- unsigned long loops;
-
- asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE("jmp 910f",
- __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
- X86_FEATURE_RETPOLINE)
- "910:"
- : "=r" (loops), ASM_CALL_CONSTRAINT
- : : "memory" );
+ alternative_input("",
+ "call __fill_rsb",
+ X86_FEATURE_RETPOLINE,
+ ASM_NO_INPUT_CLOBBER("memory"));
#endif
}

diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index f23934bbaf4e..69a473919260 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -27,6 +27,7 @@ lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
lib-$(CONFIG_RETPOLINE) += retpoline.o
+OBJECT_FILES_NON_STANDARD_retpoline.o :=y

obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index c909961e678a..3dcabe2ea2d6 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -7,6 +7,7 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>
#include <asm/nospec-branch.h>
+#include <asm/bitsperlong.h>

.macro THUNK reg
.section .text.__x86.indirect_thunk
@@ -19,6 +20,37 @@ ENDPROC(__x86_indirect_thunk_\reg)
.endm

/*
+ * Google experimented with loop-unrolling and this turned out to be
+ * the optimal version — two calls, each with their own speculation
+ * trap should their return address end up getting used, in a loop.
+ */
+.macro BOINK_RSB nr:req sp:req
+ push %_ASM_AX
+ mov $(\nr / 2), %_ASM_AX
+ .align 16
+771:
+ call 772f
+773: /* speculation trap */
+ pause
+ lfence
+ jmp 773b
+ .align 16
+772:
+ call 774f
+775: /* speculation trap */
+ pause
+ lfence
+ jmp 775b
+ .align 16
+774:
+ dec %_ASM_AX
+ jnz 771b
+ add $((BITS_PER_LONG/8) * \nr), \sp
+ pop %_ASM_AX
+.endm
+
+
+/*
* Despite being an assembler file we can't just use .irp here
* because __KSYM_DEPS__ only uses the C preprocessor and would
* only see one instance of "__x86_indirect_thunk_\reg" rather
@@ -46,3 +78,15 @@ GENERATE_THUNK(r13)
GENERATE_THUNK(r14)
GENERATE_THUNK(r15)
#endif
+
+ENTRY(__fill_rsb)
+ BOINK_RSB RSB_FILL_LOOPS, %_ASM_SP
+ ret
+END(__fill_rsb)
+EXPORT_SYMBOL_GPL(__fill_rsb)
+
+ENTRY(__clear_rsb)
+ BOINK_RSB RSB_CLEAR_LOOPS, %_ASM_SP
+ ret
+END(__clear_rsb)
+EXPORT_SYMBOL_GPL(__clear_rsb)
--
2.13.0


--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-26 20:09:53

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

Make it all a function which does the WRMSR instead of having a hairy
inline asm.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 13 ++++---------
arch/x86/include/asm/processor.h | 4 ++++
arch/x86/kernel/cpu/bugs.c | 7 +++++++
4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 6c6d862d66a1..6c033f6adc24 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -211,7 +211,7 @@
#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */
#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */

-#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled*/
+#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled */

/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4f88e1b2599f..71ae2dd65259 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -183,15 +183,10 @@ static inline void vmexit_fill_RSB(void)

static inline void indirect_branch_prediction_barrier(void)
{
- asm volatile(ALTERNATIVE("",
- "movl %[msr], %%ecx\n\t"
- "movl %[val], %%eax\n\t"
- "movl $0, %%edx\n\t"
- "wrmsr",
- X86_FEATURE_IBPB)
- : : [msr] "i" (MSR_IA32_PRED_CMD),
- [val] "i" (PRED_CMD_IBPB)
- : "eax", "ecx", "edx", "memory");
+ alternative_input("",
+ "call __ibp_barrier",
+ X86_FEATURE_IBPB,
+ ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
}

#endif /* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d3a67fba200a..4d372f1cea5a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -971,4 +971,8 @@ bool xen_set_default_idle(void);

void stop_this_cpu(void *dummy);
void df_debug(struct pt_regs *regs, long error_code);
+
+#ifdef CONFIG_RETPOLINE
+void __ibp_barrier(void);
+#endif
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index be068aea6bda..448410fcffcf 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -304,3 +304,10 @@ ssize_t cpu_show_spectre_v2(struct device *dev,
spectre_v2_bad_module ? " - vulnerable module loaded" : "");
}
#endif
+
+#ifdef CONFIG_RETPOLINE
+void __ibp_barrier(void)
+{
+ __wrmsr(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, 0);
+}
+#endif
--
2.13.0


--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-27 04:21:23

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/retpoline: Simplify vmexit_fill_RSB()

> + * Google experimented with loop-unrolling and this turned out to be
> + * the optimal version — two calls, each with their own speculation
> + * trap should their return address end up getting used, in a loop.
> + */
> +.macro BOINK_RSB nr:req sp:req


BOINK?

Really?

2018-01-27 09:03:11

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, Jan 26, 2018 at 11:20:39PM -0500, Konrad Rzeszutek Wilk wrote:
> BOINK?
>
> Really?

There's always someone who's bound to get offended, right? So I better
change it to something boring, yes?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-27 12:34:08

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

On Fri, 2018-01-26 at 21:08 +0100, Borislav Petkov wrote:
>
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -971,4 +971,8 @@ bool xen_set_default_idle(void);
>  
>  void stop_this_cpu(void *dummy);
>  void df_debug(struct pt_regs *regs, long error_code);
> +
> +#ifdef CONFIG_RETPOLINE
> +void __ibp_barrier(void);
> +#endif
>  #endif /* _ASM_X86_PROCESSOR_H */

Did I already say that needs to live in asm-prototypes.h? And it needs
to be exported to modules, I think.


Attachments:
smime.p7s (5.09 kB)

2018-01-27 13:22:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

On Sat, Jan 27, 2018 at 12:32:41PM +0000, David Woodhouse wrote:
> Did I already say that needs to live in asm-prototypes.h?

That's a C function.

> And it needs to be exported to modules, I think.

Right.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-27 14:05:25

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/retpoline: Simplify vmexit_fill_RSB()

On Sat, Jan 27, 2018 at 10:01:55AM +0100, Borislav Petkov wrote:
> On Fri, Jan 26, 2018 at 11:20:39PM -0500, Konrad Rzeszutek Wilk wrote:
> > BOINK?
> >
> > Really?
>
> There's always someone who's bound to get offended, right? So I better
> change it to something boring, yes?

https://www.networkworld.com/article/2222804/software/microsoft-code-contains-the-phrase--big-boobs------yes--really.html

comes to mind.

Perhaps
CRAM?


2018-01-29 17:14:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/retpoline: Simplify vmexit_fill_RSB()

On Fri, Jan 26, 2018 at 09:07:25PM +0100, Borislav Petkov wrote:
> +.macro FILL_RETURN_BUFFER nr:req ftr:req
> #ifdef CONFIG_RETPOLINE
> + ALTERNATIVE "", "call __clear_rsb", \ftr
> #endif
> .endm
>
> @@ -206,15 +174,10 @@ extern char __indirect_thunk_end[];
> static inline void vmexit_fill_RSB(void)
> {
> #ifdef CONFIG_RETPOLINE
> + alternative_input("",
> + "call __fill_rsb",
> + X86_FEATURE_RETPOLINE,
> + ASM_NO_INPUT_CLOBBER("memory"));
> #endif
> }
>


> @@ -19,6 +20,37 @@ ENDPROC(__x86_indirect_thunk_\reg)
> .endm
>
> /*
> + * Google experimented with loop-unrolling and this turned out to be
> + * the optimal version — two calls, each with their own speculation
> + * trap should their return address end up getting used, in a loop.
> + */
> +.macro BOINK_RSB nr:req sp:req
> + push %_ASM_AX
> + mov $(\nr / 2), %_ASM_AX
> + .align 16
> +771:
> + call 772f
> +773: /* speculation trap */
> + pause
> + lfence
> + jmp 773b
> + .align 16
> +772:
> + call 774f
> +775: /* speculation trap */
> + pause
> + lfence
> + jmp 775b
> + .align 16
> +774:
> + dec %_ASM_AX
> + jnz 771b
> + add $((BITS_PER_LONG/8) * \nr), \sp
> + pop %_ASM_AX
> +.endm
> +
> +
> +/*
> * Despite being an assembler file we can't just use .irp here
> * because __KSYM_DEPS__ only uses the C preprocessor and would
> * only see one instance of "__x86_indirect_thunk_\reg" rather
> @@ -46,3 +78,15 @@ GENERATE_THUNK(r13)
> GENERATE_THUNK(r14)
> GENERATE_THUNK(r15)
> #endif
> +
> +ENTRY(__fill_rsb)
> + BOINK_RSB RSB_FILL_LOOPS, %_ASM_SP
> + ret
> +END(__fill_rsb)
> +EXPORT_SYMBOL_GPL(__fill_rsb)
> +
> +ENTRY(__clear_rsb)
> + BOINK_RSB RSB_CLEAR_LOOPS, %_ASM_SP
> + ret
> +END(__clear_rsb)
> +EXPORT_SYMBOL_GPL(__clear_rsb)


One thing I feel this ought to mention (in the Changelog probably) is
that it looses one RET for SKL+. That is, where we used to have 16
'safe' RETs before this, we now have 15.

2018-02-06 19:46:17

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

On Fri, 2018-01-26 at 21:08 +0100, Borislav Petkov wrote:
> Make it all a function which does the WRMSR instead of having a hairy
> inline asm.

...

> + alternative_input("",
> +  "call __ibp_barrier",
> +  X86_FEATURE_IBPB,
> +  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
>  }

Dammit. I know the best time to comment is *before* I add my own sign-
off to it and before Linus has merged it but... I think this is broken.

If you're calling a C function then you have to mark *all* the call-
clobbered registers as, well, clobbered.

If you really really really want to *call* something out of line, then
it would need to be implemented in asm.


Attachments:
smime.p7s (5.09 kB)

2018-02-06 23:32:34

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()



On Tue, 2018-02-06 at 17:25 -0600, Josh Poimboeuf wrote:
> On Tue, Feb 06, 2018 at 07:44:52PM +0000, David Woodhouse wrote:
> >
> > On Fri, 2018-01-26 at 21:08 +0100, Borislav Petkov wrote:
> > >
> > > Make it all a function which does the WRMSR instead of having a hairy
> > > inline asm.
> > ...
> >
> > >
> > > + alternative_input("",
> > > +  "call __ibp_barrier",
> > > +  X86_FEATURE_IBPB,
> > > +  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
> > >  }
> > Dammit. I know the best time to comment is *before* I add my own sign-
> > off to it and before Linus has merged it but... I think this is broken.
> >
> > If you're calling a C function then you have to mark *all* the call-
> > clobbered registers as, well, clobbered.
> >
> > If you really really really want to *call* something out of line, then
> > it would need to be implemented in asm.
>
> Hm.  In theory I agree this seems like a bug.  On x86_64 I believe we
> would need to mark the following registers as clobbered: r8-r11, ax, cx,
> dx, si, di, plus "memory" and "cc".
>
> But I'm scratching my head a bit, because we seem to have this bug all
> over the kernel.  (Grep for ASM_CALL_CONSTRAINT to see them.)
>
> Many of those inline asm calls have been around a long time.  So why
> hasn't it ever bitten us?

How many are actually calling C functions, not asm or other special
cases like firmware entry points?


Attachments:
smime.p7s (5.09 kB)

2018-02-06 23:33:56

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

On Tue, Feb 06, 2018 at 07:44:52PM +0000, David Woodhouse wrote:
> On Fri, 2018-01-26 at 21:08 +0100, Borislav Petkov wrote:
> > Make it all a function which does the WRMSR instead of having a hairy
> > inline asm.
>
> ...
>
> > + alternative_input("",
> > +  "call __ibp_barrier",
> > +  X86_FEATURE_IBPB,
> > +  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
> >  }
>
> Dammit. I know the best time to comment is *before* I add my own sign-
> off to it and before Linus has merged it but... I think this is broken.
>
> If you're calling a C function then you have to mark *all* the call-
> clobbered registers as, well, clobbered.
>
> If you really really really want to *call* something out of line, then
> it would need to be implemented in asm.

Hm. In theory I agree this seems like a bug. On x86_64 I believe we
would need to mark the following registers as clobbered: r8-r11, ax, cx,
dx, si, di, plus "memory" and "cc".

But I'm scratching my head a bit, because we seem to have this bug all
over the kernel. (Grep for ASM_CALL_CONSTRAINT to see them.)

Many of those inline asm calls have been around a long time. So why
hasn't it ever bitten us?

--
Josh

2018-02-06 23:52:05

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/speculation: Simplify indirect_branch_prediction_barrier()

On Tue, Feb 06, 2018 at 11:31:18PM +0000, David Woodhouse wrote:
>
>
> On Tue, 2018-02-06 at 17:25 -0600, Josh Poimboeuf wrote:
> > On Tue, Feb 06, 2018 at 07:44:52PM +0000, David Woodhouse wrote:
> > >
> > > On Fri, 2018-01-26 at 21:08 +0100, Borislav Petkov wrote:
> > > >
> > > > Make it all a function which does the WRMSR instead of having a hairy
> > > > inline asm.
> > > ...
> > >
> > > >
> > > > + alternative_input("",
> > > > +  "call __ibp_barrier",
> > > > +  X86_FEATURE_IBPB,
> > > > +  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
> > > >  }
> > > Dammit. I know the best time to comment is *before* I add my own sign-
> > > off to it and before Linus has merged it but... I think this is broken.
> > >
> > > If you're calling a C function then you have to mark *all* the call-
> > > clobbered registers as, well, clobbered.
> > >
> > > If you really really really want to *call* something out of line, then
> > > it would need to be implemented in asm.
> >
> > Hm.  In theory I agree this seems like a bug.  On x86_64 I believe we
> > would need to mark the following registers as clobbered: r8-r11, ax, cx,
> > dx, si, di, plus "memory" and "cc".
> >
> > But I'm scratching my head a bit, because we seem to have this bug all
> > over the kernel.  (Grep for ASM_CALL_CONSTRAINT to see them.)
> >
> > Many of those inline asm calls have been around a long time.  So why
> > hasn't it ever bitten us?
>
> How many are actually calling C functions, not asm or other special
> cases like firmware entry points?

I think many, and maybe even most, are calling normal C functions.

--
Josh