2022-03-03 15:00:20

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.

And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/x86/kernel/ftrace.c | 9 ++-------
arch/x86/kernel/ftrace_64.S | 16 ++++++++++++----
2 files changed, 14 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -316,12 +316,12 @@ create_trampoline(struct ftrace_ops *ops
unsigned long offset;
unsigned long npages;
unsigned long size;
- unsigned long retq;
unsigned long *ptr;
void *trampoline;
void *ip;
/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
+ unsigned const char retq[] = { RET_INSN_OPCODE, INT3_INSN_OPCODE };
union ftrace_op_code_union op_ptr;
int ret;

@@ -359,12 +359,7 @@ create_trampoline(struct ftrace_ops *ops
goto fail;

ip = trampoline + size;
-
- /* The trampoline ends with ret(q) */
- retq = (unsigned long)ftrace_stub;
- ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE);
- if (WARN_ON(ret < 0))
- goto fail;
+ memcpy(ip, retq, RET_SIZE);

/* No need to test direct calls on created trampolines */
if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -176,10 +176,10 @@ SYM_FUNC_END(ftrace_caller);
SYM_FUNC_START(ftrace_epilogue)
/*
* This is weak to keep gas from relaxing the jumps.
- * It is also used to copy the RET for trampolines.
*/
SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK)
UNWIND_HINT_FUNC
+ ENDBR
RET
SYM_FUNC_END(ftrace_epilogue)

@@ -284,6 +284,7 @@ SYM_FUNC_START(__fentry__)
jnz trace

SYM_INNER_LABEL(ftrace_stub, SYM_L_GLOBAL)
+ ENDBR
RET

trace:
@@ -307,7 +308,7 @@ EXPORT_SYMBOL(__fentry__)

#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_FUNC_START(return_to_handler)
- subq $24, %rsp
+ subq $16, %rsp

/* Save the return values */
movq %rax, (%rsp)
@@ -319,7 +320,14 @@ SYM_FUNC_START(return_to_handler)
movq %rax, %rdi
movq 8(%rsp), %rdx
movq (%rsp), %rax
- addq $24, %rsp
- JMP_NOSPEC rdi
+
+ addq $16, %rsp
+ ANNOTATE_INTRA_FUNCTION_CALL
+ call .Ldo_rop
+ int3
+.Ldo_rop:
+ mov %rdi, (%rsp)
+ UNWIND_HINT_FUNC
+ RET
SYM_FUNC_END(return_to_handler)
#endif



2022-03-04 19:51:23

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> +
> + addq $16, %rsp
> + ANNOTATE_INTRA_FUNCTION_CALL
> + call .Ldo_rop
> + int3
> +.Ldo_rop:
> + mov %rdi, (%rsp)
> + UNWIND_HINT_FUNC
> + RET

Why the int3?

--
Josh

2022-03-04 20:00:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > +
> > + addq $16, %rsp
> > + ANNOTATE_INTRA_FUNCTION_CALL
> > + call .Ldo_rop
> > + int3
> > +.Ldo_rop:
> > + mov %rdi, (%rsp)
> > + UNWIND_HINT_FUNC
> > + RET
>
> Why the int3?

Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
without it it's just too close to a retpoline and it doesn't feel right.

2022-03-04 21:04:59

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > +
> > > + addq $16, %rsp
> > > + ANNOTATE_INTRA_FUNCTION_CALL
> > > + call .Ldo_rop
> > > + int3
> > > +.Ldo_rop:
> > > + mov %rdi, (%rsp)
> > > + UNWIND_HINT_FUNC
> > > + RET
> >
> > Why the int3?
>
> Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> without it it's just too close to a retpoline and it doesn't feel right.

Um, it *is* a retpoline :-)

Can you just use the RETPOLINE macro? Along with a comment stating why
it can't just do a JMP_NOSPEC?

--
Josh

2022-03-04 22:00:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

On Fri, Mar 04, 2022 at 01:03:34PM -0800, Josh Poimboeuf wrote:
> On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> > On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > > +
> > > > + addq $16, %rsp
> > > > + ANNOTATE_INTRA_FUNCTION_CALL
> > > > + call .Ldo_rop
> > > > + int3
> > > > +.Ldo_rop:
> > > > + mov %rdi, (%rsp)
> > > > + UNWIND_HINT_FUNC
> > > > + RET
> > >
> > > Why the int3?
> >
> > Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> > without it it's just too close to a retpoline and it doesn't feel right.
>
> Um, it *is* a retpoline :-)
>
> Can you just use the RETPOLINE macro? Along with a comment stating why
> it can't just do a JMP_NOSPEC?

There is no RETPOLINE macro; or rather it is completely contained in
lib/retpoline.S and I'd sorta like to keep it that way.

That said, I can stick a comment on.

2022-03-04 22:48:36

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

On Fri, Mar 04, 2022 at 10:44:54PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 04, 2022 at 01:03:34PM -0800, Josh Poimboeuf wrote:
> > On Fri, Mar 04, 2022 at 08:48:53PM +0100, Peter Zijlstra wrote:
> > > On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > > > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > > > +
> > > > > + addq $16, %rsp
> > > > > + ANNOTATE_INTRA_FUNCTION_CALL
> > > > > + call .Ldo_rop
> > > > > + int3
> > > > > +.Ldo_rop:
> > > > > + mov %rdi, (%rsp)
> > > > > + UNWIND_HINT_FUNC
> > > > > + RET
> > > >
> > > > Why the int3?
> > >
> > > Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> > > without it it's just too close to a retpoline and it doesn't feel right.
> >
> > Um, it *is* a retpoline :-)
> >
> > Can you just use the RETPOLINE macro? Along with a comment stating why
> > it can't just do a JMP_NOSPEC?
>
> There is no RETPOLINE macro; or rather it is completely contained in
> lib/retpoline.S and I'd sorta like to keep it that way.
>
> That said, I can stick a comment on.

The only reason it's in retpoline.S is because nobody else needed it.

It just seems weird to reinvent the wheel, especially with a slightly
different method of stopping speculation.

And I could envision other cases where we might want an unconditional
retpoline.

Your call though...

--
Josh

2022-03-04 23:13:43

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 18/39] x86/ibt,ftrace: Make function-graph play nice

From: Peter Zijlstra
> Sent: 04 March 2022 19:49
>
> On Fri, Mar 04, 2022 at 09:51:54AM -0800, Josh Poimboeuf wrote:
> > On Thu, Mar 03, 2022 at 12:23:39PM +0100, Peter Zijlstra wrote:
> > > +
> > > + addq $16, %rsp
> > > + ANNOTATE_INTRA_FUNCTION_CALL
> > > + call .Ldo_rop
> > > + int3
> > > +.Ldo_rop:
> > > + mov %rdi, (%rsp)
> > > + UNWIND_HINT_FUNC
> > > + RET
> >
> > Why the int3?
>
> Speculation trap :-) Either I'm too paranoid or not paranoid enough; but
> without it it's just too close to a retpoline and it doesn't feel right.

Isn't 'jmps .' good enough for a speculation trap?
I'm sure there is a potential issue using 'int 3' because
it is a slow instruction and might take some time to abort.

I actually remember something from a very old Intel doc that
told you not to mix code and data because you didn't want to
'accidentally' execute something like 'atan'.
I can't remember the full context - but it may have been
speculatively executing code after unconditional jumps!
There were certainly bigger problems because the cpu at that
time wouldn't abort the atan - so you had to wait for it to
finish.

I suspect you do need something between the call and label.
The sequence:
call 1f
1: pop %rax
is used to get the %pc (especially on 32bit) and is detected
so that it doesn't mess up the return stack.
So you probably want to avoid a call to the next instruction.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)