On Mon, 2018-01-08 at 02:34 -0800, Paul Turner wrote:
> One detail that is missing is that we still need RSB refill in some
> cases.
> This is not because the retpoline sequence itself will underflow (it
> is actually guaranteed not to, since it consumes only RSB entries
> that it generates.
> But either to avoid poisoning of the RSB entries themselves, or to
> avoid the hardware turning to alternate predictors on RSB underflow.
>
> Enumerating the cases we care about:
>
> • user->kernel in the absence of SMEP:
> In the absence of SMEP, we must worry about user-generated RSB
> entries being consumable by kernel execution.
> Generally speaking, for synchronous execution this will not occur
> (e.g. syscall, interrupt), however, one important case remains.
> When we context switch between two threads, we should flush the RSB
> so that execution generated from the unbalanced return path on the
> thread that we just scheduled into, cannot consume RSB entries
> potentially installed by the prior thread.

Or IBPB here, yes? That's what we had in the original patch set when
retpoline came last, and what I assume will be put back again once we
*finally* get our act together and reinstate the full set of microcode
patches.

> kernel->kernel independent of SMEP:
> While much harder to coordinate, facilities such as eBPF potentially
> allow exploitable return targets to be created.
> Generally speaking (particularly if eBPF has been disabled) the risk
> is _much_ lower here, since we can only return into kernel execution
> that was already occurring on another thread (which could e.g. likely
> be attacked there directly independent of RSB poisoning.)
>
> guest->hypervisor, independent of SMEP:
> For guest ring0 -> host ring0 transitions, it is possible that the
> tagging only includes that the entry was only generated in a ring0
> context. Meaning that a guest generated entry may be consumed by the
> host. This admits:

We are also stuffing the RSB on vmexit in the IBRS/IBPB patch set,
aren't we?

Attachments:

smime.p7s (5.09 kB)

2018-01-08 10:45:46

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH v6 01/10] x86/retpoline: Add initial retpoline support

On Sun, Jan 07, 2018 at 10:11:16PM +0000, David Woodhouse wrote:
> +#ifdef __ASSEMBLY__
> +
> +/*
> + * These are the bare retpoline primitives for indirect jmp and call.
> + * Do not use these directly; they only exist to make the ALTERNATIVE
> + * invocation below less ugly.
> + */
> +.macro RETPOLINE_JMP reg:req
> + call 1112f
> +1111: pause
> + jmp 1111b
> +1112: mov \reg, (%_ASM_SP)
> + ret
> +.endm

Should this not use local name labels instead?

.macro RETPOLINE_JMP reg:req
call .Ldo_rop_\@
.Lspec_trap_\@:
pause
jmp .Lspec_trap_\@
.Ldo_rop_\@:
mov \reg, (%_ASM_SP)
ret
.endm

And I suppose it might be nice to put a little comment with them
explaining how they work.

> +/*
> + * For i386 we use the original ret-equivalent retpoline, because
> + * otherwise we'll run out of registers. We don't care about CET
> + * here, anyway.
> + */
> +# define NOSPEC_CALL ALTERNATIVE( \
> + "call *%[thunk_target]\n", \
> + " jmp 1113f; " \
> + "1110: call 1112f; " \
> + "1111: pause; " \
> + " jmp 1111b; " \
> + "1112: addl $4, %%esp; " \
> + " pushl %[thunk_target]; " \
> + " ret; " \
> + "1113: call 1110b;\n", \
> + X86_FEATURE_RETPOLINE)

Ideally this would too, just not sure that works in inline asm.

2018-01-08 10:46:13

by Paul Turner

[permalink] [raw]

Subject: Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

On Mon, Jan 8, 2018 at 2:38 AM, Jiri Kosina <[email protected]> wrote:
> On Mon, 8 Jan 2018, Paul Turner wrote:
>
>> user->kernel in the absence of SMEP:
>> In the absence of SMEP, we must worry about user-generated RSB entries
>> being consumable by kernel execution.
>> Generally speaking, for synchronous execution this will not occur (e.g.
>> syscall, interrupt), however, one important case remains.
>> When we context switch between two threads, we should flush the RSB so that
>> execution generated from the unbalanced return path on the thread that we
>> just scheduled into, cannot consume RSB entries potentially installed by
>> the prior thread.
>
> I am still unclear whether this closes it completely, as when HT is on,
> the RSB is shared between the threads, right? Therefore one thread can
> poision it for the other without even context switch happening.
>

See 2.6.1.1 [Replicated resources]:
"The return stack predictor is replicated to improve branch
prediction of return instructions"

(This is part of the reason that the sequence is attractive; its use
of the RSB to control prediction naturally prevents cross-sibling
attack.)

> --
> Jiri Kosina
> SUSE Labs
>

2018-01-08 10:53:07

On Mon, 2018-01-08 at 15:56 -0800, Linus Torvalds wrote:
> On Mon, Jan 8, 2018 at 3:44 PM, David Woodhouse <[email protected]> wrote:
> >
> > To guard against this fill the return buffer with controlled
> > content during context switch. This prevents any underflows.
>
> Ugh. I really dislike this patch. Everything else in the retpoline
> patches makes me go "ok, that's reasonable". This one makes me go
> "Eww".
>
> It's hacky, it's ugly, and it looks pretty expensive too.
>
> Is there really nothing more clever we can do?

You get this part in the IBRS/microcode solution too. The IBRS MSR
doesn't catch everything; you still need to stuff the RSB in very
similar places (and/or use the IBPB MSR in some).

Attachments:

smime.p7s (5.09 kB)

2018-01-09 00:06:38

by Andi Kleen

[permalink] [raw]

Subject: Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch

On Mon, Jan 08, 2018 at 03:56:30PM -0800, Linus Torvalds wrote:
> On Mon, Jan 8, 2018 at 3:44 PM, David Woodhouse <[email protected]> wrote:
> >
> > To guard against this fill the return buffer with controlled
> > content during context switch. This prevents any underflows.
>
> Ugh. I really dislike this patch. Everything else in the retpoline
> patches makes me go "ok, that's reasonable". This one makes me go
> "Eww".
>
> It's hacky, it's ugly, and it looks pretty expensive too.

Modern cores are quite fast at executing calls.

>
> Is there really nothing more clever we can do?

We could be a cleverer in selecting how many dummy calls to do.
But that would likely be fragile and hard to maintain
and likely be more complicated, and I doubt it would buy that much.

Don't really have a better proposal, sorry.

-Andi

2018-01-09 00:35:34

by Linus Torvalds

[permalink] [raw]

Subject: Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch

On Mon, Jan 8, 2018 at 3:58 PM, Woodhouse, David <[email protected]> wrote:
>>
>> Is there really nothing more clever we can do?
>
> You get this part in the IBRS/microcode solution too. The IBRS MSR
> doesn't catch everything; you still need to stuff the RSB in very
> similar places (and/or use the IBPB MSR in some).

So I was really hoping that in places like context switching etc, we'd
be able to instead effectively kill off any exploits by clearing
registers.

That should make it pretty damn hard to then find a matching "gadget"
that actually does anything interesting/powerful.

Together with Spectre already being pretty hard to take advantage of,
and the eBPF people making those user-proivided gadgets inaccessible,
it really should be a pretty powerful fix.

Hmm?

Linus

2018-01-09 00:42:36

On Mon, 2018-01-08 at 19:27 -0800, Andy Lutomirski wrote:
> >
> > If SMEP is not active, speculation can go anywhere, including to a user
> > controlled gadget which can reload any registers it needs, including
> > with immediate constants.
>
> I thought that, even on pre-SMEP hardware, the CPU wouldn't
> speculatively execute from NX pages. And PTI marks user memory NX
> in kernel mode.

Hm, now that could be useful.

Do *all* the KPTI backports (some of which are reimplementations rather
than strictly backports) mark user memory NX?

Attachments:

smime.p7s (5.09 kB)

2018-01-09 13:11:12

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch

On Tue, Jan 09, 2018 at 01:04:20PM +0000, David Woodhouse wrote:
> On Mon, 2018-01-08 at 19:27 -0800, Andy Lutomirski wrote:
> > >?
> > > If SMEP is not active, speculation can go anywhere, including to a user
> > > controlled gadget which can reload any registers it needs, including
> > > with immediate constants.
> >
> > I thought that, even on pre-SMEP hardware, the CPU wouldn't
> > speculatively execute from NX pages.? And PTI marks user memory NX
> > in kernel mode.
>
> Hm, now that could be useful.?
>
> Do *all* the KPTI backports (some of which are reimplementations rather
> than strictly backports) mark user memory NX?

That doesn't really matter for upstream though; but see here:

https://lkml.kernel.org/r/[email protected]

2018-01-09 13:35:47

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH v6 01/10] x86/retpoline: Add initial retpoline support

On Tue, 9 Jan 2018, Peter Zijlstra wrote:

> On Mon, Jan 08, 2018 at 02:46:32PM +0100, Thomas Gleixner wrote:
> > On Mon, 8 Jan 2018, Josh Poimboeuf wrote:
>
> > > I wonder if an error might be more appropriate than a warning. I
> > > learned from experience that a lot of people don't see these Makefile
> > > warnings, and this would be a dangerous one to miss.
> > >
> > > Also if this were an error, you could get rid of the RETPOLINE define,
> > > and that would be one less define cluttering up the already way-too-long
> > > GCC arg list.
> >
> > It still allows to get the ASM part covered. If that's worth it I can't tell.
>
> So elsewhere you stated we're dropping support for GCC without asm-goto
> (<4.5), does it then make sense to make one more step and mandate a
> retpoline capable compiler, which would put us at >=4.9 (for x86).
>
> That would get rid of this weird case as well.

I agree in principle, though the difference is that the retpoline compilers
are not available today, gcc with asm goto are.

The reasoning for the minimal thing was to cover at least the obvious easy
targets, eg. sys_call_table as the deeper ones are harder.

Thanks,

tglx

2018-01-09 13:48:10

On Mon, 2018-01-08 at 02:42 -0800, Paul Turner wrote:
>
> While the cases above involve the crafting and use of poisoned
> entries. Recall also that one of the initial conditions was that we
> should avoid RSB underflow as some CPUs may try to use other indirect
> predictors when this occurs.

I think we should start by deliberately ignoring the CPUs which use the
other indirect predictors on RSB underflow. Those CPUs don't perform
*quite* so badly with IBRS anyway.

Let's get the minimum amount of RSB handling in to cope with the pre-
SKL CPUs, and then see if we really do want to extend it to make SKL
100% secure in retpoline mode or not.

So let's go through your list of cases and attempt to distinguish the
underflow concerns (which I declare we don't care about for now) from
the pollution (which we care about especially for non-SMEP) concerns...

> The cases we care about here are:
> - When we return _into_ protected execution. For the kernel, this
> means when we exit interrupt context into kernel context, since may
> have emptied or reduced the number of RSB entries while in iinterrupt
> context.

Don't care about that particular example. That's underflow-only.

However, we *do* care about entry to kernel code from userspace, for
interrupts and system calls etc. Basically everywhere that the IBRS
code would be setting IBRS, we need to flush the RSB (if !SMEP, I
think).

> - Context switch (even if we are returning to user code, we need to
> at unwind the scheduler/triggering frames that preempted it
> previously, considering that detail, this is a subset of the above,
> but listed for completeness)

Don't care. This is underflow-only. (Which means I think we want to
drop Andi's patch?)

> - On VMEXIT (it turns out we need to worry about both poisoned
> entries, and no entries, the solution is a single refill
> nonetheless).

Do care. This fixes pollution from the guest, and even SMEP isn't
enough to make us not care.

> - Leaving deeper (>C1) c-states, which may have flushed hardware
> state

Don't care.

> - Where we are unwinding call-chains of >16 entries[*]

Don't care.

Overall, I think the RSB-stuffing is needed in all the same places that
it's needed with IBRS.

Attachments:

smime.p7s (5.09 kB)

2018-01-10 15:32:51

by Dr. David Alan Gilbert

[permalink] [raw]

Subject: Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

* Woodhouse, David ([email protected]) wrote:
> On Mon, 2018-01-08 at 02:42 -0800, Paul Turner wrote:
> >
> > While the cases above involve the crafting and use of poisoned
> > entries.? Recall also that one of the initial conditions was that we
> > should avoid RSB underflow as some CPUs may try to use other indirect
> > predictors when this occurs.
>
> I think we should start by deliberately ignoring the CPUs which use the
> other indirect predictors on RSB underflow. Those CPUs don't perform
> *quite* so badly with IBRS anyway.
>
> Let's get the minimum amount of RSB handling in to cope with the pre-
> SKL CPUs, and then see if we really do want to extend it to make SKL
> 100% secure in retpoline mode or not.

How do you make decisions on which CPU you're running on?
I'm worried about the case of a VM that starts off on an older host
and then gets live migrated to a new Skylake.
For Intel CPUs we've historically been safe to live migrate
to any newer host based on having all the features that the old one had;
with the guest still seeing the flags etc for the old CPU.

Dave
--
Dr. David Alan Gilbert / [email protected] / Manchester, UK

2018-02-16 18:52:26

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch

On Tue 2018-01-09 13:04:20, David Woodhouse wrote:
> On Mon, 2018-01-08 at 19:27 -0800, Andy Lutomirski wrote:
> > >?
> > > If SMEP is not active, speculation can go anywhere, including to a user
> > > controlled gadget which can reload any registers it needs, including
> > > with immediate constants.
> >
> > I thought that, even on pre-SMEP hardware, the CPU wouldn't
> > speculatively execute from NX pages.? And PTI marks user memory NX
> > in kernel mode.
>
> Hm, now that could be useful.?
>
> Do *all* the KPTI backports (some of which are reimplementations rather
> than strictly backports) mark user memory NX?

Hmm. We'd still want to do something on 32-bit, and those might not
even have NX support in hardware.

Pentium 4 (and such) is probably advanced enough to be vulnerable to
spectre, but not new enough to support NX...

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html