2018-01-29 22:05:53

by Woodhouse, David

[permalink] [raw]
Subject: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

From: Tim Chen <[email protected]>

Flush indirect branches when switching into a process that marked itself
non dumpable. This protects high value processes like gpg better,
without having too high performance overhead.

If done naïvely, we could switch to a kernel idle thread and then back
to the original process, such as:

process A -> idle -> process A

In such scenario, we do not have to do IBPB here even though the process
is non-dumpable, as we are switching back to the same process after a
hiatus.

To avoid the redundant IBPB, which is expensive, we track the last mm
user context ID. The cost is to have an extra u64 mm context id to track
the last mm we were using before switching to the init_mm used by idle.
Avoiding the extra IBPB is probably worth the extra memory for this
common scenario.

For those cases where tlb_defer_switch_to_init_mm() returns true (non
PCID), lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So IBPB will be
skipped for this case.

Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.

Signed-off-by: Tim Chen <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
---
arch/x86/include/asm/tlbflush.h | 2 ++
arch/x86/mm/tlb.c | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 3effd3c..4405c4b 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -174,6 +174,8 @@ struct tlb_state {
struct mm_struct *loaded_mm;
u16 loaded_mm_asid;
u16 next_asid;
+ /* last user mm's ctx id */
+ u64 last_ctx_id;

/*
* We can be in one of several states:
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a156195..7489890 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -6,13 +6,14 @@
#include <linux/interrupt.h>
#include <linux/export.h>
#include <linux/cpu.h>
+#include <linux/debugfs.h>

#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
+#include <asm/nospec-branch.h>
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
-#include <linux/debugfs.h>

/*
* TLB flushing, formerly SMP-only
@@ -219,6 +220,27 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
} else {
u16 new_asid;
bool need_flush;
+ u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
+
+ /*
+ * Avoid user/user BTB poisoning by flushing the branch
+ * predictor when switching between processes. This stops
+ * one process from doing Spectre-v2 attacks on another.
+ *
+ * As an optimization, flush indirect branches only when
+ * switching into processes that disable dumping. This
+ * protects high value processes like gpg, without having
+ * too high performance overhead. IBPB is *expensive*!
+ *
+ * This will not flush branches when switching into kernel
+ * threads. It will also not flush if we switch to idle
+ * thread and back to the same process. It will flush if we
+ * switch to a different non-dumpable process.
+ */
+ if (tsk && tsk->mm &&
+ tsk->mm->context.ctx_id != last_ctx_id &&
+ get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ indirect_branch_prediction_barrier();

if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
@@ -268,6 +290,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
}

+ /*
+ * Record last user mm's context id, so we can avoid
+ * flushing branch buffer with IBPB if we switch back
+ * to the same user.
+ */
+ if (next != &init_mm)
+ this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+
this_cpu_write(cpu_tlbstate.loaded_mm, next);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid);
}
@@ -345,6 +375,7 @@ void initialize_tlbstate_and_flush(void)
write_cr3(build_cr3(mm->pgd, 0));

/* Reinitialize tlbstate. */
+ this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
this_cpu_write(cpu_tlbstate.next_asid, 1);
this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
--
2.7.4



2018-01-30 19:02:22

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
> From: Tim Chen <[email protected]>
>
> Flush indirect branches when switching into a process that marked itself
> non dumpable. This protects high value processes like gpg better,
> without having too high performance overhead.

I wonder what the point of this patch is. An audit of my laptop shows
only a single user of PR_SET_DUMPABLE: systemd-coredump.

[ And yes, I have gpg-agent running. Also, a grep of the gnupg source
doesn't show any evidence of it being used there. So the gpg thing
seems to be a myth. ]

But also, I much preferred the original version of the patch which only
skipped IBPB when 'prev' could ptrace 'next'.

If performance is a concern, let's look at that in more detail. But I
don't see how the solution to a performance issue could possibly be
"leave (almost) all tasks vulnerable by default."

If the argument is that everyone should "rebuild the world" with
retpolines, then this patch would still be pointless, as we wouldn't
even need IBPB.

--
Josh

2018-01-30 21:10:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
> + if (tsk && tsk->mm &&
> + tsk->mm->context.ctx_id != last_ctx_id &&
> + get_dumpable(tsk->mm) != SUID_DUMP_USER)
> + indirect_branch_prediction_barrier();

Ok, so while staring at this, someone just came up with the following
sequence:

1. Malicious process runs with UID=A, does BTB poisoning
2. Sensitive process (e.g. gpg) starts also with UID=A, no IBPB flush occurs since task is initially dumpable
3. gpg now does prctl(PR_SET_DUMPABLE, ...) to clear the dumpable flag
4. gpg now does sensitive stuff that it thinks is protected
5. gpg does indirect branches that shouldn't be influenced by the malicious process

Now, if you switch between steps 3. and 4., you're good because gpg
became non-dumpable. But if you *don't* switch, the bad BTB entries are
still there.

So, *actually*, we need to flush IBPB in set_dumpable() too, when we
clear SUID_DUMP_USER.

Or, are we missing something obvious here and that is not needed because
of reasons I haven't thought about?

I know, gpg doesn't do prctl() but disables core dumping with
setrlimit() but there might be other processes who do that...

Thx.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-30 21:16:29

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 01/30/2018 12:38 PM, Borislav Petkov wrote:
> On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
>> + if (tsk && tsk->mm &&
>> + tsk->mm->context.ctx_id != last_ctx_id &&
>> + get_dumpable(tsk->mm) != SUID_DUMP_USER)
>> + indirect_branch_prediction_barrier();
>
> Ok, so while staring at this, someone just came up with the following
> sequence:
>
> 1. Malicious process runs with UID=A, does BTB poisoning
> 2. Sensitive process (e.g. gpg) starts also with UID=A, no IBPB flush occurs since task is initially dumpable
> 3. gpg now does prctl(PR_SET_DUMPABLE, ...) to clear the dumpable flag
> 4. gpg now does sensitive stuff that it thinks is protected
> 5. gpg does indirect branches that shouldn't be influenced by the malicious process
>
> Now, if you switch between steps 3. and 4., you're good because gpg
> became non-dumpable. But if you *don't* switch, the bad BTB entries are
> still there.

The attacker has to guarantee itself to run right before victim. And
the window of attack is very small, as only the first context switch to victim
may be vulnerable. The victim will be immune on the next switch.
For timing attack, the attacker needs to scan one bit at a time.
So probably the attacker could glean 1 bit of information
on the switch back to the attacker and then the attacker could not scan
further. So it doesn't seem to be very practical attack if the victim
has set itself to be non-dumpable.

Tim

>
> So, *actually*, we need to flush IBPB in set_dumpable() too, when we
> clear SUID_DUMP_USER.
>
> Or, are we missing something obvious here and that is not needed because
> of reasons I haven't thought about?
>
> I know, gpg doesn't do prctl() but disables core dumping with
> setrlimit() but there might be other processes who do that...
>
> Thx.
>


2018-01-30 21:24:07

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 01/30/2018 09:48 AM, Josh Poimboeuf wrote:
> On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
>> From: Tim Chen <[email protected]>
>>
>> Flush indirect branches when switching into a process that marked itself
>> non dumpable. This protects high value processes like gpg better,
>> without having too high performance overhead.
>
> I wonder what the point of this patch is. An audit of my laptop shows
> only a single user of PR_SET_DUMPABLE: systemd-coredump.

This is an opt in approach. For processes who need extra
security, it set itself as non-dumpable. Then it can
ensure that it doesn't see any poisoned BTB.

>
> [ And yes, I have gpg-agent running. Also, a grep of the gnupg source
> doesn't show any evidence of it being used there. So the gpg thing
> seems to be a myth. ]

I'm less familiar with gpg-agent. Dave was the one who
put in comments about gpg-agent in this patch so perhaps
he can comment.

>
> But also, I much preferred the original version of the patch which only
> skipped IBPB when 'prev' could ptrace 'next'.

For the A->kernel thread->B scenario, you will need context of A
to decide if you need IBPB when switching to B. You need to
worry about whether the context of A has been released ... etc if
you want to use ptrace.

>
> If performance is a concern, let's look at that in more detail. But I
> don't see how the solution to a performance issue could possibly be
> "leave (almost) all tasks vulnerable by default."
>

Thanks.

Tim

2018-01-30 22:19:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 01:03:20PM -0800, Tim Chen wrote:
> So it doesn't seem to be very practical attack if the victim has set
> itself to be non-dumpable.

Probably, but considering how cheap our fix is, we might just as well
plug that hole too.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-30 22:22:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 01:23:17PM -0800, Tim Chen wrote:
> I'm less familiar with gpg-agent. Dave was the one who
> put in comments about gpg-agent in this patch so perhaps
> he can comment.

So I looked at gpg-agent and AFAICT, it disables core dumping with
setrlimit().

I wasn't able to attach to it either with gdb but didn't find where we
disable the attaching, for the couple of minutes I grepped through it.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-30 22:28:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, 30 Jan 2018, Borislav Petkov wrote:

> On Tue, Jan 30, 2018 at 01:23:17PM -0800, Tim Chen wrote:
> > I'm less familiar with gpg-agent. Dave was the one who
> > put in comments about gpg-agent in this patch so perhaps
> > he can comment.
>
> So I looked at gpg-agent and AFAICT, it disables core dumping with
> setrlimit().

setrlimit() does not end up fiddling with the dumpable bits in
mm->flags. So no.

> I wasn't able to attach to it either with gdb but didn't find where we
> disable the attaching, for the couple of minutes I grepped through it.

prctl(PR_SET_DUMPABLE, 0) = 0

That does the trick.

Thanks,

tglx

2018-01-30 22:29:15

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 01/30/2018 01:57 PM, Borislav Petkov wrote:
> On Tue, Jan 30, 2018 at 01:03:20PM -0800, Tim Chen wrote:
>> So it doesn't seem to be very practical attack if the victim has set
>> itself to be non-dumpable.
>
> Probably, but considering how cheap our fix is, we might just as well
> plug that hole too.
>

If the process has multiple threads running on different cpus,
you will need to set IBPB on all cpus they are running in
order to achieve your purpose. So it is not necessarily cheap.
But I don't think it is really necessary.

Tim

2018-01-30 22:44:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 02:26:53PM -0800, Tim Chen wrote:
> If the process has multiple threads running on different cpus,

I'm talking about issuing the barrier in set_dumpable(). What threads on
multiple CPUs?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-30 22:57:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 11:21:58PM +0100, Thomas Gleixner wrote:
> prctl(PR_SET_DUMPABLE, 0) = 0
>
> That does the trick.

Ok, found it: https://dev.gnupg.org/T1211

Looks like this is debian-specific for now. Not in gnupg mainlne, AFAICT.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Subject: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

Commit-ID: 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
Gitweb: https://git.kernel.org/tip/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
Author: Tim Chen <[email protected]>
AuthorDate: Mon, 29 Jan 2018 22:04:47 +0000
Committer: Thomas Gleixner <[email protected]>
CommitDate: Tue, 30 Jan 2018 23:09:21 +0100

x86/speculation: Use Indirect Branch Prediction Barrier in context switch

Flush indirect branches when switching into a process that marked itself
non dumpable. This protects high value processes like gpg better,
without having too high performance overhead.

If done naïvely, we could switch to a kernel idle thread and then back
to the original process, such as:

process A -> idle -> process A

In such scenario, we do not have to do IBPB here even though the process
is non-dumpable, as we are switching back to the same process after a
hiatus.

To avoid the redundant IBPB, which is expensive, we track the last mm
user context ID. The cost is to have an extra u64 mm context id to track
the last mm we were using before switching to the init_mm used by idle.
Avoiding the extra IBPB is probably worth the extra memory for this
common scenario.

For those cases where tlb_defer_switch_to_init_mm() returns true (non
PCID), lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So IBPB will be
skipped for this case.

Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.

Signed-off-by: Tim Chen <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/include/asm/tlbflush.h | 2 ++
arch/x86/mm/tlb.c | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index d33e4a2..2b8f18c 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -174,6 +174,8 @@ struct tlb_state {
struct mm_struct *loaded_mm;
u16 loaded_mm_asid;
u16 next_asid;
+ /* last user mm's ctx id */
+ u64 last_ctx_id;

/*
* We can be in one of several states:
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 5bfe61a..012d026 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -6,13 +6,14 @@
#include <linux/interrupt.h>
#include <linux/export.h>
#include <linux/cpu.h>
+#include <linux/debugfs.h>

#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
+#include <asm/nospec-branch.h>
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
-#include <linux/debugfs.h>

/*
* TLB flushing, formerly SMP-only
@@ -247,6 +248,27 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
} else {
u16 new_asid;
bool need_flush;
+ u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
+
+ /*
+ * Avoid user/user BTB poisoning by flushing the branch
+ * predictor when switching between processes. This stops
+ * one process from doing Spectre-v2 attacks on another.
+ *
+ * As an optimization, flush indirect branches only when
+ * switching into processes that disable dumping. This
+ * protects high value processes like gpg, without having
+ * too high performance overhead. IBPB is *expensive*!
+ *
+ * This will not flush branches when switching into kernel
+ * threads. It will also not flush if we switch to idle
+ * thread and back to the same process. It will flush if we
+ * switch to a different non-dumpable process.
+ */
+ if (tsk && tsk->mm &&
+ tsk->mm->context.ctx_id != last_ctx_id &&
+ get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ indirect_branch_prediction_barrier();

if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
@@ -292,6 +314,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
}

+ /*
+ * Record last user mm's context id, so we can avoid
+ * flushing branch buffer with IBPB if we switch back
+ * to the same user.
+ */
+ if (next != &init_mm)
+ this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+
this_cpu_write(cpu_tlbstate.loaded_mm, next);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid);
}
@@ -369,6 +399,7 @@ void initialize_tlbstate_and_flush(void)
write_cr3(build_cr3(mm->pgd, 0));

/* Reinitialize tlbstate. */
+ this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
this_cpu_write(cpu_tlbstate.next_asid, 1);
this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);

2018-01-31 01:03:04

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 01/30/2018 02:43 PM, Borislav Petkov wrote:
> On Tue, Jan 30, 2018 at 02:26:53PM -0800, Tim Chen wrote:
>> If the process has multiple threads running on different cpus,
>
> I'm talking about issuing the barrier in set_dumpable(). What threads on
> multiple CPUs?
>

As dumpable is a property in mm->flags, it affects all threads running on other cpus sharing
the same mm. If you issue IBPB only on the cpu that perform the set_dumpable(),
the theoretical hole you are trying to close still exist on threads running on
other cpu.

time ----->
(cpu A) set_dumpable victim (thread1), issue IBPB
(cpu B) attacker -> victim (thread2), missed IBPB -> attacker -> victim (IBPB issued)


That said, I think the risk is minuscule and is not worth the cost to set IBPB on the other cpus.

Tim


2018-01-31 01:06:29

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 04:25:26PM -0800, Tim Chen wrote:
> As dumpable is a property in mm->flags, it affects all threads running
> on other cpus sharing the same mm.

It is not about sharing the same mm - it is about sharing the RSB. How
many logical CPUs share an RSB? If it is per core (which can have two
threads) then issuing the barrier should have effect on both threads.
Thus one barrier is enough.

IOW, the granularity is determined by how many logical CPUs share the
RSB not by how many logical CPUs share an mm.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-01-31 04:38:24

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 01:23:17PM -0800, Tim Chen wrote:
> On 01/30/2018 09:48 AM, Josh Poimboeuf wrote:
> > On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
> >> From: Tim Chen <[email protected]>
> >>
> >> Flush indirect branches when switching into a process that marked itself
> >> non dumpable. This protects high value processes like gpg better,
> >> without having too high performance overhead.
> >
> > I wonder what the point of this patch is. An audit of my laptop shows
> > only a single user of PR_SET_DUMPABLE: systemd-coredump.
>
> This is an opt in approach. For processes who need extra
> security, it set itself as non-dumpable. Then it can
> ensure that it doesn't see any poisoned BTB.

I don't want other users reading my applications' memory.

I don't want other containers reading my containers' memory.

I don't want *any* user tasks reading root daemons' memory.

Those are not unreasonable expectations.

So now I have to go and modify all my containers and applications to set
PR_SET_DUMPABLE? That seems highly impractical and unlikely.

Plus, I happen to *like* core dumps.

The other option is to rebuild the entire userland with retpolines, but
again, that would make this patch completely pointless.

> > [ And yes, I have gpg-agent running. Also, a grep of the gnupg source
> > doesn't show any evidence of it being used there. So the gpg thing
> > seems to be a myth. ]
>
> I'm less familiar with gpg-agent. Dave was the one who
> put in comments about gpg-agent in this patch so perhaps
> he can comment.
>
> >
> > But also, I much preferred the original version of the patch which only
> > skipped IBPB when 'prev' could ptrace 'next'.
>
> For the A->kernel thread->B scenario, you will need context of A
> to decide if you need IBPB when switching to B. You need to
> worry about whether the context of A has been released ... etc if
> you want to use ptrace.

Is that why the ptrace approach was abandoned? Surely that's a solvable
problem? We have some smart people on lkml. And anyway I didn't see it
discussed anywhere. In the worst case we could just always do IBPB when
switching between kernel and user tasks.

--
Josh

2018-01-31 07:16:35

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, Jan 30, 2018 at 02:39:45PM -0800, tip-bot for Tim Chen wrote:
> Commit-ID: 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> Gitweb: https://git.kernel.org/tip/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> Author: Tim Chen <[email protected]>
> AuthorDate: Mon, 29 Jan 2018 22:04:47 +0000
> Committer: Thomas Gleixner <[email protected]>
> CommitDate: Tue, 30 Jan 2018 23:09:21 +0100
>
> x86/speculation: Use Indirect Branch Prediction Barrier in context switch
>
> Flush indirect branches when switching into a process that marked itself
> non dumpable. This protects high value processes like gpg better,
> without having too high performance overhead.

For the record, I am still opposed to limit this to non-dumpable processes.
Whether a process needs protection by IBPB on context switches is a
different question to whether a process should be allowed to be dumped,
though the former may be a superset of the latter. In my opinion, IBPB
should be enabled on all context switches to userspace processes, until we
have a clear mitigation strategy for userspace against Spectre-v2 designed
and implemented.

Thanks,
Dominik

--------------------------
From: Dominik Brodowski <[email protected]>
Date: Wed, 31 Jan 2018 07:43:12 +0100
Subject: [PATCH] x86/speculation: Do not limit Indirect Branch Prediction Barrier to non-dumpable processes

Whether a process needs protection by IBPB on context switches is a
different question to whether a process should be allowed to be dumped,
though the former may be a superset of the latter. Enable IBPB on all
context switches to a different userspace process, until we have a clear
mitigation strategy for userspace against Spectre-v2 designed and
implemented.

Signed-off-by: Dominik Brodowski <[email protected]>

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 012d02624848..f54897b68b16 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -255,19 +255,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
* predictor when switching between processes. This stops
* one process from doing Spectre-v2 attacks on another.
*
- * As an optimization, flush indirect branches only when
- * switching into processes that disable dumping. This
- * protects high value processes like gpg, without having
- * too high performance overhead. IBPB is *expensive*!
- *
* This will not flush branches when switching into kernel
* threads. It will also not flush if we switch to idle
* thread and back to the same process. It will flush if we
- * switch to a different non-dumpable process.
+ * switch to a different user process.
*/
if (tsk && tsk->mm &&
- tsk->mm->context.ctx_id != last_ctx_id &&
- get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ tsk->mm->context.ctx_id != last_ctx_id)
indirect_branch_prediction_barrier();

if (IS_ENABLED(CONFIG_VMAP_STACK)) {

2018-01-31 13:25:30

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Wed, Jan 31, 2018 at 08:03:00AM +0100, Dominik Brodowski wrote:
> On Tue, Jan 30, 2018 at 02:39:45PM -0800, tip-bot for Tim Chen wrote:
> > Commit-ID: 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> > Gitweb: https://git.kernel.org/tip/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> > Author: Tim Chen <[email protected]>
> > AuthorDate: Mon, 29 Jan 2018 22:04:47 +0000
> > Committer: Thomas Gleixner <[email protected]>
> > CommitDate: Tue, 30 Jan 2018 23:09:21 +0100
> >
> > x86/speculation: Use Indirect Branch Prediction Barrier in context switch
> >
> > Flush indirect branches when switching into a process that marked itself
> > non dumpable. This protects high value processes like gpg better,
> > without having too high performance overhead.
>
> For the record, I am still opposed to limit this to non-dumpable processes.
> Whether a process needs protection by IBPB on context switches is a
> different question to whether a process should be allowed to be dumped,
> though the former may be a superset of the latter. In my opinion, IBPB
> should be enabled on all context switches to userspace processes, until we
> have a clear mitigation strategy for userspace against Spectre-v2 designed
> and implemented.
>
> Thanks,
> Dominik
>
> --------------------------
> From: Dominik Brodowski <[email protected]>
> Date: Wed, 31 Jan 2018 07:43:12 +0100
> Subject: [PATCH] x86/speculation: Do not limit Indirect Branch Prediction Barrier to non-dumpable processes
>
> Whether a process needs protection by IBPB on context switches is a
> different question to whether a process should be allowed to be dumped,
> though the former may be a superset of the latter. Enable IBPB on all
> context switches to a different userspace process, until we have a clear
> mitigation strategy for userspace against Spectre-v2 designed and
> implemented.
>
> Signed-off-by: Dominik Brodowski <[email protected]>

Acked-by: Josh Poimboeuf <[email protected]>

--
Josh

2018-01-31 23:56:44

by Tim Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 01/30/2018 07:59 PM, Josh Poimboeuf wrote:
> On Tue, Jan 30, 2018 at 01:23:17PM -0800, Tim Chen wrote:
>> On 01/30/2018 09:48 AM, Josh Poimboeuf wrote:
>>> On Mon, Jan 29, 2018 at 10:04:47PM +0000, David Woodhouse wrote:
>>>> From: Tim Chen <[email protected]>
>>>>
>>>> Flush indirect branches when switching into a process that marked itself
>>>> non dumpable. This protects high value processes like gpg better,
>>>> without having too high performance overhead.
>>>
>>> I wonder what the point of this patch is. An audit of my laptop shows
>>> only a single user of PR_SET_DUMPABLE: systemd-coredump.
>>
>> This is an opt in approach. For processes who need extra
>> security, it set itself as non-dumpable. Then it can
>> ensure that it doesn't see any poisoned BTB.
>
> I don't want other users reading my applications' memory.
>
> I don't want other containers reading my containers' memory.
>
> I don't want *any* user tasks reading root daemons' memory.
>
> Those are not unreasonable expectations.
>
> So now I have to go and modify all my containers and applications to set
> PR_SET_DUMPABLE? That seems highly impractical and unlikely.
>
> Plus, I happen to *like* core dumps.
>
> The other option is to rebuild the entire userland with retpolines, but
> again, that would make this patch completely pointless.
>
>>> [ And yes, I have gpg-agent running. Also, a grep of the gnupg source
>>> doesn't show any evidence of it being used there. So the gpg thing
>>> seems to be a myth. ]
>>
>> I'm less familiar with gpg-agent. Dave was the one who
>> put in comments about gpg-agent in this patch so perhaps
>> he can comment.
>>
>>>
>>> But also, I much preferred the original version of the patch which only
>>> skipped IBPB when 'prev' could ptrace 'next'.
>>
>> For the A->kernel thread->B scenario, you will need context of A
>> to decide if you need IBPB when switching to B. You need to
>> worry about whether the context of A has been released ... etc if
>> you want to use ptrace.
>
> Is that why the ptrace approach was abandoned? Surely that's a solvable
> problem? We have some smart people on lkml. And anyway I didn't see it
> discussed anywhere. In the worst case we could just always do IBPB when
> switching between kernel and user tasks.
>

I think dumpable is not the end all policy. It is a reasonable starting point
to provide us a means to secure the most sensitive processes without
IBPBing the world. It is on the performance end of the security/performance trade off.

For people who opt for more security, it is reasonable to consider
alternate policies to distinguish friend and foe so we know if we are coming
from a potentially hostile environment. Ptrace is one means to do so, and probably
there are other ways depending on usages. I hope we can have a discussion on what we should
use to determine if two processes are friend or foe. Say do all the processes
from the same containers are considered friends with each other?
I think once we have this decided, actually putting in IBPB will simple.

Tim

2018-02-01 08:26:33

by Christian Brauner

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Wed, Jan 31, 2018 at 08:03:00AM +0100, Dominik Brodowski wrote:
> On Tue, Jan 30, 2018 at 02:39:45PM -0800, tip-bot for Tim Chen wrote:
> > Commit-ID: 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> > Gitweb: https://git.kernel.org/tip/18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7
> > Author: Tim Chen <[email protected]>
> > AuthorDate: Mon, 29 Jan 2018 22:04:47 +0000
> > Committer: Thomas Gleixner <[email protected]>
> > CommitDate: Tue, 30 Jan 2018 23:09:21 +0100
> >
> > x86/speculation: Use Indirect Branch Prediction Barrier in context switch
> >
> > Flush indirect branches when switching into a process that marked itself
> > non dumpable. This protects high value processes like gpg better,
> > without having too high performance overhead.
>
> For the record, I am still opposed to limit this to non-dumpable processes.
> Whether a process needs protection by IBPB on context switches is a
> different question to whether a process should be allowed to be dumped,

This is especially true since you can get into a non-dumpable state
implicitly due to setuid() in a new user namespace. So avoiding the
performance hit due to IBPD is not necessary a concious decision taken
by calling prctl(), that is to say even if you care about performance
and don't want to take the IBPB hit you might accidently have to take it
when doing a setuid(). Also there are definitely workloads where
you want to be able to ptrace a task while at the same time having it do
IBPB. So let's not limit this to non-dumpable taks! Thanks Dominik!

> though the former may be a superset of the latter. In my opinion, IBPB
> should be enabled on all context switches to userspace processes, until we
> have a clear mitigation strategy for userspace against Spectre-v2 designed
> and implemented.
>
> Thanks,
> Dominik
>
> --------------------------
> From: Dominik Brodowski <[email protected]>
> Date: Wed, 31 Jan 2018 07:43:12 +0100
> Subject: [PATCH] x86/speculation: Do not limit Indirect Branch Prediction Barrier to non-dumpable processes
>
> Whether a process needs protection by IBPB on context switches is a
> different question to whether a process should be allowed to be dumped,
> though the former may be a superset of the latter. Enable IBPB on all
> context switches to a different userspace process, until we have a clear
> mitigation strategy for userspace against Spectre-v2 designed and
> implemented.

Thanks Dominik. That makes a lot more sense!

>
> Signed-off-by: Dominik Brodowski <[email protected]>
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 012d02624848..f54897b68b16 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -255,19 +255,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
> * predictor when switching between processes. This stops
> * one process from doing Spectre-v2 attacks on another.
> *
> - * As an optimization, flush indirect branches only when
> - * switching into processes that disable dumping. This
> - * protects high value processes like gpg, without having
> - * too high performance overhead. IBPB is *expensive*!
> - *
> * This will not flush branches when switching into kernel
> * threads. It will also not flush if we switch to idle
> * thread and back to the same process. It will flush if we
> - * switch to a different non-dumpable process.
> + * switch to a different user process.
> */
> if (tsk && tsk->mm &&
> - tsk->mm->context.ctx_id != last_ctx_id &&
> - get_dumpable(tsk->mm) != SUID_DUMP_USER)
> + tsk->mm->context.ctx_id != last_ctx_id)
> indirect_branch_prediction_barrier();
>
> if (IS_ENABLED(CONFIG_VMAP_STACK)) {

2018-02-01 08:34:48

by David Woodhouse

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Wed, 2018-01-31 at 08:03 +0100, Dominik Brodowski wrote:
> Whether a process needs protection by IBPB on context switches is a
> different question to whether a process should be allowed to be dumped,
> though the former may be a superset of the latter. Enable IBPB on all
> context switches to a different userspace process, until we have a clear
> mitigation strategy for userspace against Spectre-v2 designed and
> implemented.
>
> ...
>                 if (tsk && tsk->mm &&
> -                   tsk->mm->context.ctx_id != last_ctx_id &&
> -                   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> +                   tsk->mm->context.ctx_id != last_ctx_id)
>                         indirect_branch_prediction_barrier();


I understand your argument and I sympathise.

But that's going to hurt a *lot*, and we don't even have a viable
proof-of-concept for a user←→user Spectre v2 attack, do we? It's only
theoretical?

Show a working PoC and it makes the argument somewhat more
convincing...


Attachments:
smime.p7s (5.09 kB)

2018-02-01 15:41:41

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Thu, Feb 01, 2018 at 08:31:53AM +0000, David Woodhouse wrote:
> On Wed, 2018-01-31 at 08:03 +0100, Dominik Brodowski wrote:
> > Whether a process needs protection by IBPB on context switches is a
> > different question to whether a process should be allowed to be dumped,
> > though the former may be a superset of the latter. Enable IBPB on all
> > context switches to a different userspace process, until we have a clear
> > mitigation strategy for userspace against Spectre-v2 designed and
> > implemented.
> >
> > ...
> >                 if (tsk && tsk->mm &&
> > -                   tsk->mm->context.ctx_id != last_ctx_id &&
> > -                   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> > +                   tsk->mm->context.ctx_id != last_ctx_id)
> >                         indirect_branch_prediction_barrier();
>
>
> I understand your argument and I sympathise.
>
> But that's going to hurt a *lot*, and we don't even have a viable
> proof-of-concept for a user←→user Spectre v2 attack, do we? It's only
> theoretical?
>
> Show a working PoC and it makes the argument somewhat more
> convincing...

Fair point. From what I can gather, user space ASLR seems to be the
only layer of protection before a POC would be feasible. So, unless I'm
mistaken, which is very possible, it seems to be a question of how much
we trust ASLR.

--
Josh

2018-02-04 19:42:26

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Thu, Feb 01, 2018 at 08:31:53AM +0000, David Woodhouse wrote:
> On Wed, 2018-01-31 at 08:03 +0100, Dominik Brodowski wrote:
> > Whether a process needs protection by IBPB on context switches is a
> > different question to whether a process should be allowed to be dumped,
> > though the former may be a superset of the latter. Enable IBPB on all
> > context switches to a different userspace process, until we have a clear
> > mitigation strategy for userspace against Spectre-v2 designed and
> > implemented.
> >
> > ...
> >                 if (tsk && tsk->mm &&
> > -                   tsk->mm->context.ctx_id != last_ctx_id &&
> > -                   get_dumpable(tsk->mm) != SUID_DUMP_USER)
> > +                   tsk->mm->context.ctx_id != last_ctx_id)
> >                         indirect_branch_prediction_barrier();
>
>
> I understand your argument and I sympathise.
>
> But that's going to hurt a *lot*, and we don't even have a viable
> proof-of-concept for a user←→user Spectre v2 attack, do we? It's only
> theoretical?

Wasn't the PoC in the Spectre paper user←→user (though on a different OS)?
And what makes KVM←→KVM so much more likely/dangerous/..., that IBPB will
be done there unconditionally (AFAICS)?


And, somewhat related, @Tim Chen:

On Wed, Jan 31, 2018 at 03:25:44PM -0800, Tim Chen wrote:
> For people who opt for more security, it is reasonable to consider
> alternate policies to distinguish friend and foe so we know if we are coming
> from a potentially hostile environment. Ptrace is one means to do so, and probably
> there are other ways depending on usages. I hope we can have a discussion on what we should
> use to determine if two processes are friend or foe. Say do all the processes
> from the same containers are considered friends with each other?

To my understanding, the concept of "containers" is meant to be kept outside
of the kernel. What *namespaces* / *control groups* can be considered
friends with each other?


Thanks,
Dominik

2018-02-05 14:21:58

by David Woodhouse

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On Tue, 2018-01-30 at 14:39 -0800, tip-bot for Tim Chen wrote:
> Thanks to the reviewers and Andy Lutomirski for the suggestion of
> using ctx_id which got rid of the problem of mm pointer recycling.

That one doesn't backport well to 4.9. Suggestions welcome.


Attachments:
smime.p7s (5.09 kB)

2018-02-05 19:37:23

by Tim Chen

[permalink] [raw]
Subject: Re: [tip:x86/pti] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

On 02/05/2018 06:18 AM, David Woodhouse wrote:
> On Tue, 2018-01-30 at 14:39 -0800, tip-bot for Tim Chen wrote:
>> Thanks to the reviewers and Andy Lutomirski for the suggestion of
>> using ctx_id which got rid of the problem of mm pointer recycling.
>
> That one doesn't backport well to 4.9. Suggestions welcome.
>

Will something like the following work for 4.9 using active_mm?
This patch is not really tested, but just
want to put it out here to see if this is a reasonable backport.

Tim

Signed-off-by: Tim Chen <[email protected]>
---
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a7655f6..4994db2 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -9,6 +9,7 @@

#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
+#include <asm/nospec-branch.h>
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
@@ -75,6 +76,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
{
unsigned cpu = smp_processor_id();
+#ifdef CONFIG_SMP
+ struct mm_struct *active_mm = this_cpu_read(cpu_tlbstate.active_mm);
+#endif

if (likely(prev != next)) {
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
@@ -91,6 +95,28 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
set_pgd(pgd, init_mm.pgd[stack_pgd_index]);
}

+ /*
+ * Avoid user/user BTB poisoning by flushing the branch
+ * predictor when switching between processes. This stops
+ * one process from doing Spectre-v2 attacks on another.
+ *
+ * As an optimization, flush indirect branches only when
+ * switching into processes that disable dumping. This
+ * protects high value processes like gpg, without having
+ * too high performance overhead. IBPB is *expensive*!
+ *
+ * This will not flush branches when switching into kernel
+ * threads. It will also not flush if we switch to idle
+ * thread and back to the same process. It will flush if we
+ * switch to a different non-dumpable process.
+ */
+ if (tsk && tsk->mm &&
+#ifdef CONFIG_SMP
+ next != active_mm &&
+#endif
+ get_dumpable(tsk->mm) != SUID_DUMP_USER)
+ indirect_branch_prediction_barrier();
+
#ifdef CONFIG_SMP
this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
this_cpu_write(cpu_tlbstate.active_mm, next);