Hi all, please cc me on replies. Hardware discussion may be found here:
https://groups.google.com/forum/?nomobile=true#!topic/comp.arch/mzXXTU2GUSo
I am designing a new processor, based on RISCV, that is intended as a
hybrid GPU VPU and CPU. For various reasons, it needs to be a
multi-issue Out of Order engine. The innocent question was therefore
asked, "how is Spectre to be dealt with?" which threw a massive
spanner in the works.
The processor is being designed to use multi-issue as a means to
implement Vector Processing. For example: for predicated elements,
several instructions (one per element) will be thrown into the
*standard* multi-issue instruction queue, and cancelled only when the
register containing the predicate mask is available and has been
decoded. Thus, resources are taken up that will affect and be affected
by other instructions, which is the very definition of Spectre timing
attacks.
ooops.
Standard Spectre mitigation would completely destroy the performance
and viability of the project's Vector Engine, as well as many other
features.
So I have a proposal that, if correct and implemented, may be adopted
by other architectures as a mitigation solution that allows out of
order to continue to be used. It is a collaborative solution that
specifically requires explicit instructions to be added (and called)
at the aporopriate time(s).
The issue with Spectre attacks is that untrusted code may cause past
OR FUTURE instructions to change the amount of time in which they will
complete. An in-order architecture does not have this problem (except
where pipeline stalls occur), as there is always [almost always]
enough resources available that allow instructions (pipelines) to
proceed without blocking.
OoO typically has resource bottlenecks that are affected by other
instructions. The whole POINT of an OoO design is to run ahead,
utilising these resources speculatively and, duh, out of order.
To deal with absolutely every possible flaw in the OoO paradigm is a
total nightmare. Performance as people are discovering is utterly
trashed. Code complexity both in software terms and hardware terms
goes mental. Intel had to REMOVE hyperthreading from its latest
processors, the crossover timing leakage is that bad.
There is another way to ensure that untrusted code cannot affect
secure code: clear out the "internal state" of the processor before
letting it proceed to run the untrusted code.
In this way it becomes impossible for untrusted code to ascertain the
state of the processor, because it has been reset back to a known
uniform (blank) state.
This REQUIRES an actual instruction that programs (and the kernel) may
call. It is NOT ENOUGH that the linux kernel try to deal with
absolutely every possible situation automatically, and it is a total
nightmare to even try.
It is also not enough that the hardware try to deal with this on its
own: that is insanely complex as well. The only real safe way is to
abandon all of the benefits of OoO and go back to in-order SINGLE
issue performance levels.
Clearly, both options are not viable or acceptable.
A hybrid solution is a reasonable compromise, that may even be
possible to implement right now, with code that, on processors that do
not have the proposed new instruction, issues sufficient NOPs (or
other suitably researched instructions) such that they create a
"processor internal state" firebreak between secure and untrusted
code.
The hardware version of the firebreak opcode would WAIT until the
processor internal state has cleared out. All outstanding speculative
instructions would be cancelled. All instructions waiting for
pipelines to complete would be waited for until they had completed,
and their results written to the register file. Only then would the
processor be allowed to proceed.
It is not enough to have these "firebreak" calls done automatically by
the linux kernel: they need to be part of standard applications. An
example is firefox, which has a single process for javascript. Specre
atracks have been shown to exist using untrusted arbitrary javascript,
and if that javascript is being executed by a single process, then it
is the responsibility of that process to call the "firebreak" just
before allowing the untrusted javascript to execute.
This is going to be a mammoth task. The alternatives are to continue
as things are, which is a mess that cannot be cleaned up by either of
(mutually exclusive) hardware or software alone.
Thoughts and feedback appreciated.
l.
> This is going to be a mammoth task. The alternatives are to continue
> as things are, which is a mess that cannot be cleaned up by either of
> (mutually exclusive) hardware or software alone.
>
> Thoughts and feedback appreciated.
You need to be talking to the JIT developers not asking here I think.
Speculative attacks in JIT environments is a topic an order of magnitude
or more complex than the kernel cases because there isn't even process
isolation between the JIT, JIT engin eand support logic.
Alan
On Friday, January 18, 2019, Alan Cox <[email protected]> wrote:
>
> > This is going to be a mammoth task. The alternatives are to continue
> > as things are, which is a mess that cannot be cleaned up by either of
> > (mutually exclusive) hardware or software alone.
> >
> > Thoughts and feedback appreciated.
>
> You need to be talking to the JIT developers not asking here I think.
> Speculative attacks in JIT environments is a topic an order of magnitude
> or more complex than the kernel cases because there isn't even process
> isolation between the JIT, JIT engin eand support logic.
>
Hi Alan thanks for engaging on this. Deep breath: it's everything.
OpenSSL, linux kernel, uboot, JIT developers, PAM, system calls,
interrupts, exceptions, everything. Anywhere any time there is a
transition (of any kind, not just JIT environments) from trusted code
to arbitrary untrusted code, whether it be linux kernel, uboot,
applications, BIOSes, literally and absolutely anything and
everything, on every processor that is OoO, regardless of ISA.
In essence our basic fundamental assumptions about security separation
are... gone [for OoO processors].
Since I wrote the OP I found that the RISCV BOOM team had done some
research, and also concluded that explicitly called speculation
"fences" were the sanest solution. Link to discussion:
https://groups.google.com/forum/?nomobile=true#!topic/riscv-boom/yxDwmpjtQrE
Where my expertise runs out is whether it should be libc6 that calls
the firebreak instruction (or if one does not exist, a set of 100 NOPs
or whatever is found to be best suited for a given OoO processor), or
whether it should be the linux kernel that does so, perhaps as part of
the context switch point.
An OoO processor that has since been designed to clear the entire
speculation state on a context switch, interrupt or exception clearly
would not need to call the firebreak (fence) instruction in the linux
kernel context switch (from kernelspace to userspace as you do not
want info to leak from privileged to non-privileged) however those
that do not have such capability just as clearly would need to.
Whether the same fence should be called on the switch from userspace
to kernelspace? Honestly I do not know, I believe it would depend on
the level of paranoia :) Do you trust the linux kernel not to be
compromised, if it is, do you consider it Game Over already, that sort
of thing. Don't know the answer there.
So, yes, JIT definitely more complex, fence will definitely help...
however it is everywhere, *all* software needs to engage on this and
begin the long arduous process of designing, agreeing and then
implementing a sane mitigation strategy.
Yes BIOSes too ( Anyone still think proprietary BIOSes are a good idea? Intel?)
Or, we can all go back to using 25+ year old x86 processors (486s,
yay! Anyone still got an original OLPC, with the Geode LX500, or even
a Vortex86?), or use ARM Cortex A7 32 bit, Cortex A53 64 bit, or
MIPS64, anything that's in-order that can be found... *if* they can be
found... :)
l.