Date: Thu, 4 Jan 2018 17:52:38 +0100
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Woodhouse, David" <dwmw@amazon.co.uk>
Cc: "pavel@ucw.cz" <pavel@ucw.cz>,
        "pbonzini@redhat.com" <pbonzini@redhat.com>,
        "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "tim.c.chen@linux.intel.com" <tim.c.chen@linux.intel.com>,
        "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
        "tglx@linutronix.de" <tglx@linutronix.de>,
        "andi@firstfloor.org" <andi@firstfloor.org>,
        "gnomes@lxorguk.ukuu.org.uk" <gnomes@lxorguk.ukuu.org.uk>,
        "dave.hansen@intel.com" <dave.hansen@intel.com>,
        "gregkh@linux-foundation.org" <gregkh@linux-foundation.org>
Subject: Re: Avoid speculative indirect calls in kernel
Message-ID: <20180104165238.GF13348@redhat.com>
References: <20180103230934.15788-1-andi@firstfloor.org>
 <CA+55aFzCocoK+4kxAUEhaxxba4RTv3ewBmhiZ8Osc9iDkBtCEQ@mail.gmail.com>
 <e64025e6-9c4f-703d-cb69-29eeb9cc16bd@redhat.com>
 <20180104114231.GB1702@amd>
 <1515066469.12987.112.camel@amazon.co.uk>
 <94b12025-b27c-04d2-8726-c07a3af6b265@redhat.com>
 <7a3584c6-0c00-d807-5130-13d1f4b34102@citrix.com>
 <1515079777.12987.149.camel@amazon.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1515079777.12987.149.camel@amazon.co.uk>
User-Agent: Mutt/1.9.2 (2017-12-15)
Sender: linux-kernel-owner@vger.kernel.org

On Thu, Jan 04, 2018 at 03:29:37PM +0000, Woodhouse, David wrote:
> On Thu, 2018-01-04 at 14:51 +0000, Andrew Cooper wrote:
> > 
> > > * never turn off indirect branch prediction, but use a branch prediction
> > > barrier on every mode switch (needed for current AMD microcode)
> > 
> > Where have you got this idea from?? Using IBPB on every mode switch
> > would be an insane overhead to take, and isn't necessary.
> 
> AMD *only* has IBPB and not IBRS, but IIRC you don't need to do it on

AMD 0x10 0x12 0x16 basically have IBRS and no IBPB, those works
perfectly fine in ibrs 2 ibpb 1 mode, variant#2 fixed and zero
overhead.

> every context switch into the kernel; only when switching between
> VMs/processes?

Some AMD only has IBPB and no IBRS, then IBPB has to be called in
every enter kernel or vmexit to give the same security as ibrs 1 ibpb
1 (modulo SMT/HT but that's not the spectre PoC and you can rule that
out mathematically also by simply using cpu pinning as you already do
or disabling SMT if you care that much). Note ibrs 1 ibpb 1 also won't
cover HT effects of guest/user mode vs guest/user mode so cpu pinning
may be advisable anyway in your case (even with ibrs 1 ibpb 1 no
difference).

Of course everything can be trivially opted out at runtime and all
measurable performance restored, but by default it boots in the most
secure config available and it will make spectre variant#2 attack
impossible with only ibpb available.

> I need to pull in the AMD lfence alternative for retpoline, giving us a
> 3-way choice of the existing retpoline thunk, "lfence; jmp *%\reg", and
> a bare "jmp *%\reg".
> 
> Then the IBRS bits can be added on top.

"AMD lfence and reptoline" in the same sentence sounds like somebody
else also cares about spectre variant#2 on AMD. "Reptoline" only ever
makes sense in spectre variant#2 context so either ibrs 0 ibpb 2 mode
makes some sense too, or special lfence repotline for AMD should not
be worth mentioning in the first place.