Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753571AbeADQkL (ORCPT + 1 other); Thu, 4 Jan 2018 11:40:11 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:34976 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753438AbeADQkJ (ORCPT ); Thu, 4 Jan 2018 11:40:09 -0500 Date: Thu, 4 Jan 2018 16:39:54 +0000 From: Mark Rutland To: "Eric W. Biederman" Cc: Dan Williams , "torvalds@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "peterz@infradead.org" , "tglx@linutronix.de" , "alan@linux.intel.com" , "Reshetova, Elena" , "gnomes@lxorguk.ukuu.org.uk" , "gregkh@linuxfoundation.org" , "jikos@kernel.org" , "linux-arch@vger.kernel.org" Subject: Re: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier Message-ID: <20180104163759.5apqt6lnsfowudcl@salmiak> References: <151502463248.33513.5960736946233335087.stgit@dwillia2-desk3.amr.corp.intel.com> <20180104010754.22ca6a74@alans-desktop> <1515035438.20588.4.camel@intel.com> <87vagiusj1.fsf@xmission.com> <87wp0xu12k.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87wp0xu12k.fsf@xmission.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 04, 2018 at 08:54:11AM -0600, Eric W. Biederman wrote: > Dan Williams writes: > > On Wed, Jan 3, 2018 at 9:01 PM, Eric W. Biederman wrote: > >> "Williams, Dan J" writes: > Either the patch you presented missed a whole lot like 90%+ of the > user/kernel interface or there is some mitigating factor that I am not > seeing. Either way until reasonable people can read the code and > agree on the potential exploitability of it, I will be nacking these > patches. As Dan mentioned, this is the result of auditing some static analysis reports. I don't think it was claimed that this was complete, just that these are locations that we're fairly certain need attention. Auditing the entire user/kernel interface is going to take time, and I don't think we should ignore this corpus in the mean time (though we certainly want to avoid a whack-a-mole game). [...] > >>> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c > >>> index 8ca9915befc8..7f83abdea255 100644 > >>> --- a/net/mpls/af_mpls.c > >>> +++ b/net/mpls/af_mpls.c > >>> @@ -81,6 +81,8 @@ static struct mpls_route *mpls_route_input_rcu(struct net *net, unsigned index) > >>> if (index < net->mpls.platform_labels) { > >>> struct mpls_route __rcu **platform_label = > >>> rcu_dereference(net->mpls.platform_label); > >>> + > >>> + osb(); > >>> rt = rcu_dereference(platform_label[index]); > >>> } > >>> return rt; > >> > >> Ouch! This adds a barrier in the middle of an rcu lookup, on the > >> fast path for routing mpls packets. Which if memory serves will > >> noticably slow down software processing of mpls packets. > >> > >> Why does osb() fall after the branch for validity? So that we allow > >> speculation up until then? > > > > It falls there so that the cpu only issues reads with known good 'index' values. > > > >> I suspect it would be better to have those barriers in the tun/tap > >> interfaces where userspace can inject packets and thus time them. Then > >> the code could still speculate and go fast for remote packets. > >> > >> Or does the speculation stomping have to be immediately at the place > >> where we use data from userspace to perform a table lookup? > > > > The speculation stomping barrier has to be between where we validate > > the input and when we may speculate on invalid input. > > So a serializing instruction at the kernel/user boundary (like say > loading cr3) is not enough? That would seem to break any chance of a > controlled timing. Unfortunately, it isn't sufficient to do this at the kernel/user boundary. Any subsequent bounds check can be mis-speculated regardless of prior serialization. Such serialization has to occur *after* the relevant bounds check, but *before* use of the value that was checked. Where it's possible to audit user-provided values up front, we may be able to batch checks to amortize the cost of such serialization, but typically bounds checks are spread arbitrarily deep in the kernel. [...] > Given what I have seen in other parts of the thread I think an and > instruction that just limits the index to a sane range is generally > applicable, and should be cheap enough to not care about. Where feasible, this sounds good to me. However, since many places have dynamic bounds which aren't necessarily powers-of-two, I'm not sure how applicable this is. Thanks, Mark.