Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4592imm; Thu, 11 Oct 2018 14:18:51 -0700 (PDT) X-Google-Smtp-Source: ACcGV61QKEGAJMgFn5rPVzEKTlWFOZXQL3uZm23v9b1+w8Mjd+4vuYeytGiCgleWCG2sk8m9LbjM X-Received: by 2002:a62:fcd8:: with SMTP id e207-v6mr3141027pfh.132.1539292731929; Thu, 11 Oct 2018 14:18:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539292731; cv=none; d=google.com; s=arc-20160816; b=UAfSZI76mRFSJkxscJmFhKqOmYb+Pa4MD1F1YxQnTIXOipOnaqiHedAwcwGfn88rWp CUlrCI147Y75AYQc0ZhYt2I1DNTF7uXKrvFF6nT/Dxv95iGwmfwp2u4Yd5OW0nQV6vEQ lTLhLYChbBDcTUiZd+ucw+SyvlhQkG2zQO7y7gxRIOPOm2Rcym7+vcaho8CcwNFFh8Ar DMLWNBox8rjYswAn8aSjli5+4GQtR98dYcpRuhNXxRj29cA39m4TtHxL8RDjJ0gAAMdm B5v8hvm/2hW5FkBeycsw8oYz39HAnn7AB+MNgnSW1HPVlkHSiVkknMdmHocmJ5AmPd2N 0jbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=k3MxUZHlj+tEsFZkV/sVVwHjBdO/r+kwGGCIeo2AfZw=; b=Vzq5PzKnlE2thb8Tp07NFmGgognXklkRSHkW8lA1a/fyBi3/WDWQbhQyvgjbF4DgS/ yGp1dGcrgCs7JLdnYsSDllJHePD4TTsMje8SuJ6+Z5RuwXfGn22yZerMcw33XACl6I7o x3KMgFgen7DhKTG9b9HbJOIhib4itX93f83xn/n4m+sv045F2QFpjlwP/POYFvRpQocO ThHKaJqSLE2X2LOzWpxCYRHniWkgS467gawOGDEEmIh2eAE1QXL6AhC50FUovttep6z8 u0jhEHz8IH1rzU86IXqU4thTE31d9cDRQOeDXmH9kOFaXysxla5PjFqtESbAUIT9gvfK Luqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=vRmZdYdu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w20-v6si28415502plp.260.2018.10.11.14.18.35; Thu, 11 Oct 2018 14:18:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=vRmZdYdu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726665AbeJLEq3 (ORCPT + 99 others); Fri, 12 Oct 2018 00:46:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:46738 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726210AbeJLEq3 (ORCPT ); Fri, 12 Oct 2018 00:46:29 -0400 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 31ED121480 for ; Thu, 11 Oct 2018 21:17:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1539292644; bh=P+dMCSacJLoCczGihGSKLnRPVqm01gqEh2hV36UAqLw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=vRmZdYduZppcUMXie6CJuDe2ob/GfVd3OcLvauLw2+qtOp7ztWH40Xxjem7LqYWtK caK9M7eAoE0183KxWG3yT38SiastG42+L/UstxwRZ5j0rmw7hSo71OlGgsdD/4Ng+L 8Zc6pRrAGa0JNVGc0B9tPIMEw0m6wxwANElBO0V8= Received: by mail-wm1-f42.google.com with SMTP id y11-v6so10340977wma.3 for ; Thu, 11 Oct 2018 14:17:24 -0700 (PDT) X-Gm-Message-State: ABuFfohqEBiCrWqvfILH/+F2vjxfeG4tSQaPOpIMney/x+V+ru69ad5y SjKMTxytBsVkVooaXujijbLrKF/1NXkMwENWuaVACQ== X-Received: by 2002:a1c:1fcd:: with SMTP id f196-v6mr3052906wmf.19.1539292642496; Thu, 11 Oct 2018 14:17:22 -0700 (PDT) MIME-Version: 1.0 References: <20181011185458.10186-1-kristen@linux.intel.com> In-Reply-To: From: Andy Lutomirski Date: Thu, 11 Oct 2018 14:17:10 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] x86: entry: flush the cache if syscall error To: Kees Cook Cc: Andrew Lutomirski , Kristen Carlson Accardi , Kernel Hardening , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , X86 ML , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 11, 2018 at 1:55 PM Kees Cook wrote: > > On Thu, Oct 11, 2018 at 1:48 PM, Andy Lutomirski wrote: > > On Thu, Oct 11, 2018 at 11:55 AM Kristen Carlson Accardi > > wrote: > >> > >> This patch aims to make it harder to perform cache timing attacks on data > >> left behind by system calls. If we have an error returned from a syscall, > >> flush the L1 cache. > >> > >> It's important to note that this patch is not addressing any specific > >> exploit, nor is it intended to be a complete defense against anything. > >> It is intended to be a low cost way of eliminating some of side effects > >> of a failed system call. > >> > >> A performance test using sysbench on one hyperthread and a script which > >> attempts to repeatedly access files it does not have permission to access > >> on the other hyperthread found no significant performance impact. > >> > >> Suggested-by: Alan Cox > >> Signed-off-by: Kristen Carlson Accardi > >> --- > >> arch/x86/Kconfig | 9 +++++++++ > >> arch/x86/entry/common.c | 18 ++++++++++++++++++ > >> 2 files changed, 27 insertions(+) > >> > >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > >> index 1a0be022f91d..bde978eb3b4e 100644 > >> --- a/arch/x86/Kconfig > >> +++ b/arch/x86/Kconfig > >> @@ -445,6 +445,15 @@ config RETPOLINE > >> code are eliminated. Since this includes the syscall entry path, > >> it is not entirely pointless. > >> > >> +config SYSCALL_FLUSH > >> + bool "Clear L1 Cache on syscall errors" > >> + default n > >> + help > >> + Selecting 'y' allows the L1 cache to be cleared upon return of > >> + an error code from a syscall if the CPU supports "flush_l1d". > >> + This may reduce the likelyhood of speculative execution style > >> + attacks on syscalls. > >> + > >> config INTEL_RDT > >> bool "Intel Resource Director Technology support" > >> default n > >> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c > >> index 3b2490b81918..26de8ea71293 100644 > >> --- a/arch/x86/entry/common.c > >> +++ b/arch/x86/entry/common.c > >> @@ -268,6 +268,20 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs) > >> prepare_exit_to_usermode(regs); > >> } > >> > >> +__visible inline void l1_cache_flush(struct pt_regs *regs) > >> +{ > >> + if (IS_ENABLED(CONFIG_SYSCALL_FLUSH) && > >> + static_cpu_has(X86_FEATURE_FLUSH_L1D)) { > >> + if (regs->ax == 0 || regs->ax == -EAGAIN || > >> + regs->ax == -EEXIST || regs->ax == -ENOENT || > >> + regs->ax == -EXDEV || regs->ax == -ETIMEDOUT || > >> + regs->ax == -ENOTCONN || regs->ax == -EINPROGRESS) > > > > What about ax > 0? (Or more generally, any ax outside the range of -1 > > .. -4095 or whatever the error range is.) As it stands, it looks like > > you'll flush on successful read(), write(), recv(), etc, and that > > could seriously hurt performance on real workloads. > > Seems like just changing this with "ax == 0" into "ax >= 0" would solve that? I can easily imagine that there are other errors for which performance matters. EBUSY comes to mind. > > I think this looks like a good idea. It might be worth adding a > comment about the checks to explain why those errors are whitelisted. > It's a cheap and effective mitigation for "unknown future problems" > that doesn't degrade normal workloads. I still want to see two pieces of information before I ack a patch like this: - How long does L1D_FLUSH take, roughly? - An example of a type of attack that would be mitigated. For the latter, I assume that the goal is to mitigate against attacks where a syscall speculatively loads something sensitive and then fails. But, before it fails, it leaks the information it speculatively loaded, and that leak ended up in L1D but *not* in other cache levels. And somehow the L1D can't be probed quickly enough in a parallel thread to meaningfully get the information out of L1D. Or maybe it's trying to mitigate against the use of failing syscalls to get some sensitive data into L1D and then using something like L1TF to read it out. But this really needs to be clarified. Alan said that a bunch of the "yet another Spectre variant" attacks would have been mitigated by this patch. An explanation of *how* would be in order. And we should seriously consider putting this kind of thing under CONFIG_SPECULATIVE_MITIGATIONS or similar. The idea being that it doesn't mitigate a clear known attack family.