Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752284AbeADJQd convert rfc822-to-8bit (ORCPT + 1 other); Thu, 4 Jan 2018 04:16:33 -0500 Received: from mga04.intel.com ([192.55.52.120]:20752 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751615AbeADJQa (ORCPT ); Thu, 4 Jan 2018 04:16:30 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,506,1508828400"; d="scan'208";a="16687373" From: "Reshetova, Elena" To: Alexei Starovoitov , Alan Cox CC: Jiri Kosina , "Williams, Dan J" , Linus Torvalds , Linux Kernel Mailing List , Mark Rutland , "linux-arch@vger.kernel.org" , Peter Zijlstra , Greg KH , Thomas Gleixner , "netdev@vger.kernel.org" , Daniel Borkmann , "David S. Miller" Subject: RE: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier Thread-Topic: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier Thread-Index: AQHThPI7uuu7/B2MrkmuXPmqy+TXzaNi3xOAgAAH7gCAAAWXAIAABn+AgAACaYCAAASAgIAAD96AgABhIMA= Date: Thu, 4 Jan 2018 09:16:25 +0000 Message-ID: <2236FBA76BA1254E88B949DDB74E612B802D25BE@IRSMSX102.ger.corp.intel.com> References: <20180103223827.39601-1-mark.rutland@arm.com> <151502463248.33513.5960736946233335087.stgit@dwillia2-desk3.amr.corp.intel.com> <20180104010754.22ca6a74@alans-desktop> <20180104021553.32084de3@alans-desktop> <20180104031239.f2sat4ozqvqxaatr@ast-mbp.dhcp.thefacebook.com> In-Reply-To: <20180104031239.f2sat4ozqvqxaatr@ast-mbp.dhcp.thefacebook.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-ctpclassification: CTP_NT x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZjJlMjc4OWUtZmJiNS00YzM1LTk3M2MtNzlmZmQ2YWQ1Nzc0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjIuNS4xOCIsIlRydXN0ZWRMYWJlbEhhc2giOiJWcWxPeEdBNytDcWtCTksrWGhYQm9RbUJsRXVUZUdZbmMxb1dzbis0ZUNKOUNFN2c2dTMreTQ0d3FWZ0V0VjI4In0= x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: > On Thu, Jan 04, 2018 at 02:15:53AM +0000, Alan Cox wrote: > > > > > > Elena has done the work of auditing static analysis reports to a dozen > > > > or so locations that need some 'nospec' handling. > > > > > > How exactly is that related (especially in longer-term support terms) to > > > BPF anyway? > > > > If you read the papers you need a very specific construct in order to not > > only cause a speculative load of an address you choose but also to then > > manage to cause a second operation that in some way reveals bits of data > > or allows you to ask questions. > > > > BPF allows you to construct those sequences relatively easily and it's > > the one case where a user space application can fairly easily place code > > it wants to execute in the kernel. Without BPF you have to find the right > > construct in the kernel, prime all the right predictions and measure the > > result without getting killed off. There are places you can do that but > > they are not so easy and we don't (at this point) think there are that > > many. > > for BPF in particular we're thinking to do a different fix. > Instead of killing speculation we can let cpu speculate. > The fix will include rounding up bpf maps to nearest power of 2 and > inserting bpf_and operation on the index after bounds check, > so cpu can continue speculate beyond bounds check, but it will > load from zero-ed memory. > So this nospec arch dependent hack won't be necessary for bpf side, > but may still be needed in other parts of the kernel. I think this is a much better approach than what we have been doing internally so far. My initial fix back in August for this was to insert a minimal set of lfence barriers (osb() barrier below) that prevent speculation just based on how attack was using constants to index eBPF maps: diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c #define BPF_R0 regs[BPF_REG_0] @@ -939,6 +940,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, DST = IMM; CONT; LD_IMM_DW: + osb(); DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; insn++; CONT; @@ -1200,6 +1202,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, *(SIZE *)(unsigned long) (DST + insn->off) = IMM; \ CONT; \ LDX_MEM_##SIZEOP: \ + osb(); \ DST = *(SIZE *)(unsigned long) (SRC + insn->off); \ CONT; And similar stuff also for x86 JIT (blinding dependent): @@ -400,7 +413,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_ADD: b2 = 0x01; break; case BPF_SUB: b2 = 0x29; break; case BPF_AND: b2 = 0x21; break; - case BPF_OR: b2 = 0x09; break; + case BPF_OR: b2 = 0x09; emit_memory_barrier(&prog); break; case BPF_XOR: b2 = 0x31; break; } if (BPF_CLASS(insn->code) == BPF_ALU64) @@ -647,6 +660,16 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, case BPF_ALU64 | BPF_RSH | BPF_X: case BPF_ALU64 | BPF_ARSH | BPF_X: + /* If blinding is enabled, each + * BPF_LD | BPF_IMM | BPF_DW instruction + * is converted to 4 eBPF instructions with + * BPF_ALU64_IMM(BPF_LSH, BPF_REG_AX, 32) + * always present(number 3). Detect such cases + * and insert memory barriers. */ + if ((BPF_CLASS(insn->code) == BPF_ALU64) + && (BPF_OP(insn->code) == BPF_LSH) + && (src_reg == BPF_REG_AX)) + emit_memory_barrier(&prog); /* check for bad case when dst_reg == rcx */ if (dst_reg == BPF_REG_4) { /* mov r11, dst_reg */ But this is really ugly fix IMO to prevent speculation from happen, so with your approach I think it is really much better. If you need help in testing the fixes, I can do it unless you already have the attack code yourself. > > Also note that variant 1 is talking about exploiting prog_array > bpf feature which had 64-bit access prior to > commit 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT") > That was a fix for JIT and not related to this cpu issue, but > I believe it breaks the existing exploit. Actually I could not get existing exploit working on anything past 4.11 but at that time I could not see any fundamental change in eBPF that would prevent it. Best Regards, Elena.