Received: by 10.223.176.5 with SMTP id f5csp3492652wra; Mon, 29 Jan 2018 14:05:53 -0800 (PST) X-Google-Smtp-Source: AH8x224DW/inDutHYeO0YZPCt9ME0FKIM/MOWD/SvbEJR0JdGStWloHjlRhafLrPs41K9EUcdHsQ X-Received: by 2002:a17:902:aa45:: with SMTP id c5-v6mr7374939plr.93.1517263553675; Mon, 29 Jan 2018 14:05:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517263553; cv=none; d=google.com; s=arc-20160816; b=KlLed2LSdcVYIgOpa1xFjn0o3ht+VVcX7SG4oDPrbTpGnXYmtzxchPpRMv6iPysYII WNz8oPYUwYEjfyKphewhufC5SUNGsINs7tEfbKhbFewcCXwADBFxam7eIgoRQoSyXM9s KhSuGYEv5UBlJKlZLAcnecpa2ISfNijmHD/yDNL9qDyi83SJ63JY3/4jqoZnLu0I7/Bv vtneWQHeREZVJjFqPTQThpMz8UyQ3Lsfa7ikuu82hCXBd1R1CxOBMXATqrR8SjLW9P3D gTzgTcJu/w+L6BllNaCyznV2wdnwNkSs8VDHdS/WH41yyqG674ruPcLPH6+STvLosGGM rhLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature :arc-authentication-results; bh=1dfz5YPk3SCSLHXzW8OyyjfnXBJbnqVw5bOIX3hLuPo=; b=quKaNrPYqrYaWPWM/YGN5AAZ448QgvUIj3xfeDItWX0ytdW/dU/QS2UfGH9eAhMgyp OH4C/4ZftOlAzGWmYX3SfYLw/+s+nhGzH722Li7M78t1nMHDm2xW7CbdcpDj1t0VBt1k B396M2zC0n2ieRAvszZwe+2BVBya0fHPZXtNyk7ZF60aLRlwnXciEk2z3jtUNykTuahg n5V2BgMpkHrQldLFHrFlCBVzmV8DE5s1xQtRFmSvYP6U391vrJeWm/NOgrSNvtrNfhur jm+DR4tQHALVCO3emL0Ln3ssXziF+OVlRaZmYL1VwyK+ZvdYGtcxnzU1s0Nti6bC3l5e L6kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.co.uk header.s=amazon201209 header.b=qu0bMrUf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.co.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l24si8034411pgo.128.2018.01.29.14.05.38; Mon, 29 Jan 2018 14:05:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.co.uk header.s=amazon201209 header.b=qu0bMrUf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.co.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751592AbeA2WFP (ORCPT + 99 others); Mon, 29 Jan 2018 17:05:15 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:29786 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751402AbeA2WFO (ORCPT ); Mon, 29 Jan 2018 17:05:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazon201209; t=1517263513; x=1548799513; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=1dfz5YPk3SCSLHXzW8OyyjfnXBJbnqVw5bOIX3hLuPo=; b=qu0bMrUflNbxVzI1RaIJ9ybuk7FO8O5/KM9QAtrXaAZwuWpgZVmYlZOK pRrYdzDToZkKR9vLWla/tdhM2TsfO2RfmBJcxkKgAs6Kz4kszsZUiYwzD jXHXHBqLHxvX+xIfu5cz3/onRboqqgGGu+RQdS/O7t3edY2bQYaqrA1hQ A=; X-IronPort-AV: E=Sophos;i="5.46,432,1511827200"; d="scan'208";a="591069749" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Jan 2018 22:05:07 +0000 Received: from uc8d3ff76b9bc5848a9cc.ant.amazon.com (iad1-ws-svc-lb91-vlan3.amazon.com [10.0.103.150]) by email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id w0TM4uQ4087162 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 29 Jan 2018 22:04:59 GMT Received: from uc8d3ff76b9bc5848a9cc.ant.amazon.com (localhost [127.0.0.1]) by uc8d3ff76b9bc5848a9cc.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id w0TM4rF4003850; Mon, 29 Jan 2018 22:04:53 GMT Received: (from dwmw@localhost) by uc8d3ff76b9bc5848a9cc.ant.amazon.com (8.15.2/8.15.2/Submit) id w0TM4pZo003847; Mon, 29 Jan 2018 22:04:51 GMT From: David Woodhouse To: arjan@linux.intel.com, tglx@linutronix.de, karahmed@amazon.de, x86@kernel.org, linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com, bp@alien8.de, peterz@infradead.org, pbonzini@redhat.com, ak@linux.intel.com, torvalds@linux-foundation.org, gregkh@linux-foundation.org, mingo@kernel.org, luto@kernel.org, linux@dominikbrodowski.net Subject: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch Date: Mon, 29 Jan 2018 22:04:47 +0000 Message-Id: <1517263487-3708-1-git-send-email-dwmw@amazon.co.uk> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Tim Chen Flush indirect branches when switching into a process that marked itself non dumpable. This protects high value processes like gpg better, without having too high performance overhead. If done naïvely, we could switch to a kernel idle thread and then back to the original process, such as: process A -> idle -> process A In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after a hiatus. To avoid the redundant IBPB, which is expensive, we track the last mm user context ID. The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario. For those cases where tlb_defer_switch_to_init_mm() returns true (non PCID), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case. Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling. Signed-off-by: Tim Chen Signed-off-by: David Woodhouse --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/mm/tlb.c | 33 ++++++++++++++++++++++++++++++++- 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3effd3c..4405c4b 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -174,6 +174,8 @@ struct tlb_state { struct mm_struct *loaded_mm; u16 loaded_mm_asid; u16 next_asid; + /* last user mm's ctx id */ + u64 last_ctx_id; /* * We can be in one of several states: diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index a156195..7489890 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -6,13 +6,14 @@ #include #include #include +#include #include #include +#include #include #include #include -#include /* * TLB flushing, formerly SMP-only @@ -219,6 +220,27 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } else { u16 new_asid; bool need_flush; + u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); + + /* + * Avoid user/user BTB poisoning by flushing the branch + * predictor when switching between processes. This stops + * one process from doing Spectre-v2 attacks on another. + * + * As an optimization, flush indirect branches only when + * switching into processes that disable dumping. This + * protects high value processes like gpg, without having + * too high performance overhead. IBPB is *expensive*! + * + * This will not flush branches when switching into kernel + * threads. It will also not flush if we switch to idle + * thread and back to the same process. It will flush if we + * switch to a different non-dumpable process. + */ + if (tsk && tsk->mm && + tsk->mm->context.ctx_id != last_ctx_id && + get_dumpable(tsk->mm) != SUID_DUMP_USER) + indirect_branch_prediction_barrier(); if (IS_ENABLED(CONFIG_VMAP_STACK)) { /* @@ -268,6 +290,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); } + /* + * Record last user mm's context id, so we can avoid + * flushing branch buffer with IBPB if we switch back + * to the same user. + */ + if (next != &init_mm) + this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.loaded_mm, next); this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid); } @@ -345,6 +375,7 @@ void initialize_tlbstate_and_flush(void) write_cr3(build_cr3(mm->pgd, 0)); /* Reinitialize tlbstate. */ + this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id); this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); this_cpu_write(cpu_tlbstate.next_asid, 1); this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); -- 2.7.4