Received: by 10.223.185.116 with SMTP id b49csp296288wrg; Fri, 2 Mar 2018 19:34:07 -0800 (PST) X-Google-Smtp-Source: AG47ELuY7dw10qcmwJQXdIQ/xe0IzFlqBvxztDy9OkzcTXUgbfbekOnn/o1iRjtftkmiJsxnoL98 X-Received: by 2002:a17:902:7082:: with SMTP id z2-v6mr3876804plk.130.1520048046905; Fri, 02 Mar 2018 19:34:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520048046; cv=none; d=google.com; s=arc-20160816; b=Xa0ZWxN/qc4YAsEjvYd3CQ/MkQOIzbR3d1UvzS0iLjHd6ShwHcRTtyhENFQ1H2MUnM CZJisVeEnc/E2KkApNZGWuoue4YTzG9kLFMpTwNcGA3FrUrVB9HmNau1bri0lbmcTAo5 bwJfI0bL8pckNIBYzTRx/6M7RuzzSpg2kwxe2L/Ea2SQ5P+xsE2EV7JhprYuh2AobtT6 ngw79KJduD5XOrxMWRaGUPsi4jk2jtdU/7JDDdx5aEN21LfIaQWyKxHU5yzch2WBx5oS Snw7QtHGU/jdJ2hHwrefV1hb1Ya0MDdupkjUnmCrUOnzFk9DKCqEn6zWJfoLLXdRW9Od MbtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=Ltpzrwu92nSCsiw5UiFnheZ8I1GtxGNCyYxszxSJY5w=; b=a0kaVGE19JrW71CJu9xMMzLN++KpD8EkkBorp39JKUDTQ4T5stUzH3zV+UUdCPPdYR Vskn7jJz2gBrAT81CRz9dK3enxDK/Aqiwo3RwbuQuKZsEPdD+FXFbiAAUlMUEYqk+13x k9lr0GoGHkJdLr/Z9Od0IP8kN4oJIuwKkBhUaLGs+x6AXieXiJBnlDk3VNm0wgPsFctz l/smpa/X724qnVtMKbT4oiOOICxexM9YHSVbFy+0E2EhEtbtseeSBcKaUwtPOIcD4TxJ /moLGMCCUXJH+jy3pWdELNPyDaACFz3vOhqGrP3vtT+nX1N3Z0z8uozu2mnnqYRtgZdx o02w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p14-v6si5749854plo.778.2018.03.02.19.33.52; Fri, 02 Mar 2018 19:34:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934037AbeCBVc3 (ORCPT + 99 others); Fri, 2 Mar 2018 16:32:29 -0500 Received: from mga11.intel.com ([192.55.52.93]:55420 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934016AbeCBVc0 (ORCPT ); Fri, 2 Mar 2018 16:32:26 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Mar 2018 13:32:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,413,1515484800"; d="scan'208";a="22145325" Received: from skl-02.jf.intel.com ([10.54.74.43]) by orsmga008.jf.intel.com with ESMTP; 02 Mar 2018 13:32:25 -0800 From: Tim Chen To: stable@vger.kernel.org, Greg Kroah-Hartman Cc: Tim Chen , Andy Lutomirski , Nadav Amit , Thomas Gleixner , Andrew Morton , Arjan van de Ven , Borislav Petkov , Dave Hansen , Linus Torvalds , Mel Gorman , Peter Zijlstra , Rik van Riel , Ingo Molnar , David Woodhouse , ak@linux.intel.com, karahmed@amazon.de, pbonzini@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux@dominikbrodowski.net, gregkh@linux-foundation.org Subject: [PATCH 2/2] x86/speculation: Use Indirect Branch Prediction Barrier in context switch Date: Fri, 2 Mar 2018 13:32:10 -0800 Message-Id: X-Mailer: git-send-email 2.9.4 In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org commit: 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 Flush indirect branches when switching into a process that marked itself non dumpable. This protects high value processes like gpg better, without having too high performance overhead. If done naïvely, we could switch to a kernel idle thread and then back to the original process, such as: process A -> idle -> process A In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after a hiatus. To avoid the redundant IBPB, which is expensive, we track the last mm user context ID. The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario. For those cases where tlb_defer_switch_to_init_mm() returns true (non PCID), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case. Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling. Signed-off-by: Tim Chen Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/mm/tlb.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 94146f6..99185a0 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -68,6 +68,8 @@ static inline void invpcid_flush_all_nonglobals(void) struct tlb_state { struct mm_struct *active_mm; int state; + /* last user mm's ctx id */ + u64 last_ctx_id; /* * Access to this CR4 shadow and to H/W CR4 is protected by diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index fa74bf5..eac92e2 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -106,6 +107,28 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, unsigned cpu = smp_processor_id(); if (likely(prev != next)) { + u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); + + /* + * Avoid user/user BTB poisoning by flushing the branch + * predictor when switching between processes. This stops + * one process from doing Spectre-v2 attacks on another. + * + * As an optimization, flush indirect branches only when + * switching into processes that disable dumping. This + * protects high value processes like gpg, without having + * too high performance overhead. IBPB is *expensive*! + * + * This will not flush branches when switching into kernel + * threads. It will also not flush if we switch to idle + * thread and back to the same process. It will flush if we + * switch to a different non-dumpable process. + */ + if (tsk && tsk->mm && + tsk->mm->context.ctx_id != last_ctx_id && + get_dumpable(tsk->mm) != SUID_DUMP_USER) + indirect_branch_prediction_barrier(); + if (IS_ENABLED(CONFIG_VMAP_STACK)) { /* * If our current stack is in vmalloc space and isn't @@ -120,6 +143,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, set_pgd(pgd, init_mm.pgd[stack_pgd_index]); } + /* + * Record last user mm's context id, so we can avoid + * flushing branch buffer with IBPB if we switch back + * to the same user. + */ + if (next != &init_mm) + this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); this_cpu_write(cpu_tlbstate.active_mm, next); -- 2.9.4