Received: by 10.223.176.46 with SMTP id f43csp2818274wra; Thu, 25 Jan 2018 15:58:12 -0800 (PST) X-Google-Smtp-Source: AH8x226z3RuPmROGgk2VTxFhMnq3SW/ZfU0Ys5cvtnaq1x5BohtA/StkN4KwT7/0BJlJXgNiEIsN X-Received: by 10.98.181.14 with SMTP id y14mr17582179pfe.216.1516924692348; Thu, 25 Jan 2018 15:58:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516924692; cv=none; d=google.com; s=arc-20160816; b=X2s9VOpTSBrVrr0D0PVfYD7Fww7rN0sL4/0hs/zsZHhktHHkZomgYmLGNRBkqo2q29 /VHRxIjyvovPJTttXAZ8+egmBYB7chYW55kNomiwVqjBDqGcVzRJMyQ0wyOqtOkcNunu 7oP9BTo7zkmt8xCahQ6mW2H0BDV2biDcMd0P76QHY7oi5vUKsecdoBzlsLpEImByyfWW dQVR86Tja+5x6HUEBtCQqxPdocGFZ/Gzy/XoTxIjaAT41pHyJN2fxpnDbrFCB7bncfML kCMUzzvvAZh0qm7TueVeTtxrUVJI0SocSEHw7XjONXKBtXHiuBbbeRxvsumNGjFQhcXf EWTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=ltbqBymDS1FARxExCSFmKOkCXvWMDQVURfTyb/3LDlQ=; b=Z3lxqp7TIi5H/NZ80Tn2nFhDHh5mOjvHvX5ewMf8Iv+JhaaGePryzuZbZNTm5Tzb9w Ee+QF9Ulp2PmuLF3MyWsokc8Xqd4Qgw93NK9BCg6lz1yzNQI2cbXSXnZm3QI1G/N08ti uRjKvrAaL+D2M0FTY9eFqUHUSGf2tfEi4vv/3ZlNi93kUfLOOlrSKlSCuSOWVRhwxseX JPRV260l54QaZbXBqz8ChEuXcyULzZx2di2rU+Hb2jWbSEDtLjVnfMa+/nYDOIcGByEw NjBSid8hdneTJ0NNb6P1D7UJ/XGmNXtUosRRGA1JvxAxmzSiDlez3mtC1O2qbsRkLj7Q 2aBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b63si5344479pfe.50.2018.01.25.15.57.58; Thu, 25 Jan 2018 15:58:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751630AbeAYX5g (ORCPT + 99 others); Thu, 25 Jan 2018 18:57:36 -0500 Received: from mga09.intel.com ([134.134.136.24]:63444 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751353AbeAYX5f (ORCPT ); Thu, 25 Jan 2018 18:57:35 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jan 2018 15:57:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,414,1511856000"; d="scan'208";a="12678151" Received: from skl-02.jf.intel.com ([10.54.74.43]) by orsmga007.jf.intel.com with ESMTP; 25 Jan 2018 15:57:34 -0800 From: Tim Chen To: linux-kernel@vger.kernel.org Cc: Tim Chen , KarimAllah Ahmed , Andi Kleen , Andrea Arcangeli , Andy Lutomirski , Arjan van de Ven , Ashok Raj , Asit Mallick , Borislav Petkov , Dan Williams , Dave Hansen , David Woodhouse , Greg Kroah-Hartman , "H . Peter Anvin" , Ingo Molnar , Janakarajan Natarajan , Joerg Roedel , Jun Nakajima , Laura Abbott , Linus Torvalds , Masami Hiramatsu , Paolo Bonzini , Peter Zijlstra , rkrcmar@redhat.com, Thomas Gleixner , Tom Lendacky , x86@kernel.org Subject: [PATCH v2] x86/ibpb: Skip IBPB when we switch back to same user process Date: Thu, 25 Jan 2018 15:37:17 -0800 Message-Id: X-Mailer: git-send-email 2.9.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling. Here's an update of this patch based on Andy's suggestion. We could switch to a kernel idle thread and then back to the original process such as: process A -> idle -> process A In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after an hiatus. We track the last mm user context id before we switch to init_mm by calling leave_mm when tlb_defer_switch_to_init_mm returns false (pcid available). The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario. For those cases where tlb_defer_switch_to_init_mm returns true (non pcid), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case. v2: 1. Save last user context id instead of last user mm to avoid the problem of recycled mm Signed-off-by: Tim Chen --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/mm/tlb.c | 23 ++++++++++++++++------- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3effd3c..4405c4b 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -174,6 +174,8 @@ struct tlb_state { struct mm_struct *loaded_mm; u16 loaded_mm_asid; u16 next_asid; + /* last user mm's ctx id */ + u64 last_ctx_id; /* * We can be in one of several states: diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 33f5f97..2179b90 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -220,6 +220,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } else { u16 new_asid; bool need_flush; + u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); /* * Avoid user/user BTB poisoning by flushing the branch predictor @@ -230,14 +231,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * switching into processes that disable dumping. * * This will not flush branches when switching into kernel - * threads, but it would flush them when switching to the - * idle thread and back. - * - * It might be useful to have a one-off cache here - * to also not flush the idle case, but we would need some - * kind of stable sequence number to remember the previous mm. + * threads. It will also not flush if we switch to idle + * thread and back to the same process. It will flush if we + * switch to a different non-dumpable process. */ - if (tsk && tsk->mm && get_dumpable(tsk->mm) != SUID_DUMP_USER) + if (tsk && tsk->mm && + tsk->mm->context.ctx_id != last_ctx_id && + get_dumpable(tsk->mm) != SUID_DUMP_USER) indirect_branch_prediction_barrier(); if (IS_ENABLED(CONFIG_VMAP_STACK)) { @@ -288,6 +288,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); } + /* + * Record last user mm's context id, so we can avoid + * flushing branch buffer with IBPB if we switch back + * to the same user. + */ + if (next != &init_mm) + this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.loaded_mm, next); this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid); } @@ -365,6 +373,7 @@ void initialize_tlbstate_and_flush(void) write_cr3(build_cr3(mm->pgd, 0)); /* Reinitialize tlbstate. */ + this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id); this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); this_cpu_write(cpu_tlbstate.next_asid, 1); this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); -- 2.9.4