Received: by 2002:a05:7412:a9a2:b0:e2:908c:2ebd with SMTP id o34csp1208244rdh; Fri, 27 Oct 2023 07:40:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG4IqCwthvofvPnwnML/BzE6ZdZywiQDhnkTTYPW895YDn7x7yhnx5zU1Ch918k0ekQMW20 X-Received: by 2002:a0d:d58d:0:b0:5ad:d4be:7d08 with SMTP id x135-20020a0dd58d000000b005add4be7d08mr5270320ywd.17.1698417633925; Fri, 27 Oct 2023 07:40:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698417633; cv=none; d=google.com; s=arc-20160816; b=lteYrOn7lvSboHloAXSfpzAcvvSaqe1ZXPEFD+TzIzAh1L+7QqwbQLaVorrDIqmjwi AdyBNaFd+OX5IDF7r9SjcSulUljucuqS9uTraKgsw5M/JZvP5EUzSm4DkpPlTAlaKW+j oyOyIWDsUhl8ZkoKzZKBcSzwPZD3I98sgKncXqrpQ6u/UmPaWtLnKkDK2tMGv3xBRj/r G5tHSOxzAfnZltENtGuRMyw1+SG1AK7k4FI5BuSqDNO2gpwbJQojTH7EKojpfSEi+ehX oE7wBRDsk+negowUGi5dupjTATMLuuQzj5TrmRD4gDmm63AKvp6UVhbrQ8Apgxt3TCKR l7nA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=7ib1O0KtQFcIuzsnAxnO/yCusjbnTobTJA8lJBPj/ZM=; fh=9bGxxPI+IiFttPs3QhfhBAZDnxipaVmWpf+QSj2Bc/s=; b=vjZ3U4DBpgL43XNE3lQtUfb2kXgkav8TKWFVKivU5qX37WuCIRHJ5IPy/LMAJzy6gJ ZXKmfle87Sff295rXs5MzPJ+W0xrEPdwVIWcG6j1CdAYYVWScVxhb9djYqgxRp+x4vdU ldpXEJYNlEwy7NamQeXvM080GKvZzsLSPUcWe05eI42Fvzk8scvotf3EDdamPDSIaSOH JZNms8vK+QsbWb6cSZNO8i8y7IRm6RGF1eeFqdrO4KMU42sMm1feeusT7QKvY4FemDsZ 09eHe0JlukXW2YZ++ENmAUzwG6gERCPml6q0a8JvbxbIAgxLc9ztKjU2Fa8xiuxD2qPC xY6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Z95Un8nI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id s63-20020a817742000000b005a7bece8902si2631593ywc.381.2023.10.27.07.40.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 07:40:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Z95Un8nI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 2FE788340310; Fri, 27 Oct 2023 07:40:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346111AbjJ0OjJ (ORCPT + 99 others); Fri, 27 Oct 2023 10:39:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346018AbjJ0OjE (ORCPT ); Fri, 27 Oct 2023 10:39:04 -0400 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4E3418F; Fri, 27 Oct 2023 07:39:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698417541; x=1729953541; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=xfVDUL3C6EU2sR+yZZNhlJ+3NJv0rEqWLQFJQSLp6Pk=; b=Z95Un8nIP2g5kboWzTwtTofMVI/z2wIQxNOO+Zvcd+vh5xecf+AdjqBe eLNJS/qfe8XHAO/xd1r3K7uaBOMlEYONxxHSbRLa22Q4rhyEGGBoKF4SH E2TzHHwTYsseEyQnnizLpdWT0F0o9d9Dx3z92bAtwiPVSr5Bz466fROBA 8T1rswWygQZCxl1fs+vvpSuL7xYgVcnYqdAQjNLRj5XPq7oW1gDA4UXBy 4vQum3xhkJYha5ybCD6apZoqrart2aHEjAaPOkxgYHnIO4+Lmc3Pw2Gjd lsyXIKDPviG313r8d5z6AMOUbZCvh3kYBEJKBSBoI5R6sQVFt979k2mDc A==; X-IronPort-AV: E=McAfee;i="6600,9927,10876"; a="606722" X-IronPort-AV: E=Sophos;i="6.03,256,1694761200"; d="scan'208";a="606722" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Oct 2023 07:39:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10876"; a="736094439" X-IronPort-AV: E=Sophos;i="6.03,256,1694761200"; d="scan'208";a="736094439" Received: from dmnassar-mobl.amr.corp.intel.com (HELO desk) ([10.212.203.39]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Oct 2023 07:38:59 -0700 Date: Fri, 27 Oct 2023 07:38:59 -0700 From: Pawan Gupta To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Andy Lutomirski , Jonathan Corbet , Sean Christopherson , Paolo Bonzini , tony.luck@intel.com, ak@linux.intel.com, tim.c.chen@linux.intel.com, Andrew Cooper , Nikolay Borisov Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kvm@vger.kernel.org, Alyssa Milburn , Daniel Sneddon , antonio.gomez.iglesias@linux.intel.com, Greg Kroah-Hartman , Pawan Gupta Subject: [PATCH v4 4/6] x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key Message-ID: <20231027-delay-verw-v4-4-9a3622d4bcf7@linux.intel.com> X-Mailer: b4 0.12.3 References: <20231027-delay-verw-v4-0-9a3622d4bcf7@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231027-delay-verw-v4-0-9a3622d4bcf7@linux.intel.com> X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 27 Oct 2023 07:40:03 -0700 (PDT) The VERW mitigation at exit-to-user is enabled via a static branch mds_user_clear. This static branch is never toggled after boot, and can be safely replaced with an ALTERNATIVE() which is convenient to use in asm. Switch to ALTERNATIVE() to use the VERW mitigation late in exit-to-user path. Also remove the now redundant VERW in exc_nmi() and arch_exit_to_user_mode(). Signed-off-by: Pawan Gupta --- Documentation/arch/x86/mds.rst | 38 +++++++++++++++++++++++++----------- arch/x86/include/asm/entry-common.h | 1 - arch/x86/include/asm/nospec-branch.h | 12 ------------ arch/x86/kernel/cpu/bugs.c | 15 ++++++-------- arch/x86/kernel/nmi.c | 2 -- arch/x86/kvm/vmx/vmx.c | 2 +- 6 files changed, 34 insertions(+), 36 deletions(-) diff --git a/Documentation/arch/x86/mds.rst b/Documentation/arch/x86/mds.rst index e73fdff62c0a..a5c5091b9ccd 100644 --- a/Documentation/arch/x86/mds.rst +++ b/Documentation/arch/x86/mds.rst @@ -95,6 +95,9 @@ The kernel provides a function to invoke the buffer clearing: mds_clear_cpu_buffers() +Also macro CLEAR_CPU_BUFFERS is meant to be used in ASM late in exit-to-user +path. This macro works for cases where GPRs can't be clobbered. + The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state (idle) transitions. @@ -138,17 +141,30 @@ Mitigation points When transitioning from kernel to user space the CPU buffers are flushed on affected CPUs when the mitigation is not disabled on the kernel - command line. The migitation is enabled through the static key - mds_user_clear. - - The mitigation is invoked in prepare_exit_to_usermode() which covers - all but one of the kernel to user space transitions. The exception - is when we return from a Non Maskable Interrupt (NMI), which is - handled directly in do_nmi(). - - (The reason that NMI is special is that prepare_exit_to_usermode() can - enable IRQs. In NMI context, NMIs are blocked, and we don't want to - enable IRQs with NMIs blocked.) + command line. The mitigation is enabled through the feature flag + X86_FEATURE_CLEAR_CPU_BUF. + + The mitigation is invoked just before transitioning to userspace after + user registers are restored. This is done to minimize the window in + which kernel data could be accessed after VERW e.g. via an NMI after + VERW. + + **Corner case not handled** + Interrupts returning to kernel don't clear CPUs buffers since the + exit-to-user path is expected to do that anyways. But, there could be + a case when an NMI is generated in kernel after the exit-to-user path + has cleared the buffers. This case is not handled and NMI returning to + kernel don't clear CPU buffers because: + + 1. It is rare to get an NMI after VERW, but before returning to userspace. + 2. For an unprivileged user, there is no known way to make that NMI + less rare or target it. + 3. It would take a large number of these precisely-timed NMIs to mount + an actual attack. There's presumably not enough bandwidth. + 4. The NMI in question occurs after a VERW, i.e. when user state is + restored and most interesting data is already scrubbed. Whats left + is only the data that NMI touches, and that may or may not be of + any interest. 2. C-State transition diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index ce8f50192ae3..7e523bb3d2d3 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -91,7 +91,6 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, static __always_inline void arch_exit_to_user_mode(void) { - mds_user_clear_cpu_buffers(); amd_clear_divider(); } #define arch_exit_to_user_mode arch_exit_to_user_mode diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 005e69f93115..12b8e86678bf 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -553,7 +553,6 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp); DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb); -DECLARE_STATIC_KEY_FALSE(mds_user_clear); DECLARE_STATIC_KEY_FALSE(mds_idle_clear); DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush); @@ -585,17 +584,6 @@ static __always_inline void mds_clear_cpu_buffers(void) asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc"); } -/** - * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability - * - * Clear CPU buffers if the corresponding static key is enabled - */ -static __always_inline void mds_user_clear_cpu_buffers(void) -{ - if (static_branch_likely(&mds_user_clear)) - mds_clear_cpu_buffers(); -} - /** * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability * diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 10499bcd4e39..00aab0c0937f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -111,9 +111,6 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); /* Control unconditional IBPB in switch_mm() */ DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb); -/* Control MDS CPU buffer clear before returning to user space */ -DEFINE_STATIC_KEY_FALSE(mds_user_clear); -EXPORT_SYMBOL_GPL(mds_user_clear); /* Control MDS CPU buffer clear before idling (halt, mwait) */ DEFINE_STATIC_KEY_FALSE(mds_idle_clear); EXPORT_SYMBOL_GPL(mds_idle_clear); @@ -252,7 +249,7 @@ static void __init mds_select_mitigation(void) if (!boot_cpu_has(X86_FEATURE_MD_CLEAR)) mds_mitigation = MDS_MITIGATION_VMWERV; - static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) && (mds_nosmt || cpu_mitigations_auto_nosmt())) @@ -356,7 +353,7 @@ static void __init taa_select_mitigation(void) * For guests that can't determine whether the correct microcode is * present on host, enable the mitigation for UCODE_NEEDED as well. */ - static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); if (taa_nosmt || cpu_mitigations_auto_nosmt()) cpu_smt_disable(false); @@ -424,7 +421,7 @@ static void __init mmio_select_mitigation(void) */ if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM))) - static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); else static_branch_enable(&mmio_stale_data_clear); @@ -484,12 +481,12 @@ static void __init md_clear_update_mitigation(void) if (cpu_mitigations_off()) return; - if (!static_key_enabled(&mds_user_clear)) + if (!boot_cpu_has(X86_FEATURE_CLEAR_CPU_BUF)) goto out; /* - * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data - * mitigation, if necessary. + * X86_FEATURE_CLEAR_CPU_BUF is now enabled. Update MDS, TAA and MMIO + * Stale Data mitigation, if necessary. */ if (mds_mitigation == MDS_MITIGATION_OFF && boot_cpu_has_bug(X86_BUG_MDS)) { diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index a0c551846b35..ebfff8dca661 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -551,8 +551,6 @@ DEFINE_IDTENTRY_RAW(exc_nmi) if (this_cpu_dec_return(nmi_state)) goto nmi_restart; - if (user_mode(regs)) - mds_user_clear_cpu_buffers(); if (IS_ENABLED(CONFIG_NMI_CHECK_CPU)) { WRITE_ONCE(nsp->idt_seq, nsp->idt_seq + 1); WARN_ON_ONCE(nsp->idt_seq & 0x1); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 72e3943f3693..24e8694b83fc 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7229,7 +7229,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, /* L1D Flush includes CPU buffer clear to mitigate MDS */ if (static_branch_unlikely(&vmx_l1d_should_flush)) vmx_l1d_flush(vcpu); - else if (static_branch_unlikely(&mds_user_clear)) + else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF)) mds_clear_cpu_buffers(); else if (static_branch_unlikely(&mmio_stale_data_clear) && kvm_arch_has_assigned_device(vcpu->kvm)) -- 2.34.1