Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46B86C64ED6 for ; Mon, 27 Feb 2023 08:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230388AbjB0IYy (ORCPT ); Mon, 27 Feb 2023 03:24:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230189AbjB0IYN (ORCPT ); Mon, 27 Feb 2023 03:24:13 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F0D51165A; Mon, 27 Feb 2023 00:24:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677486250; x=1709022250; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zCc6NW7WYHgo5usOtP96CBCwLP0H+JhEtaoEGU4OuyU=; b=Z4gyfwj8+8Uz0vJ1EPp4mowFiOgugKHuZlDNi2f/QyBQJ4Hgz8D69VHr haa2DAfVExPtoqg9pRYHdXMNVu5FGrqzXv8Tu4tc2pigS1zNY6I1A5xqk 29PuT17xtLkEALFhaGp3r74aqIiiuPhcL+J0C6bTL5Itjg8NsgFE25QOU nRbofBFfwwKEy1mLt1kzTU1YpPtbGahI9aWaLDV/3T7tuk+CLLxg22yBg GlncoabpYGTAWZyRrhfOC3ZU6d/0T/D9nzzBEXlZrIcuvCnsMeQI6s3TK LxTw3o4O2WrceTPSiD1h4eJYRsP9KcxN5uVwkZj9hFDUhV8seVPgHndzU w==; X-IronPort-AV: E=McAfee;i="6500,9779,10633"; a="317608724" X-IronPort-AV: E=Sophos;i="5.97,331,1669104000"; d="scan'208";a="317608724" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 00:24:03 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10633"; a="783242067" X-IronPort-AV: E=Sophos;i="5.97,331,1669104000"; d="scan'208";a="783242067" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 00:24:03 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang Subject: [PATCH v12 016/106] KVM: TDX: Refuse to unplug the last cpu on the package Date: Mon, 27 Feb 2023 00:22:15 -0800 Message-Id: <2569a9b652be6e69ef69b5a930cdb170f773b4aa.1677484918.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata In order to reclaim TDX HKID, (i.e. when deleting guest TD), needs to call TDH.PHYMEM.PAGE.WBINVD on all packages. If we have active TDX HKID, refuse to offline the last online cpu to guarantee at least one CPU online per package. Add arch callback for cpu offline. Because TDX doesn't support suspend by the TDX 1.0 spec, this also refuses suspend if TDs are running. If no TD is running, suspend is allowed. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 43 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 5 ++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 12 +++++++-- 8 files changed, 63 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index c30d2d2ad686..f763981b7dbc 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -18,6 +18,7 @@ KVM_X86_OP(check_processor_compatibility) KVM_X86_OP(hardware_enable) KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) +KVM_X86_OP_OPTIONAL_RET0(offline_cpu) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f4e82ee3d668..5ca84fd5bd43 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1543,6 +1543,7 @@ struct kvm_x86_ops { int (*hardware_enable)(void); void (*hardware_disable)(void); void (*hardware_unsetup)(void); + int (*offline_cpu)(void); bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index cdc73c09bf0b..a15ee25d47e0 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -108,6 +108,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .check_processor_compatibility = vmx_check_processor_compat, .hardware_unsetup = vt_hardware_unsetup, + .offline_cpu = tdx_offline_cpu, .hardware_enable = vmx_hardware_enable, .hardware_disable = vmx_hardware_disable, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index b172fcb075b2..8d657bacc050 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -63,6 +63,7 @@ static struct tdx_info tdx_info; */ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; +static atomic_t nr_configured_hkid; static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { @@ -238,7 +239,8 @@ void tdx_mmu_release_hkid(struct kvm *kvm) pr_err("tdh_mng_key_freeid failed. HKID %d is leaked.\n", kvm_tdx->hkid); return; - } + } else + atomic_dec(&nr_configured_hkid); free_hkid: tdx_hkid_free(kvm_tdx); @@ -615,6 +617,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params) if (ret) break; } + if (!ret) + atomic_inc(&nr_configured_hkid); cpus_read_unlock(); free_cpumask_var(packages); if (ret) @@ -833,3 +837,40 @@ void tdx_hardware_unsetup(void) /* kfree accepts NULL. */ kfree(tdx_mng_key_config_lock); } + +int tdx_offline_cpu(void) +{ + int curr_cpu = smp_processor_id(); + cpumask_var_t packages; + int ret = 0; + int i; + + /* No TD is running. Allow any cpu to be offline. */ + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * In order to reclaim TDX HKID, (i.e. when deleting guest TD), need to + * call TDH.PHYMEM.PAGE.WBINVD on all packages to program all memory + * controller with pconfig. If we have active TDX HKID, refuse to + * offline the last online cpu. + */ + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + return -ENOMEM; + for_each_online_cpu(i) { + if (i != curr_cpu) + cpumask_set_cpu(topology_physical_package_id(i), packages); + } + /* Check if this cpu is the last online cpu of this package. */ + if (!cpumask_test_cpu(topology_physical_package_id(curr_cpu), packages)) + ret = -EBUSY; + free_cpumask_var(packages); + if (ret) + /* + * Because it's hard for human operator to understand the + * reason, warn it. + */ + pr_warn_ratelimited("TDX requires all packages to have an online CPU. " + "Delete all TDs in order to offline all CPUs of a package.\n"); + return ret; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index e497a5347329..4960e7d58add 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -142,6 +142,7 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); void tdx_hardware_unsetup(void); bool tdx_is_vm_type_supported(unsigned long type); int tdx_dev_ioctl(void __user *argp); +int tdx_offline_cpu(void); int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap); int tdx_vm_init(struct kvm *kvm); @@ -153,6 +154,7 @@ static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return -ENOS static inline void tdx_hardware_unsetup(void) {} static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } static inline int tdx_dev_ioctl(void __user *argp) { return -EOPNOTSUPP; }; +static inline int tdx_offline_cpu(void) { return 0; } static inline int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { return -EINVAL; }; static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b2dd5670f552..a2a2d62d490a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12119,6 +12119,11 @@ void kvm_arch_hardware_disable(void) drop_user_return_notifiers(); } +int kvm_arch_offline_cpu(unsigned int cpu) +{ + return static_call(kvm_x86_offline_cpu)(); +} + bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu) { return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7b0bef248dd8..f6470338d5fa 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1478,6 +1478,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {} int kvm_arch_hardware_enable(void); void kvm_arch_hardware_disable(void); #endif +int kvm_arch_offline_cpu(unsigned int cpu); int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3f1a55834440..f8495e27d210 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5509,13 +5509,21 @@ static void hardware_disable_nolock(void *junk) __this_cpu_write(hardware_enabled, false); } +__weak int kvm_arch_offline_cpu(unsigned int cpu) +{ + return 0; +} + static int kvm_offline_cpu(unsigned int cpu) { + int r = 0; + mutex_lock(&kvm_lock); - if (kvm_usage_count) + r = kvm_arch_offline_cpu(cpu); + if (!r && kvm_usage_count) hardware_disable_nolock(NULL); mutex_unlock(&kvm_lock); - return 0; + return r; } static void hardware_disable_all_nolock(void) -- 2.25.1