Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2681231rwd; Sun, 28 May 2023 22:05:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6oH1zrpe7rf45AqrshwdEzN6nmr6dvoJYBe6UPtyeduTotU1XKe0QBawIY98TTtLC4EXUB X-Received: by 2002:a17:902:9b91:b0:1a9:21bc:65f8 with SMTP id y17-20020a1709029b9100b001a921bc65f8mr8449470plp.11.1685336736593; Sun, 28 May 2023 22:05:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685336736; cv=none; d=google.com; s=arc-20160816; b=tVFS3dEmAu5wqp2y2yEhT4hdzm7nlT8OxsuBw/fhFPOAaGX2bPjm6/N1BqO33LTMCE NWlTYDNsC49Abcksp80V8dbRbnmOnDxzqi4zCMVi6KL1mCIU/EykmfwlLy+GXKhFy1ht 3UBM73SehHwJDHf/uiStp1ZoVP12LTrYttVPG81GX7434BpayJ+0DKLRHInD0sBjGzg4 V5nD5aYX8xjZP5/ixx6x9+IqaX8gkKPbNN++qvMZd7qJIRfMMgGy2yORBvo48LqAkBzm 6TEokDz9mPd4GkWN6bSAIbEWb/JZ6O6lLuyVNRT8oftP1rjFUT0+2gzp+cIsgPw9TMg0 xrYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=o9NLGyr+kEr+GbCrYLUEp4cNPvpn77ILWcjXI01PGx0=; b=1JuhXDHcaJ+SkyKLYdDUmzmvRAomEccli/AJ0TNL5lTerEA/tLQNeBGQBaTa2vQpEx lMSXXLaWZ9+3ip2l1Y9r8rRJ2GHDsRyHAu0moOI9pU4rPxo1ztvUUWRDfP9lm1HwjRT9 I0OFUHSatEjnWoXgDrK/ie8JPz3iO5HoVSG8R6PX/s3d20ZEZ7idDRZY/KQ6uFQHNGAo LtZwrudZopuJYevKxrGuq0juGlBMrvPpdCQUKttjBf855GT5VgUl5Div8ruB6qsXwNrt I/nnRbI0wqf+2UrV+RDe2y6GjHuDEJ/8SC6pvUHgPypGCYP4ibdgBopCHiMNUYr+uo7x i/jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cd19KN+k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u11-20020a170902e5cb00b001ae0691dff4si9543690plf.630.2023.05.28.22.05.24; Sun, 28 May 2023 22:05:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=cd19KN+k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231713AbjE2EWn (ORCPT + 99 others); Mon, 29 May 2023 00:22:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44086 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231538AbjE2EVW (ORCPT ); Mon, 29 May 2023 00:21:22 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B73EBFD; Sun, 28 May 2023 21:20:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685334059; x=1716870059; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jJnvrNxImFNy3nQ4NwCYmLwBAbetyaY6+TP5oMgzFqQ=; b=cd19KN+kdWC7b5duDEFitSylBavMb6ykYWGBR0I6tm9z2thNh938UEjG +6K4ysTg8QSyWDXeqnCjXG53LSEiIy5BWIy2BCB2I039iiL/ONExw3jrU Is3RAcxKTW/RYOI8vFFurr2esetW4/BmMHybTXpTB6Do+yZtAEiMjpNs1 /XuhCZVW3dYeUSiuItSKtKJwPU31/ccMAk3vWVmrSfYSqyNlRU2cAbzDt EAGes2Z2CxWYxjr17WFD2J2L2PMyhaM3LshcVDEhhKkpmJh3XLWNwBe38 Po+fwy3CMqjrKFCO5743s9XjUlaoH9DeSpcSrlfnfdoAGRSTcarjfRkkU A==; X-IronPort-AV: E=McAfee;i="6600,9927,10724"; a="418094349" X-IronPort-AV: E=Sophos;i="6.00,200,1681196400"; d="scan'208";a="418094349" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2023 21:20:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10724"; a="683419358" X-IronPort-AV: E=Sophos;i="6.00,200,1681196400"; d="scan'208";a="683419358" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2023 21:20:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com Subject: [PATCH v14 022/113] KVM: TDX: Refuse to unplug the last cpu on the package Date: Sun, 28 May 2023 21:19:04 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata In order to reclaim TDX HKID, (i.e. when deleting guest TD), needs to call TDH.PHYMEM.PAGE.WBINVD on all packages. If we have active TDX HKID, refuse to offline the last online cpu to guarantee at least one CPU online per package. Add arch callback for cpu offline. Because TDX doesn't support suspend by the TDX 1.0 spec, this also refuses suspend if TDs are running. If no TD is running, suspend is allowed. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 43 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 5 ++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 12 +++++++-- 8 files changed, 63 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 8dc33d14e7da..54fe5baf99ed 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -18,6 +18,7 @@ KVM_X86_OP(check_processor_compatibility) KVM_X86_OP(hardware_enable) KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) +KVM_X86_OP_OPTIONAL_RET0(offline_cpu) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0d460c425c75..f5a861c8b88b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1540,6 +1540,7 @@ struct kvm_x86_ops { int (*hardware_enable)(void); void (*hardware_disable)(void); void (*hardware_unsetup)(void); + int (*offline_cpu)(void); bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 2c2afbc46254..fd49e931f6bb 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -123,6 +123,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .check_processor_compatibility = vmx_check_processor_compat, .hardware_unsetup = vt_hardware_unsetup, + .offline_cpu = tdx_offline_cpu, .hardware_enable = vt_hardware_enable, .hardware_disable = vmx_hardware_disable, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 3ce5a9bee613..646fbf2b630e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -63,6 +63,7 @@ static struct tdx_info tdx_info; */ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; +static atomic_t nr_configured_hkid; static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { @@ -231,7 +232,8 @@ void tdx_mmu_release_hkid(struct kvm *kvm) pr_err("tdh_mng_key_freeid failed. HKID %d is leaked.\n", kvm_tdx->hkid); return; - } + } else + atomic_dec(&nr_configured_hkid); free_hkid: tdx_hkid_free(kvm_tdx); @@ -606,6 +608,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params) if (ret) break; } + if (!ret) + atomic_inc(&nr_configured_hkid); cpus_read_unlock(); free_cpumask_var(packages); if (ret) { @@ -857,3 +861,40 @@ void tdx_hardware_unsetup(void) /* kfree accepts NULL. */ kfree(tdx_mng_key_config_lock); } + +int tdx_offline_cpu(void) +{ + int curr_cpu = smp_processor_id(); + cpumask_var_t packages; + int ret = 0; + int i; + + /* No TD is running. Allow any cpu to be offline. */ + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * In order to reclaim TDX HKID, (i.e. when deleting guest TD), need to + * call TDH.PHYMEM.PAGE.WBINVD on all packages to program all memory + * controller with pconfig. If we have active TDX HKID, refuse to + * offline the last online cpu. + */ + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + return -ENOMEM; + for_each_online_cpu(i) { + if (i != curr_cpu) + cpumask_set_cpu(topology_physical_package_id(i), packages); + } + /* Check if this cpu is the last online cpu of this package. */ + if (!cpumask_test_cpu(topology_physical_package_id(curr_cpu), packages)) + ret = -EBUSY; + free_cpumask_var(packages); + if (ret) + /* + * Because it's hard for human operator to understand the + * reason, warn it. + */ + pr_warn_ratelimited("TDX requires all packages to have an online CPU. " + "Delete all TDs in order to offline all CPUs of a package.\n"); + return ret; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 8482d6aaebca..b13774ee94a0 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -139,6 +139,7 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); void tdx_hardware_unsetup(void); bool tdx_is_vm_type_supported(unsigned long type); +int tdx_offline_cpu(void); int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap); int tdx_vm_init(struct kvm *kvm); @@ -149,6 +150,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return -ENOSYS; } static inline void tdx_hardware_unsetup(void) {} static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } +static inline int tdx_offline_cpu(void) { return 0; } static inline int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { return -EINVAL; }; static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2a85f437b6ce..3b7b90d3499a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12253,6 +12253,11 @@ void kvm_arch_hardware_disable(void) drop_user_return_notifiers(); } +int kvm_arch_offline_cpu(unsigned int cpu) +{ + return static_call(kvm_x86_offline_cpu)(); +} + bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu) { return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 833a80313cee..44d8209d9869 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1475,6 +1475,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {} int kvm_arch_hardware_enable(void); void kvm_arch_hardware_disable(void); #endif +int kvm_arch_offline_cpu(unsigned int cpu); int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 186544f6a515..7078698924bb 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5386,13 +5386,21 @@ static void hardware_disable_nolock(void *junk) __this_cpu_write(hardware_enabled, false); } +__weak int kvm_arch_offline_cpu(unsigned int cpu) +{ + return 0; +} + static int kvm_offline_cpu(unsigned int cpu) { + int r = 0; + mutex_lock(&kvm_lock); - if (kvm_usage_count) + r = kvm_arch_offline_cpu(cpu); + if (!r && kvm_usage_count) hardware_disable_nolock(NULL); mutex_unlock(&kvm_lock); - return 0; + return r; } static void hardware_disable_all_nolock(void) -- 2.25.1