Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp262157rwb; Tue, 25 Jul 2023 15:36:55 -0700 (PDT) X-Google-Smtp-Source: APBJJlHp12sUgpMl7Ah3Zvsw3NWYq3yNI4PGlKKn3F99SqQqp2HC5j7sWsl4QWBDT2pngkJKouvB X-Received: by 2002:a05:6e02:1b8c:b0:343:c8b1:b7f0 with SMTP id h12-20020a056e021b8c00b00343c8b1b7f0mr418814ili.23.1690324614968; Tue, 25 Jul 2023 15:36:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690324614; cv=none; d=google.com; s=arc-20160816; b=VAZHm8bAwwtU87Lnx4iLQgeV5hRpxzmqEwdNKwNQx6UiWK3dRJXCBSjVi6IHBBTYth busRhxYdaMiRj/dHKMiESGHZ2xzcBCss+UT6slG52womDkZ5h5HfjjpElOgGMZPYuPk0 D3ln0grYmCAqkSXnsNUbyvU20h63CxhrcaO1fBbOykWAsMcTZS5etLLhY/WmhsLuF71S GUwW6WZ+WgBByMUWqluJEE8B4d5Vu4ZQXsQiDqb+ncrBptskl/jJhgEm4YVouASa6+uc 0Erk4rabmEu9S5ykBUnkVyaxYsIr6ZL/B3z51YuCfFKeolI3fiLLW/XA7S2FdaH6bO6W Yh3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=AmQpopwYgj4KRsFVp/Ioazq/0Db/BUeu6sKO2TQI+tA=; fh=juxwNcA6iKwLISFgUCNwdoIYC0NKLUseq3xZdq25RR4=; b=QIB4g8k5CD4IcfmQPkLwzC0Eln/FeUa3kQn07g66sDb0OpKFVRQVfHRzeIKvhAvaw7 GmPZVRdomDfr/ngODmJnx4D0HaQ/JEeJekIEBWo4yU9F1PYBhsOF7UlE7VAJIDlwKmO7 oUv/yrJrEvMuIDhdy7a0vd/bB4SWtHQKq6cU/Ou9ncH3jgkTye9GCEHXGV+q2zNMH8wv Thzy1zgZpJ0jtPNnpNNImcm33vugGAHyNmIISurqS6QLgHC9Cb8yzXT46rpI+gtEPcBS hXaTkvrql4C884+A+Tu8kSbhSYBjOTb7UaFjTXG8LW9syHMoOqxongXkESXBsajghe61 gQyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gssJLtyH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d20-20020a637354000000b0056382376f87si10970137pgn.203.2023.07.25.15.36.42; Tue, 25 Jul 2023 15:36:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gssJLtyH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231295AbjGYWRT (ORCPT + 99 others); Tue, 25 Jul 2023 18:17:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231978AbjGYWQI (ORCPT ); Tue, 25 Jul 2023 18:16:08 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E1FC2696; Tue, 25 Jul 2023 15:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690323346; x=1721859346; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lnuf3/jjPZZS0jUzvhyB2DBfnZhKXUEO0TV4zg7LKf8=; b=gssJLtyH6IXneR2b3bsBjQgEePfZKbZa/B8HPmsQbQB1cX3BuGYbeZ6W oEjrFxx8lDm9w5Wi68xmEwXUroS6yJA3GRK92+gk0/n2CpwARZv7iU9nv fbjYkIJJxwSFRf8j0s4zfSL9KfOmvTDZWQyHXEM1SrCYf69ZwVwpon6QN tDhFD7H5bVEVScn0e5LLFXzAPkcuHIQrsv+lrcKUPV1qD5RKqTP0RaZ1h 9Z02i7nXXXIW5mLAZqu6WHLSyq+fg59inhsOHH8XCfSwO8UGmCTwN7Z+f 9cDbwXnUOn6SrDb3xMY+lBa6M8AZPksWn58sM1RhTi8jhDB91gkp9m81e g==; X-IronPort-AV: E=McAfee;i="6600,9927,10782"; a="357863151" X-IronPort-AV: E=Sophos;i="6.01,231,1684825200"; d="scan'208";a="357863151" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2023 15:15:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10782"; a="1056938841" X-IronPort-AV: E=Sophos;i="6.01,231,1684825200"; d="scan'208";a="1056938841" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2023 15:15:24 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v15 023/115] KVM: TDX: Refuse to unplug the last cpu on the package Date: Tue, 25 Jul 2023 15:13:34 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata In order to reclaim TDX HKID, (i.e. when deleting guest TD), needs to call TDH.PHYMEM.PAGE.WBINVD on all packages. If we have active TDX HKID, refuse to offline the last online cpu to guarantee at least one CPU online per package. Add arch callback for cpu offline. Because TDX doesn't support suspend by the TDX 1.0 spec, this also refuses suspend if TDs are running. If no TD is running, suspend is allowed. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 44 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 5 ++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 12 ++++++-- 8 files changed, 64 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index a574e7eb04f3..d711829fb26a 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -18,6 +18,7 @@ KVM_X86_OP(check_processor_compatibility) KVM_X86_OP(hardware_enable) KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) +KVM_X86_OP_OPTIONAL_RET0(offline_cpu) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6ce2f512458e..5deb39828820 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1542,6 +1542,7 @@ struct kvm_x86_ops { int (*hardware_enable)(void); void (*hardware_disable)(void); void (*hardware_unsetup)(void); + int (*offline_cpu)(void); bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index ef08a46b04b3..d9c8becfe749 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -121,6 +121,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .check_processor_compatibility = vmx_check_processor_compat, .hardware_unsetup = vt_hardware_unsetup, + .offline_cpu = tdx_offline_cpu, .hardware_enable = vt_hardware_enable, .hardware_disable = vmx_hardware_disable, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index eb94572631aa..36d687e7c3f3 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -64,6 +64,7 @@ static struct tdx_info tdx_info __ro_after_init; */ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; +static atomic_t nr_configured_hkid; static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { @@ -232,7 +233,8 @@ void tdx_mmu_release_hkid(struct kvm *kvm) pr_err("tdh_mng_key_freeid failed. HKID %d is leaked.\n", kvm_tdx->hkid); return; - } + } else + atomic_dec(&nr_configured_hkid); free_hkid: tdx_hkid_free(kvm_tdx); @@ -635,6 +637,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, if (ret) break; } + if (!ret) + atomic_inc(&nr_configured_hkid); cpus_read_unlock(); free_cpumask_var(packages); if (ret) { @@ -921,3 +925,41 @@ void tdx_hardware_unsetup(void) /* kfree accepts NULL. */ kfree(tdx_mng_key_config_lock); } + +int tdx_offline_cpu(void) +{ + int curr_cpu = smp_processor_id(); + cpumask_var_t packages; + int ret = 0; + int i; + + /* No TD is running. Allow any cpu to be offline. */ + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * In order to reclaim TDX HKID, (i.e. when deleting guest TD), need to + * call TDH.PHYMEM.PAGE.WBINVD on all packages to program all memory + * controller with pconfig. If we have active TDX HKID, refuse to + * offline the last online cpu. + */ + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + return -ENOMEM; + for_each_online_cpu(i) { + if (i != curr_cpu) + cpumask_set_cpu(topology_physical_package_id(i), packages); + } + /* Check if this cpu is the last online cpu of this package. */ + if (!cpumask_test_cpu(topology_physical_package_id(curr_cpu), packages)) + ret = -EBUSY; + free_cpumask_var(packages); + if (ret) + /* + * Because it's hard for human operator to understand the + * reason, warn it. + */ +#define MSG_ALLPKG_ONLINE \ + "TDX requires all packages to have an online CPU. Delete all TDs in order to offline all CPUs of a package.\n" + pr_warn_ratelimited(MSG_ALLPKG_ONLINE); + return ret; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index fc5348dd20da..9394a7148c5e 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -139,6 +139,7 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); void tdx_hardware_unsetup(void); bool tdx_is_vm_type_supported(unsigned long type); +int tdx_offline_cpu(void); int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap); int tdx_vm_init(struct kvm *kvm); @@ -149,6 +150,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return -EOPNOTSUPP; } static inline void tdx_hardware_unsetup(void) {} static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } +static inline int tdx_offline_cpu(void) { return 0; } static inline int tdx_vm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bc7cdd8cbbb0..29a71f722fbb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12259,6 +12259,11 @@ void kvm_arch_hardware_disable(void) drop_user_return_notifiers(); } +int kvm_arch_offline_cpu(unsigned int cpu) +{ + return static_call(kvm_x86_offline_cpu)(); +} + bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu) { return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 42b5a2ccc9d1..e8770afce5cf 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1486,6 +1486,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {} int kvm_arch_hardware_enable(void); void kvm_arch_hardware_disable(void); #endif +int kvm_arch_offline_cpu(unsigned int cpu); int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ec2e879bb3f2..60ed0f613bce 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5441,13 +5441,21 @@ static void hardware_disable_nolock(void *junk) __this_cpu_write(hardware_enabled, false); } +__weak int kvm_arch_offline_cpu(unsigned int cpu) +{ + return 0; +} + static int kvm_offline_cpu(unsigned int cpu) { + int r = 0; + mutex_lock(&kvm_lock); - if (kvm_usage_count) + r = kvm_arch_offline_cpu(cpu); + if (!r && kvm_usage_count) hardware_disable_nolock(NULL); mutex_unlock(&kvm_lock); - return 0; + return r; } static void hardware_disable_all_nolock(void) -- 2.25.1