Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp4119374rwb; Sun, 7 Aug 2022 15:36:06 -0700 (PDT) X-Google-Smtp-Source: AA6agR5xmBvLCSJQOweraW7HFRlAsw0ErsCsGkkaApg9GrOA5+wK7pYuAC0vgubb06QE/IZRcZfZ X-Received: by 2002:a05:6a00:c88:b0:52e:e004:e570 with SMTP id a8-20020a056a000c8800b0052ee004e570mr9628972pfv.2.1659911766209; Sun, 07 Aug 2022 15:36:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659911766; cv=none; d=google.com; s=arc-20160816; b=jw6HDvm6WNExtno0MJjBHTwi7dTXtXav3FzEVv277NUp7dNCM5T5YraVTuCbnLSnd5 MqbRZuZq7/j6LJcgqipQguqSCET9S7gNdOd2PijSqNGYR9XkGnTqK442epSt8ep6R5Vz 1CKsQUfMigfvl/RknMQbm99FJg7WTLgZhT5l609aQtWo12/i7jG1yIjuGtoQ+R7w/iTy X3Kbom7DqtFOwEYxykRZBtBd9wqw7K+ERy1SInNGDvMaYd6EeJwxk/EDC1iA2KRG4cp9 W5lX4diLMqrGdNzTuRtZp7qSsaQRKVZQ3S8o99plLSLdD82nYlk2ezUNSERq8hlSz5Qu NEQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gssjcMuNHXGLAwXaB2UFV3lLw1aqcbvjsVGHRi2uxhE=; b=VNVcwUbuNk6GJ8PZrZ3E5eZCouZVkVy2KoeoB3Tr+/uPTND4pCfAvpJuRzvZoOnRaI xHT2beyZ6bm0HikJWKiVA9HZ3tZpel71XRbTtyBJBZuMHwMeA2TduZw24ldcseLeh8lA lQ1l9YMxqLn1SWWQAn0RmDTF8c/RBQBU7mYh5SxL4iK4p1spNx72c9PXdPX7QbM58qNZ xsZsLlrgp8hb7qnU7lvK5/33WsOGCLu890YNwQO7IyYzS0NA6ViM62lilx438UAXr0Uc UHRwY3vYRmtbKdp3dV+TSYOpYk6//ZK01fgZ02y+B2Vwt30tnoXcJfuvKQZz96/3Nz0i ppwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YGha2l1z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y184-20020a638ac1000000b004089d1b3912si9300014pgd.204.2022.08.07.15.35.52; Sun, 07 Aug 2022 15:36:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YGha2l1z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236921AbiHGWE7 (ORCPT + 99 others); Sun, 7 Aug 2022 18:04:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233760AbiHGWCl (ORCPT ); Sun, 7 Aug 2022 18:02:41 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36F136583; Sun, 7 Aug 2022 15:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659909757; x=1691445757; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3+QnoU7kHGBGWTy3FL8qVIsaiLLvTwE8vMfL10ZxM08=; b=YGha2l1zNx4thHgPQ27MUAymkQJQWvYWXnKlmp7vjdChKs9kQ0ikHNRy fE/3qf0/bSBfO/940M3spDmgvKkWsRrvJRmFpAMlMMgx6WMSEZFrLm0ga znMnDrVJI5JHlCw7jhPXMHDx5lhuDeRsvpkRGtipxQwOzykJN/gkd29AL Q3/Wg7CuNMPnNt7YISLo8bkxIZaQcIV7jyc3PAyuz4DnCiWppcVR6WuI3 CLx1AN0qDMjmTxtX/ofdZyLMzldzXFMvXl5XQBvwmDH250HghDSSYwxF0 Jw2PAeCPDI4vLltZshuIa+7czpoBx8nR75cdM1tTVFgG5q4W0CuSAUcFu Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10432"; a="289224090" X-IronPort-AV: E=Sophos;i="5.93,220,1654585200"; d="scan'208";a="289224090" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2022 15:02:32 -0700 X-IronPort-AV: E=Sophos;i="5.93,220,1654585200"; d="scan'208";a="663682509" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2022 15:02:32 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar Subject: [PATCH v8 023/103] KVM: TDX: initialize VM with TDX specific parameters Date: Sun, 7 Aug 2022 15:01:08 -0700 Message-Id: <031bea8db0c579b4866a33faeb85ce4d461dc8a3.1659854790.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiaoyao Li TDX requires additional parameters for TDX VM for confidential execution to protect its confidentiality of its memory contents and its CPU state from any other software, including VMM. When creating guest TD VM before creating vcpu, the number of vcpu, TSC frequency (that is same among vcpus. and it can't be changed.) CPUIDs which is emulated by the TDX module. It means guest can trust those CPUIDs. and sha384 values for measurement. Add new subcommand, KVM_TDX_INIT_VM, to pass parameters for TDX guest. It assigns encryption key to the TDX guest for memory encryption. TDX encrypts memory per-guest bases. It assigns device model passes per-VM parameters for the TDX guest. The maximum number of vcpus, tsc frequency (TDX guest has fised VM-wide TSC frequency. not per-vcpu. The TDX guest can not change it.), attributes (production or debug), available extended features (which is reflected into guest XCR0, IA32_XSS MSR), cpuids, sha384 measurements, and etc. This subcommand is called before creating vcpu and KVM_SET_CPUID2, i.e. cpuids configurations aren't available yet. So CPUIDs configuration values needs to be passed in struct kvm_init_vm. It's device model responsibility to make this cpuid config for KVM_TDX_INIT_VM and KVM_SET_CPUID2. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/tdx.h | 3 + arch/x86/include/uapi/asm/kvm.h | 33 +++++ arch/x86/kvm/vmx/tdx.c | 199 ++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 22 +++ tools/arch/x86/include/uapi/asm/kvm.h | 33 +++++ 5 files changed, 290 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index a32e8881e758..8a1905ae3ad6 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -89,6 +89,9 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, #endif /* CONFIG_INTEL_TDX_GUEST && CONFIG_KVM_GUEST */ #ifdef CONFIG_INTEL_TDX_HOST + +/* -1 indicates CPUID leaf with no sub-leaves. */ +#define TDX_CPUID_NO_SUBLEAF ((u32)-1) struct tdx_cpuid_config { u32 leaf; u32 sub_leaf; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 9effc64e547e..97ce34d746af 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -538,6 +538,7 @@ struct kvm_pmu_event_filter { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, KVM_TDX_CMD_NR_MAX, }; @@ -583,4 +584,36 @@ struct kvm_tdx_capabilities { struct kvm_tdx_cpuid_config cpuid_configs[0]; }; +struct kvm_tdx_init_vm { + __u64 attributes; + __u32 max_vcpus; + __u32 padding; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + union { + /* + * KVM_TDX_INIT_VM is called before vcpu creation, thus before + * KVM_SET_CPUID2. CPUID configurations needs to be passed. + * + * This configuration supersedes KVM_SET_CPUID{,2}. + * The user space VMM, e.g. qemu, should make them consistent + * with this values. + * sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES(256) + * = 8KB. + */ + struct { + struct kvm_cpuid2 cpuid; + /* 8KB with KVM_MAX_CPUID_ENTRIES. */ + struct kvm_cpuid_entry2 entries[]; + }; + /* + * For future extensibility. + * The size(struct kvm_tdx_init_vm) = 16KB. + * This should be enough given sizeof(TD_PARAMS) = 1024 + */ + __u64 reserved[2028]; + }; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d3b9f653da4b..dcd2f460275e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -424,6 +424,202 @@ int tdx_dev_ioctl(void __user *argp) return 0; } +/* + * cpuid entry lookup in TDX cpuid config way. + * The difference is how to specify index(subleaves). + * Specify index to TDX_CPUID_NO_SUBLEAF for CPUID leaf with no-subleaves. + */ +static const struct kvm_cpuid_entry2 *tdx_find_cpuid_entry( + const struct kvm_cpuid2 *cpuid, u32 function, u32 index) +{ + int i; + + /* In TDX CPU CONFIG, TDX_CPUID_NO_SUBLEAF means index = 0. */ + if (index == TDX_CPUID_NO_SUBLEAF) + index = 0; + + for (i = 0; i < cpuid->nent; i++) { + const struct kvm_cpuid_entry2 *e = &cpuid->entries[i]; + + if (e->function == function && + (e->index == index || + !(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX))) + return e; + } + return NULL; +} + +static int setup_tdparams(struct kvm *kvm, struct td_params *td_params, + struct kvm_tdx_init_vm *init_vm) +{ + const struct kvm_cpuid2 *cpuid = &init_vm->cpuid; + const struct kvm_cpuid_entry2 *entry; + u64 guest_supported_xcr0; + u64 guest_supported_xss; + int max_pa; + int i; + + td_params->max_vcpus = init_vm->max_vcpus; + + td_params->attributes = init_vm->attributes; + if (td_params->attributes & TDX_TD_ATTRIBUTE_PERFMON) { + /* + * TODO: save/restore PMU related registers around TDENTER. + * Once it's done, remove this guard. + */ + pr_warn("TD doesn't support perfmon yet. KVM needs to save/restore " + "host perf registers properly.\n"); + return -EOPNOTSUPP; + } + + for (i = 0; i < tdx_caps.nr_cpuid_configs; i++) { + const struct tdx_cpuid_config *config = &tdx_caps.cpuid_configs[i]; + const struct kvm_cpuid_entry2 *entry = + tdx_find_cpuid_entry(cpuid, config->leaf, config->sub_leaf); + struct tdx_cpuid_value *value = &td_params->cpuid_values[i]; + + if (!entry) + continue; + + value->eax = entry->eax & config->eax; + value->ebx = entry->ebx & config->ebx; + value->ecx = entry->ecx & config->ecx; + value->edx = entry->edx & config->edx; + } + + max_pa = 36; + entry = tdx_find_cpuid_entry(cpuid, 0x80000008, 0); + if (entry) + max_pa = entry->eax & 0xff; + + td_params->eptp_controls = VMX_EPTP_MT_WB; + /* + * No CPU supports 4-level && max_pa > 48. + * "5-level paging and 5-level EPT" section 4.1 4-level EPT + * "4-level EPT is limited to translating 48-bit guest-physical + * addresses." + * cpu_has_vmx_ept_5levels() check is just in case. + */ + if (cpu_has_vmx_ept_5levels() && max_pa > 48) { + td_params->eptp_controls |= VMX_EPTP_PWL_5; + td_params->exec_controls |= TDX_EXEC_CONTROL_MAX_GPAW; + } else { + td_params->eptp_controls |= VMX_EPTP_PWL_4; + } + + /* Setup td_params.xfam */ + entry = tdx_find_cpuid_entry(cpuid, 0xd, 0); + if (entry) + guest_supported_xcr0 = (entry->eax | ((u64)entry->edx << 32)); + else + guest_supported_xcr0 = 0; + guest_supported_xcr0 &= kvm_caps.supported_xcr0; + + entry = tdx_find_cpuid_entry(cpuid, 0xd, 1); + if (entry) + guest_supported_xss = (entry->ecx | ((u64)entry->edx << 32)); + else + guest_supported_xss = 0; + /* PT can be exposed to TD guest regardless of KVM's XSS support */ + guest_supported_xss &= (kvm_caps.supported_xss | XFEATURE_MASK_PT); + + td_params->xfam = guest_supported_xcr0 | guest_supported_xss; + if (td_params->xfam & XFEATURE_MASK_LBR) { + /* + * TODO: once KVM supports LBR(save/restore LBR related + * registers around TDENTER), remove this guard. + */ + pr_warn("TD doesn't support LBR yet. KVM needs to save/restore " + "IA32_LBR_DEPTH properly.\n"); + return -EOPNOTSUPP; + } + + if (td_params->xfam & XFEATURE_MASK_XTILE) { + /* + * TODO: once KVM supports AMX(save/restore AMX related + * registers around TDENTER), remove this guard. + */ + pr_warn("TD doesn't support AMX yet. KVM needs to save/restore " + "IA32_XFD, IA32_XFD_ERR properly.\n"); + return -EOPNOTSUPP; + } + + td_params->tsc_frequency = + TDX_TSC_KHZ_TO_25MHZ(kvm->arch.default_tsc_khz); + +#define MEMCPY_SAME_SIZE(dst, src) \ + do { \ + BUILD_BUG_ON(sizeof(dst) != sizeof(src)); \ + memcpy((dst), (src), sizeof(dst)); \ + } while (0) + + MEMCPY_SAME_SIZE(td_params->mrconfigid, init_vm->mrconfigid); + MEMCPY_SAME_SIZE(td_params->mrowner, init_vm->mrowner); + MEMCPY_SAME_SIZE(td_params->mrownerconfig, init_vm->mrownerconfig); + + return 0; +} + +static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct kvm_tdx_init_vm *init_vm = NULL; + struct td_params *td_params = NULL; + struct tdx_module_output out; + int ret; + u64 err; + + BUILD_BUG_ON(sizeof(*init_vm) != 16 * 1024); + BUILD_BUG_ON((sizeof(*init_vm) - offsetof(typeof(*init_vm), entries)) / + sizeof(init_vm->entries[0]) < KVM_MAX_CPUID_ENTRIES); + BUILD_BUG_ON(sizeof(struct td_params) != 1024); + + if (is_td_initialized(kvm)) + return -EINVAL; + + if (cmd->flags) + return -EINVAL; + + init_vm = kzalloc(sizeof(*init_vm), GFP_KERNEL); + if (copy_from_user(init_vm, (void __user *)cmd->data, sizeof(*init_vm))) { + ret = -EFAULT; + goto out; + } + + if (init_vm->max_vcpus > KVM_MAX_VCPUS) { + ret = -EINVAL; + goto out; + } + + td_params = kzalloc(sizeof(struct td_params), GFP_KERNEL); + if (!td_params) { + ret = -ENOMEM; + goto out; + } + + ret = setup_tdparams(kvm, td_params, init_vm); + if (ret) + goto out; + + err = tdh_mng_init(kvm_tdx->tdr.pa, __pa(td_params), &out); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_INIT, err, &out); + ret = -EIO; + goto out; + } + + kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET); + kvm_tdx->attributes = td_params->attributes; + kvm_tdx->xfam = td_params->xfam; + kvm->max_vcpus = td_params->max_vcpus; + +out: + /* kfree() accepts NULL. */ + kfree(init_vm); + kfree(td_params); + return ret; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -437,6 +633,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) mutex_lock(&kvm->lock); switch (tdx_cmd.id) { + case KVM_TDX_INIT_VM: + r = tdx_td_init(kvm, &tdx_cmd); + break; default: r = -EINVAL; goto out; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 8058b6b153f8..3e5782438dc9 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -20,7 +20,11 @@ struct kvm_tdx { struct tdx_td_page tdr; struct tdx_td_page *tdcs; + u64 attributes; + u64 xfam; int hkid; + + u64 tsc_offset; }; struct vcpu_tdx { @@ -50,6 +54,11 @@ static inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_tdx, vcpu); } +static inline bool is_td_initialized(struct kvm *kvm) +{ + return !!kvm->max_vcpus; +} + static __always_inline void tdvps_vmcs_check(u32 field, u8 bits) { BUILD_BUG_ON_MSG(__builtin_constant_p(field) && (field) & 0x1, @@ -135,6 +144,19 @@ TDX_BUILD_TDVPS_ACCESSORS(64, VMCS, vmcs); TDX_BUILD_TDVPS_ACCESSORS(64, STATE_NON_ARCH, state_non_arch); TDX_BUILD_TDVPS_ACCESSORS(8, MANAGEMENT, management); +static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 field) +{ + struct tdx_module_output out; + u64 err; + + err = tdh_mng_rd(kvm_tdx->tdr.pa, TDCS_EXEC(field), &out); + if (unlikely(err)) { + pr_err("TDH_MNG_RD[EXEC.0x%x] failed: 0x%llx\n", field, err); + return 0; + } + return out.r8; +} + #else static inline int tdx_module_setup(void) { return -ENODEV; }; diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index ca85a070ac19..965a1c2e347d 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -532,6 +532,7 @@ struct kvm_pmu_event_filter { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, KVM_TDX_CMD_NR_MAX, }; @@ -577,4 +578,36 @@ struct kvm_tdx_capabilities { struct kvm_tdx_cpuid_config cpuid_configs[0]; }; +struct kvm_tdx_init_vm { + __u64 attributes; + __u32 max_vcpus; + __u32 padding; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + union { + /* + * KVM_TDX_INIT_VM is called before vcpu creation, thus before + * KVM_SET_CPUID2. CPUID configurations needs to be passed. + * + * This configuration supersedes KVM_SET_CPUID{,2}. + * The user space VMM, e.g. qemu, should make them consistent + * with this values. + * sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES(256) + * = 8KB. + */ + struct { + struct kvm_cpuid2 cpuid; + /* 8KB with KVM_MAX_CPUID_ENTRIES. */ + struct kvm_cpuid_entry2 entries[]; + }; + /* + * For future extensibility. + * The size(struct kvm_tdx_init_vm) = 16KB. + * This should be enough given sizeof(TD_PARAMS) = 1024 + */ + __u64 reserved[2028]; + }; +}; + #endif /* _ASM_X86_KVM_H */ -- 2.25.1