Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp4119156rwb; Sun, 7 Aug 2022 15:35:45 -0700 (PDT) X-Google-Smtp-Source: AA6agR4um6fHhNux3t1zv5CzJIkTCkaFCPtMMDrCSjPVd/HYzwaeB1isU7fEfrMsWeACSHuVOpdG X-Received: by 2002:a05:6a00:bc5:b0:52b:49c9:d26c with SMTP id x5-20020a056a000bc500b0052b49c9d26cmr15867068pfu.73.1659911744851; Sun, 07 Aug 2022 15:35:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659911744; cv=none; d=google.com; s=arc-20160816; b=hCO555Wg9awZHCD7qzm0X0Y4wPIPEWLAJyqoEstsWsa5U/aJrRRdmpRMiUqfk4eNyM mIs1Wj0msLX3c8gXT0z/10EYlOZN7nbgU/nIgjAKBWLXGp6/TJb+S0IaddwozfAqxxTO 9uJqv4gJeCRehqRJLP/3vFUchBV596vyXqrZCv+9rq1qTpobi/X+3ATpThtXgZCX/FK2 W2kmbwIYccx6TGYNIq/kU4iwtRDTKgpEb8l9gX7+3HH9Iy/qmYN3LQYh3inizTn91u+o SReC38UWZxTlZXZ85TCCK6ovsH2zSP9FPOUHj8mWt7KMuBoGZ/OYvqUtXGnz83mcDacy UskQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Up7k1x5RgYk+WQneh8t0Y0WmrEDYgi18yyEWhHUhMZY=; b=ZQNDy9c/TEWWOdfMAbz/Lvy6Ltjv4fX2YwWMJPUajxxOGP7ajUVhizzkUAdgHRgNyh bNRrVsKb5ynxvd1Ffnga23nNIsZ1CqMrro8yd71wzljaxbl+ej/VfHUceIv59mjRi98z KxMQ69t3BajUEPJeVwlR5dTBP5nh70Xv5BcJlwH50ibjZMuav/npNVFTQpNn9O0L0Cb+ GFxdciZsHrAI8rxn/00ODgQrFwY2PD8BniCJLVgcj0XUM+jv0XDm0ND23t69ekr7y5BE 8Wf+98RGlKRmDSz915pGZN4xDG8MMDuPlrA1pLJMuzetx7Yrh429H2JJnIuFkrK7X4mf xfmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HxWY66jv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 6-20020a630806000000b0041b0abf6cd2si8862091pgi.647.2022.08.07.15.35.30; Sun, 07 Aug 2022 15:35:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HxWY66jv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237712AbiHGWGG (ORCPT + 99 others); Sun, 7 Aug 2022 18:06:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234530AbiHGWCu (ORCPT ); Sun, 7 Aug 2022 18:02:50 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 039DA64F7; Sun, 7 Aug 2022 15:02:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659909759; x=1691445759; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DQrE78lf8SL2z2kgWstG9f1z56ycM0LUNVG5ybdfBsM=; b=HxWY66jvZLUB/xTrYyS25ueG2/GjHsrD3AsOmJj1Jslejd1vxY6Ar18/ zWrMk+gmQNKM+jJy90l9N28kXxlX/mkWrycl6jLWbu5KCCQ75hfBk0iLi TNSjAn24yhE+oB2y9nQVGz3SVcnaK+/MS4mWlzYk0Y3Zb6J8mXmeTRsBn mI6HZTHJ2iInUbddq4FboD/ympXeYzvutA9AXAHctrW/9uzY9hUf+BXlT /K5Mulcfz6gNtC2bfhhGJgsEZgJ/nX2jzBp7tfx73JaktGDdtYcBl2nnt YFnlFL6gdBs2Yaj7kQIfWtPmTY4+suPH0CCbyxVeWV9Z4aBTVrjbyYHjk g==; X-IronPort-AV: E=McAfee;i="6400,9594,10432"; a="289224097" X-IronPort-AV: E=Sophos;i="5.93,220,1654585200"; d="scan'208";a="289224097" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2022 15:02:32 -0700 X-IronPort-AV: E=Sophos;i="5.93,220,1654585200"; d="scan'208";a="663682518" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2022 15:02:32 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar Subject: [PATCH v8 026/103] KVM: TDX: allocate/free TDX vcpu structure Date: Sun, 7 Aug 2022 15:01:11 -0700 Message-Id: <2347416df71bf525bd716a036e92e21a2e2a7520.1659854790.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata The next step of TDX guest creation is to create vcpu. Allocate TDX vcpu structures, initialize it. Allocate pages of TDX vcpu for the TDX module. In the case of the conventional case, cpuid is empty at the initialization. and cpuid is configured after the vcpu initialization. Because TDX supports only X2APIC mode, cpuid is forcibly initialized to support X2APIC on the vcpu initialization. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 40 +++++++++-- arch/x86/kvm/vmx/tdx.c | 135 +++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 8 +++ 3 files changed, 179 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 067f5de56c53..4f4ed4ad65a7 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -73,6 +73,38 @@ static void vt_vm_free(struct kvm *kvm) return tdx_vm_free(kvm); } +static int vt_vcpu_precreate(struct kvm *kvm) +{ + if (is_td(kvm)) + return 0; + + return vmx_vcpu_precreate(kvm); +} + +static int vt_vcpu_create(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_create(vcpu); + + return vmx_vcpu_create(vcpu); +} + +static void vt_vcpu_free(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_free(vcpu); + + return vmx_vcpu_free(vcpu); +} + +static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_reset(vcpu, init_event); + + return vmx_vcpu_reset(vcpu, init_event); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -98,10 +130,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vm_destroy = vt_vm_destroy, .vm_free = vt_vm_free, - .vcpu_precreate = vmx_vcpu_precreate, - .vcpu_create = vmx_vcpu_create, - .vcpu_free = vmx_vcpu_free, - .vcpu_reset = vmx_vcpu_reset, + .vcpu_precreate = vt_vcpu_precreate, + .vcpu_create = vt_vcpu_create, + .vcpu_free = vt_vcpu_free, + .vcpu_reset = vt_vcpu_reset, .prepare_switch_to_guest = vmx_prepare_switch_to_guest, .vcpu_load = vmx_vcpu_load, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index dcd2f460275e..ee682a65b233 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -6,6 +6,7 @@ #include "capabilities.h" #include "x86_ops.h" #include "tdx.h" +#include "x86.h" #undef pr_fmt #define pr_fmt(fmt) "tdx: " fmt @@ -47,6 +48,11 @@ static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); } +static inline bool is_td_vcpu_created(struct vcpu_tdx *tdx) +{ + return tdx->tdvpr.added; +} + static inline bool is_td_created(struct kvm_tdx *kvm_tdx) { return kvm_tdx->tdr.added; @@ -378,6 +384,135 @@ int tdx_vm_init(struct kvm *kvm) return ret; } +int tdx_vcpu_create(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + int ret, i; + + /* TDX only supports x2APIC, which requires an in-kernel local APIC. */ + if (!vcpu->arch.apic) + return -EINVAL; + + fpstate_set_confidential(&vcpu->arch.guest_fpu); + + ret = tdx_alloc_td_page(&tdx->tdvpr); + if (ret) + return ret; + + tdx->tdvpx = kcalloc(tdx_caps.tdvpx_nr_pages, sizeof(*tdx->tdvpx), + GFP_KERNEL_ACCOUNT); + if (!tdx->tdvpx) { + ret = -ENOMEM; + goto free_tdvpr; + } + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + ret = tdx_alloc_td_page(&tdx->tdvpx[i]); + if (ret) + goto free_tdvpx; + } + + vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; + + vcpu->arch.cr0_guest_owned_bits = -1ul; + vcpu->arch.cr4_guest_owned_bits = -1ul; + + vcpu->arch.tsc_offset = to_kvm_tdx(vcpu->kvm)->tsc_offset; + vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; + vcpu->arch.guest_state_protected = + !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + + return 0; + +free_tdvpx: + /* @i points at the TDVPX page that failed allocation. */ + for (--i; i >= 0; i--) + free_page(tdx->tdvpx[i].va); + kfree(tdx->tdvpx); +free_tdvpr: + free_page(tdx->tdvpr.va); + + return ret; +} + +void tdx_vcpu_free(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + int i; + + /* Can't reclaim or free pages if teardown failed. */ + if (is_hkid_assigned(to_kvm_tdx(vcpu->kvm))) + return; + + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) + tdx_reclaim_td_page(&tdx->tdvpx[i]); + kfree(tdx->tdvpx); + tdx_reclaim_td_page(&tdx->tdvpr); +} + +void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + struct msr_data apic_base_msr; + u64 err; + int i; + + /* TDX doesn't support INIT event. */ + if (WARN_ON(init_event)) + goto td_bugged; + if (WARN_ON(is_td_vcpu_created(tdx))) + goto td_bugged; + + err = tdh_vp_create(kvm_tdx->tdr.pa, tdx->tdvpr.pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_VP_CREATE, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpr); + + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + err = tdh_vp_addcx(tdx->tdvpr.pa, tdx->tdvpx[i].pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_VP_ADDCX, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpx[i]); + } + + if (!vcpu->arch.cpuid_entries) { + /* + * On cpu creation, cpuid entry is blank. Forcibly enable + * X2APIC feature to allow X2APIC. + */ + struct kvm_cpuid_entry2 *e; + + e = kvmalloc_array(1, sizeof(*e), GFP_KERNEL_ACCOUNT); + *e = (struct kvm_cpuid_entry2) { + .function = 1, /* Features for X2APIC */ + .index = 0, + .eax = 0, + .ebx = 0, + .ecx = 1ULL << 21, /* X2APIC */ + .edx = 0, + }; + vcpu->arch.cpuid_entries = e; + vcpu->arch.cpuid_nent = 1; + } + apic_base_msr.data = APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC; + if (kvm_vcpu_is_reset_bsp(vcpu)) + apic_base_msr.data |= MSR_IA32_APICBASE_BSP; + apic_base_msr.host_initiated = true; + if (WARN_ON(kvm_set_apic_base(vcpu, &apic_base_msr))) + goto td_bugged; + + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + + return; + +td_bugged: + vcpu->kvm->vm_bugged = true; +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index f0fe40c7ac34..b98bbcd9ef42 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -138,6 +138,10 @@ int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); void tdx_vm_free(struct kvm *kvm); +int tdx_vcpu_create(struct kvm_vcpu *vcpu); +void tdx_vcpu_free(struct kvm_vcpu *vcpu); +void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } @@ -150,6 +154,10 @@ static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} static inline void tdx_flush_shadow_all_private(struct kvm *kvm) {} static inline void tdx_vm_free(struct kvm *kvm) {} +static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } +static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} +static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} + static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } #endif -- 2.25.1