Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3487739rwb; Fri, 30 Sep 2022 04:27:57 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ueQc8YK8TOy8ECk9zk5av8FLGZfcsxQGRVjLmlN8+ePHUMkxxSdwpwnneF3HjEu52tsN1 X-Received: by 2002:a05:6402:35cb:b0:451:6a0a:6688 with SMTP id z11-20020a05640235cb00b004516a0a6688mr7440171edc.415.1664537277586; Fri, 30 Sep 2022 04:27:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664537277; cv=none; d=google.com; s=arc-20160816; b=HpNdZPSc3hKnEgyjEYP6ZaecrsWgZLH21UpA3ODfBAbGUYLKQotkgjn5L675rwRO63 ZxQ1+h4hSBYqAiP8JMmxdcqncHYiqui8c426kQ/rXZemA2LyX47NaH3rL6/AiFK2VcU6 WQsVtiDsfclL0bUFGWKGHfp1PuJA0bzUJ9TWKP7BR2Q0J8626szJ/VeZFi4tHfeUl4Dr T92kH3aHSChTf4qxco8UrwKQZWU1z3bsIvV4tp+dMzvvHSeTd2zpsYkcUKyuSuNn9iQY E9GhjqtMXlvmhvEzdWMAszIWHmb5ABZkHI3B8SPruJdfmEkkmeK5q/gpXhfOCC8pSSq+ CpSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=RuYsvdGSqytuDCQTaOmi7niidL7jCVYHOzuvhnUoHcg=; b=WWshFR4d7Ac1xjsMR0aKg6zRZ9wFTOe2bg0sCgtfypDa/K62j2hhsttZgbnm6HeC66 BbCK3geTbWFcMDnYKzZ6QEOtu9QKFdpyzbWdrH97g7M8JoAeAWeNjVC4CEaqMqZQa5Ns d7K4nMtuxcpDcJMeEJT1c5AA/KweYm0/JVUuF0VYiEMuKt59vwtEmWW0DfqfNiuOKsf4 15VBZlAkbf6/UYVwNx/b2uNfVR2W0xBp4hIwZK6SPqDhFzQUGJHUem8XsmmCi9AA2C/c xXEPlBlHIveQlCX9ziYdCWI78OWivp+0ztkcZTNBc420T00xjeWuqRH5Fb4unXqYBWwb pF2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m6cY7vG5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s22-20020a50ab16000000b00457463e2dfbsi1501265edc.512.2022.09.30.04.27.31; Fri, 30 Sep 2022 04:27:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m6cY7vG5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231964AbiI3Kco (ORCPT + 99 others); Fri, 30 Sep 2022 06:32:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53738 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232653AbiI3K1I (ORCPT ); Fri, 30 Sep 2022 06:27:08 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 129735B7B2; Fri, 30 Sep 2022 03:20:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664533211; x=1696069211; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cIu6PvsPhZ0UQ/64xg/km6uQfrpF5sEyR8sXlvhDs4Y=; b=m6cY7vG51kYqMbjyjegOxsViZhPW09EEp1LXbrLrdSDMhZJxmWFhzP1o 8jceFltZE4DgIroe0FZC5r1kV6yyaCE6tLZxAnNbiV3s16EYTrutlM1y1 4l4EDDEXq+wmrNL0m08odmygOxxqZfmwSxjg32RFAQ+94wRqykOpxIYgS MdJmVNH3JUC+iBXp4hz9jS1ye7cBM5MdqrmTo6/tGQ6caaAZx6e/RMMI8 J8IPEww4qAP4dYub8kucp+U7YxG4XVtRyvQZciVoRpm9j8o/1WOq04hH5 RVHHOZFFODRZnekp7BXDN7jfXAVtQf1U7ORQf1WeQ3IPX9nFeoV0bp7pU w==; X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="328540187" X-IronPort-AV: E=Sophos;i="5.93,358,1654585200"; d="scan'208";a="328540187" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 03:19:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10485"; a="726807849" X-IronPort-AV: E=Sophos;i="5.93,358,1654585200"; d="scan'208";a="726807849" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Sep 2022 03:19:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar Subject: [PATCH v9 103/105] Documentation/virt/kvm: Document on Trust Domain Extensions(TDX) Date: Fri, 30 Sep 2022 03:18:37 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Isaku Yamahata Add documentation to Intel Trusted Domain Extensions(TDX) support. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/api.rst | 9 +- Documentation/virt/kvm/index.rst | 2 + Documentation/virt/kvm/intel-tdx.rst | 345 +++++++++++++++++++++++++++ 3 files changed, 355 insertions(+), 1 deletion(-) create mode 100644 Documentation/virt/kvm/intel-tdx.rst diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index ebf5d2177933..3aa64ba9bb2b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1426,6 +1426,9 @@ It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. The KVM_SET_MEMORY_REGION does not allow fine grained control over memory allocation and is deprecated. +For TDX guest, deleting/moving memory region loses guest memory contents. +Read only region isn't supported. Only as-id 0 is supported. + 4.36 KVM_SET_TSS_ADDR --------------------- @@ -4712,7 +4715,7 @@ H_GET_CPU_CHARACTERISTICS hypercall. :Capability: basic :Architectures: x86 -:Type: vm +:Type: vm ioctl, vcpu ioctl :Parameters: an opaque platform specific structure (in/out) :Returns: 0 on success; -1 on error @@ -4724,6 +4727,10 @@ Currently, this ioctl is used for issuing Secure Encrypted Virtualization (SEV) commands on AMD Processors. The SEV commands are defined in Documentation/virt/kvm/x86/amd-memory-encryption.rst. +Currently, this ioctl is used for issuing Trusted Domain Extensions +(TDX) commands on Intel Processors. The TDX commands are defined in +Documentation/virt/kvm/intel-tdx.rst. + 4.111 KVM_MEMORY_ENCRYPT_REG_REGION ----------------------------------- diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst index e0a2c74e1043..cdb8b43ce797 100644 --- a/Documentation/virt/kvm/index.rst +++ b/Documentation/virt/kvm/index.rst @@ -18,3 +18,5 @@ KVM locking vcpu-requests review-checklist + + intel-tdx diff --git a/Documentation/virt/kvm/intel-tdx.rst b/Documentation/virt/kvm/intel-tdx.rst new file mode 100644 index 000000000000..6999b0f4f6c2 --- /dev/null +++ b/Documentation/virt/kvm/intel-tdx.rst @@ -0,0 +1,345 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================== +Intel Trust Domain Extensions (TDX) +=================================== + +Overview +======== +TDX stands for Trust Domain Extensions which isolates VMs from +the virtual-machine manager (VMM)/hypervisor and any other software on +the platform. For details, see the specifications [1]_, whitepaper [2]_, +architectural extensions specification [3]_, module documentation [4]_, +loader interface specification [5]_, guest-hypervisor communication +interface [6]_, virtual firmware design guide [7]_, and other resources +([8]_, [9]_, [10]_, [11]_, and [12]_). + + +API description +=============== + +KVM_MEMORY_ENCRYPT_OP +--------------------- +:Type: vm ioctl, vcpu ioctl + +For TDX operations, KVM_MEMORY_ENCRYPT_OP is re-purposed to be generic +ioctl with TDX specific sub ioctl command. + +:: + + /* Trust Domain eXtension sub-ioctl() commands. */ + enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, + KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, + + KVM_TDX_CMD_NR_MAX, + }; + + struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-commend. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + * Defined for consistency with struct kvm_sev_cmd. + */ + __u64 error; + /* Reserved: Defined for consistency with struct kvm_sev_cmd. */ + __u64 unused; + }; + +KVM_TDX_CAPABILITIES +-------------------- +:Type: vm ioctl + +Subset of TDSYSINFO_STRCUCT retrieved by TDH.SYS.INFO TDX SEAM call will be +returned. Which describes about Intel TDX module. + +- id: KVM_TDX_CAPABILITIES +- flags: must be 0 +- data: pointer to struct kvm_tdx_capabilities +- error: must be 0 +- unused: must be 0 + +:: + + struct kvm_tdx_cpuid_config { + __u32 leaf; + __u32 sub_leaf; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; + }; + + struct kvm_tdx_capabilities { + __u64 attrs_fixed0; + __u64 attrs_fixed1; + __u64 xfam_fixed0; + __u64 xfam_fixed1; + + __u32 nr_cpuid_configs; + struct kvm_tdx_cpuid_config cpuid_configs[0]; + }; + + +KVM_TDX_INIT_VM +--------------- +:Type: vm ioctl + +Does additional VM initialization specific to TDX which corresponds to +TDH.MNG.INIT TDX SEAM call. + +- id: KVM_TDX_INIT_VM +- flags: must be 0 +- data: pointer to struct kvm_tdx_init_vm +- error: must be 0 +- unused: must be 0 + +:: + + struct kvm_tdx_init_vm { + __u32 max_vcpus; + __u32 reserved; + __u64 attributes; + __u64 cpuid; /* pointer to struct kvm_cpuid2 */ + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + __u64 reserved[43]; /* must be zero for future extensibility */ + }; + + +KVM_TDX_INIT_VCPU +----------------- +:Type: vcpu ioctl + +Does additional VCPU initialization specific to TDX which corresponds to +TDH.VP.INIT TDX SEAM call. + +- id: KVM_TDX_INIT_VCPU +- flags: must be 0 +- data: initial value of the guest TD VCPU RCX +- error: must be 0 +- unused: must be 0 + +KVM_TDX_INIT_MEM_REGION +----------------------- +:Type: vm ioctl + +Encrypt a memory continuous region which corresponding to TDH.MEM.PAGE.ADD +TDX SEAM call. +If KVM_TDX_MEASURE_MEMORY_REGION flag is specified, it also extends measurement +which corresponds to TDH.MR.EXTEND TDX SEAM call. + +- id: KVM_TDX_INIT_VCPU +- flags: flags + currently only KVM_TDX_MEASURE_MEMORY_REGION is defined +- data: pointer to struct kvm_tdx_init_mem_region +- error: must be 0 +- unused: must be 0 + +:: + + #define KVM_TDX_MEASURE_MEMORY_REGION (1UL << 0) + + struct kvm_tdx_init_mem_region { + __u64 source_addr; + __u64 gpa; + __u64 nr_pages; + }; + + +KVM_TDX_FINALIZE_VM +------------------- +:Type: vm ioctl + +Complete measurement of the initial TD contents and mark it ready to run +which corresponds to TDH.MR.FINALIZE + +- id: KVM_TDX_FINALIZE_VM +- flags: must be 0 +- data: must be 0 +- error: must be 0 +- unused: must be 0 + +KVM TDX creation flow +===================== +In addition to KVM normal flow, new TDX ioctls need to be called. The control flow +looks like as follows. + +#. system wide capability check + + * KVM_CAP_VM_TYPES: check if VM type is supported and if TDX_VM_TYPE is + supported. + +#. creating VM + + * KVM_CREATE_VM + * KVM_TDX_CAPABILITIES: query if TDX is supported on the platform. + * KVM_TDX_INIT_VM: pass TDX specific VM parameters. + +#. creating VCPU + + * KVM_CREATE_VCPU + * KVM_TDX_INIT_VCPU: pass TDX specific VCPU parameters. + +#. initializing guest memory + + * allocate guest memory and initialize page same to normal KVM case + In TDX case, parse and load TDVF into guest memory in addition. + * KVM_TDX_INIT_MEM_REGION to add and measure guest pages. + If the pages has contents above, those pages need to be added. + Otherwise the contents will be lost and guest sees zero pages. + * KVM_TDX_FINALIAZE_VM: Finalize VM and measurement + This must be after KVM_TDX_INIT_MEM_REGION. + +#. run vcpu + +Design discussion +================= + +Coexistence of normal(VMX) VM and TD VM +--------------------------------------- +It's required to allow both legacy(normal VMX) VMs and new TD VMs to +coexist. Otherwise the benefits of VM flexibility would be eliminated. +The main issue for it is that the logic of kvm_x86_ops callbacks for +TDX is different from VMX. On the other hand, the variable, +kvm_x86_ops, is global single variable. Not per-VM, not per-vcpu. + +Several points to be considered: + + * No or minimal overhead when TDX is disabled(CONFIG_INTEL_TDX_HOST=n). + * Avoid overhead of indirect call via function pointers. + * Contain the changes under arch/x86/kvm/vmx directory and share logic + with VMX for maintenance. + Even though the ways to operation on VM (VMX instruction vs TDX + SEAM call) is different, the basic idea remains same. So, many + logic can be shared. + * Future maintenance + The huge change of kvm_x86_ops in (near) future isn't expected. + a centralized file is acceptable. + +- Wrapping kvm x86_ops: The current choice + + Introduce dedicated file for arch/x86/kvm/vmx/main.c (the name, + main.c, is just chosen to show main entry points for callbacks.) and + wrapper functions around all the callbacks with + "if (is-tdx) tdx-callback() else vmx-callback()". + + Pros: + + - No major change in common x86 KVM code. The change is (mostly) + contained under arch/x86/kvm/vmx/. + - When TDX is disabled(CONFIG_INTEL_TDX_HOST=n), the overhead is + optimized out. + - Micro optimization by avoiding function pointer. + + Cons: + + - Many boiler plates in arch/x86/kvm/vmx/main.c. + +KVM MMU Changes +--------------- +KVM MMU needs to be enhanced to handle Secure/Shared-EPT. The +high-level execution flow is mostly same to normal EPT case. +EPT violation/misconfiguration -> invoke TDP fault handler -> +resolve TDP fault -> resume execution. (or emulate MMIO) +The difference is, that S-EPT is operated(read/write) via TDX SEAM +call which is expensive instead of direct read/write EPT entry. +One bit of GPA (51 or 47 bit) is repurposed so that it means shared +with host(if set to 1) or private to TD(if cleared to 0). + +- The current implementation + + * Reuse the existing MMU code with minimal update. Because the + execution flow is mostly same. But additional operation, TDX call + for S-EPT, is needed. So add hooks for it to kvm_x86_ops. + * For performance, minimize TDX SEAM call to operate on S-EPT. When + getting corresponding S-EPT pages/entry from faulting GPA, don't + use TDX SEAM call to read S-EPT entry. Instead create shadow copy + in host memory. + Repurpose the existing kvm_mmu_page as shadow copy of S-EPT and + associate S-EPT to it. + * Treats share bit as attributes. mask/unmask the bit where + necessary to keep the existing traversing code works. + Introduce kvm.arch.gfn_shared_mask and use "if (gfn_share_mask)" + for special case. + + * 0 : for non-TDX case + * 51 or 47 bit set for TDX case. + + Pros: + + - Large code reuse with minimal new hooks. + - Execution path is same. + + Cons: + + - Complicates the existing code. + - Repurpose kvm_mmu_page as shadow of Secure-EPT can be confusing. + +New KVM API, ioctl (sub)command, to manage TD VMs +------------------------------------------------- +Additional KVM API are needed to control TD VMs. The operations on TD +VMs are specific to TDX. + +- Piggyback and repurpose KVM_MEMORY_ENCRYPT_OP + + Although not all operation isn't memory encryption, repupose to get + TDX specific ioctls. + + Pros: + + - No major change in common x86 KVM code. + + Cons: + + - The operations aren't actually memory encryption, but operations + on TD VMs. + +References +========== + +.. [1] TDX specification + https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html +.. [2] Intel Trust Domain Extensions (Intel TDX) + https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-whitepaper-final9-17.pdf +.. [3] Intel CPU Architectural Extensions Specification + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-cpu-architectural-specification.pdf +.. [4] Intel TDX Module 1.0 EAS + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-module-1eas.pdf +.. [5] Intel TDX Loader Interface Specification + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-seamldr-interface-specification.pdf +.. [6] Intel TDX Guest-Hypervisor Communication Interface + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-guest-hypervisor-communication-interface.pdf +.. [7] Intel TDX Virtual Firmware Design Guide + https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1. +.. [8] intel public github + + * kvm TDX branch: https://github.com/intel/tdx/tree/kvm + * TDX guest branch: https://github.com/intel/tdx/tree/guest + +.. [9] tdvf + https://github.com/tianocore/edk2-staging/tree/TDVF +.. [10] KVM forum 2020: Intel Virtualization Technology Extensions to + Enable Hardware Isolated VMs + https://osseu2020.sched.com/event/eDzm/intel-virtualization-technology-extensions-to-enable-hardware-isolated-vms-sean-christopherson-intel +.. [11] Linux Security Summit EU 2020: + Architectural Extensions for Hardware Virtual Machine Isolation + to Advance Confidential Computing in Public Clouds - Ravi Sahita + & Jun Nakajima, Intel Corporation + https://osseu2020.sched.com/event/eDOx/architectural-extensions-for-hardware-virtual-machine-isolation-to-advance-confidential-computing-in-public-clouds-ravi-sahita-jun-nakajima-intel-corporation +.. [12] [RFCv2,00/16] KVM protected memory extension + https://lkml.org/lkml/2020/10/20/66 -- 2.25.1