Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp1180960rbb; Mon, 26 Feb 2024 00:57:19 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXfv53ItG4rWCFws+R0ETBTs87eMTipMV5WwqY5PXrsB7tEJs3L8nGmHWdy4pP+HqjoOH/lsh4LOpECwGSE0aJVm0r0u350YzUElt9UKg== X-Google-Smtp-Source: AGHT+IF3PDpZrOJYcxAk8cgij127e1Viw9UcmNXIpDwEl7Mi1E/U5lf8dgNdg6wt5YabYqlae7dJ X-Received: by 2002:a17:906:4717:b0:a3f:2703:1c6a with SMTP id y23-20020a170906471700b00a3f27031c6amr4413658ejq.51.1708937839617; Mon, 26 Feb 2024 00:57:19 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708937839; cv=pass; d=google.com; s=arc-20160816; b=uqdEKa1aflgMgBpmnC5STS7InZXfy3a6/uL4H6PzAthNxYxQ2X8OO8/YDcz2/2wwlH GDUNFG33POfk8fB/9id5K/xWjLoPRLX99AYgG0ININVTNXwLoA7wmUCP8+F41Y8Zkuwh OeRTqHI8tC1ZgcckHVIEP77KPUBEMFA6HcWufTJdKkZrps45kThhMImbxcm+FjJa/MAg o6CkjKzMfIbp4vMnwhBAHC1a3pT0sec7rV8IP3Vb5veEbEaxgmYbnPcg1TK1VMH/DRSM MnDt7+cLXhqIu6zrHCFuwOChi4qUkKinN8kKPZNgskPicMN1srpU/vQYwdjt2xDZ+Hbm +43A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=W6FDuyjGQFAaa8oDfTD9uWQoZuTy/IgJG8T46QVqs8s=; fh=GU4KR9NKf1qQHjmWlEi4zSAEAxyD6Pa0rUGDupPjr4Q=; b=r/J5xFqGQW73g1ovtkQGMxd9DcjXadQJnf1EOKT5NddCakM+7eS8pA/+4UiZwtlG2i kAtEhU0xPV8IvxP1ur0nJCjLIv7I5f46I3BJiFhmyJtqLqdJtrrMFTMmPT26XnJySb3n loMrOFZWguZ9ghRA47TLopPm+kSBqamVTgZhjuOV6BA9nrvKd2KUa61++96CTlmErKQv paGxSAG6amOkxmXw8FCrvB+PaI2TorM4nzXa4gr485zj5FTH0+IyU+7LgpBOVjXx0mJd 4qDrDZJzDdrEXTOZGQyTQQzh7P47ifZqoAWVtKKAXByF7zHBidZ643SgB9a+efQ/A1bo JAXw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggo4w80H; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80820-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80820-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id gb34-20020a170907962200b00a3e57ae39e0si1997525ejc.474.2024.02.26.00.57.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 00:57:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80820-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggo4w80H; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-80820-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80820-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 143A91F26183 for ; Mon, 26 Feb 2024 08:46:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2995F612DD; Mon, 26 Feb 2024 08:28:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ggo4w80H" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BEA2604C6; Mon, 26 Feb 2024 08:28:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936103; cv=none; b=NWd40F6+jC6pwOw0zu6NYfQPP/QFiemphaSzZTavFpwqrhUpV0fyu/cePGkB+fbjd4CDx8Yay703+cXV6PUfTv6z2zZQS+h70M3/vgJgoETPZze3dCCuvM0jBzMPJct3egGaozpV75I5xyBjdv3Pcmx7GW+EzlOq+BKOM83ypR4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708936103; c=relaxed/simple; bh=eQXT76E/Iv2Z7iSGfh0gwWuL1GwbfdwraxK817hkAaI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iMV4uhUJ2UG5ErFl+9NgcqzsTCEeisDPQ5VIFs0ZZX53wAieZkAVjRBCluY5PluFd6eB9d0qrNAKF1sYNGG117v26Zy6+j6/Atw52zlP9gzZLDUZvx8cXfx+03wKiA8YJLRyp/NGy2t9tG1vXwUnYwmseNkyh/6x3IiP6Wlwek8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ggo4w80H; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708936101; x=1740472101; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eQXT76E/Iv2Z7iSGfh0gwWuL1GwbfdwraxK817hkAaI=; b=ggo4w80HNsoA2lb7IsfIRTEScdruT4XfscosPXg6SDz2An4iJtnqedSr qvOeuvlb/9dauNoulXbo8L8U2xziEMCHTTd9FokpN7+gZMUB+NocKQeH2 5QbiCK9ciJCq2H4zrqmU7YjzzOIIowd2WTl/YoIlfG42R6L3mtwSxryF7 3OInRgZvas9dUBphpO8kXqWOjjwZXcyz2ldRCt/CYoZjjbHHTG8COJkmN AfysKU9sFO51xSsbpz+dG5Wi7HZI0sheiBGaiG6eWdfFKsZbxXRmWTbrg piierkOZU4G08U83u+gOOJx0hgO+IK9qMBslAcLBs+OdCY+LZTeUw2W73 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10995"; a="6155425" X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6155425" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:28:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,185,1705392000"; d="scan'208";a="6615886" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2024 00:28:17 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Binbin Wu Subject: [PATCH v19 058/130] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Date: Mon, 26 Feb 2024 00:26:00 -0800 Message-Id: <9d86b5a2787d20ffb5a58f86e43601a660521f16.1708933498.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Isaku Yamahata For private GPA, CPU refers a private page table whose contents are encrypted. The dedicated APIs to operate on it (e.g. updating/reading its PTE entry) are used and their cost is expensive. When KVM resolves KVM page fault, it walks the page tables. To reuse the existing KVM MMU code and mitigate the heavy cost to directly walk private page table, allocate one more page to copy the dummy page table for KVM MMU code to directly walk. Resolve KVM page fault with the existing code, and do additional operations necessary for the private page table. To distinguish such cases, the existing KVM page table is called a shared page table (i.e. not associated with private page table), and the page table with private page table is called a private page table. The relationship is depicted below. Add a private pointer to struct kvm_mmu_page for private page table and add helper functions to allocate/initialize/free a private page table page. KVM page fault | | | V | -------------+---------- | | | | V V | shared GPA private GPA | | | | V V | shared PT root dummy PT root | private PT root | | | | V V | V shared PT dummy PT ----propagate----> private PT | | | | | \-----------------+------\ | | | | | V | V V shared guest page | private guest page | non-encrypted memory | encrypted memory | PT: page table - Shared PT is visible to KVM and it is used by CPU. - Private PT is used by CPU but it is invisible to KVM. - Dummy PT is visible to KVM but not used by CPU. It is used to propagate PT change to the actual private PT which is used by CPU. Signed-off-by: Isaku Yamahata Reviewed-by: Binbin Wu --- v19: - typo in the comment in kvm_mmu_alloc_private_spt() - drop CONFIG_KVM_MMU_PRIVATE --- arch/x86/include/asm/kvm_host.h | 5 +++ arch/x86/kvm/mmu/mmu.c | 7 ++++ arch/x86/kvm/mmu/mmu_internal.h | 63 ++++++++++++++++++++++++++++++--- arch/x86/kvm/mmu/tdp_mmu.c | 1 + 4 files changed, 72 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index dcc6f7c38a83..efd3fda1c177 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -825,6 +825,11 @@ struct kvm_vcpu_arch { struct kvm_mmu_memory_cache mmu_shadow_page_cache; struct kvm_mmu_memory_cache mmu_shadowed_info_cache; struct kvm_mmu_memory_cache mmu_page_header_cache; + /* + * This cache is to allocate private page table. E.g. Secure-EPT used + * by the TDX module. + */ + struct kvm_mmu_memory_cache mmu_private_spt_cache; /* * QEMU userspace and the guest each have their own FPU state. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index eeebbc67e42b..0d6d4506ec97 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -685,6 +685,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) 1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM); if (r) return r; + if (kvm_gfn_shared_mask(vcpu->kvm)) { + r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_private_spt_cache, + PT64_ROOT_MAX_LEVEL); + if (r) + return r; + } r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache, PT64_ROOT_MAX_LEVEL); if (r) @@ -704,6 +710,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); + kvm_mmu_free_memory_cache(&vcpu->arch.mmu_private_spt_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index e3f54701f98d..002f3f80bf3b 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -101,7 +101,21 @@ struct kvm_mmu_page { int root_count; refcount_t tdp_mmu_root_count; }; - unsigned int unsync_children; + union { + struct { + unsigned int unsync_children; + /* + * Number of writes since the last time traversal + * visited this page. + */ + atomic_t write_flooding_count; + }; + /* + * Associated private shadow page table, e.g. Secure-EPT page + * passed to the TDX module. + */ + void *private_spt; + }; union { struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */ tdp_ptep_t ptep; @@ -124,9 +138,6 @@ struct kvm_mmu_page { int clear_spte_count; #endif - /* Number of writes since the last time traversal visited this page. */ - atomic_t write_flooding_count; - #ifdef CONFIG_X86_64 /* Used for freeing the page asynchronously if it is a TDP MMU page. */ struct rcu_head rcu_head; @@ -150,6 +161,50 @@ static inline bool is_private_sp(const struct kvm_mmu_page *sp) return kvm_mmu_page_role_is_private(sp->role); } +static inline void *kvm_mmu_private_spt(struct kvm_mmu_page *sp) +{ + return sp->private_spt; +} + +static inline void kvm_mmu_init_private_spt(struct kvm_mmu_page *sp, void *private_spt) +{ + sp->private_spt = private_spt; +} + +static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + bool is_root = vcpu->arch.root_mmu.root_role.level == sp->role.level; + + KVM_BUG_ON(!kvm_mmu_page_role_is_private(sp->role), vcpu->kvm); + if (is_root) + /* + * Because TDX module assigns root Secure-EPT page and set it to + * Secure-EPTP when TD vcpu is created, secure page table for + * root isn't needed. + */ + sp->private_spt = NULL; + else { + /* + * Because the TDX module doesn't trust VMM and initializes + * the pages itself, KVM doesn't initialize them. Allocate + * pages with garbage and give them to the TDX module. + */ + sp->private_spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_private_spt_cache); + /* + * Because mmu_private_spt_cache is topped up before starting + * kvm page fault resolving, the allocation above shouldn't + * fail. + */ + WARN_ON_ONCE(!sp->private_spt); + } +} + +static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) +{ + if (sp->private_spt) + free_page((unsigned long)sp->private_spt); +} + static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp) { /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 87233b3ceaef..d47f0daf1b03 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -53,6 +53,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) { + kvm_mmu_free_private_spt(sp); free_page((unsigned long)sp->spt); kmem_cache_free(mmu_page_header_cache, sp); } -- 2.25.1