Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp706196imn; Thu, 28 Jul 2022 13:17:54 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vGJh/2U66vCf/EO/vSyiQBZ9E0HPngLfRLkHjMHynaVwUyMeFp2Fy5Qzrcn4IFCujxBpg3 X-Received: by 2002:a05:6402:2786:b0:43a:dc4e:ea63 with SMTP id b6-20020a056402278600b0043adc4eea63mr626201ede.175.1659039474081; Thu, 28 Jul 2022 13:17:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659039474; cv=none; d=google.com; s=arc-20160816; b=qYO7OB89syIodGOARsbIZa6/NYSVOZsLUqIsB9q354lB/0CfSrktz+EgoVnoN2syXw mgpCD4RZKU3QLcLoCqODvXen/aUVifuJ8NKO7ZVzb2zbELPhu+ypo5Iu8Me/JF8CZ9yj 32DAZ2Kkuel/m6sZLaisWahPSyN9ic4j9TJAvzA9QAnFmUCG3K97SbGDDVbejDFhkV3o 1k72OjR5eB7DjQimfzlkOYek2ZkFF2nHlp8PmdiU6b9MLjmWsUg1hgUwY5Igdyoimvul WXf4L79a66HstuDKQCPlOLCfUwk2Bs1snX4+k+Xs+54rX3d8lLYqWrU/JY7xPKHK8REg 93iQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=OCGCkbI2DSYUg6HeHxm6nACK8W6v9vz3ryu4JaTHbCk=; b=iO85g9c3hEqR1ccJVuPNUQK3ky9LRM+ybe3jzE3FO6dD6Kan0YJI7IobvK4Oqr9MlN J9zEe9VpgIC6sJtWTI3HT+IrPDeDphDeqCpxFAurJSR1NxlaGu5n71okhCwGl48TLCem diomCcv27ujXpkF+s9zr99Sybq5+3NMbjc78PAMvE6xMrv1atMPb8g8g5v0ZSW9ZogtP ezNPFPXP+qR9fPlk0RcY5J2VxJW8PM2/35KL/LW3Z60JUsJ5APrny7OUla/4yH67kKaG Uw59cYYEoedTd30bSJHcHFJ2+RmXKgPSiif9k2utlE8357DoXOoyB1yIGJbMQgoV0OFw z0Ag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LZgRdf8P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cs18-20020a170906dc9200b0072fc5a78f82si1681247ejc.202.2022.07.28.13.17.29; Thu, 28 Jul 2022 13:17:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LZgRdf8P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230228AbiG1TmA (ORCPT + 99 others); Thu, 28 Jul 2022 15:42:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229458AbiG1Tl7 (ORCPT ); Thu, 28 Jul 2022 15:41:59 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E64C46BC13 for ; Thu, 28 Jul 2022 12:41:57 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id r186so2326447pgr.2 for ; Thu, 28 Jul 2022 12:41:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=OCGCkbI2DSYUg6HeHxm6nACK8W6v9vz3ryu4JaTHbCk=; b=LZgRdf8P7M6BSc+/Yy7gDqlxEpAL7JxmVubVQXCVV4YoGWtQe6wIO2pt8HjE1Spt4h ILoXizfn0ceJfSYC/wW07yLl0UnoA2YXGx5vo0RW6oXWim10jvA9UiZpJQ7K4ZL5nS7c 8xfipt0d4srCncDb1mU+yiqwV36fet7eZjfeS2747VhMcggzb27fkunw42qwqIBxZdla gKICw9wY92/yed+9eefppmt/cMxRt5mBMjh1vVXDX2kawWYdFbuB+7FvtodC6vUOUCWJ 6I7CeyF/gQop/IVsTs/5b0COu0H1eehde7ASXiWhBEskXT0CgqJ43m205CS2dSEd/7sy 2K2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=OCGCkbI2DSYUg6HeHxm6nACK8W6v9vz3ryu4JaTHbCk=; b=NENbUxSFsn/7rkgCz2v8R1ZeIm6FEovYcdPcOv8xXQxiOTZNJK/D0+NxWj235D2lQI M52X9JGMJZWGSjq6qrilMHTTww/7lxHjOPezgsFMxFB8jirH2SHPe5VuAVPT9wQValty 2+tgmZGjb+G+CFV6KE2WngU1gLkXcwaEg0d3oT7eAodXplBwrETcyd1uo+/2/Y7mi6z3 7OQwtgqNE9NOg7mYqGsvq0rGw+tVc0dxg6Q/yWYx7H7zSawrH1hdtdqSNleO+ZnZkKvQ UBxtQ8siZKB/CfLxgmOG9PA+QKxZbAFziOIy9+5d71qaempVMEx4Dv2tgUOZissrrFWn WtsA== X-Gm-Message-State: AJIora8fsTWZ6ZfLzbl/WwuzObpPlRv/MQIM2uvR5bWLLjfJxllE62Kz k6HIvTpxPbomODj+P6t3yBXTwQ== X-Received: by 2002:a05:6a00:e85:b0:52b:5db8:f3df with SMTP id bo5-20020a056a000e8500b0052b5db8f3dfmr152513pfb.14.1659037317094; Thu, 28 Jul 2022 12:41:57 -0700 (PDT) Received: from google.com (223.103.125.34.bc.googleusercontent.com. [34.125.103.223]) by smtp.gmail.com with ESMTPSA id n4-20020a170902d2c400b0016cabb9d77dsm1752813plc.169.2022.07.28.12.41.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Jul 2022 12:41:56 -0700 (PDT) Date: Thu, 28 Jul 2022 12:41:51 -0700 From: David Matlack To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini Subject: Re: [PATCH v7 044/102] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Message-ID: References: <392839e09c48ff4e14a598ff6ed8bd56429bf17b.1656366338.git.isaku.yamahata@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <392839e09c48ff4e14a598ff6ed8bd56429bf17b.1656366338.git.isaku.yamahata@intel.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 27, 2022 at 02:53:36PM -0700, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata > > For private GPA, CPU refers a private page table whose contents are > encrypted. The dedicated APIs to operate on it (e.g. updating/reading its > PTE entry) are used and their cost is expensive. > > When KVM resolves KVM page fault, it walks the page tables. To reuse the > existing KVM MMU code and mitigate the heavy cost to directly walk > encrypted private page table, allocate a more page to mirror the existing > KVM page table. Resolve KVM page fault with the existing code, and do > additional operations necessary for the mirrored private page table. To > distinguish such cases, the existing KVM page table is called a shared page > table (i.e. no mirrored private page table), and the KVM page table with > mirrored private page table is called a private page table. The > relationship is depicted below. > > Add private pointer to struct kvm_mmu_page for mirrored private page table > and add helper functions to allocate/initialize/free a mirrored private > page table page. Also, add helper functions to check if a given > kvm_mmu_page is private. The later patch introduces hooks to operate on > the mirrored private page table. > > KVM page fault | > | | > V | > -------------+---------- | > | | | > V V | > shared GPA private GPA | > | | | > V V | > CPU/KVM shared PT root KVM private PT root | CPU private PT root > | | | | > V V | V > shared PT private PT <----mirror----> mirrored private PT > | | | | > | \-----------------+------\ | > | | | | > V | V V > shared guest page | private guest page > | > non-encrypted memory | encrypted memory > | > PT: page table > > Both CPU and KVM refer to CPU/KVM shared page table. Private page table > is used only by KVM. CPU refers to mirrored private page table. > > Signed-off-by: Isaku Yamahata > --- > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/mmu/mmu.c | 9 ++++ > arch/x86/kvm/mmu/mmu_internal.h | 84 +++++++++++++++++++++++++++++++++ > arch/x86/kvm/mmu/tdp_mmu.c | 3 ++ > 4 files changed, 97 insertions(+) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index f4d4ed41641b..bfc934dc9a33 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -716,6 +716,7 @@ struct kvm_vcpu_arch { > struct kvm_mmu_memory_cache mmu_shadow_page_cache; > struct kvm_mmu_memory_cache mmu_gfn_array_cache; > struct kvm_mmu_memory_cache mmu_page_header_cache; > + struct kvm_mmu_memory_cache mmu_private_sp_cache; I notice that mmu_private_sp_cache.gfp_zero is left unset so these pages may contain garbage. Is this by design because the TDX module can't rely on the contents being zero and has to take care of initializing the page itself? i.e. GFP_ZERO would be a waste of cycles? If I'm correct please include a comment here in the next revision to explain why GFP_ZERO is not necessary. > > /* > * QEMU userspace and the guest each have their own FPU state. > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index c517c7bca105..a5bf3e40e209 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -691,6 +691,13 @@ static int mmu_topup_shadow_page_cache(struct kvm_vcpu *vcpu) > int start, end, i, r; > bool is_tdp_mmu = is_tdp_mmu_enabled(vcpu->kvm); > > + if (kvm_gfn_shared_mask(vcpu->kvm)) { > + r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_private_sp_cache, > + PT64_ROOT_MAX_LEVEL); > + if (r) > + return r; > + } > + > if (is_tdp_mmu && shadow_nonpresent_value) > start = kvm_mmu_memory_cache_nr_free_objects(mc); > > @@ -732,6 +739,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) > { > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); > + kvm_mmu_free_memory_cache(&vcpu->arch.mmu_private_sp_cache); > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_gfn_array_cache); > kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); > } > @@ -1736,6 +1744,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct > if (!direct) > sp->gfns = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_gfn_array_cache); > set_page_private(virt_to_page(sp->spt), (unsigned long)sp); > + kvm_mmu_init_private_sp(sp, NULL); This is unnecessary. kvm_mmu_page structs are zero-initialized so private_sp will already be NULL. > > /* > * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h > index 44a04fad4bed..9f3a6bea60a3 100644 > --- a/arch/x86/kvm/mmu/mmu_internal.h > +++ b/arch/x86/kvm/mmu/mmu_internal.h > @@ -55,6 +55,10 @@ struct kvm_mmu_page { > u64 *spt; > /* hold the gfn of each spte inside spt */ > gfn_t *gfns; > +#ifdef CONFIG_KVM_MMU_PRIVATE > + /* associated private shadow page, e.g. SEPT page. */ Can we use "Secure EPT" instead of SEPT in KVM code and comments? (i.e. also including variable names like sept_page -> secure_ept_page) "SEPT" looks like a mispelling of SPTE, which is used all over KVM. It will be difficult to read code that contains both acronyms. > + void *private_sp; Please name this "private_spt" and move it up next to "spt". sp" or "shadow page" is used to refer to kvm_mmu_page structs. For example, look at all the code in KVM that uses `struct kvm_mmu_page *sp`. "spt" is "shadow page table", i.e. the actual page table memory. See kvm_mmu_page.spt. Calling this field "private_spt" makes it obvious that this pointer is pointing to a page table. Also please update the language in the comment accordingly to "private shadow page table". > +#endif > /* Currently serving as active root */ > union { > int root_count; > @@ -115,6 +119,86 @@ static inline int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) > return kvm_mmu_role_as_id(sp->role); > } > > +/* > + * TDX vcpu allocates page for root Secure EPT page and assigns to CPU secure > + * EPT pointer. KVM doesn't need to allocate and link to the secure EPT. > + * Dummy value to make is_pivate_sp() return true. > + */ > +#define KVM_MMU_PRIVATE_SP_ROOT ((void *)1) > + > +#ifdef CONFIG_KVM_MMU_PRIVATE > +static inline bool is_private_sp(struct kvm_mmu_page *sp) > +{ > + return !!sp->private_sp; > +} > + > +static inline bool is_private_sptep(u64 *sptep) > +{ > + WARN_ON(!sptep); > + return is_private_sp(sptep_to_sp(sptep)); > +} > + > +static inline void *kvm_mmu_private_sp(struct kvm_mmu_page *sp) > +{ > + return sp->private_sp; > +} > + > +static inline void kvm_mmu_init_private_sp(struct kvm_mmu_page *sp, void *private_sp) > +{ > + sp->private_sp = private_sp; > +} > + > +/* Valid sp->role.level is required. */ > +static inline void kvm_mmu_alloc_private_sp( > + struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, bool is_root) > +{ > + if (is_root) > + sp->private_sp = KVM_MMU_PRIVATE_SP_ROOT; > + else > + sp->private_sp = kvm_mmu_memory_cache_alloc( > + &vcpu->arch.mmu_private_sp_cache); > + /* > + * Because mmu_private_sp_cache is topped up before staring kvm page > + * fault resolving, the allocation above shouldn't fail. > + */ > + WARN_ON_ONCE(!sp->private_sp); > +} > + > +static inline void kvm_mmu_free_private_sp(struct kvm_mmu_page *sp) > +{ > + if (sp->private_sp != KVM_MMU_PRIVATE_SP_ROOT) > + free_page((unsigned long)sp->private_sp); > +} > +#else > +static inline bool is_private_sp(struct kvm_mmu_page *sp) > +{ > + return false; > +} > + > +static inline bool is_private_sptep(u64 *sptep) > +{ > + return false; > +} > + > +static inline void *kvm_mmu_private_sp(struct kvm_mmu_page *sp) > +{ > + return NULL; > +} > + > +static inline void kvm_mmu_init_private_sp(struct kvm_mmu_page *sp, void *private_sp) > +{ > +} > + > +static inline void kvm_mmu_alloc_private_sp( > + struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, bool is_root) > +{ > +} > + > +static inline void kvm_mmu_free_private_sp(struct kvm_mmu_page *sp) > +{ > +} > +#endif > + > static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp) > { > /* > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > index 7eb41b176d1e..b2568b062faa 100644 > --- a/arch/x86/kvm/mmu/tdp_mmu.c > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > @@ -72,6 +72,8 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) > > static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) > { > + if (is_private_sp(sp)) > + kvm_mmu_free_private_sp(sp); > free_page((unsigned long)sp->spt); > kmem_cache_free(mmu_page_header_cache, sp); > } > @@ -295,6 +297,7 @@ static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, > sp->gfn = gfn; > sp->ptep = sptep; > sp->tdp_mmu_page = true; > + kvm_mmu_init_private_sp(sp); > > trace_kvm_mmu_get_page(sp, true); > } > -- > 2.25.1 >