Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp2383104pxt; Sun, 8 Aug 2021 22:04:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4UOylcXcRqMtF7PVNuN9sCRQ2zO9GZztUe+2NDaHHbN1zEnmqyf/f+LTQzQmIYU3CES/d X-Received: by 2002:a05:6402:214a:: with SMTP id bq10mr27377852edb.296.1628485462565; Sun, 08 Aug 2021 22:04:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628485462; cv=none; d=google.com; s=arc-20160816; b=n7ufHbaT2PIegM8jkIepivmFkTfXSpWgZnl2Iq7U+YmkQxtTCyVFoV+HHx4dPtiZo5 zI7W/ot0MqZMznxZXHGFhuZF9/muv2OGIvigeVZHEyaBXYs2yz1QcDtpNqKFNHwP086G iDAjKdMgozrQDEoRz/LwHvM397G/cDgDadSmvehyGPU8YYx9qztQt8xFnTZrRMwf2/hC pnZ6XlQ1wl8hNv3XqniR33ikwvuF7MRNTjLO3RVUqeg27yeR7aNWeM+j1WJkU9m4un8t vaZ720ZJrlhOVV6yyaPXNpjYHVf0zQ2bx1ePFGzZz2i71cQhTrR9c/s1F9zOocO0Hk7C dbvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=YTFUNj99pp993wXeqIcVRGrpuGyDdd7Aq5QozrnOl3A=; b=L2DqHlb6RfOlpCFKasWkCSB013r+QNWhC7jt4fbcQBR5mZlWLLdwTs8p9NibW7Pkzm AvCiFZh+g2YyU9zXTsrXe8E6h5oNg5hmE7EsDKJqeCshFOoyYMX7JzRmuoNZPvUtn5Z/ GHtIHzie2wKxRdny1ymWm7AfYmd1sxa0epBacq0mDkcbIDasd7GeKmZkzAsTn178jS+5 y9ApmvRNs9h7a04OqszNPrNZRq8a7N8NB9T40uIwPBhpNepQ9q6EyEBvCj7SCADlnyYm 77KwkX8mZDvf/abgOzQRVIGoP11+1+QvzlPS3aPSll7k0low4FM+ihJxIuPTGF8LpucQ yW5A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v16si15565330edy.570.2021.08.08.22.03.58; Sun, 08 Aug 2021 22:04:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231675AbhHIE1c (ORCPT + 99 others); Mon, 9 Aug 2021 00:27:32 -0400 Received: from mga14.intel.com ([192.55.52.115]:13468 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbhHIE1b (ORCPT ); Mon, 9 Aug 2021 00:27:31 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10070"; a="214350947" X-IronPort-AV: E=Sophos;i="5.84,305,1620716400"; d="scan'208";a="214350947" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2021 21:27:11 -0700 X-IronPort-AV: E=Sophos;i="5.84,305,1620716400"; d="scan'208";a="514775084" Received: from raochun1-mobl.ccr.corp.intel.com (HELO localhost) ([10.255.28.63]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2021 21:27:07 -0700 Date: Mon, 9 Aug 2021 12:27:03 +0800 From: Yu Zhang To: Wei Huang Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com Subject: Re: [PATCH v2 1/3] KVM: x86: Allow CPU to force vendor-specific TDP level Message-ID: <20210809042703.25gfuuvujicc3vj7@linux.intel.com> References: <20210808192658.2923641-1-wei.huang2@amd.com> <20210808192658.2923641-2-wei.huang2@amd.com> <20210809035806.5cqdqm5vkexvngda@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20171215 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Aug 08, 2021 at 11:11:40PM -0500, Wei Huang wrote: > > > On 8/8/21 10:58 PM, Yu Zhang wrote: > > On Sun, Aug 08, 2021 at 02:26:56PM -0500, Wei Huang wrote: > > > AMD future CPUs will require a 5-level NPT if host CR4.LA57 is set. > > > > Sorry, but why? NPT is not indexed by HVA. > > NPT is not indexed by HVA - it is always indexed by GPA. What I meant is NPT > page table level has to be the same as the host OS page table: if 5-level > page table is enabled in host OS (CR4.LA57=1), guest NPT has to 5-level too. I know what you meant. But may I ask why? B.R. Yu > > > > > > To prevent kvm_mmu_get_tdp_level() from incorrectly changing NPT level > > > on behalf of CPUs, add a new parameter in kvm_configure_mmu() to force > > > a fixed TDP level. > > > > > > Signed-off-by: Wei Huang > > > --- > > > arch/x86/include/asm/kvm_host.h | 5 ++--- > > > arch/x86/kvm/mmu/mmu.c | 10 ++++++++-- > > > arch/x86/kvm/svm/svm.c | 4 +++- > > > arch/x86/kvm/vmx/vmx.c | 3 ++- > > > 4 files changed, 15 insertions(+), 7 deletions(-) > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > index 974cbfb1eefe..6d16f75cc8da 100644 > > > --- a/arch/x86/include/asm/kvm_host.h > > > +++ b/arch/x86/include/asm/kvm_host.h > > > @@ -723,7 +723,6 @@ struct kvm_vcpu_arch { > > > u64 reserved_gpa_bits; > > > int maxphyaddr; > > > - int max_tdp_level; > > > /* emulate context */ > > > @@ -1747,8 +1746,8 @@ void kvm_mmu_invalidate_gva(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > > void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); > > > void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd); > > > -void kvm_configure_mmu(bool enable_tdp, int tdp_max_root_level, > > > - int tdp_huge_page_level); > > > +void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, > > > + int tdp_max_root_level, int tdp_huge_page_level); > > > static inline u16 kvm_read_ldt(void) > > > { > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 66f7f5bc3482..c11ee4531f6d 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -97,6 +97,7 @@ module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644); > > > bool tdp_enabled = false; > > > static int max_huge_page_level __read_mostly; > > > +static int tdp_root_level __read_mostly; > > > > I think this is a broken design - meaning KVM can only use 5-level or > > 4-level NPT for all VMs. > > Broken normally means non-functional or buggy, which doesn't apply here. A > good TLB design should be able to offset the potential overhead of 5-level > page table for most cases. > > > > > B.R. > > Yu > > > > > static int max_tdp_level __read_mostly; > > > enum { > > > @@ -4562,6 +4563,10 @@ static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu, > > > static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu) > > > { > > > + /* tdp_root_level is architecture forced level, use it if nonzero */ > > > + if (tdp_root_level) > > > + return tdp_root_level; > > > + > > > /* Use 5-level TDP if and only if it's useful/necessary. */ > > > if (max_tdp_level == 5 && cpuid_maxphyaddr(vcpu) <= 48) > > > return 4; > > > @@ -5253,10 +5258,11 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid) > > > */ > > > } > > > -void kvm_configure_mmu(bool enable_tdp, int tdp_max_root_level, > > > - int tdp_huge_page_level) > > > +void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, > > > + int tdp_max_root_level, int tdp_huge_page_level) > > > { > > > tdp_enabled = enable_tdp; > > > + tdp_root_level = tdp_forced_root_level; > > > max_tdp_level = tdp_max_root_level; > > > /* > > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > > index e8ccab50ebf6..f361d466e18e 100644 > > > --- a/arch/x86/kvm/svm/svm.c > > > +++ b/arch/x86/kvm/svm/svm.c > > > @@ -1015,7 +1015,9 @@ static __init int svm_hardware_setup(void) > > > if (!boot_cpu_has(X86_FEATURE_NPT)) > > > npt_enabled = false; > > > - kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G); > > > + /* Force VM NPT level equal to the host's max NPT level */ > > > + kvm_configure_mmu(npt_enabled, get_max_npt_level(), > > > + get_max_npt_level(), PG_LEVEL_1G); > > > pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis"); > > > /* Note, SEV setup consumes npt_enabled. */ > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index 927a552393b9..034e1397c7d5 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -7803,7 +7803,8 @@ static __init int hardware_setup(void) > > > ept_lpage_level = PG_LEVEL_2M; > > > else > > > ept_lpage_level = PG_LEVEL_4K; > > > - kvm_configure_mmu(enable_ept, vmx_get_max_tdp_level(), ept_lpage_level); > > > + kvm_configure_mmu(enable_ept, 0, vmx_get_max_tdp_level(), > > > + ept_lpage_level); > > > /* > > > * Only enable PML when hardware supports PML feature, and both EPT > > > -- > > > 2.31.1 > > >