Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp2128942rdb; Mon, 20 Nov 2023 02:54:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IF4W8Ju0bn/Jwp0LpcEK1GObKybtOEcvpGD+JNGYh5/WfqH1lRe8R3aQzX3i7oKqgSp/BCU X-Received: by 2002:a05:6a00:13a9:b0:6c9:892c:5916 with SMTP id t41-20020a056a0013a900b006c9892c5916mr5230556pfg.9.1700477680296; Mon, 20 Nov 2023 02:54:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700477680; cv=none; d=google.com; s=arc-20160816; b=gAKF0ImKq2kRgSYwBt58QmEtscYtuoErpzYoiZqWk7jzBFUixhxR0sN/C7UEHaRouj qCskxQ+Bb6gQe1qswkeTTJ53rEjJdUGBGp90Cjxgpvnb+kbIKSDPP1UpulM05R/ODrdU 1rt6oIc2QyymBc56KMbkQefg7m54rqghZcIlP04NfCAhQekqIK7UvMR0+eUqFg2lfExG Hmyn59WacQfz0XBUxBHdry4+1szzQG+DbCzPfswsrwvxn+GiOIGFs9vRl/iXwckr/k+n Y8Rg5981yztkykD8dW4x+Vb0JN9misUOvDaq/NsCifXIbk1PXfnGqLGtOzL/pqjNz8G3 DEUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=Mwu9gfiq9k0cPASgtVjvrbPuaP2+HN1KBwmn8JjQknw=; fh=SSl64FBK0Lhy77ytVxLm3gRT6oITPT9jTW5tmAf+4Dk=; b=oa7h/BRBC4sh7rSVlDZQQtqCn5TnXHq+430/Q8HftMf6wwKALLv+e8yaHkLWWaMgCo OCwH5itcf280vv118Pt73NUlYyEQcV/dMxaW7bnF7ZTgk04Fnfjwm3T54TSgNrA652WO GAX2dypJeNBLUhIJaG93KbR0Qgjq+LMN9dtkLBetx8j/gAbxU/2wCP8igToD1Sjha2Qe H6GWudIWcxdedDdFLojfjsFxTZ69604uqkDHqX7r2FAqyfbx0SCuiSv7eG1oqeQ2apF1 335DR+OBv+DV1bu252EZsTrjAxPaMzqqPaHhRXvUQS5AqR1NOzPkbWhOaq3v+cFUN44j L9Hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GHGuLlS1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id h12-20020a056a00170c00b006cbb2cd545esi644967pfc.5.2023.11.20.02.54.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 02:54:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GHGuLlS1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id EE73B8059DB0; Mon, 20 Nov 2023 02:54:32 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232771AbjKTKyT (ORCPT + 99 others); Mon, 20 Nov 2023 05:54:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232632AbjKTKyS (ORCPT ); Mon, 20 Nov 2023 05:54:18 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 009B69D; Mon, 20 Nov 2023 02:54:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700477655; x=1732013655; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=3iDX7dVKder9KQMVZYc3X0MGJbgyOtOSKFpY5c1P79w=; b=GHGuLlS1Dl18+vSO7adnYW0qXOScr5d5xqcCr+vT0ndKaH7wOy++dbJm 5V/xN0B5BxDhvhHAv3xahnpK0BQBnX0atWMA6kVSE1J05SDCkO5563MVi UK1Y9mK2tierNKHyAqT59dKw72UOJPIak6KwdLtXopCe6BWStrZixNaHz GKkEQrlc+jimD7BQFZ+fto4MhXzTPdx39mL/VQ3iaMrF8wbwr9686MFNc Q6I5aSWO9tjDw8ptdsIVMKnuht2rZrjG+fTslek8yA/ujS2S5x4DKRyQ7 Lg7VB+z5GBE3b9EP16CPwI9qDKkFP2O4RIv4tHN5clu/7M+DkTtj0RQuG w==; X-IronPort-AV: E=McAfee;i="6600,9927,10899"; a="371773076" X-IronPort-AV: E=Sophos;i="6.04,213,1695711600"; d="scan'208";a="371773076" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2023 02:54:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10899"; a="742691083" X-IronPort-AV: E=Sophos;i="6.04,213,1695711600"; d="scan'208";a="742691083" Received: from binbinwu-mobl.ccr.corp.intel.com (HELO [10.93.8.180]) ([10.93.8.180]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2023 02:54:10 -0800 Message-ID: <8e0934a0-c478-413a-8a58-36f7d20c23e9@linux.intel.com> Date: Mon, 20 Nov 2023 18:54:07 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 07/16] KVM: MMU: Introduce level info in PFERR code To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com, Xiaoyao Li References: From: Binbin Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 20 Nov 2023 02:54:33 -0800 (PST) On 11/7/2023 11:00 PM, isaku.yamahata@intel.com wrote: > From: Xiaoyao Li > > For TDX, EPT violation can happen when TDG.MEM.PAGE.ACCEPT. > And TDG.MEM.PAGE.ACCEPT contains the desired accept page level of TD guest. > > 1. KVM can map it with 4KB page while TD guest wants to accept 2MB page. > > TD geust will get TDX_PAGE_SIZE_MISMATCH and it should try to accept s/geust/guest > 4KB size. > > 2. KVM can map it with 2MB page while TD guest wants to accept 4KB page. > > KVM needs to honor it because > a) there is no way to tell guest KVM maps it as 2MB size. And > b) guest accepts it in 4KB size since guest knows some other 4KB page > in the same 2MB range will be used as shared page. > > For case 2, it need to pass desired page level to MMU's > page_fault_handler. Use bit 29:31 of kvm PF error code for this purpose. The level info is needed not only for case 2, KVM also needs the info so that it can map a 2MB page when TD guest wants to accept a 2MB page. > > Signed-off-by: Xiaoyao Li > Signed-off-by: Isaku Yamahata > --- > arch/x86/include/asm/kvm_host.h | 3 +++ > arch/x86/kvm/mmu/mmu.c | 5 +++++ > arch/x86/kvm/vmx/common.h | 6 +++++- > arch/x86/kvm/vmx/tdx.c | 15 ++++++++++++++- > arch/x86/kvm/vmx/tdx.h | 19 +++++++++++++++++++ > arch/x86/kvm/vmx/vmx.c | 2 +- > 6 files changed, 47 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index edcafcd650db..eed36c1eedb7 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -261,6 +261,8 @@ enum x86_intercept_stage; > #define PFERR_FETCH_BIT 4 > #define PFERR_PK_BIT 5 > #define PFERR_SGX_BIT 15 > +#define PFERR_LEVEL_START_BIT 29 > +#define PFERR_LEVEL_END_BIT 31 > #define PFERR_GUEST_FINAL_BIT 32 > #define PFERR_GUEST_PAGE_BIT 33 > #define PFERR_GUEST_ENC_BIT 34 > @@ -273,6 +275,7 @@ enum x86_intercept_stage; > #define PFERR_FETCH_MASK BIT(PFERR_FETCH_BIT) > #define PFERR_PK_MASK BIT(PFERR_PK_BIT) > #define PFERR_SGX_MASK BIT(PFERR_SGX_BIT) > +#define PFERR_LEVEL_MASK GENMASK_ULL(PFERR_LEVEL_END_BIT, PFERR_LEVEL_START_BIT) > #define PFERR_GUEST_FINAL_MASK BIT_ULL(PFERR_GUEST_FINAL_BIT) > #define PFERR_GUEST_PAGE_MASK BIT_ULL(PFERR_GUEST_PAGE_BIT) > #define PFERR_GUEST_ENC_MASK BIT_ULL(PFERR_GUEST_ENC_BIT) > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index eb17a508c5d1..265177cedf37 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4615,6 +4615,11 @@ bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma) > > int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > { > + u8 err_level = (fault->error_code & PFERR_LEVEL_MASK) >> PFERR_LEVEL_START_BIT; > + > + if (err_level) > + fault->max_level = min(fault->max_level, err_level); > + > /* > * If the guest's MTRRs may be used to compute the "real" memtype, > * restrict the mapping level to ensure KVM uses a consistent memtype > diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h > index 027aa4175d2c..bb00433932ee 100644 > --- a/arch/x86/kvm/vmx/common.h > +++ b/arch/x86/kvm/vmx/common.h > @@ -67,7 +67,8 @@ static inline void vmx_handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, > } > > static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, > - unsigned long exit_qualification) > + unsigned long exit_qualification, > + int err_page_level) > { > u64 error_code; > > @@ -90,6 +91,9 @@ static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, > if (kvm_is_private_gpa(vcpu->kvm, gpa)) > error_code |= PFERR_GUEST_ENC_MASK; > > + if (err_page_level > 0) > + error_code |= (err_page_level << PFERR_LEVEL_START_BIT) & PFERR_LEVEL_MASK; > + > return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > } > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 31598b84811f..e4167f08b58b 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -1803,7 +1803,20 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, > > static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > { > + union tdx_ext_exit_qualification ext_exit_qual; > unsigned long exit_qual; > + int err_page_level = 0; > + > + ext_exit_qual.full = tdexit_ext_exit_qual(vcpu); > + > + if (ext_exit_qual.type >= NUM_EXT_EXIT_QUAL) { Can we add unlikely() hint here? > + pr_err("EPT violation at gpa 0x%lx, with invalid ext exit qualification type 0x%x\n", > + tdexit_gpa(vcpu), ext_exit_qual.type); > + kvm_vm_bugged(vcpu->kvm); > + return 0; > + } else if (ext_exit_qual.type == EXT_EXIT_QUAL_ACCEPT) { > + err_page_level = tdx_sept_level_to_pg_level(ext_exit_qual.req_sept_level); > + } > > if (kvm_is_private_gpa(vcpu->kvm, tdexit_gpa(vcpu))) { > /* > @@ -1830,7 +1843,7 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > } > > trace_kvm_page_fault(vcpu, tdexit_gpa(vcpu), exit_qual); > - return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual); > + return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual, err_page_level); > } > > static int tdx_handle_ept_misconfig(struct kvm_vcpu *vcpu) > diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h > index 54c3f6b83571..37ee944c36a1 100644 > --- a/arch/x86/kvm/vmx/tdx.h > +++ b/arch/x86/kvm/vmx/tdx.h > @@ -72,6 +72,25 @@ union tdx_exit_reason { > u64 full; > }; > > +union tdx_ext_exit_qualification { > + struct { > + u64 type : 4; > + u64 reserved0 : 28; > + u64 req_sept_level : 3; > + u64 err_sept_level : 3; > + u64 err_sept_state : 8; > + u64 err_sept_is_leaf : 1; > + u64 reserved1 : 17; > + }; > + u64 full; > +}; > + > +enum tdx_ext_exit_qualification_type { > + EXT_EXIT_QUAL_NONE, > + EXT_EXIT_QUAL_ACCEPT, > + NUM_EXT_EXIT_QUAL, > +}; > + > struct vcpu_tdx { > struct kvm_vcpu vcpu; > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 28732925792e..ae9ba0731521 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5753,7 +5753,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) > if (unlikely(allow_smaller_maxphyaddr && kvm_vcpu_is_illegal_gpa(vcpu, gpa))) > return kvm_emulate_instruction(vcpu, 0); > > - return __vmx_handle_ept_violation(vcpu, gpa, exit_qualification); > + return __vmx_handle_ept_violation(vcpu, gpa, exit_qualification, 0); > } > > static int handle_ept_misconfig(struct kvm_vcpu *vcpu)