Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp2623068rdg; Mon, 16 Oct 2023 09:40:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGY/vmXWWZBMF1FccNxXCgT32PB2qwR2oQG6impOn6Q1jURRWuZHWgRnvxAIVCBYBUeaGUs X-Received: by 2002:a05:6358:7f1c:b0:143:96ac:96da with SMTP id p28-20020a0563587f1c00b0014396ac96damr27493348rwn.2.1697474410510; Mon, 16 Oct 2023 09:40:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697474410; cv=none; d=google.com; s=arc-20160816; b=EsQ2fXZYek4XZ/mOcqzLWIeVhfm2HemzaAIqBVuaQiguKrL90ugrncIT84cwkwRKop 0yt0EofvUacTfcdPBpKllwONBXSyB8hYKcapNRHRnxy075wte670AW0NvEyCbuyR+UEG Bqaqf239dNqSA7h0/726bTbVLFjwuAsISl88Vz0n82GKZ0latwiavNnrlma/t02babKk CK4QXvzTw4tDwgb7ASfBNxBcUsiynm5175oUie1XKmKfuoIpXDwla9XINWgllo3RDeSP rtwl0OMowMaeqA3AzGWX2U/tUfoEcLPGLT1IoUQtT3I0GPgxagVy9oY0zZUB3qkxbgJs b1AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=j1Vik4CsETa+8nbtEskF+tj4gm2VQgvQrwmly3egNdA=; fh=lRdU2Q/1zx5DcPdZuWBjshA5VT5Oc9cEhB1tCFiV0Nw=; b=UtThK7AryYvZEgp887VddPW4hO9XsUWw5/KugB+V2AoG0aKV8WQZ4iJ0NxfJN5vy0I FV9FSGDzsiFxwjsF2J2oFI034XmctGq+MNnIXeS9cKMMNROENTb39OSCQlA56xNfpM32 SkzbBOxJq72ziphnOWUVqEQD8AWXrRV8FCfrVa8XcwH5KMbti5sAQxVB7huIwo4100ZY iaUr+vtkuQ6o+NQ5L4dolGxrX+k6ZscHNtnFd21E+xY/JQhhrrVX3XlAjM3KpeWh0Sg3 wcX1jtqejgXtOPQOnGgmmYKuKHB4wcZGyoD07ChaSr7yZ54SDL5mS+GSCU9B6I69RhKH 98Bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="UklVa7/Q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id cb2-20020a056a02070200b005af7c6a2212si7554891pgb.648.2023.10.16.09.40.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 09:40:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="UklVa7/Q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 4A9E280569A1; Mon, 16 Oct 2023 09:39:50 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233707AbjJPQir (ORCPT + 99 others); Mon, 16 Oct 2023 12:38:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234457AbjJPQhg (ORCPT ); Mon, 16 Oct 2023 12:37:36 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5B8783C2; Mon, 16 Oct 2023 09:23:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697473398; x=1729009398; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=plOWEmHcszC/a1uSmdrAZhY93JEutDgD1TMhrZRF+cs=; b=UklVa7/QuN42L2VjQe4Etd7jcGFaES/o1tShzuOpLGXN6GzbYeKDjOCS EmpMHpx4OzKkga7OU0yZ5yH/aUJl0tJ00/uNNlvTUWhTaQN5hfxB8giXL QoJgbe2wbq6r4VCI4s0nJPdu/zKaBFqW5m1wnfsxPVb7PvL9a83JJEUw8 XYEOTFuss2NPq8icpGKWPe9HBZFjEaOAmxGYd827TZdUvYbVg0fYI+Zyw XnOjnFwrse/pidg9+nR5mIMAWunKxRSnMMJ7EUY2bqpqcz6rK7Bz9rFV3 znZLQYHK/eBMmKERED2I5W36CKPgaqgeQm9s4fE0nI2Q/UgBUBLw2TPmU Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10865"; a="471793214" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="471793214" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 09:21:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10865"; a="899569273" X-IronPort-AV: E=Sophos;i="6.03,229,1694761200"; d="scan'208";a="899569273" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Oct 2023 09:19:18 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [RFC PATCH v5 15/16] KVM: x86/mmu: Make kvm fault handler aware of large page of private memslot Date: Mon, 16 Oct 2023 09:21:06 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 16 Oct 2023 09:39:50 -0700 (PDT) From: Isaku Yamahata struct kvm_page_fault.req_level is the page level which takes care of the faulted-in page size. For now its calculation is only for the conventional kvm memslot by host_pfn_mapping_level() that traverses page table. However, host_pfn_mapping_level() cannot be used for private kvm memslot because pages of private kvm memlost aren't mapped into user virtual address space. Instead page order is given when getting pfn. Remember it in struct kvm_page_fault and use it. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 34 +++++++++++++++++---------------- arch/x86/kvm/mmu/mmu_internal.h | 12 +++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- 3 files changed, 30 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a1a9b0bc4f1a..e77cfe133dfe 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3158,10 +3158,10 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, static int __kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, - gfn_t gfn, int max_level, bool is_private) + gfn_t gfn, int max_level, int host_level, + bool is_private) { struct kvm_lpage_info *linfo; - int host_level; max_level = min(max_level, max_huge_page_level); for ( ; max_level > PG_LEVEL_4K; max_level--) { @@ -3170,24 +3170,23 @@ static int __kvm_mmu_max_mapping_level(struct kvm *kvm, break; } - if (is_private) - return max_level; - if (max_level == PG_LEVEL_4K) return PG_LEVEL_4K; - host_level = host_pfn_mapping_level(kvm, gfn, slot); + if (!is_private) { + WARN_ON_ONCE(host_level != PG_LEVEL_NONE); + host_level = host_pfn_mapping_level(kvm, gfn, slot); + } + WARN_ON_ONCE(host_level == PG_LEVEL_NONE); return min(host_level, max_level); } int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level) + int max_level, bool faultin_private) { - bool is_private = kvm_slot_can_be_private(slot) && - kvm_mem_is_private(kvm, gfn); - - return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, is_private); + return __kvm_mmu_max_mapping_level(kvm, slot, gfn, max_level, + PG_LEVEL_NONE, faultin_private); } void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) @@ -3212,7 +3211,8 @@ void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault */ fault->req_level = __kvm_mmu_max_mapping_level(vcpu->kvm, slot, fault->gfn, fault->max_level, - fault->is_private); + fault->host_level, + kvm_is_faultin_private(fault)); if (fault->req_level == PG_LEVEL_4K || fault->huge_page_disallowed) return; @@ -4328,6 +4328,7 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { int max_order, r; + u8 max_level; if (!kvm_slot_can_be_private(fault->slot)) { kvm_mmu_prepare_memory_fault_exit(vcpu, fault); @@ -4341,8 +4342,9 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, return r; } - fault->max_level = min(kvm_max_level_for_order(max_order), - fault->max_level); + max_level = kvm_max_level_for_order(max_order); + fault->host_level = max_level; + fault->max_level = min(max_level, fault->max_level); fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); return RET_PF_CONTINUE; @@ -4392,7 +4394,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault return -EFAULT; } - if (fault->is_private) + if (kvm_is_faultin_private(fault)) return kvm_faultin_pfn_private(vcpu, fault); async = false; @@ -6804,7 +6806,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, */ if (sp->role.direct && sp->role.level < kvm_mmu_max_mapping_level(kvm, slot, sp->gfn, - PG_LEVEL_NUM)) { + PG_LEVEL_NUM, false)) { kvm_zap_one_rmap_spte(kvm, rmap_head, sptep); if (kvm_available_flush_remote_tlbs_range()) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 099e8fe929c6..813d405fc11e 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -358,6 +358,9 @@ struct kvm_page_fault { * is changing its own translation in the guest page tables. */ bool write_fault_to_shadow_pgtable; + + /* valid only for private memslot && private gfn */ + enum pg_level host_level; }; int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); @@ -452,7 +455,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_mmu_max_mapping_level(struct kvm *kvm, const struct kvm_memory_slot *slot, gfn_t gfn, - int max_level); + int max_level, bool faultin_private); void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_level); @@ -470,4 +473,11 @@ static inline bool kvm_hugepage_test_mixed(struct kvm_memory_slot *slot, gfn_t g } #endif +static inline bool kvm_is_faultin_private(const struct kvm_page_fault *fault) +{ + if (IS_ENABLED(CONFIG_KVM_GENERIC_PRIVATE_MEM)) + return fault->is_private && kvm_slot_can_be_private(fault->slot); + return false; +} + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 9cb63613d831..cac48881c5f1 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -2302,7 +2302,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, continue; max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot, - iter.gfn, PG_LEVEL_NUM); + iter.gfn, PG_LEVEL_NUM, false); if (max_mapping_level < iter.level) continue; -- 2.25.1