Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932436AbbKMCOL (ORCPT ); Thu, 12 Nov 2015 21:14:11 -0500 Received: from tama50.ecl.ntt.co.jp ([129.60.39.147]:39335 "EHLO tama50.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754394AbbKMCOJ (ORCPT ); Thu, 12 Nov 2015 21:14:09 -0500 Subject: Re: [PATCH 09/10 RFC] KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() To: Paolo Bonzini References: <20151112204849.ba920599a8426d7196a0df73@lab.ntt.co.jp> <20151112205656.33f23effece7e6914c3e2646@lab.ntt.co.jp> <5644A1CE.1040906@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org From: Takuya Yoshikawa Message-ID: <564547AA.20002@lab.ntt.co.jp> Date: Fri, 13 Nov 2015 11:15:06 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <5644A1CE.1040906@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4349 Lines: 133 On 2015/11/12 23:27, Paolo Bonzini wrote: > On 12/11/2015 12:56, Takuya Yoshikawa wrote: >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h >> index 9d21b44..f414ca6 100644 >> --- a/arch/x86/kvm/paging_tmpl.h >> +++ b/arch/x86/kvm/paging_tmpl.h >> @@ -598,7 +598,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, >> goto out_gpte_changed; >> >> if (sp) >> - link_shadow_page(it.sptep, sp, PT_GUEST_ACCESSED_MASK); >> + link_shadow_page(vcpu, it.sptep, sp, PT_GUEST_ACCESSED_MASK); >> } >> > > Here I think you can remove completely the > > if (sp) > kvm_mmu_put_page(sp, it.sptep); > > later in FNAME(fetch). Apart from this nit, it's okay. Yes, that's what this patch does below: > @@ -629,8 +629,6 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, > return emulate; > > out_gpte_changed: > - if (sp) > - kvm_mmu_put_page(sp, it.sptep); > kvm_release_pfn_clean(pfn); > return 0; > } Since this is the only user of kvm_mmu_put_page(), it also removes the definition: > @@ -2268,11 +2268,6 @@ static void kvm_mmu_page_unlink_children(struct kvm *kvm, > mmu_page_zap_pte(kvm, sp, sp->spt + i); > } > > -static void kvm_mmu_put_page(struct kvm_mmu_page *sp, u64 *parent_pte) > -{ > - mmu_page_remove_parent_pte(sp, parent_pte); > -} > - > static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp) > { > u64 *sptep; Actually, I don't understand why this is named kvm_mmu_put_page() for just removing parent_pte pointer from the sp->parent_ptes pointer chain. > On to kvm_mmu_get_page... > > if (!direct) { > if (rmap_write_protect(vcpu, gfn)) > kvm_flush_remote_tlbs(vcpu->kvm); > if (level > PT_PAGE_TABLE_LEVEL && need_sync) > kvm_sync_pages(vcpu, gfn); > > This seems fishy. > > need_sync is set if sp->unsync, but then the parents have not been > unsynced yet. Reaching here means that kvm_mmu_get_page() could not return sp from inside the for_each_gfn_sp() loop above, so even without this patch, mark_unsync() has not been called. Here, sp holds the new page allocated by kvm_mmu_alloc_page(). One confusing thing is that hlist_add_head() right before this "if (!direct)" line has already added the new sp to the hash list, so it will be found by for_each_gfn_indirect_valid_sp() in kvm_sync_pages(). Because this sp is new and sp->unsync is not set, kvm_sync_pages() will just skip it and look for other sp's whose ->unsync were found to be set in the for_each_gfn_sp() loop. I'm not 100% sure if the existence of the parent_pte pointer in the newly created sp->parent_ptes chain alone makes any difference: > @@ -2127,7 +2122,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > sp = kvm_mmu_alloc_page(vcpu, direct); > > sp->parent_ptes.val = 0; > - mmu_page_add_parent_pte(vcpu, sp, parent_pte); > > sp->gfn = gfn; > sp->role = role; > On the other hand, all calls to kvm_mmu_get_page except for the > roots are followed by link_shadow_page... Perhaps if parent_pte != NULL > you can call link_shadow_page directly from kvm_mmu_get_page. The call > would go before the "if (!direct)" and it would subsume all the existing > calls. > > We could probably also warn if > > (parent_pte == NULL) > != (level == vcpu->arch.mmu.root_level) > > in kvm_mmu_get_page. I think we should set the spte after init_shadow_page_table(), and to make this subsume all the existing calls, we need to change the "return sp;" in the for_each_gfn_sp() loop to a goto statement so that the end of this function will become something like this: init_shadow_page(sp); out: if (parent_pte) { mmu_page_add_parent_pte(vcpu, sp, parent_pte); link_shadow_page(parent_pte, sp, accessed); } trace_kvm_mmu_get_page(sp, created); return sp; So, "bool accessed" needs to be passed to kvm_mmu_get_page(). But any way, we need to understand if mmu_page_add_parent_pte() really needs to be placed before the "if (!direct)" block. Takuya -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/