Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp670162pxu; Tue, 6 Oct 2020 16:31:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyCnVZT4CaXLqdsGsBcCI3b0O21YsYU/6Sa4CoCJPFuhaz6oOnCeCz2CF5q247MOcur5oa1 X-Received: by 2002:aa7:d30f:: with SMTP id p15mr551659edq.256.1602027098394; Tue, 06 Oct 2020 16:31:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602027098; cv=none; d=google.com; s=arc-20160816; b=AsVfdvO+Xv2J0l8k8/Cr0YDfqnMQwjxe1kx+WWFQMPNlt8HrzzKwSMEuf47tT5gdVt WTskKKxeN3FQUP5ptDQJHe5SZHGWijgD0LsVn3hdYu4GfRGcqacBz7k3pttKUp2txtpo DJEeRT6uzoerGkQrK48Xg+6VIJncAzcxT84z0lkS8EYXKAynew2Rz4R4Q33yi/NzclyN 0krOaxKVncD7JR6zU3RtIn0YykmaDYcmTqCUfpXTuJ6RWYSLiHmO/TjRt3y3gTzpT+EA jDC8x1j7mnIA64+74Yhiq5CIh+w9yZj49Jnd1suK0NuLDC1pPUhRcXU3SnatMRyo87u2 sHwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=9p1MY3hl0VjKwvxCjtFJ+LTRHD7p4AUYAZD5YNgkGmI=; b=gFNdORgeR3WsOmp/QHfxyGkMDqOiGqVNsEq2+KauZcczcHqFeOpOtlBjg7AwddRDAy q9UdLHU1mVyWIq81X30umUOE2PQ6KsCWdw9Y9QJa+OyBvvAD3SP96RmlyYcDULac6urE hDmNNi3kqst2wmBM56T5pBXkMPnNMmcOv/VTTP4k5PFkohWhdNZlG8oA92RMH+PTrhqy scXueW6WeFmqnqYEfoYsoU+wSppFwuL5lT1g8PqMUlpRTD+OqaGE6sMM+Vg+aq8gOAG+ jES2Nlnf9U7I+Dq0+lh+gqOQ+Ky0APzcEurXQfC59wU0iSeD7bnUUZAOj9U+TMO9Qq/K g9iA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=euybA1L6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o27si166689ejh.256.2020.10.06.16.31.15; Tue, 06 Oct 2020 16:31:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=euybA1L6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727677AbgJFWde (ORCPT + 99 others); Tue, 6 Oct 2020 18:33:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727655AbgJFWde (ORCPT ); Tue, 6 Oct 2020 18:33:34 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F23DCC0613D2 for ; Tue, 6 Oct 2020 15:33:33 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id t18so407173ilo.12 for ; Tue, 06 Oct 2020 15:33:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9p1MY3hl0VjKwvxCjtFJ+LTRHD7p4AUYAZD5YNgkGmI=; b=euybA1L6HHdATfaMRKmXvnT08W0W/Y3u6itIt67ptVlnqL2mShdqTKHusJ/QLsCDAF LKKsEUVKMmgPhXMv6ZjlgWRJ9ghfcFMGzS0sZOQyvsmKxwgKuwLiATdVIPACHtAn3Sh+ 3LbBsGAv3ArmaFq54UF2LNSBno4jH3Ulr5mm/M5CbN6WP5ne103V2CgbwYPQsZiW41oX p8xkerimFhRULM4Enu4LLUy4N+aAHELgFEGDZP5lrbB79JZkfVQR5cuOJ8E6l2o5vGRF QYG9qwBY++Vg2PnHhWhmlR8NqfJQ/c9uSZq2ThtkKSv0RPOYmUGIUpbDd1hwqfhhbWa7 vmsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9p1MY3hl0VjKwvxCjtFJ+LTRHD7p4AUYAZD5YNgkGmI=; b=lzPnt8/y6KnJjNTduHozmH+9CMLxp9bTM/Ap0yIaIThhimr/n5JAlI/165yq16jH8I wCble4pNHOyF7jDIkogR00yZyst2aMPLsAQ+DL07/Jt6WfFp7YLK7bTXDjHplTLpeXqi BX+elgbxk5eVRe1wlbqFfLmPpHJhmQP1XnyYT/ULn8BiqBUnhroCskB9Dh9P3rLzxEc2 FtpC9u4aTX00QhYRrmAFniQjK6yn98BSa2qnv1X0fNhb+OCZx4qoTp5o+zWUyY8jeaLn k4So5EUcYdacwxuRnvW5BESHPk8+GvHPUZRNca3V2j5ZvrojO6o7EtBdNRn1q2iHmrSd PGNg== X-Gm-Message-State: AOAM533Zr0v3KQBChD/rCP/JrSs57C9QQGd/6aGfLJDIJTZDRKvSF9jB I1krP/40xHqTXNVctYmgiefOH0OFY7AvvXw8mlBctA== X-Received: by 2002:a92:5b02:: with SMTP id p2mr306759ilb.283.1602023612887; Tue, 06 Oct 2020 15:33:32 -0700 (PDT) MIME-Version: 1.0 References: <20200925212302.3979661-1-bgardon@google.com> <20200925212302.3979661-11-bgardon@google.com> <20200930163740.GD32672@linux.intel.com> In-Reply-To: <20200930163740.GD32672@linux.intel.com> From: Ben Gardon Date: Tue, 6 Oct 2020 15:33:21 -0700 Message-ID: Subject: Re: [PATCH 10/22] kvm: mmu: Add TDP MMU PF handler To: Sean Christopherson Cc: LKML , kvm , Cannon Matthews , Paolo Bonzini , Peter Xu , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 30, 2020 at 9:37 AM Sean Christopherson wrote: > > On Fri, Sep 25, 2020 at 02:22:50PM -0700, Ben Gardon wrote: > > @@ -4113,8 +4088,9 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, > > if (page_fault_handle_page_track(vcpu, error_code, gfn)) > > return RET_PF_EMULATE; > > > > - if (fast_page_fault(vcpu, gpa, error_code)) > > - return RET_PF_RETRY; > > + if (!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) > > + if (fast_page_fault(vcpu, gpa, error_code)) > > + return RET_PF_RETRY; > > It'll probably be easier to handle is_tdp_mmu() in fast_page_fault(). I'd prefer to keep this check here because then in the fast page fault path, we can just handle the case where we do have a tdp mmu root with the tdp mmu fast pf handler and it'll mirror the split below with __direct_map and the TDP MMU PF handler. > > > > > r = mmu_topup_memory_caches(vcpu, false); > > if (r) > > @@ -4139,8 +4115,14 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, > > r = make_mmu_pages_available(vcpu); > > if (r) > > goto out_unlock; > > - r = __direct_map(vcpu, gpa, write, map_writable, max_level, pfn, > > - prefault, is_tdp && lpage_disallowed); > > + > > + if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) > > + r = kvm_tdp_mmu_page_fault(vcpu, write, map_writable, max_level, > > + gpa, pfn, prefault, > > + is_tdp && lpage_disallowed); > > + else > > + r = __direct_map(vcpu, gpa, write, map_writable, max_level, pfn, > > + prefault, is_tdp && lpage_disallowed); > > > > out_unlock: > > spin_unlock(&vcpu->kvm->mmu_lock); > > ... > > > +/* > > + * Handle a TDP page fault (NPT/EPT violation/misconfiguration) by installing > > + * page tables and SPTEs to translate the faulting guest physical address. > > + */ > > +int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, int write, int map_writable, > > + int max_level, gpa_t gpa, kvm_pfn_t pfn, > > + bool prefault, bool account_disallowed_nx_lpage) > > +{ > > + struct tdp_iter iter; > > + struct kvm_mmu_memory_cache *pf_pt_cache = > > + &vcpu->arch.mmu_shadow_page_cache; > > + u64 *child_pt; > > + u64 new_spte; > > + int ret; > > + int as_id = kvm_arch_vcpu_memslots_id(vcpu); > > + gfn_t gfn = gpa >> PAGE_SHIFT; > > + int level; > > + > > + if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa))) > > + return RET_PF_RETRY; > > I feel like we should kill off these silly WARNs in the existing code instead > of adding more. If they actually fired, I'm pretty sure that they would > continue firing and spamming the kernel log until the VM is killed as I don't > see how restarting the guest will magically fix anything. > > > + > > + if (WARN_ON(!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))) > > + return RET_PF_RETRY; > > This seems especially gratuitous, this has exactly one caller that explicitly > checks is_tdp_mmu_root(). Again, if this fires it will spam the kernel log > into submission. > > > + > > + level = kvm_mmu_hugepage_adjust(vcpu, gfn, max_level, &pfn); > > + > > + for_each_tdp_pte_vcpu(iter, vcpu, gfn, gfn + 1) { > > + disallowed_hugepage_adjust(iter.old_spte, gfn, iter.level, > > + &pfn, &level); > > + > > + if (iter.level == level) > > + break; > > + > > + /* > > + * If there is an SPTE mapping a large page at a higher level > > + * than the target, that SPTE must be cleared and replaced > > + * with a non-leaf SPTE. > > + */ > > + if (is_shadow_present_pte(iter.old_spte) && > > + is_large_pte(iter.old_spte)) { > > + *iter.sptep = 0; > > + handle_changed_spte(vcpu->kvm, as_id, iter.gfn, > > + iter.old_spte, 0, iter.level); > > + kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn, > > + KVM_PAGES_PER_HPAGE(iter.level)); > > + > > + /* > > + * The iter must explicitly re-read the spte here > > + * because the new is needed before the next iteration > > + * of the loop. > > + */ > > I think it'd be better to explicitly, and simply, call out that iter.old_spte > is consumed below. It's subtle enough to warrant a comment, but the comment > didn't actually help me. Maybe something like: > > /* > * Refresh iter.old_spte, it will trigger the !present > * path below. > */ > That's a good point and calling out the relation to the present check below is much clearer. > > + iter.old_spte = READ_ONCE(*iter.sptep); > > + } > > + > > + if (!is_shadow_present_pte(iter.old_spte)) { > > + child_pt = kvm_mmu_memory_cache_alloc(pf_pt_cache); > > + clear_page(child_pt); > > + new_spte = make_nonleaf_spte(child_pt, > > + !shadow_accessed_mask); > > + > > + *iter.sptep = new_spte; > > + handle_changed_spte(vcpu->kvm, as_id, iter.gfn, > > + iter.old_spte, new_spte, > > + iter.level); > > + } > > + } > > + > > + if (WARN_ON(iter.level != level)) > > + return RET_PF_RETRY; > > This also seems unnecessary. Or maybe these are all good candiates for > KVM_BUG_ON... > I've replaced all these warnings with KVM_BUG_ONs. > > + > > + ret = page_fault_handle_target_level(vcpu, write, map_writable, > > + as_id, &iter, pfn, prefault); > > + > > + /* If emulating, flush this vcpu's TLB. */ > > Why? It's obvious _what_ the code is doing, the comment should explain _why_. > > > + if (ret == RET_PF_EMULATE) > > + kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); > > + > > + return ret; > > +} > > diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h > > index cb86f9fe69017..abf23dc0ab7ad 100644 > > --- a/arch/x86/kvm/mmu/tdp_mmu.h > > +++ b/arch/x86/kvm/mmu/tdp_mmu.h > > @@ -14,4 +14,8 @@ void kvm_tdp_mmu_put_root_hpa(struct kvm *kvm, hpa_t root_hpa); > > > > bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end); > > void kvm_tdp_mmu_zap_all(struct kvm *kvm); > > + > > +int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, int write, int map_writable, > > + int level, gpa_t gpa, kvm_pfn_t pfn, bool prefault, > > + bool lpage_disallowed); > > #endif /* __KVM_X86_MMU_TDP_MMU_H */ > > -- > > 2.28.0.709.gb0816b6eb0-goog > >