Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp705157pxu; Wed, 14 Oct 2020 11:29:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwAXyLojMn6Vnx0esGG9niHV4dkfp3CApDIYF5JUkbKbsM6FIy1eyGSfr7/B1AbxEzYmt2r X-Received: by 2002:a50:88e5:: with SMTP id d92mr235572edd.145.1602700180792; Wed, 14 Oct 2020 11:29:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602700180; cv=none; d=google.com; s=arc-20160816; b=y1kyFA+YRRUOeHRviWYmCUPtEmk9GATgntzOq+vz5OzZV3o+rhWo7u/KSOHYFs9yCm 13P/p2tdYJwYmZPB4Px8KpoMDtFaijYn66k0QQK8IYg/vAZ7cRjT22dAvDRnCH0CIyG1 5SEVnL4L34LJlAyWQeQ+q4Td7NsUwINcTnsLJ+O3jMW+6U+CjNZQdZBAGOnkr1hqJ8vM dhfC1Kl+6NsXr3oXo5wRcxp+twZ7Ok7v+XK/ktDJpWZJ6Dtl8/+KW6Eb+6M45rvnPpS4 Z0B47rtv2aYfqcrn0rQyCfy2/tzw2nwninPlzrgt2995AYYV22hqofRz8pJ9wxti4anm NgnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:sender:dkim-signature; bh=aAKqStLipLNglfnBbmU4uyjlvVKmWH6/nWK4t/GH5Is=; b=mwCEHYY7pPKDvEEsFLbLEV81T/u+9+T2HPQOiFuM7UxKqNQzuCCj8V2hxDWlj9rMNG 5bxfZ2EwI18GCPxLf/3An/CqmI1X6KlmC3dhNdP8GZxcUFCNL6nAdIubAX5vQQ3zckcy wp06r3dCr9z8TSwoIXgid3XV+fKUpchsGdcKqqmO58z6cd9qiD2C5VDPTgDnCHkEBf8u e5DXMSUoTSBNsPFqySAuM1IeNIC1PGy/b/n0/5nZqUpGEyBGjd18TFLwzb6ZonRWOyHQ 7DN6mKH2hiLl1X6fjyTzLJJF77bWojXSBflLilKuTxP2YPHVJxR2AdNe/kbD5gq3BiU6 ybSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DAmtf0nm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s20si218692edr.0.2020.10.14.11.29.16; Wed, 14 Oct 2020 11:29:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DAmtf0nm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389332AbgJNS1o (ORCPT + 99 others); Wed, 14 Oct 2020 14:27:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388669AbgJNS10 (ORCPT ); Wed, 14 Oct 2020 14:27:26 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38D05C0613DB for ; Wed, 14 Oct 2020 11:27:17 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id co16so2729pjb.1 for ; Wed, 14 Oct 2020 11:27:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=aAKqStLipLNglfnBbmU4uyjlvVKmWH6/nWK4t/GH5Is=; b=DAmtf0nm71gj2upbd1Pn2uJRMrrJxd4M5dUhRTc/4DlackeHh1+BqWFO0MGJ7kSRjX zx56TqPsi42djPGNaSj4cgWI2dCt3GF1ESoY5sugLNZaO65ZbWytCYjoMUJvTA9PCeav uBYcb5XloRf3CApZ2M14dZhI/OI/G2EY4qvNqKpDVlD+6xk0hD64Z6MZlhYzJd8+WVub SZ5swH0/c3YDNWBq7+O3c78sn5Q5WGbI3QJB98nnQuEvcU/3O3CCnajv0f13QXwq8nTh 5DYV5/LpNK8T+kNlT/Trk/Qia9iiHgla5p5p5eZ0d+ABLfLRjsHmO6HaNIU3IovMpXEc gheA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=aAKqStLipLNglfnBbmU4uyjlvVKmWH6/nWK4t/GH5Is=; b=Yies6YwogqQC0XiddORqKD+tUST0pEB75USTGUUg5gXf4b8BhTPPQxVQYbz7YZokco 8C9/qDPx21JTJ9jWk53rgbx968S0Gxzyre+pVL5uwqtbzf5St3RlZBPRDCpUFDqjJDt8 QtHGGIyGHrfUDnVe6wPVfjHhJANcx0BVRrHrC3YTgHW7QEOkmalbj+kxJysPnPL1qUFV aZefHphyTMmOGXneU5bdwDk0AE9RAphfkrIZ2h6faVvNnspKjhvz/ToU7TDrJ8I6Feoz ETstxkoON+Edz+ufcu8EU7u3uVaDUxdPoJPdWzPDoRmY/yOSzQm7gAJEatdu5/2y8pXR GpeA== X-Gm-Message-State: AOAM531kgBHvr2hYIJAnjtNs6KJNJekROl3ncYPYQ33HJhsnKuZJDpSG q7IrF6xRJNMQwrkeThDzJLv+NUPFuqC0PsgHrDGK/Hgx71fF4xG+HxYdrUXNtqdIRfYStqKjVCY qcTU5oN9FgNpqAY59nGvwxMv0RSA/0pNZRUaVv72b3IhcR+BLVSIFOyRvjjkQymwa4uaELJti Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:aa7:96f8:0:b029:152:94c0:7e5 with SMTP id i24-20020aa796f80000b029015294c007e5mr558245pfq.76.1602700036584; Wed, 14 Oct 2020 11:27:16 -0700 (PDT) Date: Wed, 14 Oct 2020 11:26:47 -0700 In-Reply-To: <20201014182700.2888246-1-bgardon@google.com> Message-Id: <20201014182700.2888246-8-bgardon@google.com> Mime-Version: 1.0 References: <20201014182700.2888246-1-bgardon@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v2 07/20] kvm: x86/mmu: Support zapping SPTEs in the TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Cannon Matthews , Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add functions to zap SPTEs to the TDP MMU. These are needed to tear down TDP MMU roots properly and implement other MMU functions which require tearing down mappings. Future patches will add functions to populate the page tables, but as for this patch there will not be any work for these functions to do. Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell machine. This series introduced no new failures. This series can be viewed in Gerrit at: https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538 Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 15 +++++ arch/x86/kvm/mmu/tdp_iter.c | 5 ++ arch/x86/kvm/mmu/tdp_iter.h | 1 + arch/x86/kvm/mmu/tdp_mmu.c | 109 ++++++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.h | 2 + 5 files changed, 132 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 8bf20723c6177..337ab6823e312 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5787,6 +5787,10 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) kvm_reload_remote_mmus(kvm); kvm_zap_obsolete_pages(kvm); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_zap_all(kvm); + spin_unlock(&kvm->mmu_lock); } @@ -5827,6 +5831,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) struct kvm_memslots *slots; struct kvm_memory_slot *memslot; int i; + bool flush; spin_lock(&kvm->mmu_lock); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { @@ -5846,6 +5851,12 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) } } + if (kvm->arch.tdp_mmu_enabled) { + flush = kvm_tdp_mmu_zap_gfn_range(kvm, gfn_start, gfn_end); + if (flush) + kvm_flush_remote_tlbs(kvm); + } + spin_unlock(&kvm->mmu_lock); } @@ -6012,6 +6023,10 @@ void kvm_mmu_zap_all(struct kvm *kvm) } kvm_mmu_commit_zap_page(kvm, &invalid_list); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_zap_all(kvm); + spin_unlock(&kvm->mmu_lock); } diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index b07e9f0c5d4aa..701eb753b701e 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -174,3 +174,8 @@ void tdp_iter_refresh_walk(struct tdp_iter *iter) iter->root_level, iter->min_level, goal_gfn); } +u64 *tdp_iter_root_pt(struct tdp_iter *iter) +{ + return iter->pt_path[iter->root_level - 1]; +} + diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index d629a53e1b73f..884ed2c70bfed 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -52,5 +52,6 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, int min_level, gfn_t goal_gfn); void tdp_iter_next(struct tdp_iter *iter); void tdp_iter_refresh_walk(struct tdp_iter *iter); +u64 *tdp_iter_root_pt(struct tdp_iter *iter); #endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index f2bd3a6928ce9..9b5cd4a832f1a 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -56,8 +56,13 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa) return sp->tdp_mmu_page && sp->root_count; } +static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t start, gfn_t end); + void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { + gfn_t max_gfn = 1ULL << (boot_cpu_data.x86_phys_bits - PAGE_SHIFT); + lockdep_assert_held(&kvm->mmu_lock); WARN_ON(root->root_count); @@ -65,6 +70,8 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) list_del(&root->link); + zap_gfn_range(kvm, root, 0, max_gfn); + free_page((unsigned long)root->spt); kmem_cache_free(mmu_page_header_cache, root); } @@ -155,6 +162,11 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, u64 old_spte, u64 new_spte, int level); +static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) +{ + return sp->role.smm ? 1 : 0; +} + /** * handle_changed_spte - handle bookkeeping associated with an SPTE change * @kvm: kvm instance @@ -262,3 +274,100 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, { __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level); } + +static inline void tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, + u64 new_spte) +{ + u64 *root_pt = tdp_iter_root_pt(iter); + struct kvm_mmu_page *root = sptep_to_sp(root_pt); + int as_id = kvm_mmu_page_as_id(root); + + *iter->sptep = new_spte; + + handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, + iter->level); +} + +#define tdp_root_for_each_pte(_iter, _root, _start, _end) \ + for_each_tdp_pte(_iter, _root->spt, _root->role.level, _start, _end) + +static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +{ + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + kvm_flush_remote_tlbs(kvm); + cond_resched_lock(&kvm->mmu_lock); + tdp_iter_refresh_walk(iter); + return true; + } + + return false; +} + +/* + * Tears down the mappings for the range of gfns, [start, end), and frees the + * non-root pages mapping GFNs strictly within that range. Returns true if + * SPTEs have been cleared and a TLB flush is needed before releasing the + * MMU lock. + */ +static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t start, gfn_t end) +{ + struct tdp_iter iter; + bool flush_needed = false; + + tdp_root_for_each_pte(iter, root, start, end) { + if (!is_shadow_present_pte(iter.old_spte)) + continue; + + /* + * If this is a non-last-level SPTE that covers a larger range + * than should be zapped, continue, and zap the mappings at a + * lower level. + */ + if ((iter.gfn < start || + iter.gfn + KVM_PAGES_PER_HPAGE(iter.level) > end) && + !is_last_spte(iter.old_spte, iter.level)) + continue; + + tdp_mmu_set_spte(kvm, &iter, 0); + + flush_needed = !tdp_mmu_iter_cond_resched(kvm, &iter); + } + return flush_needed; +} + +/* + * Tears down the mappings for the range of gfns, [start, end), and frees the + * non-root pages mapping GFNs strictly within that range. Returns true if + * SPTEs have been cleared and a TLB flush is needed before releasing the + * MMU lock. + */ +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_mmu_page *root; + bool flush = false; + + for_each_tdp_mmu_root(kvm, root) { + /* + * Take a reference on the root so that it cannot be freed if + * this thread releases the MMU lock and yields in this loop. + */ + get_tdp_mmu_root(kvm, root); + + flush |= zap_gfn_range(kvm, root, start, end); + + put_tdp_mmu_root(kvm, root); + } + + return flush; +} + +void kvm_tdp_mmu_zap_all(struct kvm *kvm) +{ + gfn_t max_gfn = 1ULL << (boot_cpu_data.x86_phys_bits - PAGE_SHIFT); + bool flush; + + flush = kvm_tdp_mmu_zap_gfn_range(kvm, 0, max_gfn); + if (flush) + kvm_flush_remote_tlbs(kvm); +} diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index ac0ef91294420..6de2d007fc03c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -12,4 +12,6 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t root); hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root); +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end); +void kvm_tdp_mmu_zap_all(struct kvm *kvm); #endif /* __KVM_X86_MMU_TDP_MMU_H */ -- 2.28.0.1011.ga647a8990f-goog