Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1419200pxk; Fri, 25 Sep 2020 14:26:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzR6bqlDaoPvnX/sc4etr4IG5AgR6M87aOV5wLNyYYpMkfwydUKIhqe/CqYXDd8ZVchSz1m X-Received: by 2002:a17:906:441:: with SMTP id e1mr4568234eja.396.1601069210635; Fri, 25 Sep 2020 14:26:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601069210; cv=none; d=google.com; s=arc-20160816; b=o/HGK8SR0nRqfIWt8v81OLecvO4J5SnHUovyfemIeqg3dPJ+yfOHTGB19hU2/7bktc IY5ilT9xHK1pgjWhhMoBNOismpSasK3p2M0zQmfyZjR0avTs6kUAd+qxeaLporzVw+Uf b5u8pYTjot2Dq8aRcfzgPDqwYZI5EVqyujt7oJLUErzLgGJmdlXCF+hSEbDrxrkSq5Ls 7YmDMPfy20J5PqGIRYGvm15RXhezIj7S7H2zjW6d1zBwlnNuwmcW+bYoUiC+4izi+L0M dvP7C7Fu9sQGjwBcJKQtrfAH8ObDl2FBUsrfMq0HeDq99Xo2wvGEeJW0PJzRvtpsewr7 avzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:sender:dkim-signature; bh=Ef4AIuZeb4s5cy019ZEPLkLaZqhCMC47Todu9wHoSxM=; b=YsjJq9w0ltEA0j4bnpautFdAV/Pnl/nogaSucNwuJUGMH/siAXrhYHWMi99xIwJIUK 9dM0igMIMwBFjJJeZMnPCPwqiDgwpEt74vLsnpm0AwguwCEamiU+bw123HysIlic1d+j n1d+xapxghdbutm4R7v+NNdlxlkc5JPk18xn0vWj8ojxaOtctBlKT/cfRr6gyCEcNL57 FJbiwCC2S7hfkYEZSXdLFbOymCDcsHjbcJVOjsxLZDtxCmkjoQ4/3D5bzu4/T/Z+hCPX 9Gr3X8UWe0sw0khQ39/2dLcRU2da4LmQjnqtEk5MB+b2qku+l9Jq4fe1SLka/8347gUA DLDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=FojbvIec; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b11si2948102ejz.631.2020.09.25.14.26.27; Fri, 25 Sep 2020 14:26:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=FojbvIec; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729361AbgIYVYw (ORCPT + 99 others); Fri, 25 Sep 2020 17:24:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728476AbgIYVXV (ORCPT ); Fri, 25 Sep 2020 17:23:21 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65D85C0613D8 for ; Fri, 25 Sep 2020 14:23:21 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id q95so289199pja.0 for ; Fri, 25 Sep 2020 14:23:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=Ef4AIuZeb4s5cy019ZEPLkLaZqhCMC47Todu9wHoSxM=; b=FojbvIec/7pxWbOTa2bolog20/DFD50uqk2zpDc237fvki3tjTgMOPwmWU0ogx9E8u dJnYbU93k/nYnLjDSJuApQrffWNitNCBap0NRMMZ8ZzO6xLd9ReQIuQjMbu7f6E5BeUJ +Cdd16WUgsWOqJHy4qTsBczm+ACXL0bi5LcdQm5m0KExZmFEIPibPw2/qBEtqmuGrIsk EUIHjjA+H2DIvXYujIV4mMxJjfVWbDywL59qhl0YSos/Q8K/utJJfINqwWV7zNM55upX 9d4tFZPqgElsybxx9Snz/Pzoz8TkM5+me78i2YhE2AuyAAL+C4D3BJBpAbCRarbvJGjv bcJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Ef4AIuZeb4s5cy019ZEPLkLaZqhCMC47Todu9wHoSxM=; b=gocEwXwL6OrDs9HHqZtHooNu0qaZA2xyo2UektWOIS9LvSWp524GdzBtRqVacC8Is7 sBplvV2bBnYN6STEZU07Wo8jLUpqx+KIhQKKAg25dyztc5JsFI/nUciQw8xG3s7jJ1qC pV5Eh5s0rbWUzwYffcM+cEBHqzszESkVnY5iu2FP3M/KoAo5pAPE5kn9oPJ2++eSXbAZ DE5d1ERN59AaZ0Pnxl8IIwpff9iX6jpuCyfdqiWJY/7w+xpKa7ZfPSt0zAlGCdnjyipu U9cYhf8QeA8AfbGMLHkLwCYWRFuVhlzy+RcsSVokSXttbIJG9wl8EWZBymX2zlae2Y+2 KP+Q== X-Gm-Message-State: AOAM532y5br5FO1zOgljsOgn16RGGysw8iJjmK25MsW/Gl8YlkW7heLT H8r/mtsPIFQ8teCl+hFSkS6DV/wszJ27sO7bUPjHuJu/L63MQhCflnOkZ/OelLznLUcxGG7iLbS be1kf7nKtldeyYMSbNB6UVE/TM5gIIq/We9HFuX/kEKI6QTeS2tFT9LsaQ17FmsEPPx19h06c Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:b713:b029:d2:6153:fb62 with SMTP id d19-20020a170902b713b02900d26153fb62mr1280963pls.28.1601069000829; Fri, 25 Sep 2020 14:23:20 -0700 (PDT) Date: Fri, 25 Sep 2020 14:22:47 -0700 In-Reply-To: <20200925212302.3979661-1-bgardon@google.com> Message-Id: <20200925212302.3979661-8-bgardon@google.com> Mime-Version: 1.0 References: <20200925212302.3979661-1-bgardon@google.com> X-Mailer: git-send-email 2.28.0.709.gb0816b6eb0-goog Subject: [PATCH 07/22] kvm: mmu: Support zapping SPTEs in the TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Cannon Matthews , Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add functions to zap SPTEs to the TDP MMU. These are needed to tear down TDP MMU roots properly and implement other MMU functions which require tearing down mappings. Future patches will add functions to populate the page tables, but as for this patch there will not be any work for these functions to do. Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell machine. This series introduced no new failures. This series can be viewed in Gerrit at: https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538 Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 15 +++++ arch/x86/kvm/mmu/tdp_iter.c | 17 ++++++ arch/x86/kvm/mmu/tdp_iter.h | 1 + arch/x86/kvm/mmu/tdp_mmu.c | 106 ++++++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.h | 2 + 5 files changed, 141 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f09081f9137b0..7a17cca19b0c1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5852,6 +5852,10 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) kvm_reload_remote_mmus(kvm); kvm_zap_obsolete_pages(kvm); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_zap_all(kvm); + spin_unlock(&kvm->mmu_lock); } @@ -5892,6 +5896,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) struct kvm_memslots *slots; struct kvm_memory_slot *memslot; int i; + bool flush; spin_lock(&kvm->mmu_lock); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { @@ -5911,6 +5916,12 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) } } + if (kvm->arch.tdp_mmu_enabled) { + flush = kvm_tdp_mmu_zap_gfn_range(kvm, gfn_start, gfn_end); + if (flush) + kvm_flush_remote_tlbs(kvm); + } + spin_unlock(&kvm->mmu_lock); } @@ -6077,6 +6088,10 @@ void kvm_mmu_zap_all(struct kvm *kvm) } kvm_mmu_commit_zap_page(kvm, &invalid_list); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_zap_all(kvm); + spin_unlock(&kvm->mmu_lock); } diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index ee90d62d2a9b1..6c1a38429c81a 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -161,3 +161,20 @@ void tdp_iter_next(struct tdp_iter *iter) done = try_step_side(iter); } } + +/* + * Restart the walk over the paging structure from the root, starting from the + * highest gfn the iterator had previously reached. Assumes that the entire + * paging structure, except the root page, may have been completely torn down + * and rebuilt. + */ +void tdp_iter_refresh_walk(struct tdp_iter *iter) +{ + gfn_t goal_gfn = iter->goal_gfn; + + if (iter->gfn > goal_gfn) + goal_gfn = iter->gfn; + + tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], + iter->root_level, goal_gfn); +} diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index b102109778eac..34da3bdada436 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -49,5 +49,6 @@ u64 *spte_to_child_pt(u64 pte, int level); void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, gfn_t goal_gfn); void tdp_iter_next(struct tdp_iter *iter); +void tdp_iter_refresh_walk(struct tdp_iter *iter); #endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 653507773b42c..d96fc182c8497 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -2,6 +2,7 @@ #include "mmu.h" #include "mmu_internal.h" +#include "tdp_iter.h" #include "tdp_mmu.h" static bool __read_mostly tdp_mmu_enabled = true; @@ -57,8 +58,13 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa) return root->tdp_mmu_page; } +static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t start, gfn_t end); + static void free_tdp_mmu_root(struct kvm *kvm, struct kvm_mmu_page *root) { + gfn_t max_gfn = 1ULL << (boot_cpu_data.x86_phys_bits - PAGE_SHIFT); + lockdep_assert_held(&kvm->mmu_lock); WARN_ON(root->root_count); @@ -66,6 +72,8 @@ static void free_tdp_mmu_root(struct kvm *kvm, struct kvm_mmu_page *root) list_del(&root->link); + zap_gfn_range(kvm, root, 0, max_gfn); + free_page((unsigned long)root->spt); kmem_cache_free(mmu_page_header_cache, root); } @@ -193,6 +201,11 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, u64 old_spte, u64 new_spte, int level); +static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) +{ + return sp->role.smm ? 1 : 0; +} + /** * handle_changed_spte - handle bookkeeping associated with an SPTE change * @kvm: kvm instance @@ -294,3 +307,96 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, free_page((unsigned long)pt); } } + +#define for_each_tdp_pte_root(_iter, _root, _start, _end) \ + for_each_tdp_pte(_iter, _root->spt, _root->role.level, _start, _end) + +/* + * If the MMU lock is contended or this thread needs to yield, flushes + * the TLBs, releases, the MMU lock, yields, reacquires the MMU lock, + * restarts the tdp_iter's walk from the root, and returns true. + * If no yield is needed, returns false. + */ +static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +{ + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + kvm_flush_remote_tlbs(kvm); + cond_resched_lock(&kvm->mmu_lock); + tdp_iter_refresh_walk(iter); + return true; + } else { + return false; + } +} + +/* + * Tears down the mappings for the range of gfns, [start, end), and frees the + * non-root pages mapping GFNs strictly within that range. Returns true if + * SPTEs have been cleared and a TLB flush is needed before releasing the + * MMU lock. + */ +static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t start, gfn_t end) +{ + struct tdp_iter iter; + bool flush_needed = false; + int as_id = kvm_mmu_page_as_id(root); + + for_each_tdp_pte_root(iter, root, start, end) { + if (!is_shadow_present_pte(iter.old_spte)) + continue; + + /* + * If this is a non-last-level SPTE that covers a larger range + * than should be zapped, continue, and zap the mappings at a + * lower level. + */ + if ((iter.gfn < start || + iter.gfn + KVM_PAGES_PER_HPAGE(iter.level) > end) && + !is_last_spte(iter.old_spte, iter.level)) + continue; + + *iter.sptep = 0; + handle_changed_spte(kvm, as_id, iter.gfn, iter.old_spte, 0, + iter.level); + + flush_needed = !tdp_mmu_iter_cond_resched(kvm, &iter); + } + return flush_needed; +} + +/* + * Tears down the mappings for the range of gfns, [start, end), and frees the + * non-root pages mapping GFNs strictly within that range. Returns true if + * SPTEs have been cleared and a TLB flush is needed before releasing the + * MMU lock. + */ +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_mmu_page *root; + bool flush = false; + + for_each_tdp_mmu_root(kvm, root) { + /* + * Take a reference on the root so that it cannot be freed if + * this thread releases the MMU lock and yields in this loop. + */ + get_tdp_mmu_root(kvm, root); + + flush = zap_gfn_range(kvm, root, start, end) || flush; + + put_tdp_mmu_root(kvm, root); + } + + return flush; +} + +void kvm_tdp_mmu_zap_all(struct kvm *kvm) +{ + gfn_t max_gfn = 1ULL << (boot_cpu_data.x86_phys_bits - PAGE_SHIFT); + bool flush; + + flush = kvm_tdp_mmu_zap_gfn_range(kvm, 0, max_gfn); + if (flush) + kvm_flush_remote_tlbs(kvm); +} diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 9274debffeaa1..cb86f9fe69017 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -12,4 +12,6 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t root); hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); void kvm_tdp_mmu_put_root_hpa(struct kvm *kvm, hpa_t root_hpa); +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end); +void kvm_tdp_mmu_zap_all(struct kvm *kvm); #endif /* __KVM_X86_MMU_TDP_MMU_H */ -- 2.28.0.709.gb0816b6eb0-goog