Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp705910pxu; Wed, 14 Oct 2020 11:30:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwm5zjTi2OfLmzqMuhN8BA/Yvcjm0fSu6p8C6G479XUEwIKxcYOEVWp/MXoYJZLh8xqR+B7 X-Received: by 2002:aa7:d948:: with SMTP id l8mr233222eds.159.1602700249404; Wed, 14 Oct 2020 11:30:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602700249; cv=none; d=google.com; s=arc-20160816; b=aGFFvi4I8uzhRcPEmh7BMq4gEY6BruGkqQ9chxNh5ockZVUkqsveaVasD9tsobNkbB QcuLSm5bMmcxE6MXJ4GXjGv3ji4FJoZ1pESB6AwDfy2vG7U57V9OIpaHyJ4sqO7WEHRi 6Vb8WuzjyCmD+bJ93870TxUbgletjfDNA+HEix0mIGgYB/CrX0Xk/e/W/uj+lov7R/QB lInmotjaPRHF04jhgYsA4tLvVDIWgzdOOzMteOiKX00D1UmDhrRzCywxs9Tz8Dd5zl8W owVeLBJ+d7VxTUlQzhKceQPbzcZ82QE9nRBZO6BxIJdC3S0c4a6GTkdWU+5CVqXifn8O kJ9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:sender:dkim-signature; bh=hkNKsBF60m7BvmN3QxbxSqMRTzsy0y9J5Po6mL/IB9c=; b=xl1cAzb56cT/aMfiqKwH66hIssNZ/w9EWckEfDqUUnl6keziciBqWJrvIe//AQN1k9 kVovs2ws+hyINLuMyQ6ePWuN6/7YTE49mAqhFQZYnAvkzYTg9SSin8x1dhajGPqSEIx9 NYRDWcfri2iRexsMawwhjYdW11N2j0ey+VS2Upg12Ad+UYmq4KWg/j+Jkql1Z7eSZush ZBD3SEi1x0MTp065ZoFK5qbJphFV/NCJQ94WvfyOjcDq4r80V5IpTXPdApl1SoVYcWvp GTVON5Ku37x5mFJz3M9FcUy518s2Xv+TZ/bnzXk4mKNmGoZvtu0wT5N+KjApodU6V9z7 gpcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pe85GixK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s21si352226ejb.142.2020.10.14.11.30.26; Wed, 14 Oct 2020 11:30:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pe85GixK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389775AbgJNS2e (ORCPT + 99 others); Wed, 14 Oct 2020 14:28:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388774AbgJNS10 (ORCPT ); Wed, 14 Oct 2020 14:27:26 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5663DC0613D2 for ; Wed, 14 Oct 2020 11:27:26 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id h20so73804plr.9 for ; Wed, 14 Oct 2020 11:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=hkNKsBF60m7BvmN3QxbxSqMRTzsy0y9J5Po6mL/IB9c=; b=pe85GixKyXi2QiVC4AItexRyw4KUvjDLrDmc8khVTPhKaOlEPZXzdEGtecVI+Xzwy5 npOTd7JLcCpJNm6fAbt5Aaj2G9me3jSjBY8+5ew5MJUHhUYKruA6UvJukKmLYxgf0QAd R1oKvX8QHBroCei1ZbmQpfH0+Wmb4R6DjLR7uCusYQfucVWHVDVfKzWcewRXb4S7CETF 41lbjMozCLvRE8vzQDJg53NNlkfBX8NVhGBeOQsgm4US9za0+Fx53EY+1vaxD5aIPZcE EECzbhxXeZhxihWXAJnetxKRFXCCeyMcMEtfmsRRtQHxIlmMPkW26NVfJYuTbxQtQ0Xn KNKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hkNKsBF60m7BvmN3QxbxSqMRTzsy0y9J5Po6mL/IB9c=; b=tkP7PcaCyZs7jYYK4qu725UuTFTgSD67Yw7/MjAvDFFQuRRieTrZIb1GdUG0IerSP+ TGlGest36MqP6J8L29+VJa4mPAi54TWKFEYJ5+YbRWNrdJhVi33siaYXb0gtLVTX3ovW 5uP6LkHzYJKcWvTTttTAKdHnWsokK0KySvamljr35dUKDukJLrJdAOMLJsXgXUCaYnzo VwXZQ+RGDm0MSeJLBDMi0H/CPR+l/04BgrNFVvPmCrMnGlpt3SfWk0HWs+WkLJwWilNG z1aRaaeGwAi7KIwauJzU8EJBSzeuPBCWzf6MAEAIBpAlCWMdr0ub7pj/NttX04WAcX4C 9nTA== X-Gm-Message-State: AOAM533I0JpAjO0DXblvFIhUoh7WfZ2dfL2SVCt2Qzl07GVhGCgTL/pW bH4VjAZ6RdTtzcwFjNWu4XhpDWXbYIvbonY6Ib7WDskE+Oi7Tj8dDKnVGKqwBKIYBJlAICtV3pY j+WfCfbjStWzCG+0I+ElN0rI2pmQUayHY5vE9AdD+eM5okt59sYI1owY6N+rGlMBp8hUM/B1j Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:f693:9fff:fef4:a293]) (user=bgardon job=sendgmr) by 2002:a17:902:8b89:b029:d2:4345:5dd with SMTP id ay9-20020a1709028b89b02900d2434505ddmr256316plb.57.1602700045732; Wed, 14 Oct 2020 11:27:25 -0700 (PDT) Date: Wed, 14 Oct 2020 11:26:52 -0700 In-Reply-To: <20201014182700.2888246-1-bgardon@google.com> Message-Id: <20201014182700.2888246-13-bgardon@google.com> Mime-Version: 1.0 References: <20201014182700.2888246-1-bgardon@google.com> X-Mailer: git-send-email 2.28.0.1011.ga647a8990f-goog Subject: [PATCH v2 12/20] kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Cannon Matthews , Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In order to interoperate correctly with the rest of KVM and other Linux subsystems, the TDP MMU must correctly handle various MMU notifiers. Add hooks to handle the invalidate range family of MMU notifiers. Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell machine. This series introduced no new failures. This series can be viewed in Gerrit at: https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538 Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 9 ++++- arch/x86/kvm/mmu/tdp_mmu.c | 80 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/mmu/tdp_mmu.h | 2 + 3 files changed, 85 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 421a12a247b67..00534133f99fc 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1781,7 +1781,14 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, unsigned flags) { - return kvm_handle_hva_range(kvm, start, end, 0, kvm_unmap_rmapp); + int r; + + r = kvm_handle_hva_range(kvm, start, end, 0, kvm_unmap_rmapp); + + if (kvm->arch.tdp_mmu_enabled) + r |= kvm_tdp_mmu_zap_hva_range(kvm, start, end); + + return r; } int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 78d41a1949651..9ec6c26ed6619 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -58,7 +58,7 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa) } static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end); + gfn_t start, gfn_t end, bool can_yield); void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { @@ -71,7 +71,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) list_del(&root->link); - zap_gfn_range(kvm, root, 0, max_gfn); + zap_gfn_range(kvm, root, 0, max_gfn, false); free_page((unsigned long)root->spt); kmem_cache_free(mmu_page_header_cache, root); @@ -318,9 +318,14 @@ static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) * non-root pages mapping GFNs strictly within that range. Returns true if * SPTEs have been cleared and a TLB flush is needed before releasing the * MMU lock. + * If can_yield is true, will release the MMU lock and reschedule if the + * scheduler needs the CPU or there is contention on the MMU lock. If this + * function cannot yield, it will not release the MMU lock or reschedule and + * the caller must ensure it does not supply too large a GFN range, or the + * operation can cause a soft lockup. */ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end) + gfn_t start, gfn_t end, bool can_yield) { struct tdp_iter iter; bool flush_needed = false; @@ -341,7 +346,10 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); - flush_needed = !tdp_mmu_iter_cond_resched(kvm, &iter); + if (can_yield) + flush_needed = !tdp_mmu_iter_cond_resched(kvm, &iter); + else + flush_needed = true; } return flush_needed; } @@ -364,7 +372,7 @@ bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) */ get_tdp_mmu_root(kvm, root); - flush |= zap_gfn_range(kvm, root, start, end); + flush |= zap_gfn_range(kvm, root, start, end, true); put_tdp_mmu_root(kvm, root); } @@ -502,3 +510,65 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return ret; } + +static int kvm_tdp_mmu_handle_hva_range(struct kvm *kvm, unsigned long start, + unsigned long end, unsigned long data, + int (*handler)(struct kvm *kvm, struct kvm_memory_slot *slot, + struct kvm_mmu_page *root, gfn_t start, + gfn_t end, unsigned long data)) +{ + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + struct kvm_mmu_page *root; + int ret = 0; + int as_id; + + for_each_tdp_mmu_root(kvm, root) { + /* + * Take a reference on the root so that it cannot be freed if + * this thread releases the MMU lock and yields in this loop. + */ + get_tdp_mmu_root(kvm, root); + + as_id = kvm_mmu_page_as_id(root); + slots = __kvm_memslots(kvm, as_id); + kvm_for_each_memslot(memslot, slots) { + unsigned long hva_start, hva_end; + gfn_t gfn_start, gfn_end; + + hva_start = max(start, memslot->userspace_addr); + hva_end = min(end, memslot->userspace_addr + + (memslot->npages << PAGE_SHIFT)); + if (hva_start >= hva_end) + continue; + /* + * {gfn(page) | page intersects with [hva_start, hva_end)} = + * {gfn_start, gfn_start+1, ..., gfn_end-1}. + */ + gfn_start = hva_to_gfn_memslot(hva_start, memslot); + gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); + + ret |= handler(kvm, memslot, root, gfn_start, + gfn_end, data); + } + + put_tdp_mmu_root(kvm, root); + } + + return ret; +} + +static int zap_gfn_range_hva_wrapper(struct kvm *kvm, + struct kvm_memory_slot *slot, + struct kvm_mmu_page *root, gfn_t start, + gfn_t end, unsigned long unused) +{ + return zap_gfn_range(kvm, root, start, end, false); +} + +int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start, + unsigned long end) +{ + return kvm_tdp_mmu_handle_hva_range(kvm, start, end, 0, + zap_gfn_range_hva_wrapper); +} diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 4d111a4dd332f..026ceb6284102 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -19,4 +19,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, int map_writable, int max_level, kvm_pfn_t pfn, bool prefault, bool is_tdp); +int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start, + unsigned long end); #endif /* __KVM_X86_MMU_TDP_MMU_H */ -- 2.28.0.1011.ga647a8990f-goog