Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp595649pxb; Wed, 27 Jan 2021 16:12:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJy00NeCeMnQGu04NIWYtdpkjWxp8AWoXQl7cQN65TutWBjpzLukcbW93ms5bfzdCJxo91Li X-Received: by 2002:a05:6402:430c:: with SMTP id m12mr11864312edc.299.1611792729899; Wed, 27 Jan 2021 16:12:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611792729; cv=none; d=google.com; s=arc-20160816; b=0tUknmGPMCHhx8oiHePmy4avE2ZGf1ZuI8pbkJS8pKnDP3eXuOl6TMC+FDhp0Hj1hq prk9Vz1FI9YAbldnfmnlj6YsDijPTKSD0bo7OkyTbI7FsjSSXQ5FPW6J02X/0YNxeUjU i3hrXU3zspDxBTxpq5JKv6rdhNikVvA4lx3DeukbxGV2mxvXwlLlAwmk1FKgUc10IsTH toqzIjq/e5x6QfUlVkcyhoAWy38QDy4jC3HBt5OYmGgZTFk2esCZPfH2FBp4bdE60HRy XsIJVhlBVrd84K7I/N0+QIQ5ur7JCdTPIYHd5E6Nt/nTYqbzNIPEuP5aGphgrK+VMhYJ thlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=jJRPrAWF7MFclOHB2Lr1Il6UnBQA+yMSwpuT+dr3FeE=; b=MMekyAMCrAL1rEfxA9grxwyhF0em0MXRMjTXmky6y3+UAapNhIbECAV3usCMKQK+8k 2KEN5YGeQZSUjjO5kKWD3O3zh3n5sUWsrH6pdQzK+LcTm6xEydTbHB6hYSeKQQ3kajjZ y64WqiaxCwmQ4TaigCpuXeXHNatvvTArFkDuyTlqR8nMMrc+20VkDYMfWgbR+bK64D3d p1RLybxmspOcHt7LaJ9F1DsiddjIwd62ctfUsRyVwbid7+IobUZBlRhz5wwx0ngdRpst sMFijiC+oxo51BmukB4hEGl8tDpg+E2lB3ZC/l2EmNIGt7iKqi0anzJDL54PVKvK9b4z 4WzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=p+JOm0Ma; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r13si1793944edd.388.2021.01.27.16.11.45; Wed, 27 Jan 2021 16:12:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=p+JOm0Ma; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236145AbhA0SYF (ORCPT + 99 others); Wed, 27 Jan 2021 13:24:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236061AbhA0SYC (ORCPT ); Wed, 27 Jan 2021 13:24:02 -0500 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 952DAC06174A for ; Wed, 27 Jan 2021 10:23:22 -0800 (PST) Received: by mail-pg1-x52c.google.com with SMTP id z21so2157578pgj.4 for ; Wed, 27 Jan 2021 10:23:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=jJRPrAWF7MFclOHB2Lr1Il6UnBQA+yMSwpuT+dr3FeE=; b=p+JOm0MaAq0DKLuPmm62TKPoLEb63Ivgqxil78dFCdEx4sz1UgEhA/sr1202i1f6fO gP7dBosWqWE8SEl5Q9bXMvRttmO8MBkch9w74sMo3b4uL32ehzuvDUOBIATwr0R0jhK0 wGRAbqIP9ZOtS4RGRPo2nOos/oPULUsm4x6AllyAgIBAyWdz6upYM6KonsT6v1VM7h02 rCbHzpgKaZvHxKNM2a0EAofk1HjicEqG+hgMq7srXFKGVCqhp+ey+fC+nLuHf0PTADwe rQtWBV+yPvFIpO4pNyMKRoWlZirWQrvsMnCt7yvFN3uXHS8Umhyg+bkawtJCTQvJHdiF ARtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=jJRPrAWF7MFclOHB2Lr1Il6UnBQA+yMSwpuT+dr3FeE=; b=OHGxj4Oz5X3n+1B7Qv0VsviHbxgfbcR6z3etzUfcVjFM0wUbnqt20MVenOveotQuwf oIWcNU++07FLyj2tl2X9WE0D4Lez58BwsZxEe/MkFIzWumZU8mdi0TekO1mR5rFDJjRX sivktcf+tct3SwOeiCO+D6tRodYVpcOqcJRxBq79Qn/mWSHmIvLSqUx+dpI3ty96I3no HyZXS/+IWBMC7HXFEVME4nlJO+UNdRBLllaCDNv0unCf4wXZGU2z5J0jOYo53KxU16pV wLcVKLSmvFJxuAJqKbCEQITwXSf3HMWBMY3eyO7RswTBmGvC8BIpBm+7WfQjlBtR99Q8 eDVA== X-Gm-Message-State: AOAM530SLxl9lG9W3iTMIw+TwXoV76K1JV9+W3kNezaI1yCmNcFc28fi ZPbA3uKup+flSDTfu1Wkgp4DXg== X-Received: by 2002:a63:5459:: with SMTP id e25mr1610122pgm.403.1611771801851; Wed, 27 Jan 2021 10:23:21 -0800 (PST) Received: from google.com ([2620:15c:f:10:1ea0:b8ff:fe73:50f5]) by smtp.gmail.com with ESMTPSA id m73sm3022198pga.25.2021.01.27.10.23.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Jan 2021 10:23:21 -0800 (PST) Date: Wed, 27 Jan 2021 10:23:14 -0800 From: Sean Christopherson To: David Stevens Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Marc Zyngier , James Morse , Julien Thierry , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, Huacai Chen , Aleksandar Markovic , linux-mips@vger.kernel.org, Paul Mackerras , kvm-ppc@vger.kernel.org, Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Claudio Imbrenda Subject: Re: [PATCH v2] KVM: x86/mmu: consider the hva in mmu_notifier retry Message-ID: References: <20210127024504.613844-1-stevensd@google.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MkeFThl52DvUx8zi" Content-Disposition: inline In-Reply-To: <20210127024504.613844-1-stevensd@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --MkeFThl52DvUx8zi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Jan 27, 2021, David Stevens wrote: > From: David Stevens > > Track the range being invalidated by mmu_notifier and skip page fault > retries if the fault address is not affected by the in-progress > invalidation. Handle concurrent invalidations by finding the minimal > range which includes all ranges being invalidated. Although the combined > range may include unrelated addresses and cannot be shrunk as individual > invalidation operations complete, it is unlikely the marginal gains of > proper range tracking are worth the additional complexity. > > The primary benefit of this change is the reduction in the likelihood of > extreme latency when handing a page fault due to another thread having > been preempted while modifying host virtual addresses. > > Signed-off-by: David Stevens > --- > v1 -> v2: > - improve handling of concurrent invalidation requests by unioning > ranges, instead of just giving up and using [0, ULONG_MAX). Ooh, even better. > - add lockdep check > - code comments and formatting > > arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- > arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- > arch/x86/kvm/mmu/mmu.c | 16 ++++++++------ > arch/x86/kvm/mmu/paging_tmpl.h | 7 ++++--- > include/linux/kvm_host.h | 27 +++++++++++++++++++++++- > virt/kvm/kvm_main.c | 29 ++++++++++++++++++++++---- > 6 files changed, 67 insertions(+), 16 deletions(-) > ... > @@ -3717,7 +3720,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, > mmu_seq = vcpu->kvm->mmu_notifier_seq; > smp_rmb(); > > - if (try_async_pf(vcpu, prefault, gfn, gpa, &pfn, write, &map_writable)) > + if (try_async_pf(vcpu, prefault, gfn, gpa, &pfn, &hva, > + write, &map_writable)) > return RET_PF_RETRY; > > if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, pfn, ACC_ALL, &r)) > @@ -3725,7 +3729,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, > > r = RET_PF_RETRY; > spin_lock(&vcpu->kvm->mmu_lock); > - if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) > + if (mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, hva)) 'hva' will be uninitialized at this point if the gfn did not resolve to a memslot, i.e. when handling an MMIO page fault. On the plus side, that's an opportunity for another optimization as there is no need to retry MMIO page faults on mmu_notifier invalidations. Including the attached patch as a preqreq to this will avoid consuming an uninitialized 'hva'. > goto out_unlock; > r = make_mmu_pages_available(vcpu); > if (r) ... > void kvm_release_pfn_clean(kvm_pfn_t pfn); > void kvm_release_pfn_dirty(kvm_pfn_t pfn); > @@ -1203,6 +1206,28 @@ static inline int mmu_notifier_retry(struct kvm *kvm, unsigned long mmu_seq) > return 1; > return 0; > } > + > +static inline int mmu_notifier_retry_hva(struct kvm *kvm, > + unsigned long mmu_seq, > + unsigned long hva) > +{ > +#ifdef CONFIG_LOCKDEP > + lockdep_is_held(&kvm->mmu_lock); No need to manually do the #ifdef, just use lockdep_assert_held instead of lockdep_is_held. > +#endif > + /* > + * If mmu_notifier_count is non-zero, then the range maintained by > + * kvm_mmu_notifier_invalidate_range_start contains all addresses that > + * might be being invalidated. Note that it may include some false > + * positives, due to shortcuts when handing concurrent invalidations. > + */ > + if (unlikely(kvm->mmu_notifier_count) && > + kvm->mmu_notifier_range_start <= hva && > + hva < kvm->mmu_notifier_range_end) Uber nit: I find this easier to read if 'hva' is on the left-hand side for both checks, i.e. if (unlikely(kvm->mmu_notifier_count) && hva >= kvm->mmu_notifier_range_start && hva < kvm->mmu_notifier_range_end) > + return 1; > + if (kvm->mmu_notifier_seq != mmu_seq) > + return 1; > + return 0; > +} > #endif > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING --MkeFThl52DvUx8zi Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="0001-KVM-x86-mmu-Skip-mmu_notifier-check-when-handling-MM.patch" From a1bfdc6fe16582440815cfecc656313dff993003 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Wed, 27 Jan 2021 10:04:45 -0800 Subject: [PATCH] KVM: x86/mmu: Skip mmu_notifier check when handling MMIO page fault Don't retry a page fault due to an mmu_notifier invalidation when handling a page fault for a GPA that did not resolve to a memslot, i.e. an MMIO page fault. Invalidations from the mmu_notifier signal a change in a host virtual address (HVA) mapping; without a memslot, there is no HVA and thus no possibility that the invalidation is relevant to the page fault being handled. Note, the MMIO vs. memslot generation checks handle the case where a pending memslot will create a memslot overlapping the faulting GPA. The mmu_notifier checks are orthogonal to memslot updates. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d16481aa29d..9ac0a727015d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3725,7 +3725,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, r = RET_PF_RETRY; spin_lock(&vcpu->kvm->mmu_lock); - if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) + if (!is_noslot_pfn(pfn) && mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; r = make_mmu_pages_available(vcpu); if (r) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 50e268eb8e1a..ab54263d857c 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -869,7 +869,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, r = RET_PF_RETRY; spin_lock(&vcpu->kvm->mmu_lock); - if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) + if (!is_noslot_pfn(pfn) && mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; kvm_mmu_audit(vcpu, AUDIT_PRE_PAGE_FAULT); -- 2.30.0.280.ga3ce27912f-goog --MkeFThl52DvUx8zi--