Received: by 2002:ab2:69cc:0:b0:1f4:be93:e15a with SMTP id n12csp158060lqp; Fri, 12 Apr 2024 13:32:08 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWJLVdF3+T6jtP49JUP6Uaeqhy2DUrIMe6TzgWPHA4XG+BvAxquB82d3T38jNubYGcOIynGg87oSSI1dTorSMnTB54i+K5lOomNknoKpQ== X-Google-Smtp-Source: AGHT+IEk60a3JNvqkaZEn5ZixkihX48OAOaCjfR65cvAE2eCAJNa4dIPL1bGVKfEc1VMox1qrAvZ X-Received: by 2002:a05:6a00:3a24:b0:6ea:c9c3:94a5 with SMTP id fj36-20020a056a003a2400b006eac9c394a5mr4543528pfb.0.1712953928067; Fri, 12 Apr 2024 13:32:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712953928; cv=pass; d=google.com; s=arc-20160816; b=0gToPByf7TuRtJVmga/YncFgmcQK+wWWazDwciN0ULh95v3HXPCvK0zjbX/YjNvJVm gPSzgGolyN5ON5dSZWUYXLxo9TJaxYyBM5fsaiFg0rRc5w3hcb4/2r8aVm6cWGgvo1v2 y8066sNjtB4dQHTSZZVxfaF0v3SZk8b55GojNXnlom8IZ2ql07p1IwMtyGRS+MWEeLXg sj2y6t2MqjmFNkTc6s/3TtlCFsvL19PV3las1C8TROZrLEGChad8yLxQlW63fycJAoAL /DKMaWZY2OFvcItxl88zOTItH9u+nQ8LTleg9OOvUnFdn/uPWPDMHHTDnoryXNJVjv02 d+Iw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=pGFHwuKbuynzrcj4lfuCemo+GDIQliA585QXUWNhvcc=; fh=xHoxGWI/ZIQyM4Qrg2WCqwjgx+AAqKb6Bn3UiiqQTcc=; b=OpuW8o3CtVRKotClmgLyTVt8pGBNfBGx2nQQxf0dOuvutcXyQ83lyLEnXQmJc3XtT5 04j2Uyng8vksfbJ/mVFEHXATm9puE1JIGSglfPuA0JbY6hvxAdL/eD+eur0OCQBwSiir a+Uy+lZxP0qdTVd82Wgx+JKMaB7Y+/qJ8Lhr/OzoEe9pnbhhTFlpf8YIg5gQO0zSYlr8 g2v3smJ6KPet7VH6U85dQaN5gG722CLsKVinp51XjPAA2+c7Na+LvbNeYYRGKdeWUuo+ 3T6tF6v9PNIXs4rJiG7NQNdWyfNtRaaEOs73Xb3xkyxhA5I9S9QgiKskBepGZkzSIOT9 HwJA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=e7gZOQUj; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-143311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-143311-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id h9-20020aa79f49000000b006edca539524si3903340pfr.269.2024.04.12.13.32.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Apr 2024 13:32:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-143311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=e7gZOQUj; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-143311-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-143311-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 162D4B226EB for ; Fri, 12 Apr 2024 20:28:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B0F001514E7; Fri, 12 Apr 2024 20:28:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="e7gZOQUj" Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A565314AD02 for ; Fri, 12 Apr 2024 20:28:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712953728; cv=none; b=Cj+pQTGb/nrvSQ3czvcS7yN7/0R9Kr4TMRHw3jB5GxXITyBZWEVV/5QSdjEXf8SUrms97Gyug/WG15gxsNR0Vbr0pxWZ/XhB6wXljkhc30SvnQL8YVvCq+PjYo4QGop6DSYKaFEf3qfxqfMxGyitoIceJcJDNwMbrQpO5QE/dgU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712953728; c=relaxed/simple; bh=N7IX9nF8vCF0/TX4kpM+9leGLDqEbSW+fKgzpAt+uAU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DtJgpXv156wj7s9cd2ZHEpdNdGoP1/6mGNUWvuf41kEB+04oQcum7hr/41Z9gtizDMYZhsjfSkT9i8ZNnqMQvDMnSvmtmKTqtQ1ObhHyXTSV9hz50OunVQOnjdPxA/hr4iFHLLFDMz84FD5miYxeWvl+WVi0kRc7ds46jPVYL+E= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=e7gZOQUj; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-6ecec796323so1350406b3a.3 for ; Fri, 12 Apr 2024 13:28:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712953726; x=1713558526; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pGFHwuKbuynzrcj4lfuCemo+GDIQliA585QXUWNhvcc=; b=e7gZOQUjOr1oZ/EUQJnbgowbDAzP8hsYRbZJQrMKJ8TMpuMa/9HNNhGg94+Zf7O6T6 3HLx7bTDjOKZ7U3YDxOFj1StjinRslqkwbYyb509yhiYCdwegV8e1qot0FuuUaVysPdV HqBcrGxkfgtlChaDlXn9Vew83q3e6KobYllsG6Wf2E2hmvCRO4AzAxxetpOniZ69FcyV D2o3h0F0WHi6ybdUuf+dEK+qltlQw23nbA6femovHCuuPobTWIa2P1NaX2WI3jxMxKKj KsSW6z9pnefaTYKDHQKerR4RDSox2ZfwJ81BRGaK/gUkSGzFP2QZJODT6Dz9FhguZgvz WjwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712953726; x=1713558526; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pGFHwuKbuynzrcj4lfuCemo+GDIQliA585QXUWNhvcc=; b=vVC0jWHttBIYxeouqwVC5arcZpz2XbIJY/SDeHF8m7Lthb98l2PEL/1mWP4uSu+wfs nxBQZwGc5hW4cCMsn8MuDAyQekWpwaI1edYbaI/ye3Zz/4aXad1vCwlU0MubCeRWq9Og zqSssfFF/+oCOyx8sLlXmRFwzKgm7Iuswz5dXA8pAdw3wZMO1qqdhmsofR5H4qj7XtH4 Nw3VU0ve9kgS3ddEjpNcmT7YN8pXLRCPOchZQOwMqKFiOQ4lrKKqr1x6jmSs/Shpdy7I 5qwOFm/1605z/F/bXCGWm7FC8m8w0j2kkUL1Bz7r1yf/T9tnImNHjs+LbeFfMJADo0fJ 1uHw== X-Forwarded-Encrypted: i=1; AJvYcCVELiceTgc+QMKC9QJZSMkUY5a/h7rF21H7G2iq/m4l9uRhInBn4nOqOz525lCfayWQNu0TqKBGjY2c6IBNtpc5pVIpsXReULlN06bI X-Gm-Message-State: AOJu0YxTXujaOthwRwELMeBjKGixS6g9vOYK9k+AMtWyVcY2cC4IQtz9 4kWWmfArFAt7OM0/lL50cz5sqqH+/chznUeW+4CAhz9a4xOgUcmxk2Qorut3SA== X-Received: by 2002:a05:6a00:a8c:b0:6ec:f28b:659f with SMTP id b12-20020a056a000a8c00b006ecf28b659fmr4075582pfl.3.1712953725642; Fri, 12 Apr 2024 13:28:45 -0700 (PDT) Received: from google.com (210.73.125.34.bc.googleusercontent.com. [34.125.73.210]) by smtp.gmail.com with ESMTPSA id fa1-20020a056a002d0100b006e6b7124b33sm3233340pfb.209.2024.04.12.13.28.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Apr 2024 13:28:45 -0700 (PDT) Date: Fri, 12 Apr 2024 13:28:40 -0700 From: David Matlack To: James Houghton Cc: Andrew Morton , Paolo Bonzini , Yu Zhao , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v3 3/7] KVM: Add basic bitmap support into kvm_mmu_notifier_test/clear_young Message-ID: References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-4-jthoughton@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240401232946.1837665-4-jthoughton@google.com> On 2024-04-01 11:29 PM, James Houghton wrote: > Add kvm_arch_prepare_bitmap_age() for architectures to indiciate that > they support bitmap-based aging in kvm_mmu_notifier_test_clear_young() > and that they do not need KVM to grab the MMU lock for writing. This > function allows architectures to do other locking or other preparatory > work that it needs. There's a lot going on here. I know it's extra work but I think the series would be easier to understand and simplify if you introduced the KVM support for lockless test/clear_young() first, and then introduce support for the bitmap-based look-around. Specifically: 1. Make all test/clear_young() notifiers lockless. i.e. Move the mmu_lock into the architecture-specific code (kvm_age_gfn() and kvm_test_age_gfn()). 2. Convert KVM/x86's kvm_{test,}_age_gfn() to be lockless for the TDP MMU. 4. Convert KVM/arm64's kvm_{test,}_age_gfn() to hold the mmu_lock in read-mode. 5. Add bitmap-based look-around support to KVM/x86 and KVM/arm64 (probably 2-3 patches). > > If an architecture does not implement kvm_arch_prepare_bitmap_age() or > is unable to do bitmap-based aging at runtime (and marks the bitmap as > unreliable): > 1. If a bitmap was provided, we inform the caller that the bitmap is > unreliable (MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE). > 2. If a bitmap was not provided, fall back to the old logic. > > Also add logic for architectures to easily use the provided bitmap if > they are able. The expectation is that the architecture's implementation > of kvm_gfn_test_age() will use kvm_gfn_record_young(), and > kvm_gfn_age() will use kvm_gfn_should_age(). > > Suggested-by: Yu Zhao > Signed-off-by: James Houghton > --- > include/linux/kvm_host.h | 60 ++++++++++++++++++++++++++ > virt/kvm/kvm_main.c | 92 +++++++++++++++++++++++++++++----------- > 2 files changed, 127 insertions(+), 25 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 1800d03a06a9..5862fd7b5f9b 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1992,6 +1992,26 @@ extern const struct _kvm_stats_desc kvm_vm_stats_desc[]; > extern const struct kvm_stats_header kvm_vcpu_stats_header; > extern const struct _kvm_stats_desc kvm_vcpu_stats_desc[]; > > +/* > + * Architectures that support using bitmaps for kvm_age_gfn() and > + * kvm_test_age_gfn should return true for kvm_arch_prepare_bitmap_age() > + * and do any work they need to prepare. The subsequent walk will not > + * automatically grab the KVM MMU lock, so some architectures may opt > + * to grab it. > + * > + * If true is returned, a subsequent call to kvm_arch_finish_bitmap_age() is > + * guaranteed. > + */ > +#ifndef kvm_arch_prepare_bitmap_age > +static inline bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn) I find the name of these architecture callbacks misleading/confusing. The lockless path is used even when a bitmap is not provided. i.e. bitmap can be NULL in between kvm_arch_prepare/finish_bitmap_age(). > +{ > + return false; > +} > +#endif > +#ifndef kvm_arch_finish_bitmap_age > +static inline void kvm_arch_finish_bitmap_age(struct mmu_notifier *mn) {} > +#endif kvm_arch_finish_bitmap_age() seems unnecessary. I think the KVM/arm64 code could acquire/release the mmu_lock in read-mode in kvm_test_age_gfn() and kvm_age_gfn() right? > + > #ifdef CONFIG_KVM_GENERIC_MMU_NOTIFIER > static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) > { > @@ -2076,9 +2096,16 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > return READ_ONCE(kvm->mmu_invalidate_seq) != mmu_seq; > } > > +struct test_clear_young_metadata { > + unsigned long *bitmap; > + unsigned long bitmap_offset_end; bitmap_offset_end is unused. > + unsigned long end; > + bool unreliable; > +}; > union kvm_mmu_notifier_arg { > pte_t pte; > unsigned long attributes; > + struct test_clear_young_metadata *metadata; nit: Maybe s/metadata/test_clear_young/ ? > }; > > struct kvm_gfn_range { > @@ -2087,11 +2114,44 @@ struct kvm_gfn_range { > gfn_t end; > union kvm_mmu_notifier_arg arg; > bool may_block; > + bool lockless; Please document this as it's somewhat subtle. A reader might think this implies the entire operation runs without taking the mmu_lock. > }; > bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); > bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); > bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); > + > +static inline void kvm_age_set_unreliable(struct kvm_gfn_range *range) > +{ > + struct test_clear_young_metadata *args = range->arg.metadata; > + > + args->unreliable = true; > +} > +static inline unsigned long kvm_young_bitmap_offset(struct kvm_gfn_range *range, > + gfn_t gfn) > +{ > + struct test_clear_young_metadata *args = range->arg.metadata; > + > + return hva_to_gfn_memslot(args->end - 1, range->slot) - gfn; > +} > +static inline void kvm_gfn_record_young(struct kvm_gfn_range *range, gfn_t gfn) > +{ > + struct test_clear_young_metadata *args = range->arg.metadata; > + > + WARN_ON_ONCE(gfn < range->start || gfn >= range->end); > + if (args->bitmap) > + __set_bit(kvm_young_bitmap_offset(range, gfn), args->bitmap); > +} > +static inline bool kvm_gfn_should_age(struct kvm_gfn_range *range, gfn_t gfn) > +{ > + struct test_clear_young_metadata *args = range->arg.metadata; > + > + WARN_ON_ONCE(gfn < range->start || gfn >= range->end); > + if (args->bitmap) > + return test_bit(kvm_young_bitmap_offset(range, gfn), > + args->bitmap); > + return true; > +} > #endif > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index d0545d88c802..7d80321e2ece 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -550,6 +550,7 @@ struct kvm_mmu_notifier_range { > on_lock_fn_t on_lock; > bool flush_on_ret; > bool may_block; > + bool lockless; > }; > > /* > @@ -598,6 +599,8 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, > struct kvm_memslots *slots; > int i, idx; > > + BUILD_BUG_ON(sizeof(gfn_range.arg) != sizeof(gfn_range.arg.pte)); > + > if (WARN_ON_ONCE(range->end <= range->start)) > return r; > > @@ -637,15 +640,18 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, > gfn_range.start = hva_to_gfn_memslot(hva_start, slot); > gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); > gfn_range.slot = slot; > + gfn_range.lockless = range->lockless; > > if (!r.found_memslot) { > r.found_memslot = true; > - KVM_MMU_LOCK(kvm); > - if (!IS_KVM_NULL_FN(range->on_lock)) > - range->on_lock(kvm); > - > - if (IS_KVM_NULL_FN(range->handler)) > - break; > + if (!range->lockless) { > + KVM_MMU_LOCK(kvm); > + if (!IS_KVM_NULL_FN(range->on_lock)) > + range->on_lock(kvm); > + > + if (IS_KVM_NULL_FN(range->handler)) > + break; > + } > } > r.ret |= range->handler(kvm, &gfn_range); > } > @@ -654,7 +660,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, > if (range->flush_on_ret && r.ret) > kvm_flush_remote_tlbs(kvm); > > - if (r.found_memslot) > + if (r.found_memslot && !range->lockless) > KVM_MMU_UNLOCK(kvm); > > srcu_read_unlock(&kvm->srcu, idx); > @@ -682,19 +688,24 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, > return __kvm_handle_hva_range(kvm, &range).ret; > } > > -static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, > - unsigned long start, > - unsigned long end, > - gfn_handler_t handler) > +static __always_inline int kvm_handle_hva_range_no_flush( > + struct mmu_notifier *mn, > + unsigned long start, > + unsigned long end, > + gfn_handler_t handler, > + union kvm_mmu_notifier_arg arg, > + bool lockless) > { > struct kvm *kvm = mmu_notifier_to_kvm(mn); > const struct kvm_mmu_notifier_range range = { > .start = start, > .end = end, > .handler = handler, > + .arg = arg, > .on_lock = (void *)kvm_null_fn, > .flush_on_ret = false, > .may_block = false, > + .lockless = lockless, > }; > > return __kvm_handle_hva_range(kvm, &range).ret; > @@ -909,15 +920,36 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, > kvm_age_gfn); > } > > -static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, > - struct mm_struct *mm, > - unsigned long start, > - unsigned long end, > - unsigned long *bitmap) > +static int kvm_mmu_notifier_test_clear_young(struct mmu_notifier *mn, > + struct mm_struct *mm, > + unsigned long start, > + unsigned long end, > + unsigned long *bitmap, > + bool clear) Perhaps pass in the callback (kvm_test_age_gfn/kvm_age_gfn) instead of true/false to avoid the naked booleans at the callsites? > { > - trace_kvm_age_hva(start, end); > + if (kvm_arch_prepare_bitmap_age(mn)) { > + struct test_clear_young_metadata args = { > + .bitmap = bitmap, > + .end = end, > + .unreliable = false, > + }; > + union kvm_mmu_notifier_arg arg = { > + .metadata = &args > + }; > + bool young; > + > + young = kvm_handle_hva_range_no_flush( > + mn, start, end, > + clear ? kvm_age_gfn : kvm_test_age_gfn, > + arg, true); I suspect the end result will be cleaner we make all architectures lockless. i.e. Move the mmu_lock acquire/release into the architecture-specific code. This could result in more acquire/release calls (one per memslot that overlaps the provided range) but that should be a single memslot in the majority of cases I think? Then unconditionally pass in the metadata structure. Then you don't need any special casing for the fast path / bitmap path. The only thing needed is to figure out whether to return MMU_NOTIFIER_YOUNG vs MMU_NOTIFIER_YOUNG_LOOK_AROUND and that can be plumbed via test_clear_young_metadata or by changing gfn_handler_t to return an int instead of a bool. > + > + kvm_arch_finish_bitmap_age(mn); > > - /* We don't support bitmaps. Don't test or clear anything. */ > + if (!args.unreliable) > + return young ? MMU_NOTIFIER_YOUNG_FAST : 0; > + } > + > + /* A bitmap was passed but the architecture doesn't support bitmaps */ > if (bitmap) > return MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE; > > @@ -934,7 +966,21 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, > * cadence. If we find this inaccurate, we might come up with a > * more sophisticated heuristic later. > */ > - return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); > + return kvm_handle_hva_range_no_flush( > + mn, start, end, clear ? kvm_age_gfn : kvm_test_age_gfn, > + KVM_MMU_NOTIFIER_NO_ARG, false); Should this return MMU_NOTIFIER_YOUNG explicitly? This code is assuming MMU_NOTIFIER_YOUNG == (int)true. > +} > + > +static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, > + struct mm_struct *mm, > + unsigned long start, > + unsigned long end, > + unsigned long *bitmap) > +{ > + trace_kvm_age_hva(start, end); > + > + return kvm_mmu_notifier_test_clear_young(mn, mm, start, end, bitmap, > + true); > } > > static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, > @@ -945,12 +991,8 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, > { > trace_kvm_test_age_hva(start, end); > > - /* We don't support bitmaps. Don't test or clear anything. */ > - if (bitmap) > - return MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE; > - > - return kvm_handle_hva_range_no_flush(mn, start, end, > - kvm_test_age_gfn); > + return kvm_mmu_notifier_test_clear_young(mn, mm, start, end, bitmap, > + false); > } > > static void kvm_mmu_notifier_release(struct mmu_notifier *mn, > -- > 2.44.0.478.gd926399ef9-goog >