Received: by 2002:a05:6500:1b8f:b0:1fa:5c73:8e2d with SMTP id df15csp810133lqb; Wed, 29 May 2024 11:06:29 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXOK30wfGqXRWLBK/jw4SiSSnjiinpmKPFiz/TIiBBm/dusfEzKln4a8Ru+NK2bKcIrLRNADINRR+Z8WL0wfw8i7ZcK9LoWOdK/E89wjA== X-Google-Smtp-Source: AGHT+IHEt6ezqOHBxeNjbY8vF7kGlZDVOQuwEn2aeXYwOKXHwXOHhtEFmSDr1gmrL6mVk1E7pnUc X-Received: by 2002:a50:cd12:0:b0:579:f1a3:664f with SMTP id 4fb4d7f45d1cf-579f1a36ab3mr6008100a12.30.1717005989622; Wed, 29 May 2024 11:06:29 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717005989; cv=pass; d=google.com; s=arc-20160816; b=pyfLVPWLdJn/mJdKKcjGKVYb/JbyPPM2tZELpKyV0zsfSa/24zt9rq6ll9p0TGYWez OeF2lqLkKvuKEXOOQBPsvcv/dKAHY8REr9CLFA5EmLXtXm7P1Qtq4HFYNzAuyzFStVjq 5FJ8sBJ8Beq+FQE6DfhEU8edZY4M+wW22HAtdqk1ay955WX6jvm3q7Y5KstDKLs7Hr8w HU7cdRhon5Yx8MMaYYVi1lfoXIcdoBU78M5fjhXWrFq6qOo+SaaynD58ht3cpb7+8XaJ yz+W5IVcKHh+UTu0CLRIIdOOHxPAW3e41OoosNnV+B8UF+MEGaxtvjEFvyFIumE/Cwx8 J6Xg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=9IVaEi5zQfJ05wKpIy/u5NLNm+GzTdT+Fqa7TW9318s=; fh=NYvLsJ/si6I2Ux1d938A9rHCMylNWgr55UyxjkusFC0=; b=A1996eCQpjUj1botLgasVOcOK/E4qa0NAw4FTCOGBZhUGe+Tkv3jYcYW2UpFGJQven wbEQDtgEfZ+W3eHdGpdLdiZx87Klml6ked8pZ5r0Acgdlj2ynetunNv5zio+ggIJ7YTC DTOh/IQmdS2xanAyfpuCr+9r/2unlfIrrEFXSLuCcmKwkLu7Tlt9kXXpjshMNEC6iUAy TOMYwcv8JlFCAqE9llzlU0d8kR34puBAjb9Crz7/XkilUWF3wfyZN+D0uoaIJhAIY597 qTbYWVv0NoUa2NtWHMokAhfsIMHJJ1yEpm5MPf4CKW26RhhXe79TmR//NYCk4gCPPwja k2cw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=s7MBpJ1m; arc=pass (i=1 spf=pass spfdomain=flex--jthoughton.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-194548-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194548-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-579d1ae1a62si3913436a12.571.2024.05.29.11.06.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 May 2024 11:06:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-194548-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=s7MBpJ1m; arc=pass (i=1 spf=pass spfdomain=flex--jthoughton.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-194548-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194548-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id F28AB1F22EA4 for ; Wed, 29 May 2024 18:06:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EF2461A38FB; Wed, 29 May 2024 18:05:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="s7MBpJ1m" Received: from mail-ua1-f73.google.com (mail-ua1-f73.google.com [209.85.222.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3711F1C0DE9 for ; Wed, 29 May 2024 18:05:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717005924; cv=none; b=hEHyLSdRJkhlV0tua2y9a5NaDw7w3yj71ei/iTjnTOtVdo0zTh6r+SL+XT37gQ7DGqbpig+qWy3h74ErR5/yjRv1bzcYUo2c/TiL9eIGrCtku35JvYka4IRAK5PWAqA7btHL1K5M/cmAuSuxjJzXXn9aJck+BKuV5GfIEWb1UXc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717005924; c=relaxed/simple; bh=SWsXA6zISqxg7q/RviiZ4NedczRmTyw4MPVWOHpV4Is=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RZVtsafu6WpkiKkTVj+fukN87ci4UnvhPC/Gpeg4kpeJtc6Oax34PIMjsGTeNIhiyU9xpnw12ex60TjBZPqD74s8mJpJpW0TtY+JPz2wbhDcMPd+PZ1xw/qRRj3LpSjnHCOcblijLxro915GjJ9Cbg5Ku7HtYGZEOhqmhdoYGMM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jthoughton.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=s7MBpJ1m; arc=none smtp.client-ip=209.85.222.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jthoughton.bounces.google.com Received: by mail-ua1-f73.google.com with SMTP id a1e0cc1a2514c-804f9eea128so30871241.0 for ; Wed, 29 May 2024 11:05:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717005920; x=1717610720; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9IVaEi5zQfJ05wKpIy/u5NLNm+GzTdT+Fqa7TW9318s=; b=s7MBpJ1mHbnkciR9D20cJ6JpiWCtHVtbp1Xs4X2ToDNocrXxNDD99pTaf8IBdMQ+Ch K5+dqPBtAGsL+eW4M3bIVGJ8gRZNPafS9gFSaHCdbz7XKeR2zeRUM8TG+SyeEJONnQ+5 5RSSLmFj3mVvfz/CWNpYKjFi1qKShk+L8qU1wUzvGpe4G0O8Jfdp1yQpTJ96Jsdak4BV 6ztaNRvVQKyEC58xBXmTIh+H4qY4yN0StoFCeZTSN3nA4w0itvMCxdo9NIxSsIxNp2/I zRZG848LjREmCP0kKyqUMtB3EhynDvJJ/i0zCb8u7tbSBx4oaN2x0bbZ780p+p/4diax QpEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717005920; x=1717610720; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9IVaEi5zQfJ05wKpIy/u5NLNm+GzTdT+Fqa7TW9318s=; b=YtA4PfOgjEsbbrB8Mp0oCZYdGz8fASr5khElK/6ajhIY6rK1T2SAnF0oc4j0dG4P4X 1BUQKWluKvfCfvUK2CYFmr6429sAAmXbKlOQveeCLemwgWoGTDt1dFqMjwFqmk+VQbzt gsFfeh8LQDY/9yEb6bOUX526b5mO4+GLvGcVGKyALDpgyl9vqifVnmwe23JC0GpUtupb 1UZ3N0/MuZihT2oKEwn9DjgnDV45i9SRwIirMLfxFOMURtCZRc2a7TLXaiM9Yt9i3Iod L9fYxDuqAw3LxmnwwNGPUPvVFknX02WuWXTstcVd1VnEWfQN56J4y1tUX0qEOkI/O/Hz jYOg== X-Forwarded-Encrypted: i=1; AJvYcCUMsmOCwQWE1AQRumTCdFCMwyPEbUuLIiS6ch3ONhBmHnwMY/jAXMDVApAtrJyLA9R4h2sBIVNkA43p+ONbcGcFCaWpjVsS4p0CQSyz X-Gm-Message-State: AOJu0Yyul6Cu41uzY2tmuafRjWFQajdwIeFWS8zn2YvxAr8jRWUHC7i2 PHmc3EFnSmzFAmUYYjWorTaViTcT0pLqAPYraACi7eyljkveg4sH3vA4o83QQs7UchauZrTbrSn lASVBZAu8eKvyBEOmPw== X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6130:2289:b0:80a:5330:87c5 with SMTP id a1e0cc1a2514c-80a5330931fmr5289241.2.1717005920232; Wed, 29 May 2024 11:05:20 -0700 (PDT) Date: Wed, 29 May 2024 18:05:06 +0000 In-Reply-To: <20240529180510.2295118-1-jthoughton@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240529180510.2295118-1-jthoughton@google.com> X-Mailer: git-send-email 2.45.1.288.g0e0cd299f1-goog Message-ID: <20240529180510.2295118-4-jthoughton@google.com> Subject: [PATCH v4 3/7] KVM: Add lockless memslot walk to KVM From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Albert Ou , Ankit Agrawal , Anup Patel , Atish Patra , Axel Rasmussen , Bibo Mao , Catalin Marinas , David Matlack , David Rientjes , Huacai Chen , James Houghton , James Morse , Jonathan Corbet , Marc Zyngier , Michael Ellerman , Nicholas Piggin , Oliver Upton , Palmer Dabbelt , Paul Walmsley , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Shuah Khan , Suzuki K Poulose , Tianrui Zhao , Will Deacon , Yu Zhao , Zenghui Yu , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev Content-Type: text/plain; charset="UTF-8" Provide flexibility to the architecture to synchronize as optimally as they can instead of always taking the MMU lock for writing. The immediate application is to allow architectures to implement the test/clear_young MMU notifiers more cheaply. Suggested-by: Yu Zhao Signed-off-by: James Houghton --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 38 +++++++++++++++++++++++++------------- 2 files changed, 26 insertions(+), 13 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 692c01e41a18..4d7c3e8632e6 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -266,6 +266,7 @@ struct kvm_gfn_range { gfn_t end; union kvm_mmu_notifier_arg arg; bool may_block; + bool lockless; }; bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 14841acb8b95..d197b6725cb3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -558,6 +558,7 @@ struct kvm_mmu_notifier_range { on_lock_fn_t on_lock; bool flush_on_ret; bool may_block; + bool lockless; }; /* @@ -612,6 +613,10 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, IS_KVM_NULL_FN(range->handler))) return r; + /* on_lock will never be called for lockless walks */ + if (WARN_ON_ONCE(range->lockless && !IS_KVM_NULL_FN(range->on_lock))) + return r; + idx = srcu_read_lock(&kvm->srcu); for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { @@ -643,15 +648,18 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, gfn_range.start = hva_to_gfn_memslot(hva_start, slot); gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; + gfn_range.lockless = range->lockless; if (!r.found_memslot) { r.found_memslot = true; - KVM_MMU_LOCK(kvm); - if (!IS_KVM_NULL_FN(range->on_lock)) - range->on_lock(kvm); - - if (IS_KVM_NULL_FN(range->handler)) - break; + if (!range->lockless) { + KVM_MMU_LOCK(kvm); + if (!IS_KVM_NULL_FN(range->on_lock)) + range->on_lock(kvm); + + if (IS_KVM_NULL_FN(range->handler)) + break; + } } r.ret |= range->handler(kvm, &gfn_range); } @@ -660,7 +668,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, if (range->flush_on_ret && r.ret) kvm_flush_remote_tlbs(kvm); - if (r.found_memslot) + if (r.found_memslot && !range->lockless) KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); @@ -686,10 +694,12 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, return __kvm_handle_hva_range(kvm, &range).ret; } -static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, - unsigned long start, - unsigned long end, - gfn_handler_t handler) +static __always_inline int kvm_handle_hva_range_no_flush( + struct mmu_notifier *mn, + unsigned long start, + unsigned long end, + gfn_handler_t handler, + bool lockless) { struct kvm *kvm = mmu_notifier_to_kvm(mn); const struct kvm_mmu_notifier_range range = { @@ -699,6 +709,7 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn .on_lock = (void *)kvm_null_fn, .flush_on_ret = false, .may_block = false, + .lockless = lockless, }; return __kvm_handle_hva_range(kvm, &range).ret; @@ -889,7 +900,8 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * cadence. If we find this inaccurate, we might come up with a * more sophisticated heuristic later. */ - return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); + return kvm_handle_hva_range_no_flush(mn, start, end, + kvm_age_gfn, false); } static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, @@ -899,7 +911,7 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, trace_kvm_test_age_hva(address); return kvm_handle_hva_range_no_flush(mn, address, address + 1, - kvm_test_age_gfn); + kvm_test_age_gfn, false); } static void kvm_mmu_notifier_release(struct mmu_notifier *mn, -- 2.45.1.288.g0e0cd299f1-goog