Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp4308089rwd; Sun, 4 Jun 2023 02:21:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7MPducOXKiOVylCIj6ZkStssQFs4DwvoyJaaXKZU/TGGnUVO601nM+IqCSd79CJ0hiG90o X-Received: by 2002:a17:902:aa04:b0:1ae:4bbb:e958 with SMTP id be4-20020a170902aa0400b001ae4bbbe958mr2297018plb.14.1685870504583; Sun, 04 Jun 2023 02:21:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685870504; cv=none; d=google.com; s=arc-20160816; b=VaQ6J6nio87yC8vcUZKCYtr9Dii1GbJEGsDTnjMwhH/x3sUny02fxffypP3HDURslZ glGxDLQtTyHR5NJG7it8+HrULFsRfzLBtnbZ8tN1hHn+7o4Z/XycbKow7GnBdLlrEiWL dc+vCfOccT3jKNnoVhK1KYkiAYtJ/NY2fCG2IfmyWegfMZme2WET7O0HxlgfowwiAj/G UBx/DqDqRHhXhDMZ/joPahkVkoAcR26dIA0aUViolW3VETuqJBhfQBte0BQkob/8873i VpTEb0nSgE3kn1uMAa3DepKCBYSPaHxRcFUVa0vh3aw9FLF08TJD8yathvSSFkANQ5i9 XwiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=3nC8mxFePJzd07Z2PO6HmwDaSJ68B3YWvij0D1WvAAU=; b=Z8jwfgZqqAHMnDqFwiCKkSYJt5cjQt6CrbOuPSXuOpXie8yU0R7bnLX5apG6SSTfNQ deL3qtrd/s9x6PS6suEMtTAETKUxZxyT1pyLgQHuUgIF6YuP6q9iswgsapiC+a1/u/X6 QvfbsW6Hhb61qail1LS3Ej08wycoyh1KboPuC/gVIka5d8rNyAGXyFAWhafhKluaVWZQ tAnhmbXt7mjMRY1kc5BfWQ7BCd+2sJUl2ZpC+i+EhaeQ5f0QWlJO4uZELmNH17FOlKA6 R5XHzi8n0eZZh5oof8LL6VqIRcv9i62aU6iQZALvRemcF+0XaOe1XwEvAiBaJIYAglJi zTUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=df2rk5iu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q8-20020a170902dac800b001a1ee8ceedcsi3950411plx.495.2023.06.04.02.21.32; Sun, 04 Jun 2023 02:21:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=df2rk5iu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230413AbjFDIXv (ORCPT + 99 others); Sun, 4 Jun 2023 04:23:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229891AbjFDIXt (ORCPT ); Sun, 4 Jun 2023 04:23:49 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25077135; Sun, 4 Jun 2023 01:23:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A420B60BC3; Sun, 4 Jun 2023 08:23:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EC518C433EF; Sun, 4 Jun 2023 08:23:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685867026; bh=0o/Sg7LuWV2tXQmiYJm5thgkVxTQtCibmRXbS1usirY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=df2rk5iuYxLPAV1plu/eo3/bECha8oi2VdbrqlbT6kwMdiAU9cI+Bm8L0J7XkgUFs XVGCobHYCAcew+sKuVe2ZhE2wvHZYAbk0gQ6y9w6pfcRMGUUB/1bdbi0Q+oAE5vzE1 JvmJaiLL8UIpvkgxDXIJmPQijdsDUf2rRAzUt1zVCc5unkBAfqObfnudtmzLOkNw4Z uvq3PYyYphRGykWwSVJDKysDZr9Zyf1PnNEfQTwqNPe/57FSj1IJNexasvMknKv28n dJGAISg/5OpefG98K8ea1fRPM9Svdx6DnVvzjGIWGaFJBnhiXBB3yyI5E8vih2grgT qvwtY9WRr6CPA== Received: from [37.166.236.89] (helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q5j1a-002gi6-HG; Sun, 04 Jun 2023 09:23:43 +0100 Date: Sun, 04 Jun 2023 09:23:39 +0100 Message-ID: <87sfb7octw.wl-maz@kernel.org> From: Marc Zyngier To: Colton Lewis Cc: kvm@vger.kernel.org, Catalin Marinas , Will Deacon , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev Subject: Re: [PATCH 3/3] KVM: arm64: Skip break phase when we have FEAT_BBM level 2 In-Reply-To: <20230602170147.1541355-4-coltonlewis@google.com> References: <20230602170147.1541355-1-coltonlewis@google.com> <20230602170147.1541355-4-coltonlewis@google.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 37.166.236.89 X-SA-Exim-Rcpt-To: coltonlewis@google.com, kvm@vger.kernel.org, catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 02 Jun 2023 18:01:47 +0100, Colton Lewis wrote: > > Skip the break phase of break-before-make when the CPU has FEAT_BBM > level 2. This allows skipping some expensive invalidation and > serialization and should result in significant performance > improvements when changing block size. > > The ARM manual section D5.10.1 specifically states under heading > "Support levels for changing block size" that FEAT_BBM Level 2 support > means changing block size does not break coherency, ordering > guarantees, or uniprocessor semantics. I'd like to have that sort of reference in the code itself (spelling out the revision on the ARM ARM this is taken from, as this section is in D8.14.2 in DDI0487J.a). I'd also like it to point out that this only applies when the *output addresses* are the same. > > Because a compare-and-exchange operation was used in the break phase > to serialize access to the PTE, an analogous compare-and-exchange is > introduced in the make phase to ensure serialization remains even if > the break phase is skipped and proper handling is introduced to > account for this function now having a way to fail. > > Considering the possibility that the new pte has different permissions > than the old pte, the minimum necessary tlb invalidations are used. > > Signed-off-by: Colton Lewis > --- > arch/arm64/kvm/hyp/pgtable.c | 58 +++++++++++++++++++++++++++++++----- > 1 file changed, 51 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index 8acab89080af9..6778e3df697f7 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -643,6 +643,11 @@ static bool stage2_has_fwb(struct kvm_pgtable *pgt) > return !(pgt->flags & KVM_PGTABLE_S2_NOFWB); > } > > +static bool stage2_has_bbm_level2(void) > +{ > + return cpus_have_const_cap(ARM64_HAS_STAGE2_BBM2); By the time we look at unmapping things from S2, the capabilities should be finalised, so this should read cpus_have_final_cap() instead. > +} > + > #define KVM_S2_MEMATTR(pgt, attr) PAGE_S2_MEMATTR(attr, stage2_has_fwb(pgt)) > > static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot prot, > @@ -730,7 +735,7 @@ static bool stage2_try_set_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_ > * @ctx: context of the visited pte. > * @mmu: stage-2 mmu > * > - * Returns: true if the pte was successfully broken. > + * Returns: true if the pte was successfully broken or there is no need. No need of what? Why? The rationale should be captured in the comments below. > * > * If the removed pte was valid, performs the necessary serialization and TLB > * invalidation for the old value. For counted ptes, drops the reference count > @@ -750,6 +755,10 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx, > return false; > } > > + /* There is no need to break the pte. */ > + if (stage2_has_bbm_level2()) > + return true; > + > if (!stage2_try_set_pte(ctx, KVM_INVALID_PTE_LOCKED)) > return false; > > @@ -771,16 +780,45 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx, > return true; > } > > -static void stage2_make_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_t new) > +static bool stage2_pte_perms_equal(kvm_pte_t p1, kvm_pte_t p2) > +{ > + u64 perms1 = p1 & KVM_PGTABLE_PROT_RWX; > + u64 perms2 = p2 & KVM_PGTABLE_PROT_RWX; Huh? The KVM_PGTABLE_PROT_* constants are part of an *enum*, and do *not* represent the bit layout of the PTE. How did you test this code? > + > + return perms1 == perms2; > +} > + > +/** > + * stage2_try_make_pte() - Attempts to install a new pte. > + * > + * @ctx: context of the visited pte. > + * @new: new pte to install > + * > + * Returns: true if the pte was successfully installed > + * > + * If the old pte had different permissions, perform appropriate TLB > + * invalidation for the old value. For counted ptes, drops the > + * reference count on the containing table page. > + */ > +static bool stage2_try_make_pte(const struct kvm_pgtable_visit_ctx *ctx, struct kvm_s2_mmu *mmu, kvm_pte_t new) > { > struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops; > > - WARN_ON(!stage2_pte_is_locked(*ctx->ptep)); > + if (!stage2_has_bbm_level2()) > + WARN_ON(!stage2_pte_is_locked(*ctx->ptep)); > + > + if (!stage2_try_set_pte(ctx, new)) > + return false; > + > + if (kvm_pte_table(ctx->old, ctx->level)) > + kvm_call_hyp(__kvm_tlb_flush_vmid, mmu); > + else if (kvm_pte_valid(ctx->old) && !stage2_pte_perms_equal(ctx->old, new)) > + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa_nsh, mmu, ctx->addr, ctx->level); Why a non-shareable invalidation? Nothing in this code captures the rationale for it. What if the permission change was a *restriction* of the permission? It should absolutely be global, and not local. > > if (stage2_pte_is_counted(new)) > mm_ops->get_page(ctx->ptep); > > - smp_store_release(ctx->ptep, new); > + return true; > } > > static void stage2_put_pte(const struct kvm_pgtable_visit_ctx *ctx, struct kvm_s2_mmu *mmu, > @@ -879,7 +917,8 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx, > stage2_pte_executable(new)) > mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule); > > - stage2_make_pte(ctx, new); > + if (!stage2_try_make_pte(ctx, data->mmu, new)) > + return -EAGAIN; So we don't have forward-progress guarantees anymore? I'm not sure this is a change I'm overly fond of. Thanks, M. -- Without deviation from the norm, progress is not possible.