Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp3399853rwo; Mon, 24 Jul 2023 10:23:32 -0700 (PDT) X-Google-Smtp-Source: APBJJlFPm7RA8fgkAZQIXYl/qVHJml7urhUqptIkypL/FjGeD+9RmivRQP6Om7f+PrbZrGlMOvfI X-Received: by 2002:ac2:4da5:0:b0:4f8:596a:4bb7 with SMTP id h5-20020ac24da5000000b004f8596a4bb7mr5236099lfe.57.1690219412129; Mon, 24 Jul 2023 10:23:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690219412; cv=none; d=google.com; s=arc-20160816; b=IlbioPLHAOip25Q0GLlxidvAMcG8+//6dteM6Y6bMjwvlvvmnu/ehQLmfPQmsYmptc d3R1EOBnB8HMTxoMM9HSIcZDi/KfsqAHs8ipwKxGb2l5UCKVunK0bkphcGcJ49ByaMG7 0ypK574Bys2HFSuCalksBjhFRRyKkdpz2sNS5KG6tDtK17swS954fn8YogdiB/AEAi/r Nw2FlBTXBJSjrP6qeOLwQLCCK44gHYL/AZzJZzrlYOuwh9U4fQxoFpU+lFBrV8z1Ltqo gqPNReOdyPhmKFJuD88NkXLCQfaZkQnuZiatnE6aMV77no0rltJ5363hsEPhLcmU+xeP +HSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=gK3B5wZw1OFgk2DT1iEhz+umMy+9ra8FX/pBQvU+W/w=; fh=HT1NFiw72D+Q/XeJ4YD0ru985jPfTpsUgwU0CCdK8rg=; b=PMHqwhRFSP+0a6xOCwEdXrKrFcSxf7h5Z+/ctVdqHg3l2jSElEDhuVEbfAmxoXH7ac KQ+GaK6teYW0Z3qjF1ped0s8CPmYBrh+1T1EYHulvUl+IV3eATbE4avyC7aI7QoQ38Hb JeV0B7zYg5R04JmL1He+evENe/yVBBdB5mKrfiMFj7WTryvUTrEL0EEejqz0S6xX7DJ6 c2W20RtGj3h/1QJFy2OGPj2+0tRKmAjKEE0rKatzPlqC8wLLywOUCWotvJDlzWSZ9p+m syxXzf0F7MYbhghwzqaSm8p6DAeWOKd/Aecw+Ljdf+qKEXeHdBWEQ6Il0N98KfdLb7yF 7YPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=xY3tx3WT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v8-20020a056402184800b005216d397e43si6581594edy.494.2023.07.24.10.23.08; Mon, 24 Jul 2023 10:23:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=xY3tx3WT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229991AbjGXQsD (ORCPT + 99 others); Mon, 24 Jul 2023 12:48:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231655AbjGXQsA (ORCPT ); Mon, 24 Jul 2023 12:48:00 -0400 Received: from mail-il1-x135.google.com (mail-il1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BEAA110C3 for ; Mon, 24 Jul 2023 09:47:58 -0700 (PDT) Received: by mail-il1-x135.google.com with SMTP id e9e14a558f8ab-3460770afe2so299655ab.1 for ; Mon, 24 Jul 2023 09:47:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690217278; x=1690822078; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gK3B5wZw1OFgk2DT1iEhz+umMy+9ra8FX/pBQvU+W/w=; b=xY3tx3WTK4KdOY0TvcsLtqmIECu15VlVbTEUqZLznKTrcEQ3qq9MZ8A73xGTPHFhur A/0lC2YaasiB2qtLY96CMo7gEkIDOmPcUEvCDuOPhXiO41+3fvy2/Skq1jEll2CJZ4lB wyvmFXGj4o8we+38ahIX0hi2cof+wdyVSxg0IuxH6YMIzNwMT6/bsiJ/gjrGwSSgyJjB nLN8V8tWusmsM+7/fZZ4n+x9rbksRwiIno10akUtzIy+PPeQIuE7QuixtX1MDApQ0wfG +Jl9ixl11ZsGp2pA/Fh2BIytERW9S4aYDU7b7oeTBewyBL4tWEcTlqqwi4vIUmQP0co3 8Khw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690217278; x=1690822078; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gK3B5wZw1OFgk2DT1iEhz+umMy+9ra8FX/pBQvU+W/w=; b=PNUNcokoF8QO/q5Wg1u/BhTfTc8WVr9fl9VsK2nPnDIXMbHPtGxqA5Pb8iOcYWd7Zr Gv2J+bmFaM/7GP6yqUCHKNGIQC8XFg6sruJfn0090cY8Dr452Npum/QdwpxZUKsdcnw3 EX4cBsx8xZKV7Bw6pgvGaY175Wza3HfgrxAM5EtitU9iNrVabk03j4kPclvpVypEmwj+ FKcEL290COyztEqWTnjShjajLMXnWdi/IjV8XV17gyGWAIAIqOQNfmeewUGNZXZ2NgTm loemkC77hom9nBGOxeMTgm8qYpAKNKYVvg5P7o47n9DoziYCLDVmjgoDLARP2LF6IFo7 NqNw== X-Gm-Message-State: ABy/qLaR1AUX6DGharOyDCvohsNwOeOfmF4OtChetLDLqgaRl5iWVC17 xIbDERpvmByLrJdOvnEOpkK+o4LwKMOA0/Cj0/OcjQ== X-Received: by 2002:a05:6e02:1a0d:b0:348:cfeb:f52 with SMTP id s13-20020a056e021a0d00b00348cfeb0f52mr272705ild.7.1690217278012; Mon, 24 Jul 2023 09:47:58 -0700 (PDT) MIME-Version: 1.0 References: <20230722022251.3446223-1-rananta@google.com> <20230722022251.3446223-13-rananta@google.com> <0841aca6-2824-6a1b-a568-119f8bd220de@redhat.com> In-Reply-To: <0841aca6-2824-6a1b-a568-119f8bd220de@redhat.com> From: Raghavendra Rao Ananta Date: Mon, 24 Jul 2023 09:47:46 -0700 Message-ID: Subject: Re: [PATCH v7 12/12] KVM: arm64: Use TLBI range-based intructions for unmap To: Shaoqin Huang Cc: Oliver Upton , Marc Zyngier , James Morse , Suzuki K Poulose , Paolo Bonzini , Sean Christopherson , Huacai Chen , Zenghui Yu , Anup Patel , Atish Patra , Jing Zhang , Reiji Watanabe , Colton Lewis , David Matlack , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 24, 2023 at 2:35=E2=80=AFAM Shaoqin Huang = wrote: > > Hi Raghavendra, > > On 7/22/23 10:22, Raghavendra Rao Ananta wrote: > > The current implementation of the stage-2 unmap walker traverses > > the given range and, as a part of break-before-make, performs > > TLB invalidations with a DSB for every PTE. A multitude of this > > combination could cause a performance bottleneck on some systems. > > > > Hence, if the system supports FEAT_TLBIRANGE, defer the TLB > > invalidations until the entire walk is finished, and then > > use range-based instructions to invalidate the TLBs in one go. > > Condition deferred TLB invalidation on the system supporting FWB, > > as the optimization is entirely pointless when the unmap walker > > needs to perform CMOs. > > > > Rename stage2_put_pte() to stage2_unmap_put_pte() as the function > > now serves the stage-2 unmap walker specifically, rather than > > acting generic. > > > > Signed-off-by: Raghavendra Rao Ananta > > --- > > arch/arm64/kvm/hyp/pgtable.c | 67 +++++++++++++++++++++++++++++++----= - > > 1 file changed, 58 insertions(+), 9 deletions(-) > > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.= c > > index 5ef098af1736..cf88933a2ea0 100644 > > --- a/arch/arm64/kvm/hyp/pgtable.c > > +++ b/arch/arm64/kvm/hyp/pgtable.c > > @@ -831,16 +831,54 @@ static void stage2_make_pte(const struct kvm_pgta= ble_visit_ctx *ctx, kvm_pte_t n > > smp_store_release(ctx->ptep, new); > > } > > > > -static void stage2_put_pte(const struct kvm_pgtable_visit_ctx *ctx, st= ruct kvm_s2_mmu *mmu, > > - struct kvm_pgtable_mm_ops *mm_ops) > > +struct stage2_unmap_data { > > + struct kvm_pgtable *pgt; > > + bool defer_tlb_flush_init; > > +}; > > + > > +static bool __stage2_unmap_defer_tlb_flush(struct kvm_pgtable *pgt) > > +{ > > + /* > > + * If FEAT_TLBIRANGE is implemented, defer the individual > > + * TLB invalidations until the entire walk is finished, and > > + * then use the range-based TLBI instructions to do the > > + * invalidations. Condition deferred TLB invalidation on the > > + * system supporting FWB, as the optimization is entirely > > + * pointless when the unmap walker needs to perform CMOs. > > + */ > > + return system_supports_tlb_range() && stage2_has_fwb(pgt); > > +} > > + > > +static bool stage2_unmap_defer_tlb_flush(struct stage2_unmap_data *unm= ap_data) > > +{ > > + bool defer_tlb_flush =3D __stage2_unmap_defer_tlb_flush(unmap_dat= a->pgt); > > + > > + /* > > + * Since __stage2_unmap_defer_tlb_flush() is based on alternative > > + * patching and the TLBIs' operations behavior depend on this, > > + * track if there's any change in the state during the unmap sequ= ence. > > + */ > > + WARN_ON(unmap_data->defer_tlb_flush_init !=3D defer_tlb_flush); > > + return defer_tlb_flush; > > +} > > + > > +static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *c= tx, > > + struct kvm_s2_mmu *mmu, > > + struct kvm_pgtable_mm_ops *mm_ops) > > { > > + struct stage2_unmap_data *unmap_data =3D ctx->arg; > > + > > /* > > - * Clear the existing PTE, and perform break-before-make with > > - * TLB maintenance if it was valid. > > + * Clear the existing PTE, and perform break-before-make if it wa= s > > + * valid. Depending on the system support, the TLB maintenance fo= r > > + * the same can be deferred until the entire unmap is completed. > > */ > > if (kvm_pte_valid(ctx->old)) { > > kvm_clear_pte(ctx->ptep); > > - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ct= x->level); > > + > > + if (!stage2_unmap_defer_tlb_flush(unmap_data)) > Why not directly check (unmap_data->defer_tlb_flush_init) here? > (Re-sending the reply as the previous one was formatted as HTML and was blocked by many lists) No particular reason per say, but I was just going with the logic of determining if we need to defer the flush and the WARN_ON() parts separate. Any advantage if we directly check in stage2_unmap_put_pte() that I missed or is this purely for readability? > > + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, > > + ctx->addr, ctx->level); > Small indent hint. The ctx->addr can align with __kvm_tlb_flush_vmid_ipa. > Ah, yes. I'll adjust this if I send out a v8. Thank you. Raghavendra > Thanks, > Shaoqin > > } > > > > mm_ops->put_page(ctx->ptep); > > @@ -1070,7 +1108,8 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtab= le *pgt, u64 addr, u64 size, > > static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ct= x, > > enum kvm_pgtable_walk_flags visit) > > { > > - struct kvm_pgtable *pgt =3D ctx->arg; > > + struct stage2_unmap_data *unmap_data =3D ctx->arg; > > + struct kvm_pgtable *pgt =3D unmap_data->pgt; > > struct kvm_s2_mmu *mmu =3D pgt->mmu; > > struct kvm_pgtable_mm_ops *mm_ops =3D ctx->mm_ops; > > kvm_pte_t *childp =3D NULL; > > @@ -1098,7 +1137,7 @@ static int stage2_unmap_walker(const struct kvm_p= gtable_visit_ctx *ctx, > > * block entry and rely on the remaining portions being faulted > > * back lazily. > > */ > > - stage2_put_pte(ctx, mmu, mm_ops); > > + stage2_unmap_put_pte(ctx, mmu, mm_ops); > > > > if (need_flush && mm_ops->dcache_clean_inval_poc) > > mm_ops->dcache_clean_inval_poc(kvm_pte_follow(ctx->old, m= m_ops), > > @@ -1112,13 +1151,23 @@ static int stage2_unmap_walker(const struct kvm= _pgtable_visit_ctx *ctx, > > > > int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 s= ize) > > { > > + int ret; > > + struct stage2_unmap_data unmap_data =3D { > > + .pgt =3D pgt, > > + .defer_tlb_flush_init =3D __stage2_unmap_defer_tlb_flush(= pgt), > > + }; > > struct kvm_pgtable_walker walker =3D { > > .cb =3D stage2_unmap_walker, > > - .arg =3D pgt, > > + .arg =3D &unmap_data, > > .flags =3D KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABL= E_POST, > > }; > > > > - return kvm_pgtable_walk(pgt, addr, size, &walker); > > + ret =3D kvm_pgtable_walk(pgt, addr, size, &walker); > > + if (stage2_unmap_defer_tlb_flush(&unmap_data)) > > + /* Perform the deferred TLB invalidations */ > > + kvm_tlb_flush_vmid_range(pgt->mmu, addr, size); > > + > > + return ret; > > } > > > > struct stage2_attr_data { > > -- > Shaoqin >