Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1898615rwd; Fri, 2 Jun 2023 01:35:04 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6I9V1QaXvfR7n86UZlW0ho3VhenroDnNt2hZbTKGnDhNlC57uVIvAazlDwcRyr1r5ueFf7 X-Received: by 2002:a17:902:8601:b0:1a6:6fe3:df8d with SMTP id f1-20020a170902860100b001a66fe3df8dmr1892005plo.8.1685694904196; Fri, 02 Jun 2023 01:35:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685694904; cv=none; d=google.com; s=arc-20160816; b=NXU4gw6zJdzIdTdFGlxV1eoBFaUbBdB61Qv0SCgBn9UIxOiDQICEXhMUDB0JZzFI4L yW8TwwqAGBTpMaGCPKj8nw+4V5zfh//arYgl0bH+OEshwPxBYYBaJiAn7l06Yj9Z/qx5 7Gu+o8sRW1wLQTm2/beSNcsUP2Y/GJZ5/11UjXkj42qYbjMXWoHmoO67Ufie83WMXRKT 91HRuI4wJ+GITkpqQoszrRn1hoCVje4zpHOFkxva/JYz5lS9C3yHiwXEYmULUUdDyJPT ZGBpzAnJ+XZqz81Bwn5CMEEDG+O9JBPGk4cpXxEHE9IcYQ6NVUtvQVTbaiq3CkJuYhuh y3aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=mtGbG1/6hi7jw/EJxFZFrRfacGkXL4dgmDAmlFql3Qc=; b=ADK2cm4DYpKfmrwIF+qWhu04x9eWbJ+Va4Io6pHB3zHPa204TIZbyHCJjWnUjkLwKm ty9ThZ4KV2X0kBm2ot7rRuivZ+szQtqUSk2luTe97Q8wigtsoP14g+pe7zErB1yC47pR ZVTlELqnqV/pKrFfQHHSf34zSVA8Ud7nqlw8puNgLfNnfghmB3j0F1owD2BYE4wnoioU 3PCKa2uadVfwgva6GuJRyrbinJ1k7wgA6SBaK7rOuIIsQKOyq8jdQTLplbPUAZeKr+GK Hzn9dXrQtuYgqIERfahm/o7h93SgLnI49GPaE7EqWVRbBTJPiPau4ZqGZVPM4YG+x/p7 IRcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xh7HYTpo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x10-20020a170902ec8a00b001aaf4d2eef6si568446plg.370.2023.06.02.01.34.48; Fri, 02 Jun 2023 01:35:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xh7HYTpo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234552AbjFBI0o (ORCPT + 99 others); Fri, 2 Jun 2023 04:26:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234601AbjFBI0E (ORCPT ); Fri, 2 Jun 2023 04:26:04 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBFD71A7; Fri, 2 Jun 2023 01:25:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5897B64D23; Fri, 2 Jun 2023 08:25:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F8C8C433D2; Fri, 2 Jun 2023 08:25:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685694341; bh=bNXlHhXedmdPPe/K+SVuaWMGRNYLUdQRKj6I74qlBrs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Xh7HYTpo4nMXKVmRIlBMtVQZAaJZlNJh8oodxpQMfUiZ7i03OAMyDN6rcJW+6KY67 nFaj4Vgy3gV2SAs2FFrgj6veNIZOhDoJBNnP+gW0309syLC6bxyNhZjS8BB+8BjgPa vd/Wyj63PV1mkMuLUcHOdc+11P6LnkRWvaMJeMz+rwX4zO5CBLJ3Oldw+R5XCKeh0g J/AdtophrHNobHJnSMOeV0fOALs6427GzA8qHOW81xwLuZH+XAUX/cNXavI4+kQI5+ HKc5qB9ke7zuoPIFMapN9VOS0jgv332F8kDDJ86FZUpZgRQ0zjwzaypZIvIkeg0BBA bd40/xlROVi0A== Received: from 90.4.23.109.rev.sfr.net ([109.23.4.90] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q506N-002H9j-8n; Fri, 02 Jun 2023 09:25:39 +0100 Date: Fri, 02 Jun 2023 09:25:37 +0100 Message-ID: <875y86p8xq.wl-maz@kernel.org> From: Marc Zyngier To: Raghavendra Rao Ananta Cc: Oliver Upton , James Morse , Suzuki K Poulose , Ricardo Koller , Paolo Bonzini , Jing Zhang , Colton Lewis , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [PATCH v4 3/6] KVM: arm64: Implement kvm_arch_flush_remote_tlbs_range() In-Reply-To: References: <20230519005231.3027912-1-rananta@google.com> <20230519005231.3027912-4-rananta@google.com> <87v8gbjkzn.wl-maz@kernel.org> <86zg5kc2ho.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 109.23.4.90 X-SA-Exim-Rcpt-To: rananta@google.com, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, ricarkol@google.com, pbonzini@redhat.com, jingzhangos@google.com, coltonlewis@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 02 Jun 2023 02:37:28 +0100, Raghavendra Rao Ananta wrote: >=20 > On Wed, May 31, 2023 at 1:46=E2=80=AFAM Marc Zyngier wro= te: > > > > On Tue, 30 May 2023 22:22:23 +0100, > > Raghavendra Rao Ananta wrote: > > > > > > On Mon, May 29, 2023 at 7:00=E2=80=AFAM Marc Zyngier = wrote: > > > > > > > > On Fri, 19 May 2023 01:52:28 +0100, > > > > Raghavendra Rao Ananta wrote: > > > > > > > > > > Implement kvm_arch_flush_remote_tlbs_range() for arm64 > > > > > to invalidate the given range in the TLB. > > > > > > > > > > Signed-off-by: Raghavendra Rao Ananta > > > > > --- > > > > > arch/arm64/include/asm/kvm_host.h | 3 +++ > > > > > arch/arm64/kvm/hyp/nvhe/tlb.c | 4 +--- > > > > > arch/arm64/kvm/mmu.c | 11 +++++++++++ > > > > > 3 files changed, 15 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/inclu= de/asm/kvm_host.h > > > > > index 81ab41b84f436..343fb530eea9c 100644 > > > > > --- a/arch/arm64/include/asm/kvm_host.h > > > > > +++ b/arch/arm64/include/asm/kvm_host.h > > > > > @@ -1081,6 +1081,9 @@ struct kvm *kvm_arch_alloc_vm(void); > > > > > #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS > > > > > int kvm_arch_flush_remote_tlbs(struct kvm *kvm); > > > > > > > > > > +#define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS_RANGE > > > > > +int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t star= t_gfn, u64 pages); > > > > > + > > > > > static inline bool kvm_vm_is_protected(struct kvm *kvm) > > > > > { > > > > > return false; > > > > > diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/n= vhe/tlb.c > > > > > index d4ea549c4b5c4..d2c7c1bc6d441 100644 > > > > > --- a/arch/arm64/kvm/hyp/nvhe/tlb.c > > > > > +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c > > > > > @@ -150,10 +150,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s= 2_mmu *mmu, > > > > > return; > > > > > } > > > > > > > > > > - dsb(ishst); > > > > > - > > > > > /* Switch to requested VMID */ > > > > > - __tlb_switch_to_guest(mmu, &cxt); > > > > > + __tlb_switch_to_guest(mmu, &cxt, false); > > > > > > > > This hunk is in the wrong patch, isn't it? > > > > > > > Ah, you are right. It should be part of the previous patch. I think I > > > introduced it accidentally when I rebased the series. I'll remove it > > > in the next spin. > > > > > > > > > > > > > > > > __flush_tlb_range_op(ipas2e1is, start, pages, stride, 0, 0,= false); > > > > > > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > > > index d0a0d3dca9316..e3673b4c10292 100644 > > > > > --- a/arch/arm64/kvm/mmu.c > > > > > +++ b/arch/arm64/kvm/mmu.c > > > > > @@ -92,6 +92,17 @@ int kvm_arch_flush_remote_tlbs(struct kvm *kvm) > > > > > return 0; > > > > > } > > > > > > > > > > +int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t star= t_gfn, u64 pages) > > > > > +{ > > > > > + phys_addr_t start, end; > > > > > + > > > > > + start =3D start_gfn << PAGE_SHIFT; > > > > > + end =3D (start_gfn + pages) << PAGE_SHIFT; > > > > > + > > > > > + kvm_call_hyp(__kvm_tlb_flush_vmid_range, &kvm->arch.mmu, st= art, end); > > > > > > > > So that's the point that I think is not right. It is the MMU code t= hat > > > > should drive the invalidation method, and not the HYP code. The HYP > > > > code should be as dumb as possible, and the logic should be kept in > > > > the MMU code. > > > > > > > > So when a range invalidation is forwarded to HYP, it's a *valid* ra= nge > > > > invalidation. not something that can fallback to VMID-wide invalida= tion. > > > > > > > I'm guessing that you are referring to patch-2. Do you recommend > > > moving the 'pages >=3D MAX_TLBI_RANGE_PAGES' logic here and simply > > > return an error? How about for the other check: > > > system_supports_tlb_range()? > > > The idea was for __kvm_tlb_flush_vmid_range() to also implement a > > > fallback mechanism in case the system doesn't support the range-based > > > instructions. But if we end up calling __kvm_tlb_flush_vmid_range() > > > from multiple cases, we'd end up duplicating the checks. WDYT? > > > > My take is that there should be a single helper deciding to issue > > either a number of range-based TLBIs depending on start/end, or a > > single VMID-based TLBI. Having multiple calling sites is not a > > problem, and even if that code gets duplicated, big deal. > > > Hypothetically, if I move the check to this patch and return an error > if this situation occurs, since I'm dependending on David's common MMU > code [1], kvm_main.c would end of calling kvm_flush_remote_tlbs() and > we'd be doing a VMID-based TLBI. > One idea would be to issue a WARN_ON() and return 0 so that we don't > issue any TLBIs. Thoughts? Then we should fix the infrastructure. And no, not issuing TLBs is not an acceptable outcome. Might as well panic the kernel, because silent memory corruption is not something I'm willing to entertain. My take is as follows: - either the core code doesn't specify a range, and we blow the whole thing as we do today - or it specifies a range, and we apply range invalidation as permitted by the architecture and the implementation (either a *series* of range invalidation, or a full VMID invalidation). As for the "common MMU" stuff, something such as "flush_remote_tlbs" really shows that this code isn't common at all. It is just x86 creep, and I have zero problem ignoring it. >=20 > > But a hypercall that falls back to global invalidation based on a > > range evaluation error (more than MAX_TLBI_RANGE_PAGES) is papering > > over a latent bug. > > > If I understand this correctly, MAX_TLBI_RANGE_PAGES is specifically > the capacity of the range-based instructions itself, isn't it? Is it > incorrect for the caller to request a higher range be invalidated even > on systems that do not support these instructions? Probably that's why > __flush_tlb_range() falls back to a global flush when the range > request is exceeded? This is indeed the architectural limit. But invalidating the whole VMID isn't necessarily the right thing either. If you can preserve some of the TLBs by only issuing a couple of range invalidation that span *in total* more than MAX_TLBI_RANGE_PAGES. Again, what I'm objecting to is the silent upgrading to VMID-wide invalidation being hidden away in the HYP code. That's just wrong. The HYP code is not the place for abstraction. Thanks, M. --=20 Without deviation from the norm, progress is not possible.