Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp404484pxf; Wed, 17 Mar 2021 07:43:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw5jdJzheWh35Dx7CW4rishK1H9lDMHafiekCzXCMuQ0sV4X3nQJExhAYVxpcbx4rCrb3p9 X-Received: by 2002:aa7:cb82:: with SMTP id r2mr42658291edt.209.1615992221066; Wed, 17 Mar 2021 07:43:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615992221; cv=none; d=google.com; s=arc-20160816; b=uZi+90jhtqv6bMMOjkbKFUvnfKKz7XVaWgYaZAAWU8ObvbZTDWPHJO9ywd3YM/Wy1I 93xWsCVcHXfVRSjcelHHh8jo8xUdi9zluxjmTDgknHln8iCNTYkvq+7HwInaKsTD/fh2 Z1AdmuPSTS8rTvpVYeOhDdZhs4WBz17xG0Cfhc4M8guMnJFAx7/OzTZbndkwXbuGhRLY isIFArbbQPGQlpX0B2ud+34pyX53htB9ui1Jy7grw+u8Yt57xpP8sBTEfPvFB0MbvCpX xAQngLDwfq5GHLYSpAW8y8C6kEPEGpHNAq7r6ttECQJOer1cxjfr+ZiccVzendzYw956 MQqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date; bh=zFp6HcU46vj+SKlNmjYgX+L8e3qL2acFi1g6DwoZOSs=; b=VE/Z4eHCYSgNiWxbtAb4zLpe6SE4dfzITewbIYRkUalJFJwgtbNWro5ZCFL2qz5FJA rJ5iRstWlTUbxtSTgU8O6vk8nf73lN8fLaapvmKBRoJaBJWKN5x+01tkefgTZqyiW5Eq 9Uyv2Dt1B27tmUhf5Jve5Tp3UUVz2u3p1nSzY3jvp6dr1APpUNWknJhNU1J23bPTwpJj 7kQLRGkq1CU2Pfin/rYHWrVIWT+hR2V36F2uOksFoopwZXhzrC2X7wpJzo0FCNGLo3kB bar6NPyPZGm4Im9sAnI5EuycsFE18QOqRUeiXtCCQksSDHsUIm50F3L1tSS8KfpZlczS UJ2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o11si15919261eja.149.2021.03.17.07.43.16; Wed, 17 Mar 2021 07:43:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232006AbhCQOmC (ORCPT + 99 others); Wed, 17 Mar 2021 10:42:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:57986 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231932AbhCQOlf (ORCPT ); Wed, 17 Mar 2021 10:41:35 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 01BC864F70; Wed, 17 Mar 2021 14:41:35 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1lMXMa-002EI2-PA; Wed, 17 Mar 2021 14:41:32 +0000 Date: Wed, 17 Mar 2021 14:41:31 +0000 Message-ID: <87a6r1j10k.wl-maz@kernel.org> From: Marc Zyngier To: Quentin Perret Cc: catalin.marinas@arm.com, will@kernel.org, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, android-kvm@google.com, seanjc@google.com, linux-kernel@vger.kernel.org, robh+dt@kernel.org, linux-arm-kernel@lists.infradead.org, kernel-team@android.com, kvmarm@lists.cs.columbia.edu, tabba@google.com, ardb@kernel.org, mark.rutland@arm.com, dbrazdil@google.com, mate.toth-pal@arm.com Subject: Re: [PATCH 1/2] KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB Stage-2 flag In-Reply-To: <20210317141714.383046-2-qperret@google.com> References: <20210315143536.214621-34-qperret@google.com> <20210317141714.383046-1-qperret@google.com> <20210317141714.383046-2-qperret@google.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: qperret@google.com, catalin.marinas@arm.com, will@kernel.org, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, android-kvm@google.com, seanjc@google.com, linux-kernel@vger.kernel.org, robh+dt@kernel.org, linux-arm-kernel@lists.infradead.org, kernel-team@android.com, kvmarm@lists.cs.columbia.edu, tabba@google.com, ardb@kernel.org, mark.rutland@arm.com, dbrazdil@google.com, mate.toth-pal@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Quentin, On Wed, 17 Mar 2021 14:17:13 +0000, Quentin Perret wrote: > > In order to further configure stage-2 page-tables, pass flags to the > init function using a new enum. > > The first of these flags allows to disable FWB even if the hardware > supports it as we will need to do so for the host stage-2. > > Signed-off-by: Quentin Perret > > --- > > One question is, do we want to use stage2_has_fwb() everywhere, including > guest-specific paths (e.g. kvm_arch_prepare_memory_region(), ...) ? > > That'd make this patch more intrusive, but would make the whole codebase > work with FWB enabled on a guest by guest basis. I don't see us use that > anytime soon (other than maybe debug of some sort?) but it'd be good to > have an agreement. I'm not sure how useful that would be. We fought long and hard to get FWB, and I can't see a good reason to disable it for guests unless the HW was buggy (but in which case that'd be for everyone). I'd rather keep the changes small for now (this whole series is invasive enough!). As for this patch, I only have a few cosmetic comments: > --- > arch/arm64/include/asm/kvm_pgtable.h | 19 +++++++++-- > arch/arm64/include/asm/pgtable-prot.h | 4 +-- > arch/arm64/kvm/hyp/pgtable.c | 49 +++++++++++++++++---------- > 3 files changed, 50 insertions(+), 22 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > index b93a2a3526ab..7382bdfb6284 100644 > --- a/arch/arm64/include/asm/kvm_pgtable.h > +++ b/arch/arm64/include/asm/kvm_pgtable.h > @@ -56,6 +56,15 @@ struct kvm_pgtable_mm_ops { > phys_addr_t (*virt_to_phys)(void *addr); > }; > > +/** > + * enum kvm_pgtable_stage2_flags - Stage-2 page-table flags. > + * @KVM_PGTABLE_S2_NOFWB: Don't enforce Normal-WB even if the CPUs have > + * ARM64_HAS_STAGE2_FWB. > + */ > +enum kvm_pgtable_stage2_flags { > + KVM_PGTABLE_S2_NOFWB = BIT(0), > +}; > + > /** > * struct kvm_pgtable - KVM page-table. > * @ia_bits: Maximum input address size, in bits. > @@ -72,6 +81,7 @@ struct kvm_pgtable { > > /* Stage-2 only */ > struct kvm_s2_mmu *mmu; > + enum kvm_pgtable_stage2_flags flags; > }; > > /** > @@ -201,11 +211,16 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift); > * @arch: Arch-specific KVM structure representing the guest virtual > * machine. > * @mm_ops: Memory management callbacks. > + * @flags: Stage-2 configuration flags. > * > * Return: 0 on success, negative error code on failure. > */ > -int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch, > - struct kvm_pgtable_mm_ops *mm_ops); > +int kvm_pgtable_stage2_init_flags(struct kvm_pgtable *pgt, struct kvm_arch *arch, > + struct kvm_pgtable_mm_ops *mm_ops, > + enum kvm_pgtable_stage2_flags flags); > + > +#define kvm_pgtable_stage2_init(pgt, arch, mm_ops) \ > + kvm_pgtable_stage2_init_flags(pgt, arch, mm_ops, 0) > > /** > * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table. > diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h > index 046be789fbb4..beeb722a82d3 100644 > --- a/arch/arm64/include/asm/pgtable-prot.h > +++ b/arch/arm64/include/asm/pgtable-prot.h > @@ -72,10 +72,10 @@ extern bool arm64_use_ng_mappings; > #define PAGE_KERNEL_EXEC __pgprot(PROT_NORMAL & ~PTE_PXN) > #define PAGE_KERNEL_EXEC_CONT __pgprot((PROT_NORMAL & ~PTE_PXN) | PTE_CONT) > > -#define PAGE_S2_MEMATTR(attr) \ > +#define PAGE_S2_MEMATTR(attr, has_fwb) \ > ({ \ > u64 __val; \ > - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) \ > + if (has_fwb) \ > __val = PTE_S2_MEMATTR(MT_S2_FWB_ ## attr); \ > else \ > __val = PTE_S2_MEMATTR(MT_S2_ ## attr); \ > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index 3a971df278bd..dee8aaeaf13e 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -507,12 +507,25 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) > return vtcr; > } > > -static int stage2_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) > +static bool stage2_has_fwb(struct kvm_pgtable *pgt) > +{ > + if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > + return false; > + > + return !(pgt->flags & KVM_PGTABLE_S2_NOFWB); > +} > + > +static int stage2_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep, > + struct kvm_pgtable *pgt) nit: make pgt the first parameter, as it defines the context in which the rest applies. > { > bool device = prot & KVM_PGTABLE_PROT_DEVICE; > - kvm_pte_t attr = device ? PAGE_S2_MEMATTR(DEVICE_nGnRE) : > - PAGE_S2_MEMATTR(NORMAL); > u32 sh = KVM_PTE_LEAF_ATTR_LO_S2_SH_IS; > + kvm_pte_t attr; > + > + if (device) > + attr = PAGE_S2_MEMATTR(DEVICE_nGnRE, stage2_has_fwb(pgt)); > + else > + attr = PAGE_S2_MEMATTR(NORMAL, stage2_has_fwb(pgt)); Maybe define a new helper: #define KVM_S2_MEMATTR(pgt, attr) PAGE_S2_MEMATTR(attr, stage2_has_fwb(pgt)) to avoid the constant stage2_has_fwb() repetition. > > if (!(prot & KVM_PGTABLE_PROT_X)) > attr |= KVM_PTE_LEAF_ATTR_HI_S2_XN; > @@ -748,7 +761,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, > .arg = &map_data, > }; > > - ret = stage2_set_prot_attr(prot, &map_data.attr); > + ret = stage2_set_prot_attr(prot, &map_data.attr, pgt); > if (ret) > return ret; > > @@ -786,16 +799,13 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, > > static void stage2_flush_dcache(void *addr, u64 size) > { > - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > - return; > - > __flush_dcache_area(addr, size); > } Consider dropping the function altogether and use __flush_dcache_area directly (assuming the prototypes are identical). > > -static bool stage2_pte_cacheable(kvm_pte_t pte) > +static bool stage2_pte_cacheable(kvm_pte_t pte, struct kvm_pgtable *pgt) Same comment about pgt being the first argument. > { > u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR; > - return memattr == PAGE_S2_MEMATTR(NORMAL); > + return memattr == PAGE_S2_MEMATTR(NORMAL, stage2_has_fwb(pgt)); > } > > static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > @@ -821,8 +831,8 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > > if (mm_ops->page_count(childp) != 1) > return 0; > - } else if (stage2_pte_cacheable(pte)) { > - need_flush = true; > + } else if (stage2_pte_cacheable(pte, pgt)) { > + need_flush = !stage2_has_fwb(pgt); > } > > /* > @@ -979,10 +989,11 @@ static int stage2_flush_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > enum kvm_pgtable_walk_flags flag, > void * const arg) > { > - struct kvm_pgtable_mm_ops *mm_ops = arg; > + struct kvm_pgtable *pgt = arg; > + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; > kvm_pte_t pte = *ptep; > > - if (!kvm_pte_valid(pte) || !stage2_pte_cacheable(pte)) > + if (!kvm_pte_valid(pte) || !stage2_pte_cacheable(pte, pgt)) > return 0; > > stage2_flush_dcache(kvm_pte_follow(pte, mm_ops), kvm_granule_size(level)); > @@ -994,17 +1005,18 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) > struct kvm_pgtable_walker walker = { > .cb = stage2_flush_walker, > .flags = KVM_PGTABLE_WALK_LEAF, > - .arg = pgt->mm_ops, > + .arg = pgt, > }; > > - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > + if (stage2_has_fwb(pgt)) > return 0; > > return kvm_pgtable_walk(pgt, addr, size, &walker); > } > > -int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch, > - struct kvm_pgtable_mm_ops *mm_ops) > +int kvm_pgtable_stage2_init_flags(struct kvm_pgtable *pgt, struct kvm_arch *arch, > + struct kvm_pgtable_mm_ops *mm_ops, > + enum kvm_pgtable_stage2_flags flags) > { > size_t pgd_sz; > u64 vtcr = arch->vtcr; > @@ -1017,6 +1029,7 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch, > if (!pgt->pgd) > return -ENOMEM; > > + pgt->flags = flags; Try and keep the initialisation order similar to the definition of the structure if possible. > pgt->ia_bits = ia_bits; > pgt->start_level = start_level; > pgt->mm_ops = mm_ops; > @@ -1101,7 +1114,7 @@ int kvm_pgtable_stage2_find_range(struct kvm_pgtable *pgt, u64 addr, > u32 level; > int ret; > > - ret = stage2_set_prot_attr(prot, &attr); > + ret = stage2_set_prot_attr(prot, &attr, pgt); > if (ret) > return ret; > attr &= KVM_PTE_LEAF_S2_COMPAT_MASK; > -- > 2.31.0.rc2.261.g7f71774620-goog > > Thanks, M. -- Without deviation from the norm, progress is not possible.