Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp578594pxb; Wed, 24 Feb 2021 09:24:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJxzCfQfZVvxgmopqzl3jAfb40oGZ2zynDWnUCaVENdmm7sO/lH2Z13Djh+H+Tviohu7sVJv X-Received: by 2002:a17:906:af84:: with SMTP id mj4mr32818439ejb.84.1614187489024; Wed, 24 Feb 2021 09:24:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614187489; cv=none; d=google.com; s=arc-20160816; b=ZhgWfmFz90Y66Zv3lFJtPC39ims19hEck/wGbQI5s4B+RhSopnDQxOUb3QJTcAo4Yo mNizsxqoSxZnYz0S5oK13HpWJP2Ic2wHoYDw9DC2NM7cfXOBdo3ZzLOSMHBmo/0TJMnw b1ZIAYqjFKiRqbRAKDHA+1QnQevof3XTkkyBheF3USLcWmheQT7i6TWzZY9Un9SHB6DW D0o68MGT5UjJVw9RXWxbGc8b1uVVSOxlvd0qc72AzjzQfVRcenUt7Z96f7ZYoP7UVgJR rRxqkdbH9WQTan5+s3o2tDzPbSg6ssQN7KZVC1mR6gJm6LQ7bG0P5zUQD6pHuD9fTAoh DEYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject; bh=Fs9f14/tvqT7dYPvBDg//N758AEQlz9hLyd/Ot06yN4=; b=hpvO3oPNpN3aeScpuAoavD5YhcBp31vU2o5p2obRPjcWXw/5dygJNjfVetRwT2iHhg Dd299pMgC9uPJCFV7v63bwikJrbOTORh6ygGT8UZBg2Ixgm2dPsMA7RjLLz5FtHc96wV p5LHN8Vo7ymW63KgpRiV3Dkt3HeqphfQrlgQ8f8EuqWZVW0Yyh6g4G5UMiG4VCfRoC/m Hp2zRQUvOW2w7QD8oy8cITE4uoxklpB+aWCNqTbFTrTzTtsQBxEYa8kGG4RwQQoLjZ6Q XAhAoL/C4N6exi7mmdQ/9z+l+lMXzWxqOGdsswmwaQoQ3JurrngOOx82OGUEfZ8xnAAm L9/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ja23si1613648ejc.143.2021.02.24.09.24.25; Wed, 24 Feb 2021 09:24:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235385AbhBXRWP (ORCPT + 99 others); Wed, 24 Feb 2021 12:22:15 -0500 Received: from foss.arm.com ([217.140.110.172]:40978 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233502AbhBXRVy (ORCPT ); Wed, 24 Feb 2021 12:21:54 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A4A1B101E; Wed, 24 Feb 2021 09:21:07 -0800 (PST) Received: from [192.168.0.110] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EC35F3F73B; Wed, 24 Feb 2021 09:21:05 -0800 (PST) Subject: Re: [RFC PATCH 1/4] KVM: arm64: Move the clean of dcache to the map handler To: Yanan Wang , Marc Zyngier , Will Deacon , Catalin Marinas , James Morse , Julien Thierry , Suzuki K Poulose , Gavin Shan , Quentin Perret , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20210208112250.163568-1-wangyanan55@huawei.com> <20210208112250.163568-2-wangyanan55@huawei.com> From: Alexandru Elisei Message-ID: <70b2d6c2-709b-d63b-1409-b16dad89b9b6@arm.com> Date: Wed, 24 Feb 2021 17:21:22 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <20210208112250.163568-2-wangyanan55@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On 2/8/21 11:22 AM, Yanan Wang wrote: > We currently uniformly clean dcache in user_mem_abort() before calling the > fault handlers, if we take a translation fault and the pfn is cacheable. > But if there are concurrent translation faults on the same page or block, > clean of dcache for the first time is necessary while the others are not. > > By moving clean of dcache to the map handler, we can easily identify the > conditions where CMOs are really needed and avoid the unnecessary ones. > As it's a time consuming process to perform CMOs especially when flushing > a block range, so this solution reduces much load of kvm and improve the > efficiency of creating mappings. > > Signed-off-by: Yanan Wang > --- > arch/arm64/include/asm/kvm_mmu.h | 16 -------------- > arch/arm64/kvm/hyp/pgtable.c | 38 ++++++++++++++++++++------------ > arch/arm64/kvm/mmu.c | 14 +++--------- > 3 files changed, 27 insertions(+), 41 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index e52d82aeadca..4ec9879e82ed 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -204,22 +204,6 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu) > return (vcpu_read_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101; > } > > -static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size) > -{ > - void *va = page_address(pfn_to_page(pfn)); > - > - /* > - * With FWB, we ensure that the guest always accesses memory using > - * cacheable attributes, and we don't have to clean to PoC when > - * faulting in pages. Furthermore, FWB implies IDC, so cleaning to > - * PoU is not required either in this case. > - */ > - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > - return; > - > - kvm_flush_dcache_to_poc(va, size); > -} > - > static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn, > unsigned long size) > { > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index 4d177ce1d536..2f4f87021980 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -464,6 +464,26 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot, > return 0; > } > > +static bool stage2_pte_cacheable(kvm_pte_t pte) > +{ > + u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR; > + return memattr == PAGE_S2_MEMATTR(NORMAL); > +} > + > +static void stage2_flush_dcache(void *addr, u64 size) > +{ > + /* > + * With FWB, we ensure that the guest always accesses memory using > + * cacheable attributes, and we don't have to clean to PoC when > + * faulting in pages. Furthermore, FWB implies IDC, so cleaning to > + * PoU is not required either in this case. > + */ > + if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > + return; > + > + __flush_dcache_area(addr, size); > +} > + > static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, > kvm_pte_t *ptep, > struct stage2_map_data *data) > @@ -495,6 +515,10 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, > put_page(page); > } > > + /* Flush data cache before installation of the new PTE */ > + if (stage2_pte_cacheable(new)) > + stage2_flush_dcache(__va(phys), granule); This makes sense to me. kvm_pgtable_stage2_map() is protected against concurrent calls by the kvm->mmu_lock, so only one VCPU can change the stage 2 translation table at any given moment. In the case of concurrent translation faults on the same IPA, the first VCPU that will take the lock will create the mapping and do the dcache clean+invalidate. The other VCPUs will return -EAGAIN because the mapping they are trying to install is almost identical* to the mapping created by the first VCPU that took the lock. I have a question. Why are you doing the cache maintenance *before* installing the new mapping? This is what the kernel already does, so I'm not saying it's incorrect, I'm just curious about the reason behind it. *permissions might be different. Thanks, Alex > + > smp_store_release(ptep, new); > get_page(page); > data->phys += granule; > @@ -651,20 +675,6 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, > return ret; > } > > -static void stage2_flush_dcache(void *addr, u64 size) > -{ > - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) > - return; > - > - __flush_dcache_area(addr, size); > -} > - > -static bool stage2_pte_cacheable(kvm_pte_t pte) > -{ > - u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR; > - return memattr == PAGE_S2_MEMATTR(NORMAL); > -} > - > static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > enum kvm_pgtable_walk_flags flag, > void * const arg) > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 77cb2d28f2a4..d151927a7d62 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -609,11 +609,6 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, > kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask); > } > > -static void clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size) > -{ > - __clean_dcache_guest_page(pfn, size); > -} > - > static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size) > { > __invalidate_icache_guest_page(pfn, size); > @@ -882,9 +877,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (writable) > prot |= KVM_PGTABLE_PROT_W; > > - if (fault_status != FSC_PERM && !device) > - clean_dcache_guest_page(pfn, vma_pagesize); > - > if (exec_fault) { > prot |= KVM_PGTABLE_PROT_X; > invalidate_icache_guest_page(pfn, vma_pagesize); > @@ -1144,10 +1136,10 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) > trace_kvm_set_spte_hva(hva); > > /* > - * We've moved a page around, probably through CoW, so let's treat it > - * just like a translation fault and clean the cache to the PoC. > + * We've moved a page around, probably through CoW, so let's treat > + * it just like a translation fault and the map handler will clean > + * the cache to the PoC. > */ > - clean_dcache_guest_page(pfn, PAGE_SIZE); > handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &pfn); > return 0; > }