Hello,
This series makes some efficiency improvement of guest stage-2 page
table code, and there are some test results to quantify the benefit.
Description for this series:
We currently uniformly permorm CMOs of D-cache and I-cache in function
user_mem_abort before calling the fault handlers. If we get concurrent
guest faults(e.g. translation faults, permission faults) or some really
unnecessary guest faults caused by BBM, CMOs for the first vcpu are
necessary while the others later are not.
By moving CMOs to the fault handlers, we can easily identify conditions
where they are really needed and avoid the unnecessary ones. As it's a
time consuming process to perform CMOs especially when flushing a block
range, so this solution reduces much load of kvm and improve efficiency
of the stage-2 page table code.
We can imagine two specific scenarios which will gain much benefit:
1) In a normal VM startup, this solution will improve the efficiency of
handling guest page faults incurred by vCPUs, when initially populating
stage-2 page tables.
2) After live migration, the heavy workload will be resumed on the
destination VM, however all the stage-2 page tables need to be rebuilt
at the moment. So this solution will ease the performance drop during
resuming stage.
The following are test results originally from v3 [1] to represent how
much benefit was introduced by movement of CMOs. We can use KVM selftest
to simulate a scenario of concurrent guest memory access and test the
execution time that KVM uses to create new stage-2 mappings, update the
existing mappings, split/rebuild huge mappings during/after dirty logging.
hardware platform: HiSilicon Kunpeng920 Server
host kernel: Linux mainline v5.12-rc2
test tools: KVM selftest [2]
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/
cmdline: ./kvm_page_table_test -m 4 -s anonymous -b 1G -v 80
(80 vcpus, 1G memory, page mappings(normal 4K))
KVM_CREATE_MAPPINGS: before 104.35s -> after 90.42s +13.35%
KVM_UPDATE_MAPPINGS: before 78.64s -> after 75.45s + 4.06%
cmdline: ./kvm_page_table_test -m 4 -s anonymous_thp -b 20G -v 40
(40 vcpus, 20G memory, block mappings(THP 2M))
KVM_CREATE_MAPPINGS: before 15.66s -> after 6.92s +55.80%
KVM_UPDATE_MAPPINGS: before 178.80s -> after 123.35s +31.00%
KVM_REBUILD_BLOCKS: before 187.34s -> after 131.76s +30.65%
cmdline: ./kvm_page_table_test -m 4 -s anonymous_hugetlb_1gb -b 20G -v 40
(40 vcpus, 20G memory, block mappings(HUGETLB 1G))
KVM_CREATE_MAPPINGS: before 104.54s -> after 3.70s +96.46%
KVM_UPDATE_MAPPINGS: before 174.20s -> after 115.94s +33.44%
KVM_REBUILD_BLOCKS: before 103.95s -> after 2.96s +97.15%
---
Changelogs:
v5->v6:
- convert the guest CMO functions into callbacks in kvm_pgtable_mm_ops (Marc)
- drop patch #6 in v5 since we are stuffing topup into mmu_lock section (Quentin)
- rebased on latest kvmarm/tree
- v5: https://lore.kernel.org/lkml/[email protected]/
v4->v5:
- rebased on the latest kvmarm/tree to adapt to the new stage-2 page-table code
- v4: https://lore.kernel.org/lkml/[email protected]
---
Yanan Wang (4):
KVM: arm64: Introduce cache maintenance callbacks for guest stage-2
KVM: arm64: Introduce mm_ops member for structure stage2_attr_data
KVM: arm64: Tweak parameters of guest cache maintenance functions
KVM: arm64: Move guest CMOs to the fault handlers
arch/arm64/include/asm/kvm_mmu.h | 9 ++----
arch/arm64/include/asm/kvm_pgtable.h | 7 +++++
arch/arm64/kvm/hyp/pgtable.c | 47 +++++++++++++++++++++-------
arch/arm64/kvm/mmu.c | 39 ++++++++++-------------
4 files changed, 62 insertions(+), 40 deletions(-)
--
2.23.0
Also add a mm_ops member for structure stage2_attr_data, since we
will move I-cache maintenance for guest stage-2 to the permission
path and as a result will need mm_ops for some callbacks.
Signed-off-by: Yanan Wang <[email protected]>
---
arch/arm64/kvm/hyp/pgtable.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index c37c1dc4feaf..d99789432b05 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -861,10 +861,11 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
}
struct stage2_attr_data {
- kvm_pte_t attr_set;
- kvm_pte_t attr_clr;
- kvm_pte_t pte;
- u32 level;
+ kvm_pte_t attr_set;
+ kvm_pte_t attr_clr;
+ kvm_pte_t pte;
+ u32 level;
+ struct kvm_pgtable_mm_ops *mm_ops;
};
static int stage2_attr_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
@@ -903,6 +904,7 @@ static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
struct stage2_attr_data data = {
.attr_set = attr_set & attr_mask,
.attr_clr = attr_clr & attr_mask,
+ .mm_ops = pgt->mm_ops,
};
struct kvm_pgtable_walker walker = {
.cb = stage2_attr_walker,
--
2.23.0
We currently uniformly permorm CMOs of D-cache and I-cache in function
user_mem_abort before calling the fault handlers. If we get concurrent
guest faults(e.g. translation faults, permission faults) or some really
unnecessary guest faults caused by BBM, CMOs for the first vcpu are
necessary while the others later are not.
By moving CMOs to the fault handlers, we can easily identify conditions
where they are really needed and avoid the unnecessary ones. As it's a
time consuming process to perform CMOs especially when flushing a block
range, so this solution reduces much load of kvm and improve efficiency
of the stage-2 page table code.
We can imagine two specific scenarios which will gain much benefit:
1) In a normal VM startup, this solution will improve the efficiency of
handling guest page faults incurred by vCPUs, when initially populating
stage-2 page tables.
2) After live migration, the heavy workload will be resumed on the
destination VM, however all the stage-2 page tables need to be rebuilt
at the moment. So this solution will ease the performance drop during
resuming stage.
Signed-off-by: Yanan Wang <[email protected]>
---
arch/arm64/kvm/hyp/pgtable.c | 37 +++++++++++++++++++++++++++++-------
arch/arm64/kvm/mmu.c | 21 +++++++-------------
2 files changed, 37 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index d99789432b05..b7b40abe78e8 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -577,12 +577,24 @@ static void stage2_put_pte(kvm_pte_t *ptep, struct kvm_s2_mmu *mmu, u64 addr,
mm_ops->put_page(ptep);
}
+static bool stage2_pte_cacheable(struct kvm_pgtable *pgt, kvm_pte_t pte)
+{
+ u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR;
+ return memattr == KVM_S2_MEMATTR(pgt, NORMAL);
+}
+
+static bool stage2_pte_executable(kvm_pte_t pte)
+{
+ return !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN);
+}
+
static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
kvm_pte_t *ptep,
struct stage2_map_data *data)
{
kvm_pte_t new, old = *ptep;
u64 granule = kvm_granule_size(level), phys = data->phys;
+ struct kvm_pgtable *pgt = data->mmu->pgt;
struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
if (!kvm_block_mapping_supported(addr, end, phys, level))
@@ -606,6 +618,13 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
stage2_put_pte(ptep, data->mmu, addr, level, mm_ops);
}
+ /* Perform CMOs before installation of the guest stage-2 PTE */
+ if (mm_ops->flush_dcache && stage2_pte_cacheable(pgt, new))
+ mm_ops->flush_dcache(mm_ops->phys_to_virt(phys), granule);
+
+ if (mm_ops->flush_icache && stage2_pte_executable(new))
+ mm_ops->flush_icache(mm_ops->phys_to_virt(phys), granule);
+
smp_store_release(ptep, new);
if (stage2_pte_is_counted(new))
mm_ops->get_page(ptep);
@@ -798,12 +817,6 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
return ret;
}
-static bool stage2_pte_cacheable(struct kvm_pgtable *pgt, kvm_pte_t pte)
-{
- u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR;
- return memattr == KVM_S2_MEMATTR(pgt, NORMAL);
-}
-
static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
enum kvm_pgtable_walk_flags flag,
void * const arg)
@@ -874,6 +887,7 @@ static int stage2_attr_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
{
kvm_pte_t pte = *ptep;
struct stage2_attr_data *data = arg;
+ struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
if (!kvm_pte_valid(pte))
return 0;
@@ -888,8 +902,17 @@ static int stage2_attr_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
* but worst-case the access flag update gets lost and will be
* set on the next access instead.
*/
- if (data->pte != pte)
+ if (data->pte != pte) {
+ /*
+ * Invalidate instruction cache before updating the guest
+ * stage-2 PTE if we are going to add executable permission.
+ */
+ if (mm_ops->flush_icache &&
+ stage2_pte_executable(pte) && !stage2_pte_executable(*ptep))
+ mm_ops->flush_icache(kvm_pte_follow(pte, mm_ops),
+ kvm_granule_size(level));
WRITE_ONCE(*ptep, pte);
+ }
return 0;
}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index b980f8a47cbb..6d97a435a635 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -442,6 +442,8 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = {
.page_count = kvm_host_page_count,
.phys_to_virt = kvm_host_va,
.virt_to_phys = kvm_host_pa,
+ .flush_dcache = clean_dcache_guest_page,
+ .flush_icache = invalidate_icache_guest_page,
};
/**
@@ -1012,15 +1014,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (writable)
prot |= KVM_PGTABLE_PROT_W;
- if (fault_status != FSC_PERM && !device)
- clean_dcache_guest_page(page_address(pfn_to_page(pfn)),
- vma_pagesize);
-
- if (exec_fault) {
+ if (exec_fault)
prot |= KVM_PGTABLE_PROT_X;
- invalidate_icache_guest_page(page_address(pfn_to_page(pfn)),
- vma_pagesize);
- }
if (device)
prot |= KVM_PGTABLE_PROT_DEVICE;
@@ -1218,12 +1213,10 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
WARN_ON(range->end - range->start != 1);
/*
- * We've moved a page around, probably through CoW, so let's treat it
- * just like a translation fault and clean the cache to the PoC.
- */
- clean_dcache_guest_page(page_address(pfn_to_page(pfn), PAGE_SIZE);
-
- /*
+ * We've moved a page around, probably through CoW, so let's treat
+ * it just like a translation fault and the map handler will clean
+ * the cache to the PoC.
+ *
* The MMU notifiers will have unmapped a huge PMD before calling
* ->change_pte() (which in turn calls kvm_set_spte_gfn()) and
* therefore we never need to clear out a huge PMD through this
--
2.23.0
To prepare for performing guest CMOs in the fault handlers in pgtable.c,
introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
The new callbacks are specific for guest stage-2, so they will only be
initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
Signed-off-by: Yanan Wang <[email protected]>
---
arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index c3674c47d48c..302eca32e0af 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
* in the current context.
* @virt_to_phys: Convert a virtual address mapped in the current context
* into a physical address.
+ * @flush_dcache: Clean data cache for a guest page address range before
+ * creating the corresponding stage-2 mapping.
+ * @flush_icache: Invalidate instruction cache for a guest page address
+ * range before creating or updating the corresponding
+ * stage-2 mapping.
*/
struct kvm_pgtable_mm_ops {
void* (*zalloc_page)(void *arg);
@@ -54,6 +59,8 @@ struct kvm_pgtable_mm_ops {
int (*page_count)(void *addr);
void* (*phys_to_virt)(phys_addr_t phys);
phys_addr_t (*virt_to_phys)(void *addr);
+ void (*flush_dcache)(void *addr, size_t size);
+ void (*flush_icache)(void *addr, size_t size);
};
/**
--
2.23.0
Adjust the parameter "kvm_pfn_t pfn" of __clean_dcache_guest_page
and __invalidate_icache_guest_page to "void *va", which paves the
way for converting these two guest CMO functions into callbacks in
structure kvm_pgtable_mm_ops. No functional change.
Signed-off-by: Yanan Wang <[email protected]>
---
arch/arm64/include/asm/kvm_mmu.h | 9 ++-------
arch/arm64/kvm/mmu.c | 28 +++++++++++++++-------------
2 files changed, 17 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 25ed956f9af1..6844a7550392 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -187,10 +187,8 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
return (vcpu_read_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
}
-static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
+static inline void __clean_dcache_guest_page(void *va, size_t size)
{
- void *va = page_address(pfn_to_page(pfn));
-
/*
* With FWB, we ensure that the guest always accesses memory using
* cacheable attributes, and we don't have to clean to PoC when
@@ -203,16 +201,13 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
kvm_flush_dcache_to_poc(va, size);
}
-static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
- unsigned long size)
+static inline void __invalidate_icache_guest_page(void *va, size_t size)
{
if (icache_is_aliasing()) {
/* any kind of VIPT cache */
__flush_icache_all();
} else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
- void *va = page_address(pfn_to_page(pfn));
-
invalidate_icache_range((unsigned long)va,
(unsigned long)va + size);
}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 5742ba765ff9..b980f8a47cbb 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -126,6 +126,16 @@ static void *kvm_host_va(phys_addr_t phys)
return __va(phys);
}
+static void clean_dcache_guest_page(void *va, size_t size)
+{
+ __clean_dcache_guest_page(va, size);
+}
+
+static void invalidate_icache_guest_page(void *va, size_t size)
+{
+ __invalidate_icache_guest_page(va, size);
+}
+
/*
* Unmapping vs dcache management:
*
@@ -693,16 +703,6 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
}
-static void clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
-{
- __clean_dcache_guest_page(pfn, size);
-}
-
-static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size)
-{
- __invalidate_icache_guest_page(pfn, size);
-}
-
static void kvm_send_hwpoison_signal(unsigned long address, short lsb)
{
send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
@@ -1013,11 +1013,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
prot |= KVM_PGTABLE_PROT_W;
if (fault_status != FSC_PERM && !device)
- clean_dcache_guest_page(pfn, vma_pagesize);
+ clean_dcache_guest_page(page_address(pfn_to_page(pfn)),
+ vma_pagesize);
if (exec_fault) {
prot |= KVM_PGTABLE_PROT_X;
- invalidate_icache_guest_page(pfn, vma_pagesize);
+ invalidate_icache_guest_page(page_address(pfn_to_page(pfn)),
+ vma_pagesize);
}
if (device)
@@ -1219,7 +1221,7 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
* We've moved a page around, probably through CoW, so let's treat it
* just like a translation fault and clean the cache to the PoC.
*/
- clean_dcache_guest_page(pfn, PAGE_SIZE);
+ clean_dcache_guest_page(page_address(pfn_to_page(pfn), PAGE_SIZE);
/*
* The MMU notifiers will have unmapped a huge PMD before calling
--
2.23.0
Hi Marc,
On 2021/6/16 21:21, Marc Zyngier wrote:
> Hi Yanan,
>
> On Wed, 16 Jun 2021 10:51:57 +0100,
> Yanan Wang <[email protected]> wrote:
>> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
>> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
>>
>> The new callbacks are specific for guest stage-2, so they will only be
>> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
>>
>> Signed-off-by: Yanan Wang <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
>> index c3674c47d48c..302eca32e0af 100644
>> --- a/arch/arm64/include/asm/kvm_pgtable.h
>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
>> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
>> * in the current context.
>> * @virt_to_phys: Convert a virtual address mapped in the current context
>> * into a physical address.
>> + * @flush_dcache: Clean data cache for a guest page address range before
>> + * creating the corresponding stage-2 mapping.
> Please don't reintroduce the word 'flush'. We are really trying to
> move away from it as it doesn't describe what we want to do.
I agree with this. I intended to make the names short and laconic, but this
missed the information about the callback's actual behaviors.
> Here this
> should be 'clean_invalidate_dcache' which, despite being a mouthful,
> describe accurately what we expect it to do.
Sure, I will change the name as you suggested.
> The comment is also missing the invalidate part, and we shouldn't
> assume that this is only used for S2 mapping.
Ok, will refine the comment. I think something like"Clean and invalidate the
date cache for the specified memory address range" may be generic enough.
>> + * @flush_icache: Invalidate instruction cache for a guest page address
>> + * range before creating or updating the corresponding
>> + * stage-2 mapping.
> Same thing here; this should be 'invalidate_icache', and the comment
> cleaned up.
Thanks, I will also correct this part.
Besides the callback names and comments, is there anything else that still
needs some adjustment in the other three patches? :)
Regards,
Yanan
.
>> */
>> struct kvm_pgtable_mm_ops {
>> void* (*zalloc_page)(void *arg);
>> @@ -54,6 +59,8 @@ struct kvm_pgtable_mm_ops {
>> int (*page_count)(void *addr);
>> void* (*phys_to_virt)(phys_addr_t phys);
>> phys_addr_t (*virt_to_phys)(void *addr);
>> + void (*flush_dcache)(void *addr, size_t size);
>> + void (*flush_icache)(void *addr, size_t size);
>> };
>>
>> /**
> Thanks,
>
> M.
>
On Thu, 17 Jun 2021 07:48:29 +0100,
"wangyanan (Y)" <[email protected]> wrote:
>
> Hi Marc,
>
> On 2021/6/16 21:21, Marc Zyngier wrote:
> > Hi Yanan,
> >
> > On Wed, 16 Jun 2021 10:51:57 +0100,
> > Yanan Wang <[email protected]> wrote:
> >> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
> >> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
> >>
> >> The new callbacks are specific for guest stage-2, so they will only be
> >> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
> >>
> >> Signed-off-by: Yanan Wang <[email protected]>
> >> ---
> >> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
> >> 1 file changed, 7 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> >> index c3674c47d48c..302eca32e0af 100644
> >> --- a/arch/arm64/include/asm/kvm_pgtable.h
> >> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> >> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
> >> * in the current context.
> >> * @virt_to_phys: Convert a virtual address mapped in the current context
> >> * into a physical address.
> >> + * @flush_dcache: Clean data cache for a guest page address range before
> >> + * creating the corresponding stage-2 mapping.
> > Please don't reintroduce the word 'flush'. We are really trying to
> > move away from it as it doesn't describe what we want to do.
> I agree with this. I intended to make the names short and laconic, but this
> missed the information about the callback's actual behaviors.
> > Here this
> > should be 'clean_invalidate_dcache' which, despite being a mouthful,
> > describe accurately what we expect it to do.
> Sure, I will change the name as you suggested.
> > The comment is also missing the invalidate part, and we shouldn't
> > assume that this is only used for S2 mapping.
> Ok, will refine the comment. I think something like"Clean and invalidate the
> date cache for the specified memory address range" may be generic enough.
> >> + * @flush_icache: Invalidate instruction cache for a guest page address
> >> + * range before creating or updating the corresponding
> >> + * stage-2 mapping.
> > Same thing here; this should be 'invalidate_icache', and the comment
> > cleaned up.
> Thanks, I will also correct this part.
>
> Besides the callback names and comments, is there anything else that still
> needs some adjustment in the other three patches? :)
It looks pretty good so far, much nicer than the previous versions.
I have a small nit on the last patch, which should be dead easy to
address. I'm currently running a bunch of tests, hopefully nothing bad
will come out of it.
If you respin it shortly, that nothing fails, and unless someone
shouts, I'll queue it for -next.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On 2021/6/17 16:03, Marc Zyngier wrote:
> On Thu, 17 Jun 2021 07:48:29 +0100,
> "wangyanan (Y)" <[email protected]> wrote:
>> Hi Marc,
>>
>> On 2021/6/16 21:21, Marc Zyngier wrote:
>>> Hi Yanan,
>>>
>>> On Wed, 16 Jun 2021 10:51:57 +0100,
>>> Yanan Wang <[email protected]> wrote:
>>>> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
>>>> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
>>>>
>>>> The new callbacks are specific for guest stage-2, so they will only be
>>>> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
>>>>
>>>> Signed-off-by: Yanan Wang <[email protected]>
>>>> ---
>>>> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
>>>> index c3674c47d48c..302eca32e0af 100644
>>>> --- a/arch/arm64/include/asm/kvm_pgtable.h
>>>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
>>>> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
>>>> * in the current context.
>>>> * @virt_to_phys: Convert a virtual address mapped in the current context
>>>> * into a physical address.
>>>> + * @flush_dcache: Clean data cache for a guest page address range before
>>>> + * creating the corresponding stage-2 mapping.
>>> Please don't reintroduce the word 'flush'. We are really trying to
>>> move away from it as it doesn't describe what we want to do.
>> I agree with this. I intended to make the names short and laconic, but this
>> missed the information about the callback's actual behaviors.
>>> Here this
>>> should be 'clean_invalidate_dcache' which, despite being a mouthful,
>>> describe accurately what we expect it to do.
>> Sure, I will change the name as you suggested.
>>> The comment is also missing the invalidate part, and we shouldn't
>>> assume that this is only used for S2 mapping.
>> Ok, will refine the comment. I think something like"Clean and invalidate the
>> date cache for the specified memory address range" may be generic enough.
>>>> + * @flush_icache: Invalidate instruction cache for a guest page address
>>>> + * range before creating or updating the corresponding
>>>> + * stage-2 mapping.
>>> Same thing here; this should be 'invalidate_icache', and the comment
>>> cleaned up.
>> Thanks, I will also correct this part.
>>
>> Besides the callback names and comments, is there anything else that still
>> needs some adjustment in the other three patches? :)
> It looks pretty good so far, much nicer than the previous versions.
>
> I have a small nit on the last patch, which should be dead easy to
> address. I'm currently running a bunch of tests, hopefully nothing bad
> will come out of it.
>
> If you respin it shortly, that nothing fails, and unless someone
> shouts, I'll queue it for -next.
It would be nice, thanks!
I will address the nit and respin the series soon.
Thanks,
Yanan
.
> Thanks,
>
> M.
>
On Thu, 17 Jun 2021 09:22:51 +0100,
"wangyanan (Y)" <[email protected]> wrote:
>
>
>
> On 2021/6/17 16:03, Marc Zyngier wrote:
> > On Thu, 17 Jun 2021 07:48:29 +0100,
> > "wangyanan (Y)" <[email protected]> wrote:
> >> Hi Marc,
> >>
> >> On 2021/6/16 21:21, Marc Zyngier wrote:
> >>> Hi Yanan,
> >>>
> >>> On Wed, 16 Jun 2021 10:51:57 +0100,
> >>> Yanan Wang <[email protected]> wrote:
> >>>> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
> >>>> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
> >>>>
> >>>> The new callbacks are specific for guest stage-2, so they will only be
> >>>> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
> >>>>
> >>>> Signed-off-by: Yanan Wang <[email protected]>
> >>>> ---
> >>>> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
> >>>> 1 file changed, 7 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> >>>> index c3674c47d48c..302eca32e0af 100644
> >>>> --- a/arch/arm64/include/asm/kvm_pgtable.h
> >>>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> >>>> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
> >>>> * in the current context.
> >>>> * @virt_to_phys: Convert a virtual address mapped in the current context
> >>>> * into a physical address.
> >>>> + * @flush_dcache: Clean data cache for a guest page address range before
> >>>> + * creating the corresponding stage-2 mapping.
> >>> Please don't reintroduce the word 'flush'. We are really trying to
> >>> move away from it as it doesn't describe what we want to do.
> >> I agree with this. I intended to make the names short and laconic, but this
> >> missed the information about the callback's actual behaviors.
> >>> Here this
> >>> should be 'clean_invalidate_dcache' which, despite being a mouthful,
> >>> describe accurately what we expect it to do.
> >> Sure, I will change the name as you suggested.
> >>> The comment is also missing the invalidate part, and we shouldn't
> >>> assume that this is only used for S2 mapping.
> >> Ok, will refine the comment. I think something like"Clean and invalidate the
> >> date cache for the specified memory address range" may be generic enough.
> >>>> + * @flush_icache: Invalidate instruction cache for a guest page address
> >>>> + * range before creating or updating the corresponding
> >>>> + * stage-2 mapping.
> >>> Same thing here; this should be 'invalidate_icache', and the comment
> >>> cleaned up.
> >> Thanks, I will also correct this part.
> >>
> >> Besides the callback names and comments, is there anything else that still
> >> needs some adjustment in the other three patches? :)
> > It looks pretty good so far, much nicer than the previous versions.
> >
> > I have a small nit on the last patch, which should be dead easy to
> > address. I'm currently running a bunch of tests, hopefully nothing bad
> > will come out of it.
> >
> > If you respin it shortly, that nothing fails, and unless someone
> > shouts, I'll queue it for -next.
> It would be nice, thanks!
> I will address the nit and respin the series soon.
By the way, what the status of your selftest series that originally
came with this series? Are you planning to respin it? It would be
useful to have something that checks for regressions, and that series
did seem to do the trick.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Hi Marc,
On 2021/6/17 16:44, Marc Zyngier wrote:
> On Thu, 17 Jun 2021 09:22:51 +0100,
> "wangyanan (Y)" <[email protected]> wrote:
>>
>>
>> On 2021/6/17 16:03, Marc Zyngier wrote:
>>> On Thu, 17 Jun 2021 07:48:29 +0100,
>>> "wangyanan (Y)" <[email protected]> wrote:
>>>> Hi Marc,
>>>>
>>>> On 2021/6/16 21:21, Marc Zyngier wrote:
>>>>> Hi Yanan,
>>>>>
>>>>> On Wed, 16 Jun 2021 10:51:57 +0100,
>>>>> Yanan Wang <[email protected]> wrote:
>>>>>> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
>>>>>> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
>>>>>>
>>>>>> The new callbacks are specific for guest stage-2, so they will only be
>>>>>> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
>>>>>>
>>>>>> Signed-off-by: Yanan Wang <[email protected]>
>>>>>> ---
>>>>>> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
>>>>>> 1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
>>>>>> index c3674c47d48c..302eca32e0af 100644
>>>>>> --- a/arch/arm64/include/asm/kvm_pgtable.h
>>>>>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
>>>>>> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
>>>>>> * in the current context.
>>>>>> * @virt_to_phys: Convert a virtual address mapped in the current context
>>>>>> * into a physical address.
>>>>>> + * @flush_dcache: Clean data cache for a guest page address range before
>>>>>> + * creating the corresponding stage-2 mapping.
>>>>> Please don't reintroduce the word 'flush'. We are really trying to
>>>>> move away from it as it doesn't describe what we want to do.
>>>> I agree with this. I intended to make the names short and laconic, but this
>>>> missed the information about the callback's actual behaviors.
>>>>> Here this
>>>>> should be 'clean_invalidate_dcache' which, despite being a mouthful,
>>>>> describe accurately what we expect it to do.
>>>> Sure, I will change the name as you suggested.
>>>>> The comment is also missing the invalidate part, and we shouldn't
>>>>> assume that this is only used for S2 mapping.
>>>> Ok, will refine the comment. I think something like"Clean and invalidate the
>>>> date cache for the specified memory address range" may be generic enough.
>>>>>> + * @flush_icache: Invalidate instruction cache for a guest page address
>>>>>> + * range before creating or updating the corresponding
>>>>>> + * stage-2 mapping.
>>>>> Same thing here; this should be 'invalidate_icache', and the comment
>>>>> cleaned up.
>>>> Thanks, I will also correct this part.
>>>>
>>>> Besides the callback names and comments, is there anything else that still
>>>> needs some adjustment in the other three patches? :)
>>> It looks pretty good so far, much nicer than the previous versions.
>>>
>>> I have a small nit on the last patch, which should be dead easy to
>>> address. I'm currently running a bunch of tests, hopefully nothing bad
>>> will come out of it.
>>>
>>> If you respin it shortly, that nothing fails, and unless someone
>>> shouts, I'll queue it for -next.
>> It would be nice, thanks!
>> I will address the nit and respin the series soon.
> By the way, what the status of your selftest series that originally
> came with this series? Are you planning to respin it? It would be
> useful to have something that checks for regressions, and that series
> did seem to do the trick.
Actually they have already gone into upstream, since v5.13-rc1. :)
The path is tools/testing/selftests/kvm/kvm_page_table_test.c, so it
will be much convenient to test a 5.13 kernel, you can also have a try.
I am using the original test data from v3 in the cover-letter because
I think the test results will be almost the same with a different kernel.
Thanks,
Yanan
.
> Thanks,
>
> M.
>
On Thu, 17 Jun 2021 10:43:23 +0100,
"wangyanan (Y)" <[email protected]> wrote:
>
> Hi Marc,
>
> On 2021/6/17 16:44, Marc Zyngier wrote:
> > By the way, what the status of your selftest series that originally
> > came with this series? Are you planning to respin it? It would be
> > useful to have something that checks for regressions, and that series
> > did seem to do the trick.
> Actually they have already gone into upstream, since v5.13-rc1. :)
> The path is tools/testing/selftests/kvm/kvm_page_table_test.c, so it
> will be much convenient to test a 5.13 kernel, you can also have a try.
Ah, I missed it! Good stuff.
M.
--
Without deviation from the norm, progress is not possible.
Hi Yanan,
On Wed, Jun 16, 2021 at 10:52 AM Yanan Wang <[email protected]> wrote:
>
> To prepare for performing guest CMOs in the fault handlers in pgtable.c,
> introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops.
>
> The new callbacks are specific for guest stage-2, so they will only be
> initialized in 'struct kvm_pgtable_mm_ops kvm_s2_mm_ops'.
>
> Signed-off-by: Yanan Wang <[email protected]>
> ---
> arch/arm64/include/asm/kvm_pgtable.h | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index c3674c47d48c..302eca32e0af 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -44,6 +44,11 @@ typedef u64 kvm_pte_t;
> * in the current context.
> * @virt_to_phys: Convert a virtual address mapped in the current context
> * into a physical address.
> + * @flush_dcache: Clean data cache for a guest page address range before
> + * creating the corresponding stage-2 mapping.
> + * @flush_icache: Invalidate instruction cache for a guest page address
> + * range before creating or updating the corresponding
> + * stage-2 mapping.
> */
> struct kvm_pgtable_mm_ops {
> void* (*zalloc_page)(void *arg);
> @@ -54,6 +59,8 @@ struct kvm_pgtable_mm_ops {
> int (*page_count)(void *addr);
> void* (*phys_to_virt)(phys_addr_t phys);
> phys_addr_t (*virt_to_phys)(void *addr);
> + void (*flush_dcache)(void *addr, size_t size);
> + void (*flush_icache)(void *addr, size_t size);
> };
>
Just to add to Marc's comment on naming, flush_dcache is in this case
a clean and invalidate: I see that in patch 4 it eventually does a
civac. So, yes, although it is a mouthful, I think it should be
dcache_clean_inval and not just dcache_clean. An alternative, if it's
acceptable by Marc and the others, is to name the parameters dcmo/icmo
or something like that, where the nature of the maintenance operation
is not necessarily tied to the name.
For reference, this is the patch Marc mentioned, where we're trying to
fix the naming to make it consistent with Arm terminology (Arm doesn't
define what a flush is):
https://lore.kernel.org/linux-arm-kernel/[email protected]/
Otherwise:
Reviewed-by: Fuad Tabba <[email protected]>
Cheers,
/fuad
> /**
> --
> 2.23.0
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm