2021-01-25 14:24:15

by Yanan Wang

[permalink] [raw]
Subject: [PATCH 1/2] KVM: arm64: Distinguish cases of allocating memcache more precisely

With a guest translation fault, we don't really need the memcache pages
when only installing a new entry to the existing page table or replacing
the table entry with a block entry. And with a guest permission fault,
we also don't need the memcache pages for a write_fault in dirty-logging
time if VMs are not configured with huge mappings.

The cases where allocations from memcache are required can be much more
precisely distinguished by comparing fault_granule and vma_pagesize.

Signed-off-by: Yanan Wang <[email protected]>
---
arch/arm64/kvm/mmu.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7d2257cc5438..8e8549ea1d70 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -820,19 +820,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
gfn = fault_ipa >> PAGE_SHIFT;
mmap_read_unlock(current->mm);

- /*
- * Permission faults just need to update the existing leaf entry,
- * and so normally don't require allocations from the memcache. The
- * only exception to this is when dirty logging is enabled at runtime
- * and a write fault needs to collapse a block entry into a table.
- */
- if (fault_status != FSC_PERM || (logging_active && write_fault)) {
- ret = kvm_mmu_topup_memory_cache(memcache,
- kvm_mmu_cache_min_pages(kvm));
- if (ret)
- return ret;
- }
-
mmu_seq = vcpu->kvm->mmu_notifier_seq;
/*
* Ensure the read of mmu_notifier_seq happens before we call
@@ -898,6 +885,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
prot |= KVM_PGTABLE_PROT_X;

+ /*
+ * Allocations from the memcache are required only when granule of the
+ * lookup level where a guest fault happened exceeds the vma_pagesize,
+ * which means new page tables will be created in the fault handlers.
+ */
+ if (fault_granule > vma_pagesize) {
+ ret = kvm_mmu_topup_memory_cache(memcache,
+ kvm_mmu_cache_min_pages(kvm));
+ if (ret)
+ return ret;
+ }
+
/*
* Under the premise of getting a FSC_PERM fault, we just need to relax
* permissions only if vma_pagesize equals fault_granule. Otherwise,
--
2.19.1


2021-03-08 16:37:08

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 1/2] KVM: arm64: Distinguish cases of allocating memcache more precisely

On Mon, Jan 25, 2021 at 10:10:43PM +0800, Yanan Wang wrote:
> With a guest translation fault, we don't really need the memcache pages
> when only installing a new entry to the existing page table or replacing
> the table entry with a block entry. And with a guest permission fault,
> we also don't need the memcache pages for a write_fault in dirty-logging
> time if VMs are not configured with huge mappings.
>
> The cases where allocations from memcache are required can be much more
> precisely distinguished by comparing fault_granule and vma_pagesize.
>
> Signed-off-by: Yanan Wang <[email protected]>
> ---
> arch/arm64/kvm/mmu.c | 25 ++++++++++++-------------
> 1 file changed, 12 insertions(+), 13 deletions(-)
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 7d2257cc5438..8e8549ea1d70 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -820,19 +820,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> gfn = fault_ipa >> PAGE_SHIFT;
> mmap_read_unlock(current->mm);
>
> - /*
> - * Permission faults just need to update the existing leaf entry,
> - * and so normally don't require allocations from the memcache. The
> - * only exception to this is when dirty logging is enabled at runtime
> - * and a write fault needs to collapse a block entry into a table.
> - */
> - if (fault_status != FSC_PERM || (logging_active && write_fault)) {
> - ret = kvm_mmu_topup_memory_cache(memcache,
> - kvm_mmu_cache_min_pages(kvm));
> - if (ret)
> - return ret;
> - }
> -
> mmu_seq = vcpu->kvm->mmu_notifier_seq;
> /*
> * Ensure the read of mmu_notifier_seq happens before we call
> @@ -898,6 +885,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
> prot |= KVM_PGTABLE_PROT_X;
>
> + /*
> + * Allocations from the memcache are required only when granule of the
> + * lookup level where a guest fault happened exceeds the vma_pagesize,
> + * which means new page tables will be created in the fault handlers.
> + */
> + if (fault_granule > vma_pagesize) {
> + ret = kvm_mmu_topup_memory_cache(memcache,
> + kvm_mmu_cache_min_pages(kvm));
> + if (ret)
> + return ret;
> + }

This feels like it could bite us in future as the code evolves but people
forget to reconsider this check. Maybe it would be better to extend this
patch so that we handle getting -ENOMEM back and try a second time after
topping up the memcache?

Will