This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage
2. This feature is useful on cores that have support for large sized
TLB mappings (e.g., 1GB for 4K granule).
The biggest change in this version is to replace repeated instances of
walking the page tables to get to a leaf-entry with a function to
return the appropriate entry. This was suggested by Suzuki and should
help reduce the amount of churn resulting from future changes. It also
address other feedback on the previous version.
Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.
The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc4. The are a few conflicts with the support for 52
bit IPA[1] due to change in number of parameters for
stage2_pmd_offset().
Thanks,
Punit
v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
condition to decide if a page table entry should be marked
executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
of levels of stage 2 tables differs from stage 1
v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries
v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
indirectly via hugetlb
* Added review tag [4/4]
v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]
[0] https://www.spinics.net/lists/arm-kernel/msg663562.html
[1] https://www.spinics.net/lists/kvm/msg171065.html
Punit Agrawal (7):
KVM: arm/arm64: Share common code in user_mem_abort()
KVM: arm/arm64: Introduce helpers to manupulate page table entries
KVM: arm64: Support dirty page tracking for PUD hugepages
KVM: arm64: Support PUD hugepage in stage2_is_exec()
KVM: arm64: Support handling access faults for PUD hugepages
KVM: arm64: Update age handlers to support PUD hugepages
KVM: arm64: Add support for creating PUD hugepages at stage 2
arch/arm/include/asm/kvm_mmu.h | 60 +++++
arch/arm64/include/asm/kvm_mmu.h | 47 ++++
arch/arm64/include/asm/pgtable-hwdef.h | 4 +
arch/arm64/include/asm/pgtable.h | 9 +
virt/kvm/arm/mmu.c | 289 ++++++++++++++++++-------
5 files changed, 330 insertions(+), 79 deletions(-)
--
2.17.1
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage
2. This feature is useful on cores that have support for large sized
TLB mappings (e.g., 1GB for 4K granule).
The biggest change in this version is to replace repeated instances of
walking the page tables to get to a leaf-entry with a function to
return the appropriate entry. This was suggested by Suzuki and should
help reduce the amount of churn resulting from future changes. It also
address other feedback on the previous version.
Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.
The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc4. The are a few conflicts with the support for 52
bit IPA[1] due to change in number of parameters for
stage2_pmd_offset().
Thanks,
Punit
v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
condition to decide if a page table entry should be marked
executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
of levels of stage 2 tables differs from stage 1
v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries
v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
indirectly via hugetlb
* Added review tag [4/4]
v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]
[0] https://www.spinics.net/lists/arm-kernel/msg663562.html
[1] https://www.spinics.net/lists/kvm/msg171065.html
Punit Agrawal (7):
KVM: arm/arm64: Share common code in user_mem_abort()
KVM: arm/arm64: Introduce helpers to manupulate page table entries
KVM: arm64: Support dirty page tracking for PUD hugepages
KVM: arm64: Support PUD hugepage in stage2_is_exec()
KVM: arm64: Support handling access faults for PUD hugepages
KVM: arm64: Update age handlers to support PUD hugepages
KVM: arm64: Add support for creating PUD hugepages at stage 2
arch/arm/include/asm/kvm_mmu.h | 60 +++++
arch/arm64/include/asm/kvm_mmu.h | 47 ++++
arch/arm64/include/asm/pgtable-hwdef.h | 4 +
arch/arm64/include/asm/pgtable.h | 9 +
virt/kvm/arm/mmu.c | 289 ++++++++++++++++++-------
5 files changed, 330 insertions(+), 79 deletions(-)
--
2.17.1
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.
Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
virt/kvm/arm/mmu.c | 67 ++++++++++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 29 deletions(-)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..ea3d992e4fb7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1422,7 +1422,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
unsigned long fault_status)
{
int ret;
- bool write_fault, exec_fault, writable, hugetlb = false, force_pte = false;
+ bool write_fault, writable, hugetlb = false, force_pte = false;
+ bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1431,7 +1432,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
- unsigned long flags = 0;
+ unsigned long vma_pagesize, flags = 0;
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1451,7 +1452,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
return -EFAULT;
}
- if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+ vma_pagesize = vma_kernel_pagesize(vma);
+ if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
@@ -1520,28 +1522,45 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
- if (!hugetlb && !force_pte)
+ if (!hugetlb && !force_pte) {
+ /*
+ * Only PMD_SIZE transparent hugepages(THP) are
+ * currently supported. This code will need to be
+ * updated to support other THP sizes.
+ */
hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
+ if (hugetlb)
+ vma_pagesize = PMD_SIZE;
+ }
+
+ if (writable)
+ kvm_set_pfn_dirty(pfn);
+
+ if (fault_status != FSC_PERM)
+ clean_dcache_guest_page(pfn, vma_pagesize);
- if (hugetlb) {
+ if (exec_fault)
+ invalidate_icache_guest_page(pfn, vma_pagesize);
+
+ /*
+ * If we took an execution fault we have made the
+ * icache/dcache coherent above and should now let the s2
+ * mapping be executable.
+ *
+ * Write faults (!exec_fault && FSC_PERM) are orthogonal to
+ * execute permissions, and we preserve whatever we have.
+ */
+ needs_exec = exec_fault ||
+ (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
+ if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
- if (writable) {
+ if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
- kvm_set_pfn_dirty(pfn);
- }
- if (fault_status != FSC_PERM)
- clean_dcache_guest_page(pfn, PMD_SIZE);
-
- if (exec_fault) {
+ if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
- invalidate_icache_guest_page(pfn, PMD_SIZE);
- } else if (fault_status == FSC_PERM) {
- /* Preserve execute if XN was already cleared */
- if (stage2_is_exec(kvm, fault_ipa))
- new_pmd = kvm_s2pmd_mkexec(new_pmd);
- }
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else {
@@ -1549,21 +1568,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
- kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
- if (fault_status != FSC_PERM)
- clean_dcache_guest_page(pfn, PAGE_SIZE);
-
- if (exec_fault) {
+ if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
- invalidate_icache_guest_page(pfn, PAGE_SIZE);
- } else if (fault_status == FSC_PERM) {
- /* Preserve execute if XN was already cleared */
- if (stage2_is_exec(kvm, fault_ipa))
- new_pte = kvm_s2pte_mkexec(new_pte);
- }
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
}
--
2.17.1
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.
The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.
Signed-off-by: Punit Agrawal <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 5 +++++
arch/arm64/include/asm/kvm_mmu.h | 5 +++++
virt/kvm/arm/mmu.c | 8 +++++---
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8553d68b7c8a..d095c2d0b284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
int kvm_mmu_init(void);
void kvm_clear_hyp_idmap(void);
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
+
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
{
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index fb9a7127bb75..689def9bb9d5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,6 +172,11 @@ void kvm_clear_hyp_idmap(void);
#define kvm_set_pte(ptep, pte) set_pte(ptep, pte)
#define kvm_set_pmd(pmdp, pmd) set_pmd(pmdp, pmd)
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
+
static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
{
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ea3d992e4fb7..e131b7f9b7d7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1554,8 +1554,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
if (hugetlb && vma_pagesize == PMD_SIZE) {
- pmd_t new_pmd = pfn_pmd(pfn, mem_type);
- new_pmd = pmd_mkhuge(new_pmd);
+ pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+ new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
@@ -1564,7 +1566,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else {
- pte_t new_pte = pfn_pte(pfn, mem_type);
+ pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
--
2.17.1
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.
Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.
Signed-off-by: Punit Agrawal <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 16 ++++++++++++++++
arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
virt/kvm/arm/mmu.c | 11 +++++++----
3 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d095c2d0b284..c23722f75d5c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -80,6 +80,22 @@ void kvm_clear_hyp_idmap(void);
#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
+/*
+ * The following kvm_*pud*() functionas are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+ BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+ BUG();
+ return false;
+}
+
+
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
{
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 689def9bb9d5..84051930ddfe 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -239,6 +239,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
}
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+ kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+ return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e131b7f9b7d7..ed8f8271c389 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1288,9 +1288,12 @@ static void stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
do {
next = stage2_pud_addr_end(addr, end);
if (!stage2_pud_none(*pud)) {
- /* TODO:PUD not supported, revisit later if supported */
- BUG_ON(stage2_pud_huge(*pud));
- stage2_wp_pmds(pud, addr, next);
+ if (stage2_pud_huge(*pud)) {
+ if (!kvm_s2pud_readonly(pud))
+ kvm_set_s2pud_readonly(pud);
+ } else {
+ stage2_wp_pmds(pud, addr, next);
+ }
}
} while (pud++, addr = next, addr != end);
}
@@ -1333,7 +1336,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
*
* Called to start logging dirty pages after memory region
* KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
* Afterwards read of dirty page log can be called.
*
* Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
--
2.17.1
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.
Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 19 +++++++
arch/arm64/include/asm/kvm_mmu.h | 15 +++++
arch/arm64/include/asm/pgtable-hwdef.h | 2 +
arch/arm64/include/asm/pgtable.h | 2 +
virt/kvm/arm/mmu.c | 78 ++++++++++++++++++++++++--
5 files changed, 112 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8e1e8aee229e..787baf9ec994 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -77,10 +77,13 @@ void kvm_clear_hyp_idmap(void);
#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
#define kvm_pud_pfn(pud) (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud) (pud)
/*
* The following kvm_*pud*() functionas are provided strictly to allow
@@ -97,6 +100,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
}
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+ BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+ BUG();
+ return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+ BUG();
+ return pud;
+}
static inline bool kvm_s2pud_exec(pud_t *pud)
{
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c542052fb199..dd8a23159463 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -171,13 +171,16 @@ void kvm_clear_hyp_idmap(void);
#define kvm_set_pte(ptep, pte) set_pte(ptep, pte)
#define kvm_set_pmd(pmdp, pmd) set_pmd(pmdp, pmd)
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
#define kvm_pud_pfn(pud) pud_pfn(pud)
#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud) pud_mkhuge(pud)
static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
{
@@ -191,6 +194,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
}
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+ pud_val(pud) |= PUD_S2_RDWR;
+ return pud;
+}
+
static inline pte_t kvm_s2pte_mkexec(pte_t pte)
{
pte_val(pte) &= ~PTE_S2_XN;
@@ -203,6 +212,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
}
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+ pud_val(pud) &= ~PUD_S2_XN;
+ return pud;
+}
+
static inline void kvm_set_s2pte_readonly(pte_t *ptep)
{
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
#define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
+#define PUD_S2_RDONLY (_AT(pudval_t, 1) << 6) /* HAP[2:1] */
+#define PUD_S2_RDWR (_AT(pudval_t, 3) << 6) /* HAP[2:1] */
#define PUD_S2_XN (_AT(pudval_t, 2) << 53) /* XN[1:0] */
/*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
#define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud)))
#define pud_write(pud) pte_write(pud_pte(pud))
+#define pud_mkhuge(pud) (__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
#define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
#define __phys_to_pud_val(phys) __phys_to_pte_val(phys)
#define pud_pfn(pud) ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index a6d3ac9d7c7a..d8e2497e5353 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
put_page(virt_to_page(pmd));
}
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm: pointer to kvm structure.
+ * @addr: IPA
+ * @pud: pud pointer for IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t *pudp)
+{
+ if (!stage2_pud_huge(*pudp))
+ return;
+
+ stage2_pud_clear(pudp);
+ kvm_tlb_flush_vmid_ipa(kvm, addr);
+ put_page(virt_to_page(pudp));
+}
+
static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
int min, int max)
{
@@ -993,7 +1012,7 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
pmd_t *pmd;
pud = stage2_get_pud(kvm, cache, addr);
- if (!pud)
+ if (!pud || stage2_pud_huge(*pud))
return NULL;
if (stage2_pud_none(*pud)) {
@@ -1038,6 +1057,26 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
return 0;
}
+static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+ phys_addr_t addr, const pud_t *new_pudp)
+{
+ pud_t *pudp, old_pud;
+
+ pudp = stage2_get_pud(kvm, cache, addr);
+ VM_BUG_ON(!pudp);
+
+ old_pud = *pudp;
+ if (stage2_pud_present(old_pud)) {
+ stage2_pud_clear(pudp);
+ kvm_tlb_flush_vmid_ipa(kvm, addr);
+ } else {
+ get_page(virt_to_page(pudp));
+ }
+
+ kvm_set_pud(pudp, *new_pudp);
+ return 0;
+}
+
/*
* stage2_get_leaf_entry - walk the stage2 VM page tables and return
* true if a valid and present leaf-entry is found. A pointer to the
@@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
phys_addr_t addr, const pte_t *new_pte,
unsigned long flags)
{
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte, old_pte;
bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
@@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
VM_BUG_ON(logging_active && !cache);
/* Create stage-2 page table mapping - Levels 0 and 1 */
+ pud = stage2_get_pud(kvm, cache, addr);
+ if (!pud) {
+ /*
+ * Ignore calls from kvm_set_spte_hva for unallocated
+ * address ranges.
+ */
+ return 0;
+ }
+
+ /*
+ * While dirty page logging - dissolve huge PUD, then continue
+ * on to allocate page.
+ */
+ if (logging_active)
+ stage2_dissolve_pud(kvm, addr, pud);
+
pmd = stage2_get_pmd(kvm, cache, addr);
if (!pmd) {
/*
@@ -1500,9 +1556,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
}
vma_pagesize = vma_kernel_pagesize(vma);
- if (vma_pagesize == PMD_SIZE && !logging_active) {
+ if ((vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE) &&
+ !logging_active) {
+ struct hstate *h = hstate_vma(vma);
+
hugetlb = true;
- gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
+ gfn = (fault_ipa & huge_page_mask(h)) >> PAGE_SHIFT;
} else {
/*
* Pages belonging to memslots that don't have the same
@@ -1600,7 +1659,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
needs_exec = exec_fault ||
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
- if (hugetlb && vma_pagesize == PMD_SIZE) {
+ if (hugetlb && vma_pagesize == PUD_SIZE) {
+ pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
+
+ new_pud = kvm_pud_mkhuge(new_pud);
+ if (writable)
+ new_pud = kvm_s2pud_mkwrite(new_pud);
+
+ if (needs_exec)
+ new_pud = kvm_s2pud_mkexec(new_pud);
+
+ ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, &new_pud);
+ } else if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
new_pmd = kvm_pmd_mkhuge(new_pmd);
--
2.17.1
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.
Provide trivial helpers for arm32 to allow sharing code.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 6 ++++
arch/arm64/include/asm/kvm_mmu.h | 5 ++++
arch/arm64/include/asm/pgtable.h | 1 +
virt/kvm/arm/mmu.c | 51 ++++++++++++++++++--------------
4 files changed, 40 insertions(+), 23 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a4298d429efc..8e1e8aee229e 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -110,6 +110,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
}
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+ BUG();
+ return false;
+}
+
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
{
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 4d2780c588b0..c542052fb199 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
}
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+ return pud_young(pud);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
#define pfn_pmd(pfn,prot) __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
#define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot)
+#define pud_young(pud) pte_young(pud_pte(pud))
#define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud)))
#define pud_write(pud) pte_write(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index d2c705e31584..a6d3ac9d7c7a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1172,6 +1172,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
}
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+ return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
/**
* kvm_phys_addr_ioremap - map a device range to guest IPA
*
@@ -1879,42 +1884,42 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
{
- pmd_t *pmd;
- pte_t *pte;
+ pud_t *pud = NULL;
+ pmd_t *pmd = NULL;
+ pte_t *pte = NULL;
+ bool found;
- WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
- pmd = stage2_get_pmd(kvm, NULL, gpa);
- if (!pmd || pmd_none(*pmd)) /* Nothing there */
+ WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+ found = stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte);
+ if (!found)
return 0;
- if (pmd_thp_or_huge(*pmd)) /* THP, HugeTLB */
+ if (pud)
+ return stage2_pudp_test_and_clear_young(pud);
+ else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
- pte = pte_offset_kernel(pmd, gpa);
- if (pte_none(*pte))
- return 0;
-
- return stage2_ptep_test_and_clear_young(pte);
+ else
+ return stage2_ptep_test_and_clear_young(pte);
}
static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
{
- pmd_t *pmd;
- pte_t *pte;
+ pud_t *pud = NULL;
+ pmd_t *pmd = NULL;
+ pte_t *pte = NULL;
+ bool found;
- WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
- pmd = stage2_get_pmd(kvm, NULL, gpa);
- if (!pmd || pmd_none(*pmd)) /* Nothing there */
+ WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+ found = stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte);
+ if (!found)
return 0;
- if (pmd_thp_or_huge(*pmd)) /* THP, HugeTLB */
+ if (pud)
+ return kvm_s2pud_young(*pud);
+ else if (pmd)
return pmd_young(*pmd);
-
- pte = pte_offset_kernel(pmd, gpa);
- if (!pte_none(*pte)) /* Just a page... */
+ else
return pte_young(*pte);
-
- return 0;
}
int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
--
2.17.1
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.
Provide trivial helpers for arm32 to allow sharing of code.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 8 ++++++++
arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
arch/arm64/include/asm/pgtable.h | 6 ++++++
virt/kvm/arm/mmu.c | 29 ++++++++++++++++-------------
4 files changed, 37 insertions(+), 13 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d05c8986e495..a4298d429efc 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -78,6 +78,8 @@ void kvm_clear_hyp_idmap(void);
#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pud_pfn(pud) (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
+
#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
/*
@@ -102,6 +104,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
}
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+ BUG();
+ return pud;
+}
+
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
{
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 15bc1be8f82f..4d2780c588b0 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -175,6 +175,8 @@ void kvm_clear_hyp_idmap(void);
#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pud_pfn(pud) pud_pfn(pud)
+
#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -254,6 +256,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
}
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+ return pud_mkyoung(pud);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
}
+static inline pud_t pte_pud(pte_t pte)
+{
+ return __pud(pte_val(pte));
+}
+
static inline pmd_t pud_pmd(pud_t pud)
{
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
#define pfn_pmd(pfn,prot) __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
#define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot)
+#define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud)))
#define pud_write(pud) pte_write(pud_pte(pud))
#define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e73909a31e02..d2c705e31584 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1637,33 +1637,36 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
*/
static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
{
- pmd_t *pmd;
- pte_t *pte;
+ pud_t *pud = NULL;
+ pmd_t *pmd = NULL;
+ pte_t *pte = NULL;
kvm_pfn_t pfn;
- bool pfn_valid = false;
+ bool found, pfn_valid = false;
trace_kvm_access_fault(fault_ipa);
spin_lock(&vcpu->kvm->mmu_lock);
- pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
- if (!pmd || pmd_none(*pmd)) /* Nothing there */
+ found = stage2_get_leaf_entry(vcpu->kvm, fault_ipa, &pud, &pmd, &pte);
+ if (!found)
goto out;
- if (pmd_thp_or_huge(*pmd)) { /* THP, HugeTLB */
+ if (pud) { /* HugeTLB */
+ *pud = kvm_s2pud_mkyoung(*pud);
+ pfn = kvm_pud_pfn(*pud);
+ pfn_valid = true;
+ goto out;
+ } else if (pmd) { /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
goto out;
+ } else {
+ *pte = pte_mkyoung(*pte); /* Just a page... */
+ pfn = pte_pfn(*pte);
+ pfn_valid = true;
}
- pte = pte_offset_kernel(pmd, fault_ipa);
- if (pte_none(*pte)) /* Nothing there either */
- goto out;
-
- *pte = pte_mkyoung(*pte); /* Just a page... */
- pfn = pte_pfn(*pte);
- pfn_valid = true;
out:
spin_unlock(&vcpu->kvm->mmu_lock);
if (pfn_valid)
--
2.17.1
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.
Provide trivial implementations of arm32 helpers to allow sharing of
code.
Signed-off-by: Punit Agrawal <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 6 ++++
arch/arm64/include/asm/kvm_mmu.h | 5 +++
arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
virt/kvm/arm/mmu.c | 49 +++++++++++++++++++++++---
4 files changed, 57 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index c23722f75d5c..d05c8986e495 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
}
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+ BUG();
+ return false;
+}
+
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
{
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 84051930ddfe..15bc1be8f82f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
}
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+ return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
static inline bool kvm_page_empty(void *ptr)
{
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
#define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
+#define PUD_S2_XN (_AT(pudval_t, 2) << 53) /* XN[1:0] */
+
/*
* Memory Attribute override for Stage-2 (MemAttr[3:0])
*/
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed8f8271c389..e73909a31e02 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1038,23 +1038,62 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
return 0;
}
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
{
+ pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
- pmdp = stage2_get_pmd(kvm, NULL, addr);
+ pudp = stage2_get_pud(kvm, NULL, addr);
+ if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
+ return false;
+
+ if (pud_huge(*pudp)) {
+ *pudpp = pudp;
+ return true;
+ }
+
+ pmdp = stage2_pmd_offset(pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
- if (pmd_thp_or_huge(*pmdp))
- return kvm_s2pmd_exec(pmdp);
+ if (pmd_thp_or_huge(*pmdp)) {
+ *pmdpp = pmdp;
+ return true;
+ }
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
- return kvm_s2pte_exec(ptep);
+ *ptepp = ptep;
+ return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+ pud_t *pudp = NULL;
+ pmd_t *pmdp = NULL;
+ pte_t *ptep = NULL;
+ bool found;
+
+ found = stage2_get_leaf_entry(kvm, addr, &pudp, &pmdp, &ptep);
+ if (!found)
+ return false;
+
+ if (pudp)
+ return kvm_s2pud_exec(pudp);
+ else if (pmdp)
+ return kvm_s2pmd_exec(pmdp);
+ else
+ return kvm_s2pte_exec(ptep);
}
static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
--
2.17.1
Please ignore this cover letter.
Apologies for the duplicate cover-letter and a somewhat funky threading
(I blame emacs unsaved buffer).
The patches appear to be intact so don't let the threading get in the
way of review.
Punit Agrawal <[email protected]> writes:
> This series is an update to the PUD hugepage support previously posted
> at [0]. This patchset adds support for PUD hugepages at stage
> 2. This feature is useful on cores that have support for large sized
> TLB mappings (e.g., 1GB for 4K granule).
>
> The biggest change in this version is to replace repeated instances of
> walking the page tables to get to a leaf-entry with a function to
> return the appropriate entry. This was suggested by Suzuki and should
> help reduce the amount of churn resulting from future changes. It also
> address other feedback on the previous version.
>
> Support is added to code that is shared between arm and arm64. Dummy
> helpers for arm are provided as the port does not support PUD hugepage
> sizes.
>
> The patches have been tested on an A57 based system. The patchset is
> based on v4.18-rc4. The are a few conflicts with the support for 52
> bit IPA[1] due to change in number of parameters for
> stage2_pmd_offset().
>
> Thanks,
> Punit
>
> v4 -> v5:
>
> * Patch 1 - Drop helper stage2_should_exec() and refactor the
> condition to decide if a page table entry should be marked
> executable
> * Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
> latter patches
> * Patch 7 - Use stage 2 accessors instead of using the page table
> helpers directly
> * Patch 7 - Add a note to update the PUD hugepage support when number
> of levels of stage 2 tables differs from stage 1
>
>
> v3 -> v4:
> * Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
> * Patch 4-5 - Add PUD hugepage support for exec and access faults
> * Patch 6 - PUD hugepage support for aging page table entries
>
> v2 -> v3:
> * Update vma_pagesize directly if THP [1/4]. Previsouly this was done
> indirectly via hugetlb
> * Added review tag [4/4]
>
> v1 -> v2:
> * Create helper to check if the page should have exec permission [1/4]
> * Fix broken condition to detect THP hugepage [1/4]
> * Fix in-correct hunk resulting from a rebase [4/4]
>
> [0] https://www.spinics.net/lists/arm-kernel/msg663562.html
> [1] https://www.spinics.net/lists/kvm/msg171065.html
>
>
> Punit Agrawal (7):
> KVM: arm/arm64: Share common code in user_mem_abort()
> KVM: arm/arm64: Introduce helpers to manupulate page table entries
> KVM: arm64: Support dirty page tracking for PUD hugepages
> KVM: arm64: Support PUD hugepage in stage2_is_exec()
> KVM: arm64: Support handling access faults for PUD hugepages
> KVM: arm64: Update age handlers to support PUD hugepages
> KVM: arm64: Add support for creating PUD hugepages at stage 2
>
> arch/arm/include/asm/kvm_mmu.h | 60 +++++
> arch/arm64/include/asm/kvm_mmu.h | 47 ++++
> arch/arm64/include/asm/pgtable-hwdef.h | 4 +
> arch/arm64/include/asm/pgtable.h | 9 +
> virt/kvm/arm/mmu.c | 289 ++++++++++++++++++-------
> 5 files changed, 330 insertions(+), 79 deletions(-)
On 09/07/18 15:41, Punit Agrawal wrote:
> Introduce helpers to abstract architectural handling of the conversion
> of pfn to page table entries and marking a PMD page table entry as a
> block entry.
>
> The helpers are introduced in preparation for supporting PUD hugepages
> at stage 2 - which are supported on arm64 but do not exist on arm.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Acked-by: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
Reviewed-by: Suzuki K Poulose <[email protected]>
On 09/07/18 15:41, Punit Agrawal wrote:
> In preparation for creating PUD hugepages at stage 2, add support for
> write protecting PUD hugepages when they are encountered. Write
> protecting guest tables is used to track dirty pages when migrating
> VMs.
>
> Also, provide trivial implementations of required kvm_s2pud_* helpers
> to allow sharing of code with arm32.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Reviewed-by: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm/include/asm/kvm_mmu.h | 16 ++++++++++++++++
> arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
> virt/kvm/arm/mmu.c | 11 +++++++----
> 3 files changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index d095c2d0b284..c23722f75d5c 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -80,6 +80,22 @@ void kvm_clear_hyp_idmap(void);
>
> #define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
>
> +/*
> + * The following kvm_*pud*() functionas are provided strictly to allow
nit: s/functionas/functions/
Reviewed-by: Suzuki K Poulose <[email protected]>
On 09/07/18 15:41, Punit Agrawal wrote:
> In preparation for creating larger hugepages at Stage 2, add support
> to the age handling notifiers for PUD hugepages when encountered.
>
> Provide trivial helpers for arm32 to allow sharing code.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm/include/asm/kvm_mmu.h | 6 ++++
> arch/arm64/include/asm/kvm_mmu.h | 5 ++++
> arch/arm64/include/asm/pgtable.h | 1 +
> virt/kvm/arm/mmu.c | 51 ++++++++++++++++++--------------
> 4 files changed, 40 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index a4298d429efc..8e1e8aee229e 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -110,6 +110,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
> return pud;
> }
>
> +static inline bool kvm_s2pud_young(pud_t pud)
> +{
> + BUG();
> + return false;
> +}
> +
> static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> {
> *pmd = new_pmd;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 4d2780c588b0..c542052fb199 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -261,6 +261,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
> return pud_mkyoung(pud);
> }
>
> +static inline bool kvm_s2pud_young(pud_t pud)
> +{
> + return pud_young(pud);
> +}
> +
> static inline bool kvm_page_empty(void *ptr)
> {
> struct page *ptr_page = virt_to_page(ptr);
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index a64a5c35beb1..4d9476e420d9 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
> #define pfn_pmd(pfn,prot) __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
> #define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot)
>
> +#define pud_young(pud) pte_young(pud_pte(pud))
> #define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud)))
> #define pud_write(pud) pte_write(pud_pte(pud))
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index d2c705e31584..a6d3ac9d7c7a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1172,6 +1172,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
> return stage2_ptep_test_and_clear_young((pte_t *)pmd);
> }
>
> +static int stage2_pudp_test_and_clear_young(pud_t *pud)
> +{
> + return stage2_ptep_test_and_clear_young((pte_t *)pud);
> +}
> +
> /**
> * kvm_phys_addr_ioremap - map a device range to guest IPA
> *
> @@ -1879,42 +1884,42 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
>
> static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
> {
> - pmd_t *pmd;
> - pte_t *pte;
> + pud_t *pud = NULL;
> + pmd_t *pmd = NULL;
> + pte_t *pte = NULL;
minor nit: If you are spinning another revision, it would be good to get the
stage2_get_leaf_entry() to set the input parameters to NULL, rather than
expecting NULL and sprinkling this initialisation everywhere.
> + bool found;
>
> - WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
> - pmd = stage2_get_pmd(kvm, NULL, gpa);
> - if (!pmd || pmd_none(*pmd)) /* Nothing there */
> + WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
> + found = stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte);
> + if (!found)
> return 0;
minor nit: You may skip the variable here and below.
> static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data)
> {
> - pmd_t *pmd;
> - pte_t *pte;
> + pud_t *pud = NULL;
> + pmd_t *pmd = NULL;
> + pte_t *pte = NULL;
> + bool found;
>
> - WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
> - pmd = stage2_get_pmd(kvm, NULL, gpa);
> - if (!pmd || pmd_none(*pmd)) /* Nothing there */
> + WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
> + found = stage2_get_leaf_entry(kvm, gpa, &pud, &pmd, &pte);
> + if (!found)
> return 0;
Either way,
Reviewed-by: Suzuki K Poulose <[email protected]>
On 09/07/18 15:41, Punit Agrawal wrote:
> In preparation for creating PUD hugepages at stage 2, add support for
> detecting execute permissions on PUD page table entries. Faults due to
> lack of execute permissions on page table entries is used to perform
> i-cache invalidation on first execute.
>
> Provide trivial implementations of arm32 helpers to allow sharing of
> code.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm/include/asm/kvm_mmu.h | 6 ++++
> arch/arm64/include/asm/kvm_mmu.h | 5 +++
> arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
> virt/kvm/arm/mmu.c | 49 +++++++++++++++++++++++---
> 4 files changed, 57 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index c23722f75d5c..d05c8986e495 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
> }
>
>
> +static inline bool kvm_s2pud_exec(pud_t *pud)
> +{
> + BUG();
> + return false;
> +}
> +
> static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> {
> *pmd = new_pmd;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 84051930ddfe..15bc1be8f82f 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
> return kvm_s2pte_readonly((pte_t *)pudp);
> }
>
> +static inline bool kvm_s2pud_exec(pud_t *pudp)
> +{
> + return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
> +}
> +
> static inline bool kvm_page_empty(void *ptr)
> {
> struct page *ptr_page = virt_to_page(ptr);
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index fd208eac9f2a..10ae592b78b8 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -193,6 +193,8 @@
> #define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
> #define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
>
> +#define PUD_S2_XN (_AT(pudval_t, 2) << 53) /* XN[1:0] */
> +
> /*
> * Memory Attribute override for Stage-2 (MemAttr[3:0])
> */
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ed8f8271c389..e73909a31e02 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1038,23 +1038,62 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
> return 0;
> }
>
> -static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
> +/*
> + * stage2_get_leaf_entry - walk the stage2 VM page tables and return
> + * true if a valid and present leaf-entry is found. A pointer to the
> + * leaf-entry is returned in the appropriate level variable - pudpp,
> + * pmdpp, ptepp.
> + */
> +static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
> + pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
> {
> + pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
nit: As mentioned in the other thread, you may initialize the reference
pointers to NULL to make sure we start clean and avoid the initialization
everywhere this is called.
>
> - pmdp = stage2_get_pmd(kvm, NULL, addr);
> + pudp = stage2_get_pud(kvm, NULL, addr);
> + if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
> + return false;
> +
> + if (pud_huge(*pudp)) {
> + *pudpp = pudp;
> + return true;
> + }
> +
> + pmdp = stage2_pmd_offset(pudp, addr);
> if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
> return false;
>
> - if (pmd_thp_or_huge(*pmdp))
> - return kvm_s2pmd_exec(pmdp);
> + if (pmd_thp_or_huge(*pmdp)) {
> + *pmdpp = pmdp;
> + return true;
> + }
>
> ptep = pte_offset_kernel(pmdp, addr);
> if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
> return false;
>
> - return kvm_s2pte_exec(ptep);
> + *ptepp = ptep;
> + return true;
> +}
> +
> +static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
> +{
> + pud_t *pudp = NULL;
> + pmd_t *pmdp = NULL;
> + pte_t *ptep = NULL;
> + bool found;
> +
> + found = stage2_get_leaf_entry(kvm, addr, &pudp, &pmdp, &ptep);
> + if (!found)
> + return false;
> +
> + if (pudp)
> + return kvm_s2pud_exec(pudp);
> + else if (pmdp)
> + return kvm_s2pmd_exec(pmdp);
> + else
> + return kvm_s2pte_exec(ptep);
> }
Reviewed-by: Suzuki K Poulose <[email protected]>
On 09/07/18 15:41, Punit Agrawal wrote:
> In preparation for creating larger hugepages at Stage 2, extend the
> access fault handling at Stage 2 to support PUD hugepages when
> encountered.
>
> Provide trivial helpers for arm32 to allow sharing of code.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm/include/asm/kvm_mmu.h | 8 ++++++++
> arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
> arch/arm64/include/asm/pgtable.h | 6 ++++++
> virt/kvm/arm/mmu.c | 29 ++++++++++++++++-------------
> 4 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index d05c8986e495..a4298d429efc 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -78,6 +78,8 @@ void kvm_clear_hyp_idmap(void);
> #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
> #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
>
> +#define kvm_pud_pfn(pud) (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
Since we don't have PUD in arm32, it would be good to trigger a BUG()
instead of silently doing something wrong.
> @@ -102,6 +104,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
> return false;
> }
>
> +static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
> +{
> + BUG();
> + return pud;
> +}
> +
Like this. ^
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index e73909a31e02..d2c705e31584 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1637,33 +1637,36 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> */
> static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
> {
> - pmd_t *pmd;
> - pte_t *pte;
> + pud_t *pud = NULL;
> + pmd_t *pmd = NULL;
> + pte_t *pte = NULL;
> kvm_pfn_t pfn;
> - bool pfn_valid = false;
> + bool found, pfn_valid = false;
nit: You could use pfn_valid instead of a new "found" variable.
>
> trace_kvm_access_fault(fault_ipa);
>
> spin_lock(&vcpu->kvm->mmu_lock);
>
> - pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
> - if (!pmd || pmd_none(*pmd)) /* Nothing there */
> + found = stage2_get_leaf_entry(vcpu->kvm, fault_ipa, &pud, &pmd, &pte);
> + if (!found)
> goto out;
>
> - if (pmd_thp_or_huge(*pmd)) { /* THP, HugeTLB */
> + if (pud) { /* HugeTLB */
> + *pud = kvm_s2pud_mkyoung(*pud);
> + pfn = kvm_pud_pfn(*pud);
> + pfn_valid = true;
> + goto out;
You don't need these goto's and the out lable anymore.
> + } else if (pmd) { /* THP, HugeTLB */
> *pmd = pmd_mkyoung(*pmd);
> pfn = pmd_pfn(*pmd);
> pfn_valid = true;
> goto out;
> + } else {
> + *pte = pte_mkyoung(*pte); /* Just a page... */
> + pfn = pte_pfn(*pte);
> + pfn_valid = true;
> }
>
> - pte = pte_offset_kernel(pmd, fault_ipa);
> - if (pte_none(*pte)) /* Nothing there either */
> - goto out;
> -
> - *pte = pte_mkyoung(*pte); /* Just a page... */
> - pfn = pte_pfn(*pte);
> - pfn_valid = true;
> out:
> spin_unlock(&vcpu->kvm->mmu_lock);
> if (pfn_valid)
>
Otherwise look good to me.
Suzuki
On 09/07/18 15:41, Punit Agrawal wrote:
> The code for operations such as marking the pfn as dirty, and
> dcache/icache maintenance during stage 2 fault handling is duplicated
> between normal pages and PMD hugepages.
>
> Instead of creating another copy of the operations when we introduce
> PUD hugepages, let's share them across the different pagesizes.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> ---
> virt/kvm/arm/mmu.c | 67 ++++++++++++++++++++++++++--------------------
> 1 file changed, 38 insertions(+), 29 deletions(-)
>
Reviewed-by: Suzuki K Poulose <[email protected]>
On 09/07/18 15:41, Punit Agrawal wrote:
> KVM only supports PMD hugepages at stage 2. Now that the various page
> handling routines are updated, extend the stage 2 fault handling to
> map in PUD hugepages.
>
> Addition of PUD hugepage support enables additional page sizes (e.g.,
> 1G with 4K granule) which can be useful on cores that support mapping
> larger block sizes in the TLB entries.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm/include/asm/kvm_mmu.h | 19 +++++++
> arch/arm64/include/asm/kvm_mmu.h | 15 +++++
> arch/arm64/include/asm/pgtable-hwdef.h | 2 +
> arch/arm64/include/asm/pgtable.h | 2 +
> virt/kvm/arm/mmu.c | 78 ++++++++++++++++++++++++--
> 5 files changed, 112 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 8e1e8aee229e..787baf9ec994 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -77,10 +77,13 @@ void kvm_clear_hyp_idmap(void);
>
> #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
> #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
> +#define kvm_pfn_pud(pfn, prot) (__pud(0))
>
> #define kvm_pud_pfn(pud) (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
>
> #define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
> +/* No support for pud hugepages */
> +#define kvm_pud_mkhuge(pud) (pud)
>
> /*
> * The following kvm_*pud*() functionas are provided strictly to allow
> @@ -97,6 +100,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
> return false;
> }
>
> +static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
> +{
> + BUG();
> +}
> +
> +static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
> +{
> + BUG();
> + return pud;
> +}
> +
> +static inline pud_t kvm_s2pud_mkexec(pud_t pud)
> +{
> + BUG();
> + return pud;
> +}
>
> static inline bool kvm_s2pud_exec(pud_t *pud)
> {
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index c542052fb199..dd8a23159463 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -171,13 +171,16 @@ void kvm_clear_hyp_idmap(void);
>
> #define kvm_set_pte(ptep, pte) set_pte(ptep, pte)
> #define kvm_set_pmd(pmdp, pmd) set_pmd(pmdp, pmd)
> +#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
>
> #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
> #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
> +#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
>
> #define kvm_pud_pfn(pud) pud_pfn(pud)
>
> #define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
> +#define kvm_pud_mkhuge(pud) pud_mkhuge(pud)
>
> static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
> {
> @@ -191,6 +194,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
> return pmd;
> }
>
> +static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
> +{
> + pud_val(pud) |= PUD_S2_RDWR;
> + return pud;
> +}
> +
> static inline pte_t kvm_s2pte_mkexec(pte_t pte)
> {
> pte_val(pte) &= ~PTE_S2_XN;
> @@ -203,6 +212,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
> return pmd;
> }
>
> +static inline pud_t kvm_s2pud_mkexec(pud_t pud)
> +{
> + pud_val(pud) &= ~PUD_S2_XN;
> + return pud;
> +}
> +
> static inline void kvm_set_s2pte_readonly(pte_t *ptep)
> {
> pteval_t old_pteval, pteval;
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index 10ae592b78b8..e327665e94d1 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -193,6 +193,8 @@
> #define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
> #define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
>
> +#define PUD_S2_RDONLY (_AT(pudval_t, 1) << 6) /* HAP[2:1] */
> +#define PUD_S2_RDWR (_AT(pudval_t, 3) << 6) /* HAP[2:1] */
> #define PUD_S2_XN (_AT(pudval_t, 2) << 53) /* XN[1:0] */
>
> /*
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 4d9476e420d9..0afc34f94ff5 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
> #define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud)))
> #define pud_write(pud) pte_write(pud_pte(pud))
>
> +#define pud_mkhuge(pud) (__pud(pud_val(pud) & ~PUD_TABLE_BIT))
> +
> #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
> #define __phys_to_pud_val(phys) __phys_to_pte_val(phys)
> #define pud_pfn(pud) ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index a6d3ac9d7c7a..d8e2497e5353 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> put_page(virt_to_page(pmd));
> }
>
> +/**
> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> + * @kvm: pointer to kvm structure.
> + * @addr: IPA
> + * @pud: pud pointer for IPA
> + *
> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t *pudp)
> +{
> + if (!stage2_pud_huge(*pudp))
> + return;
> +
> + stage2_pud_clear(pudp);
> + kvm_tlb_flush_vmid_ipa(kvm, addr);
> + put_page(virt_to_page(pudp));
> +}
> +
> static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> int min, int max)
> {
> @@ -993,7 +1012,7 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache
> pmd_t *pmd;
>
> pud = stage2_get_pud(kvm, cache, addr);
> - if (!pud)
> + if (!pud || stage2_pud_huge(*pud))
> return NULL;
>
> if (stage2_pud_none(*pud)) {
> @@ -1038,6 +1057,26 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
> return 0;
> }
>
> +static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> + phys_addr_t addr, const pud_t *new_pudp)
> +{
> + pud_t *pudp, old_pud;
> +
> + pudp = stage2_get_pud(kvm, cache, addr);
> + VM_BUG_ON(!pudp);
> +
> + old_pud = *pudp;
> + if (stage2_pud_present(old_pud)) {
> + stage2_pud_clear(pudp);
> + kvm_tlb_flush_vmid_ipa(kvm, addr);
> + } else {
> + get_page(virt_to_page(pudp));
> + }
> +
> + kvm_set_pud(pudp, *new_pudp);
> + return 0;
> +}
> +
> /*
> * stage2_get_leaf_entry - walk the stage2 VM page tables and return
> * true if a valid and present leaf-entry is found. A pointer to the
> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> phys_addr_t addr, const pte_t *new_pte,
> unsigned long flags)
> {
> + pud_t *pud;
> pmd_t *pmd;
> pte_t *pte, old_pte;
> bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> VM_BUG_ON(logging_active && !cache);
>
> /* Create stage-2 page table mapping - Levels 0 and 1 */
> + pud = stage2_get_pud(kvm, cache, addr);
> + if (!pud) {
> + /*
> + * Ignore calls from kvm_set_spte_hva for unallocated
> + * address ranges.
> + */
> + return 0;
> + }
> +
> + /*
> + * While dirty page logging - dissolve huge PUD, then continue
> + * on to allocate page.
Punit,
We don't seem to allocate a page here for the PUD entry, in case if it is dissolved
or empty (i.e, stage2_pud_none(*pud) is true.).
> + */
> + if (logging_active)
> + stage2_dissolve_pud(kvm, addr, pud);
> +
> pmd = stage2_get_pmd(kvm, cache, addr);
> if (!pmd) {
And once you add an entry, pmd is just the matter of getting stage2_pmd_offset() from your pud.
No need to start again from the top-level with stage2_get_pmd().
Cheers
Suzuki
Suzuki K Poulose <[email protected]> writes:
> On 09/07/18 15:41, Punit Agrawal wrote:
>> In preparation for creating PUD hugepages at stage 2, add support for
>> detecting execute permissions on PUD page table entries. Faults due to
>> lack of execute permissions on page table entries is used to perform
>> i-cache invalidation on first execute.
>>
>> Provide trivial implementations of arm32 helpers to allow sharing of
>> code.
>>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Cc: Christoffer Dall <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Russell King <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> ---
>> arch/arm/include/asm/kvm_mmu.h | 6 ++++
>> arch/arm64/include/asm/kvm_mmu.h | 5 +++
>> arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
>> virt/kvm/arm/mmu.c | 49 +++++++++++++++++++++++---
>> 4 files changed, 57 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index c23722f75d5c..d05c8986e495 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
>> }
>> +static inline bool kvm_s2pud_exec(pud_t *pud)
>> +{
>> + BUG();
>> + return false;
>> +}
>> +
>> static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
>> {
>> *pmd = new_pmd;
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 84051930ddfe..15bc1be8f82f 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
>> return kvm_s2pte_readonly((pte_t *)pudp);
>> }
>> +static inline bool kvm_s2pud_exec(pud_t *pudp)
>> +{
>> + return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
>> +}
>> +
>> static inline bool kvm_page_empty(void *ptr)
>> {
>> struct page *ptr_page = virt_to_page(ptr);
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
>> index fd208eac9f2a..10ae592b78b8 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -193,6 +193,8 @@
>> #define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
>> #define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
>> +#define PUD_S2_XN (_AT(pudval_t, 2) << 53) /* XN[1:0]
>> */
>> +
>> /*
>> * Memory Attribute override for Stage-2 (MemAttr[3:0])
>> */
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index ed8f8271c389..e73909a31e02 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1038,23 +1038,62 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>> return 0;
>> }
>> -static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
>> +/*
>> + * stage2_get_leaf_entry - walk the stage2 VM page tables and return
>> + * true if a valid and present leaf-entry is found. A pointer to the
>> + * leaf-entry is returned in the appropriate level variable - pudpp,
>> + * pmdpp, ptepp.
>> + */
>> +static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
>> + pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
>> {
>> + pud_t *pudp;
>> pmd_t *pmdp;
>> pte_t *ptep;
>
> nit: As mentioned in the other thread, you may initialize the reference
> pointers to NULL to make sure we start clean and avoid the initialization
> everywhere this is called.
I took the approach to not touch the pointers unless they are being
assigned a valid pointer. I'll initialise the incoming pointers (p*dpp)
before proceeding with the table walk.
Thanks,
Punit
>
>> - pmdp = stage2_get_pmd(kvm, NULL, addr);
>> + pudp = stage2_get_pud(kvm, NULL, addr);
>> + if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
>> + return false;
>> +
>> + if (pud_huge(*pudp)) {
>> + *pudpp = pudp;
>> + return true;
>> + }
>> +
>> + pmdp = stage2_pmd_offset(pudp, addr);
>> if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
>> return false;
>> - if (pmd_thp_or_huge(*pmdp))
>> - return kvm_s2pmd_exec(pmdp);
>> + if (pmd_thp_or_huge(*pmdp)) {
>> + *pmdpp = pmdp;
>> + return true;
>> + }
>> ptep = pte_offset_kernel(pmdp, addr);
>> if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
>> return false;
>> - return kvm_s2pte_exec(ptep);
>> + *ptepp = ptep;
>> + return true;
>> +}
>> +
>> +static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
>> +{
>> + pud_t *pudp = NULL;
>> + pmd_t *pmdp = NULL;
>> + pte_t *ptep = NULL;
>> + bool found;
>> +
>> + found = stage2_get_leaf_entry(kvm, addr, &pudp, &pmdp, &ptep);
>> + if (!found)
>> + return false;
>> +
>> + if (pudp)
>> + return kvm_s2pud_exec(pudp);
>> + else if (pmdp)
>> + return kvm_s2pmd_exec(pmdp);
>> + else
>> + return kvm_s2pte_exec(ptep);
>> }
>
> Reviewed-by: Suzuki K Poulose <[email protected]>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Suzuki K Poulose <[email protected]> writes:
> On 09/07/18 15:41, Punit Agrawal wrote:
>> Introduce helpers to abstract architectural handling of the conversion
>> of pfn to page table entries and marking a PMD page table entry as a
>> block entry.
>>
>> The helpers are introduced in preparation for supporting PUD hugepages
>> at stage 2 - which are supported on arm64 but do not exist on arm.
>>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Acked-by: Christoffer Dall <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Russell King <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> ---
>
> Reviewed-by: Suzuki K Poulose <[email protected]>
Other than the query on Patch 7 I've incorporated all your suggestions
locally.
Thanks a lot for reviewing the patches.
Punit
ps: Just noticed the typo (manupulate) in the subject. I've fixed it up
locally.
On 11/07/18 17:05, Punit Agrawal wrote:
> Suzuki K Poulose <[email protected]> writes:
>
>> On 09/07/18 15:41, Punit Agrawal wrote:
>>> KVM only supports PMD hugepages at stage 2. Now that the various page
>>> handling routines are updated, extend the stage 2 fault handling to
>>> map in PUD hugepages.
>>>
>>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>>> 1G with 4K granule) which can be useful on cores that support mapping
>>> larger block sizes in the TLB entries.
>>>
>>> Signed-off-by: Punit Agrawal <[email protected]>
>>> Cc: Christoffer Dall <[email protected]>
>>> Cc: Marc Zyngier <[email protected]>
>>> Cc: Russell King <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> ---
>>> arch/arm/include/asm/kvm_mmu.h | 19 +++++++
>>> arch/arm64/include/asm/kvm_mmu.h | 15 +++++
>>> arch/arm64/include/asm/pgtable-hwdef.h | 2 +
>>> arch/arm64/include/asm/pgtable.h | 2 +
>>> virt/kvm/arm/mmu.c | 78 ++++++++++++++++++++++++--
>>> 5 files changed, 112 insertions(+), 4 deletions(-)
>>>
>
> [...]
>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index a6d3ac9d7c7a..d8e2497e5353 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>
> [...]
>
>>> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>> phys_addr_t addr, const pte_t *new_pte,
>>> unsigned long flags)
>>> {
>>> + pud_t *pud;
>>> pmd_t *pmd;
>>> pte_t *pte, old_pte;
>>> bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>> VM_BUG_ON(logging_active && !cache);
>>> /* Create stage-2 page table mapping - Levels 0 and 1 */
>>> + pud = stage2_get_pud(kvm, cache, addr);
>>> + if (!pud) {
>>> + /*
>>> + * Ignore calls from kvm_set_spte_hva for unallocated
>>> + * address ranges.
>>> + */
>>> + return 0;
>>> + }
>>> +
>>> + /*
>>> + * While dirty page logging - dissolve huge PUD, then continue
>>> + * on to allocate page.
>>
>> Punit,
>>
>> We don't seem to allocate a page here for the PUD entry, in case if it is dissolved
>> or empty (i.e, stage2_pud_none(*pud) is true.).
>
> I was trying to avoid duplicating the PUD allocation by reusing the
> functionality in stage2_get_pmd().
>
> Does the below updated comment help?
>
> /*
> * While dirty page logging - dissolve huge PUD, it'll be
> * allocated in stage2_get_pmd().
> */
>
> The other option is to duplicate the stage2_pud_none() case from
> stage2_get_pmd() here.
I think the explicit check for stage2_pud_none() suits better here.
That would make it explicit that we are tearing down the entries
from top to bottom. Also, we may be able to short cut for case
where we know we just allocated a PUD page and hence we need another
PMD level page.
Also, you are missing the comment about the assumption that stage2 PUD
level always exist with 4k fixed IPA.
Cheers
Suzuki
Suzuki K Poulose <[email protected]> writes:
> On 09/07/18 15:41, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Cc: Christoffer Dall <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Russell King <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> ---
>> arch/arm/include/asm/kvm_mmu.h | 19 +++++++
>> arch/arm64/include/asm/kvm_mmu.h | 15 +++++
>> arch/arm64/include/asm/pgtable-hwdef.h | 2 +
>> arch/arm64/include/asm/pgtable.h | 2 +
>> virt/kvm/arm/mmu.c | 78 ++++++++++++++++++++++++--
>> 5 files changed, 112 insertions(+), 4 deletions(-)
>>
[...]
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index a6d3ac9d7c7a..d8e2497e5353 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
[...]
>> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>> phys_addr_t addr, const pte_t *new_pte,
>> unsigned long flags)
>> {
>> + pud_t *pud;
>> pmd_t *pmd;
>> pte_t *pte, old_pte;
>> bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>> VM_BUG_ON(logging_active && !cache);
>> /* Create stage-2 page table mapping - Levels 0 and 1 */
>> + pud = stage2_get_pud(kvm, cache, addr);
>> + if (!pud) {
>> + /*
>> + * Ignore calls from kvm_set_spte_hva for unallocated
>> + * address ranges.
>> + */
>> + return 0;
>> + }
>> +
>> + /*
>> + * While dirty page logging - dissolve huge PUD, then continue
>> + * on to allocate page.
>
> Punit,
>
> We don't seem to allocate a page here for the PUD entry, in case if it is dissolved
> or empty (i.e, stage2_pud_none(*pud) is true.).
I was trying to avoid duplicating the PUD allocation by reusing the
functionality in stage2_get_pmd().
Does the below updated comment help?
/*
* While dirty page logging - dissolve huge PUD, it'll be
* allocated in stage2_get_pmd().
*/
The other option is to duplicate the stage2_pud_none() case from
stage2_get_pmd() here.
What do you think?
Thanks,
Punit
>> + */
>> + if (logging_active)
>> + stage2_dissolve_pud(kvm, addr, pud);
>> +
>> pmd = stage2_get_pmd(kvm, cache, addr);
>> if (!pmd) {
>
> And once you add an entry, pmd is just the matter of getting stage2_pmd_offset() from your pud.
> No need to start again from the top-level with stage2_get_pmd().
>
> Cheers
> Suzuki
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Suzuki K Poulose <[email protected]> writes:
> On 11/07/18 17:05, Punit Agrawal wrote:
>> Suzuki K Poulose <[email protected]> writes:
>>
>>> On 09/07/18 15:41, Punit Agrawal wrote:
>>>> KVM only supports PMD hugepages at stage 2. Now that the various page
>>>> handling routines are updated, extend the stage 2 fault handling to
>>>> map in PUD hugepages.
>>>>
>>>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>>>> 1G with 4K granule) which can be useful on cores that support mapping
>>>> larger block sizes in the TLB entries.
>>>>
>>>> Signed-off-by: Punit Agrawal <[email protected]>
>>>> Cc: Christoffer Dall <[email protected]>
>>>> Cc: Marc Zyngier <[email protected]>
>>>> Cc: Russell King <[email protected]>
>>>> Cc: Catalin Marinas <[email protected]>
>>>> Cc: Will Deacon <[email protected]>
>>>> ---
>>>> arch/arm/include/asm/kvm_mmu.h | 19 +++++++
>>>> arch/arm64/include/asm/kvm_mmu.h | 15 +++++
>>>> arch/arm64/include/asm/pgtable-hwdef.h | 2 +
>>>> arch/arm64/include/asm/pgtable.h | 2 +
>>>> virt/kvm/arm/mmu.c | 78 ++++++++++++++++++++++++--
>>>> 5 files changed, 112 insertions(+), 4 deletions(-)
>>>>
>>
>> [...]
>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index a6d3ac9d7c7a..d8e2497e5353 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>
>> [...]
>>
>>>> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>> phys_addr_t addr, const pte_t *new_pte,
>>>> unsigned long flags)
>>>> {
>>>> + pud_t *pud;
>>>> pmd_t *pmd;
>>>> pte_t *pte, old_pte;
>>>> bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>> VM_BUG_ON(logging_active && !cache);
>>>> /* Create stage-2 page table mapping - Levels 0 and 1 */
>>>> + pud = stage2_get_pud(kvm, cache, addr);
>>>> + if (!pud) {
>>>> + /*
>>>> + * Ignore calls from kvm_set_spte_hva for unallocated
>>>> + * address ranges.
>>>> + */
>>>> + return 0;
>>>> + }
>>>> +
>>>> + /*
>>>> + * While dirty page logging - dissolve huge PUD, then continue
>>>> + * on to allocate page.
>>>
>>> Punit,
>>>
>>> We don't seem to allocate a page here for the PUD entry, in case if it is dissolved
>>> or empty (i.e, stage2_pud_none(*pud) is true.).
>>
>> I was trying to avoid duplicating the PUD allocation by reusing the
>> functionality in stage2_get_pmd().
>>
>> Does the below updated comment help?
>>
>> /*
>> * While dirty page logging - dissolve huge PUD, it'll be
>> * allocated in stage2_get_pmd().
>> */
>>
>> The other option is to duplicate the stage2_pud_none() case from
>> stage2_get_pmd() here.
>
> I think the explicit check for stage2_pud_none() suits better here.
> That would make it explicit that we are tearing down the entries
> from top to bottom. Also, we may be able to short cut for case
> where we know we just allocated a PUD page and hence we need another
> PMD level page.
Ok, I'll add the PUD allocation code here.
>
> Also, you are missing the comment about the assumption that stage2 PUD
> level always exist with 4k fixed IPA.
Hmm... I'm quite sure I wrote a comment to that effect but can't find it
now. I'll include it in the next version.
Thanks,
Punit
>
> Cheers
> Suzuki
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm