This series of patches are follow up work (and depends on)
Toshi Kani <[email protected]>'s patches "fix memory leak/
panic in ioremap huge pages".
This series of patches are tested on 4.9 kernel with Cortex-A75
based SoC.
Chintan Pandya (3):
ioremap: Update pgtable free interfaces with addr
arm64: Implement page table free interfaces
Revert "arm64: Enforce BBM for huge IO/VMAP mappings"
From V3->V4:
- Add header for 'addr' in x86 implementation
- Re-order pmd/pud clear and table free
- Avoid redundant TLB invalidatation in one perticular case
From V2->V3:
- Use the exisiting page table free interface to do arm64
specific things
From V1->V2:
- Rebased my patches on top of "[PATCH v2 1/2] mm/vmalloc:
Add interfaces to free unmapped page table"
- Honored BBM for ARM64
arch/arm64/mm/mmu.c | 51 +++++++++++++++++++++++++++++++++----------
arch/x86/mm/pgtable.c | 6 +++--
include/asm-generic/pgtable.h | 8 +++----
lib/ioremap.c | 4 ++--
4 files changed, 49 insertions(+), 20 deletions(-)
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project
This patch ("mm/vmalloc: Add interfaces to free unmapped
page table") adds following 2 interfaces to free the page
table in case we implement huge mapping.
pud_free_pmd_page() and pmd_free_pte_page()
Some architectures (like arm64) needs to do proper TLB
maintanance after updating pagetable entry even in map.
Why ? Read this,
https://patchwork.kernel.org/patch/10134581/
Pass 'addr' in these interfaces so that proper TLB ops
can be performed.
Signed-off-by: Chintan Pandya <[email protected]>
---
arch/arm64/mm/mmu.c | 4 ++--
arch/x86/mm/pgtable.c | 6 ++++--
include/asm-generic/pgtable.h | 8 ++++----
lib/ioremap.c | 4 ++--
4 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 2dbb2c9..da98828 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -973,12 +973,12 @@ int pmd_clear_huge(pmd_t *pmdp)
return 1;
}
-int pud_free_pmd_page(pud_t *pud)
+int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
return pud_none(*pud);
}
-int pmd_free_pte_page(pmd_t *pmd)
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
return pmd_none(*pmd);
}
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 34cda7e..f640ba3 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -706,11 +706,12 @@ int pmd_clear_huge(pmd_t *pmd)
/**
* pud_free_pmd_page - Clear pud entry and free pmd page.
* @pud: Pointer to a PUD.
+ * @addr: Virtual address associated with pud.
*
* Context: The pud range has been unmaped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
-int pud_free_pmd_page(pud_t *pud)
+int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
pmd_t *pmd;
int i;
@@ -733,11 +734,12 @@ int pud_free_pmd_page(pud_t *pud)
/**
* pmd_free_pte_page - Clear pmd entry and free pte page.
* @pmd: Pointer to a PMD.
+ * @addr: Virtual address associated with pmd.
*
* Context: The pmd range has been unmaped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
-int pmd_free_pte_page(pmd_t *pmd)
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
pte_t *pte;
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index bfbb44a..e6310f4 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -983,8 +983,8 @@ static inline int p4d_clear_huge(p4d_t *p4d)
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot);
int pud_clear_huge(pud_t *pud);
int pmd_clear_huge(pmd_t *pmd);
-int pud_free_pmd_page(pud_t *pud);
-int pmd_free_pte_page(pmd_t *pmd);
+int pud_free_pmd_page(pud_t *pud, unsigned long addr);
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr);
#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
static inline int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t prot)
{
@@ -1010,11 +1010,11 @@ static inline int pmd_clear_huge(pmd_t *pmd)
{
return 0;
}
-static inline int pud_free_pmd_page(pud_t *pud)
+static inline int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
return 0;
}
-static inline int pmd_free_pte_page(pmd_t *pmd)
+static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
return 0;
}
diff --git a/lib/ioremap.c b/lib/ioremap.c
index 54e5bba..517f585 100644
--- a/lib/ioremap.c
+++ b/lib/ioremap.c
@@ -92,7 +92,7 @@ static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr,
if (ioremap_pmd_enabled() &&
((next - addr) == PMD_SIZE) &&
IS_ALIGNED(phys_addr + addr, PMD_SIZE) &&
- pmd_free_pte_page(pmd)) {
+ pmd_free_pte_page(pmd, addr)) {
if (pmd_set_huge(pmd, phys_addr + addr, prot))
continue;
}
@@ -119,7 +119,7 @@ static inline int ioremap_pud_range(p4d_t *p4d, unsigned long addr,
if (ioremap_pud_enabled() &&
((next - addr) == PUD_SIZE) &&
IS_ALIGNED(phys_addr + addr, PUD_SIZE) &&
- pud_free_pmd_page(pud)) {
+ pud_free_pmd_page(pud, addr)) {
if (pud_set_huge(pud, phys_addr + addr, prot))
continue;
}
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project
Implement pud_free_pmd_page() and pmd_free_pte_page().
Implementation requires,
1) Freeing of the un-used next level page tables
2) Clearing off the current pud/pmd entry
3) Invalidate TLB which could have previously
valid but not stale entry
Signed-off-by: Chintan Pandya <[email protected]>
---
arch/arm64/mm/mmu.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index da98828..7be3106 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -45,6 +45,7 @@
#include <asm/memblock.h>
#include <asm/mmu_context.h>
#include <asm/ptdump.h>
+#include <asm/tlbflush.h>
#define NO_BLOCK_MAPPINGS BIT(0)
#define NO_CONT_MAPPINGS BIT(1)
@@ -973,12 +974,47 @@ int pmd_clear_huge(pmd_t *pmdp)
return 1;
}
+static int __pmd_free_pte_page(pmd_t *pmd, unsigned long addr, bool tlb_inv)
+{
+ pmd_t *table;
+
+ if (pmd_val(*pmd)) {
+ table = __va(pmd_val(*pmd));
+ pmd_clear(pmd);
+ /*
+ * FIXME: __flush_tlb_pgtable(&init_mm, addr) is
+ * ideal candidate here, which exactly
+ * flushes intermediate pgtables. But,
+ * this is broken (evident from tests).
+ * So, use safe TLB op unless that is fixed.
+ */
+ if (tlb_inv)
+ flush_tlb_kernel_range(addr, addr + PMD_SIZE);
+
+ free_page((unsigned long) table);
+ }
+ return 1;
+}
+
int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
- return pud_none(*pud);
+ pmd_t *table;
+ int i;
+
+ if (pud_val(*pud)) {
+ table = __va(pud_val(*pud));
+ for (i = 0; i < PTRS_PER_PMD; i++)
+ __pmd_free_pte_page(&table[i], addr + (i * PMD_SIZE),
+ false);
+
+ pud_clear(pud);
+ flush_tlb_kernel_range(addr, addr + PUD_SIZE);
+ free_page((unsigned long) table);
+ }
+ return 1;
}
int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
- return pmd_none(*pmd);
+ return __pmd_free_pte_page(pmd, addr, true);
}
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project
This commit 15122ee2c515a ("arm64: Enforce BBM for huge
IO/VMAP mappings") is a temporary work-around until the
issues with CONFIG_HAVE_ARCH_HUGE_VMAP gets fixed.
Revert this change as we have fixes for the issue.
Signed-off-by: Chintan Pandya <[email protected]>
---
arch/arm64/mm/mmu.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7be3106..1238717 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -935,10 +935,6 @@ int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
pgprot_t sect_prot = __pgprot(PUD_TYPE_SECT |
pgprot_val(mk_sect_prot(prot)));
- /* ioremap_page_range doesn't honour BBM */
- if (pud_present(READ_ONCE(*pudp)))
- return 0;
-
BUG_ON(phys & ~PUD_MASK);
set_pud(pudp, pfn_pud(__phys_to_pfn(phys), sect_prot));
return 1;
@@ -949,10 +945,6 @@ int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot)
pgprot_t sect_prot = __pgprot(PMD_TYPE_SECT |
pgprot_val(mk_sect_prot(prot)));
- /* ioremap_page_range doesn't honour BBM */
- if (pmd_present(READ_ONCE(*pmdp)))
- return 0;
-
BUG_ON(phys & ~PMD_MASK);
set_pmd(pmdp, pfn_pmd(__phys_to_pfn(phys), sect_prot));
return 1;
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project
On Tue, 2018-03-20 at 17:15 +0530, Chintan Pandya wrote:
> This patch ("mm/vmalloc: Add interfaces to free unmapped
> page table") adds following 2 interfaces to free the page
> table in case we implement huge mapping.
>
> pud_free_pmd_page() and pmd_free_pte_page()
>
> Some architectures (like arm64) needs to do proper TLB
> maintanance after updating pagetable entry even in map.
> Why ? Read this,
> https://patchwork.kernel.org/patch/10134581/
>
> Pass 'addr' in these interfaces so that proper TLB ops
> can be performed.
>
> Signed-off-by: Chintan Pandya <[email protected]>
Thanks for the update. The change looks good.
Reviewed-by: Toshi Kani <[email protected]>
-Toshi
On Tue, Mar 20, 2018 at 05:15:13PM +0530, Chintan Pandya wrote:
> +static int __pmd_free_pte_page(pmd_t *pmd, unsigned long addr, bool tlb_inv)
> +{
> + pmd_t *table;
> +
> + if (pmd_val(*pmd)) {
> + table = __va(pmd_val(*pmd));
> + pmd_clear(pmd);
> + /*
> + * FIXME: __flush_tlb_pgtable(&init_mm, addr) is
> + * ideal candidate here, which exactly
> + * flushes intermediate pgtables. But,
> + * this is broken (evident from tests).
> + * So, use safe TLB op unless that is fixed.
> + */
> + if (tlb_inv)
> + flush_tlb_kernel_range(addr, addr + PMD_SIZE);
I don't think that __flush_tlb_pgtable() is broken. It's only valid to
call it for user page tables, since it doesn't affect all ASIDs.
We can add a simlar helper for kernel mappings, which affects all ASIDs,
e.g.
static inline void __flush_tlb_kernel_pgtable(unsigned long addr)
{
addr >>= 12;
__tlbi(vaae1is, addr);
dsb(ish);
}
Thanks,
Mark.
On 3/26/2018 3:25 PM, Mark Rutland wrote:
> On Tue, Mar 20, 2018 at 05:15:13PM +0530, Chintan Pandya wrote:
>> +static int __pmd_free_pte_page(pmd_t *pmd, unsigned long addr, bool tlb_inv)
>> +{
>> + pmd_t *table;
>> +
>> + if (pmd_val(*pmd)) {
>> + table = __va(pmd_val(*pmd));
>> + pmd_clear(pmd);
>> + /*
>> + * FIXME: __flush_tlb_pgtable(&init_mm, addr) is
>> + * ideal candidate here, which exactly
>> + * flushes intermediate pgtables. But,
>> + * this is broken (evident from tests).
>> + * So, use safe TLB op unless that is fixed.
>> + */
>> + if (tlb_inv)
>> + flush_tlb_kernel_range(addr, addr + PMD_SIZE);
>
> I don't think that __flush_tlb_pgtable() is broken. It's only valid to
> call it for user page tables, since it doesn't affect all ASIDs.
>
> We can add a simlar helper for kernel mappings, which affects all ASIDs,
> e.g.
>
Okay. I will test it and update v5.
> static inline void __flush_tlb_kernel_pgtable(unsigned long addr)
> {
> addr >>= 12;
> __tlbi(vaae1is, addr);
> dsb(ish);
> }
>
> Thanks,
> Mark.
>
Chintan
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project