2016-11-08 17:19:36

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v2 0/2] arm64: fix the bugs found in the hugetlb test

(1) Backgroud
For the arm64, the hugetlb page size can be 32M (PMD + Contiguous bit).
In the 4K page environment, the max page order is 10 (max_order - 1),
so 32M page is the gigantic page.

The arm64 MMU supports a Contiguous bit which is a hint that the PTE
is one of a set of contiguous entries which can be cached in a single
TLB entry. Please refer to the arm64v8 mannul :
DDI0487A_f_armv8_arm.pdf (in page D4-1811)

(2) The bugs
After I tested the libhugetlbfs, I found several bugs in arm64 code.
This patch set has all the bug fixes for the arm64.

(3) The test result in the Softiron and Juno-r1 boards:

This detail test result shows below (both the "make func" & "make stress"):

4KB granule:

1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
Test result : PASS

1.2) PMD : 2M x 1 = 2M (per huge page size)
Test result : PASS

1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
Test result : PASS

64KB granule:

3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
Test result : PASS

3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
Test result : no hardware to support this test

v1 -- > v2:
1.) remove the redundant checks for PGD/PUD


Huang Shijie (2):
arm64: hugetlb: remove the wrong pmd check in find_num_contig()
arm64: hugetlb: fix the wrong address for several functions

arch/arm64/mm/hugetlbpage.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)

--
2.5.5


2016-11-08 09:20:19

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v2 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()

The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:

1.) pmd entry is not present.

2.) the page fault occurs:
... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()

3.) set_huge_pte_at() will only set the first PMD entry, since the
find_num_contig just return 1 in this case. So the PMD entries
are all empty except the first one.

4.) when kernel accesses the address mapped by the second PMD entry,
a new page fault occurs:
... hugetlb_fault() --> huge_ptep_set_access_flags()

The second PMD entry is still empty now.

5.) When the kernel returns, the access will cause a page fault again.
The kernel will run like the "4)" above.
We will see a dead loop since here.

The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).

This patch removes wrong pmd check, and fixes this dead loop.

This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().

Acked-by: Steve Capper <[email protected]>
Signed-off-by: Huang Shijie <[email protected]>
---
arch/arm64/mm/hugetlbpage.c | 12 ------------
1 file changed, 12 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 2e49bd2..b0d3f8b 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -51,20 +51,8 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
*pgsize = PAGE_SIZE;
if (!pte_cont(pte))
return 1;
- if (!pgd_present(*pgd)) {
- VM_BUG_ON(!pgd_present(*pgd));
- return 1;
- }
pud = pud_offset(pgd, addr);
- if (!pud_present(*pud)) {
- VM_BUG_ON(!pud_present(*pud));
- return 1;
- }
pmd = pmd_offset(pud, addr);
- if (!pmd_present(*pmd)) {
- VM_BUG_ON(!pmd_present(*pmd));
- return 1;
- }
if ((pte_t *)pmd == ptep) {
*pgsize = PMD_SIZE;
return CONT_PMDS;
--
2.5.5

2016-11-08 14:09:16

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] arm64: fix the bugs found in the hugetlb test

On Tue, Nov 08, 2016 at 01:44:37PM +0800, Huang Shijie wrote:
> (3) The test result in the Softiron and Juno-r1 boards:
>
> This detail test result shows below (both the "make func" & "make stress"):
>
> 4KB granule:
>
> 1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
> Test result : PASS
>
> 1.2) PMD : 2M x 1 = 2M (per huge page size)
> Test result : PASS
>
> 1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
> Test result : PASS
>
> 64KB granule:
>
> 3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
> Test result : PASS
>
> 3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
> Test result : no hardware to support this test

Don't we have support for single (non-contiguous) PMD huge page with 64K
pages (512M per huge page)? I gave it a try and it seems to work (though
without your patches applied ;)).

> Huang Shijie (2):
> arm64: hugetlb: remove the wrong pmd check in find_num_contig()
> arm64: hugetlb: fix the wrong address for several functions

For these patches:

Reviewed-by: Catalin Marinas <[email protected]>

I'm not sure whether Will plans to push them into 4.9. AFAICT, the
contiguous huge pages never worked properly, so we may not count it as a
regression but a new feature. If Will doesn't take them, I'll queue the
patches for 4.10.

BTW, I think we also need to fix this (no functional change though):

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 2e49bd252fe7..0a4c97b618ec 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -323,7 +323,7 @@ __setup("hugepagesz=", setup_hugepagesz);
static __init int add_default_hugepagesz(void)
{
if (size_to_hstate(CONT_PTES * PAGE_SIZE) == NULL)
- hugetlb_add_hstate(CONT_PMD_SHIFT);
+ hugetlb_add_hstate(CONT_PTE_SHIFT);
return 0;
}
arch_initcall(add_default_hugepagesz);

--
Catalin

2016-11-08 16:36:42

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] arm64: fix the bugs found in the hugetlb test

On Tue, Nov 08, 2016 at 02:09:09PM +0000, Catalin Marinas wrote:
> On Tue, Nov 08, 2016 at 01:44:37PM +0800, Huang Shijie wrote:
> > (3) The test result in the Softiron and Juno-r1 boards:
> >
> > This detail test result shows below (both the "make func" & "make stress"):
> >
> > 4KB granule:
> >
> > 1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
> > Test result : PASS
> >
> > 1.2) PMD : 2M x 1 = 2M (per huge page size)
> > Test result : PASS
> >
> > 1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
> > Test result : PASS
> >
> > 64KB granule:
> >
> > 3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
> > Test result : PASS
> >
> > 3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
> > Test result : no hardware to support this test
>
> Don't we have support for single (non-contiguous) PMD huge page with 64K
> pages (512M per huge page)? I gave it a try and it seems to work (though
> without your patches applied ;)).
>
> > Huang Shijie (2):
> > arm64: hugetlb: remove the wrong pmd check in find_num_contig()
> > arm64: hugetlb: fix the wrong address for several functions
>
> For these patches:
>
> Reviewed-by: Catalin Marinas <[email protected]>
>
> I'm not sure whether Will plans to push them into 4.9. AFAICT, the
> contiguous huge pages never worked properly, so we may not count it as a
> regression but a new feature. If Will doesn't take them, I'll queue the
> patches for 4.10.

Right, given that it's never worked and the failure only seems to crop up
in synthetic testing, I think you can queue these for 4.10.

Will

2016-11-08 17:19:03

by Huang Shijie

[permalink] [raw]
Subject: [PATCH v2 2/2] arm64: hugetlb: fix the wrong address for several functions

The libhugetlbfs meets several failures since the following functions
do not use the correct address:
huge_ptep_get_and_clear()
huge_ptep_set_access_flags()
huge_ptep_set_wrprotect()
huge_ptep_clear_flush()

This patch fixes the wrong address for them.

Signed-off-by: Huang Shijie <[email protected]>
---
arch/arm64/mm/hugetlbpage.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index b0d3f8b..fd96ba7 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -200,7 +200,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
/* save the 1st pte to return */
pte = ptep_get_and_clear(mm, addr, cpte);
- for (i = 1; i < ncontig; ++i) {
+ for (i = 1, addr += pgsize; i < ncontig; ++i, addr += pgsize) {
/*
* If HW_AFDBM is enabled, then the HW could
* turn on the dirty bit for any of the page
@@ -238,7 +238,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
pfn = pte_pfn(*cpte);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte) {
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize) {
changed = ptep_set_access_flags(vma, addr, cpte,
pfn_pte(pfn,
hugeprot),
@@ -261,7 +261,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,

cpte = huge_pte_offset(mm, addr);
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte)
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
ptep_set_wrprotect(mm, addr, cpte);
} else {
ptep_set_wrprotect(mm, addr, ptep);
@@ -279,7 +279,7 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
cpte = huge_pte_offset(vma->vm_mm, addr);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte)
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
ptep_clear_flush(vma, addr, cpte);
} else {
ptep_clear_flush(vma, addr, ptep);
--
2.5.5

2016-11-09 02:16:31

by Huang Shijie

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] arm64: fix the bugs found in the hugetlb test

Hi Will & Catalin,

On Tue, Nov 08, 2016 at 04:36:43PM +0000, Will Deacon wrote:
> On Tue, Nov 08, 2016 at 02:09:09PM +0000, Catalin Marinas wrote:
> > On Tue, Nov 08, 2016 at 01:44:37PM +0800, Huang Shijie wrote:
> > > (3) The test result in the Softiron and Juno-r1 boards:
> > >
> > > This detail test result shows below (both the "make func" & "make stress"):
> > >
> > > 4KB granule:
> > >
> > > 1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
> > > Test result : PASS
> > >
> > > 1.2) PMD : 2M x 1 = 2M (per huge page size)
> > > Test result : PASS
> > >
> > > 1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
> > > Test result : PASS
> > >
> > > 64KB granule:
> > >
> > > 3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
> > > Test result : PASS
> > >
> > > 3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
> > > Test result : no hardware to support this test
> >
> > Don't we have support for single (non-contiguous) PMD huge page with 64K
> > pages (512M per huge page)? I gave it a try and it seems to work (though
> > without your patches applied ;)).
Yes, it should be okay. This patch set does not affect the the single
PMD huge page.

> >
> > > Huang Shijie (2):
> > > arm64: hugetlb: remove the wrong pmd check in find_num_contig()
> > > arm64: hugetlb: fix the wrong address for several functions
> >
> > For these patches:
> >
> > Reviewed-by: Catalin Marinas <[email protected]>
Thanks you, Catalin.

> >
> > I'm not sure whether Will plans to push them into 4.9. AFAICT, the
> > contiguous huge pages never worked properly, so we may not count it as a
> > regression but a new feature. If Will doesn't take them, I'll queue the
> > patches for 4.10.
>
> Right, given that it's never worked and the failure only seems to crop up
> in synthetic testing, I think you can queue these for 4.10.
Okay. Thank for queue them for 4.10.

Thanks
Huang Shijie