2011-05-25 07:07:56

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 0/4] fix pagewalk minor issues

While reviewing Stephen's numa_maps improvement, I've found recent pagewalk changes
made some minor issues.

This series fix them. I think only [1/4] need to backport to -stable.


KOSAKI Motohiro (4):
pagewalk: Fix walk_page_range() don't check find_vma() result
properly
pagewalk: don't look up vma if walk->hugetlb_entry is unused
pagewalk: add locking-rule commnets
pagewalk: fix code comment for THP

include/linux/mm.h | 1 +
mm/pagewalk.c | 49 ++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 43 insertions(+), 7 deletions(-)

--
1.7.3.1



2011-05-25 07:08:58

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 1/4] pagewalk: Fix walk_page_range() don't check find_vma() result properly

The doc of find_vma() says,

/* Look up the first VMA which satisfies addr < vm_end, NULL if none. */
struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
{
(snip)

Thus, caller should confirm whether the returned vma matches a desired one.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/pagewalk.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index c3450d5..606bbb4 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -176,7 +176,7 @@ int walk_page_range(unsigned long addr, unsigned long end,
* we can't handled it in the same manner as non-huge pages.
*/
vma = find_vma(walk->mm, addr);
- if (vma && is_vm_hugetlb_page(vma)) {
+ if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma)) {
if (vma->vm_end < next)
next = vma->vm_end;
/*
--
1.7.3.1


2011-05-25 07:09:53

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 2/4] pagewalk: don't look up vma if walk->hugetlb_entry is unused

Currently, walk_page_range() call find_vma() every page table
walk iteration. but it's completely unnecessary if walk->hugetlb_entry
is unused. And we don't have to assume find_vma() is lightweight
operation. Thus this patch check walk->hugetlb_entry and avoid
find_vma() call if possible.

This patch also makes some cleanups. 1) remove ugly uninitialized_var()
and 2) #ifdef in function body.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/pagewalk.c | 43 +++++++++++++++++++++++++++++++++++++------
1 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index 606bbb4..ee4ff87 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -126,7 +126,39 @@ static int walk_hugetlb_range(struct vm_area_struct *vma,

return 0;
}
-#endif
+
+static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk)
+{
+ struct vm_area_struct *vma;
+
+ /* We don't need vma lookup at all. */
+ if (!walk->hugetlb_entry)
+ return NULL;
+
+ VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem));
+ vma = find_vma(walk->mm, addr);
+ if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma))
+ return vma;
+
+ return NULL;
+}
+
+#else /* CONFIG_HUGETLB_PAGE */
+static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk)
+{
+ return NULL;
+}
+
+static int walk_hugetlb_range(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long end,
+ struct mm_walk *walk)
+{
+ return 0;
+}
+
+#endif /* CONFIG_HUGETLB_PAGE */
+
+

/**
* walk_page_range - walk a memory map's page tables with a callback
@@ -165,18 +197,17 @@ int walk_page_range(unsigned long addr, unsigned long end,

pgd = pgd_offset(walk->mm, addr);
do {
- struct vm_area_struct *uninitialized_var(vma);
+ struct vm_area_struct *vma;

next = pgd_addr_end(addr, end);

-#ifdef CONFIG_HUGETLB_PAGE
/*
* handle hugetlb vma individually because pagetable walk for
* the hugetlb page is dependent on the architecture and
* we can't handled it in the same manner as non-huge pages.
*/
- vma = find_vma(walk->mm, addr);
- if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma)) {
+ vma = hugetlb_vma(addr, walk);
+ if (vma) {
if (vma->vm_end < next)
next = vma->vm_end;
/*
@@ -189,7 +220,7 @@ int walk_page_range(unsigned long addr, unsigned long end,
pgd = pgd_offset(walk->mm, next);
continue;
}
-#endif
+
if (pgd_none_or_clear_bad(pgd)) {
if (walk->pte_hole)
err = walk->pte_hole(addr, next, walk);
--
1.7.3.1


2011-05-25 07:10:47

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 3/4] pagewalk: add locking-rule commnets

Originally, walk_hugetlb_range() didn't require a caller take any lock.
But commit d33b9f45bd (mm: hugetlb: fix hugepage memory leak in
walk_page_range) changed its rule. Because it added find_vma() call
in walk_hugetlb_range().

Any locking-rule change commit should write a doc too.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
include/linux/mm.h | 1 +
mm/pagewalk.c | 3 +++
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dd87a78..7337b66 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -921,6 +921,7 @@ unsigned long unmap_vmas(struct mmu_gather **tlb,
* @pte_entry: if set, called for each non-empty PTE (4th-level) entry
* @pte_hole: if set, called for each hole at all levels
* @hugetlb_entry: if set, called for each hugetlb entry
+ * *Caution*: The caller must hold mmap_sem() if it's used.
*
* (see walk_page_range for more details)
*/
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index ee4ff87..f792940 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -181,6 +181,9 @@ static int walk_hugetlb_range(struct vm_area_struct *vma,
*
* If any callback returns a non-zero value, the walk is aborted and
* the return value is propagated back to the caller. Otherwise 0 is returned.
+ *
+ * walk->mm->mmap_sem must be held for at least read if walk->hugetlb_entry
+ * is !NULL.
*/
int walk_page_range(unsigned long addr, unsigned long end,
struct mm_walk *walk)
--
1.7.3.1


2011-05-25 07:11:38

by KOSAKI Motohiro

[permalink] [raw]
Subject: [PATCH 4/4] pagewalk: fix code comment for THP

commit bae9c19bf1 (thp: split_huge_page_mm/vma) changed locking behavior
of walk_page_range(). Thus this patch changes comment too.

Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/pagewalk.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index f792940..2f5cf10 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -176,7 +176,8 @@ static int walk_hugetlb_range(struct vm_area_struct *vma,
* associated range, and a copy of the original mm_walk for access to
* the ->private or ->mm fields.
*
- * No locks are taken, but the bottom level iterator will map PTE
+ * Usually no locks are taken, but splitting transparent huge page may
+ * take page table lock. And the bottom level iterator will map PTE
* directories from highmem if necessary.
*
* If any callback returns a non-zero value, the walk is aborted and
--
1.7.3.1