2009-09-21 15:08:20

by KOSAKI Motohiro

[permalink] [raw]
Subject: a patch drop request in -mm

Mel,

Today, my test found following patch makes false-positive warning.
because, truncate can free the pages
although the pages are mlock()ed.

So, I think following patch should be dropped.
.. or, do you think truncate should clear PG_mlock before free the page?

Can I ask your patch intention?


=============================================================
commit 7a06930af46eb39351cbcdc1ab98701259f9a72c
Author: Mel Gorman <[email protected]>
Date: Tue Aug 25 00:43:07 2009 +0200

When a page is freed with the PG_mlocked set, it is considered an
unexpected but recoverable situation. A counter records how often this
event happens but it is easy to miss that this event has occured at
all. This patch warns once when PG_mlocked is set to prompt debuggers
to check the counter to see how often it is happening.

Signed-off-by: Mel Gorman <[email protected]>
Reviewed-by: Christoph Lameter <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 28c2f3e..251fd73 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -494,6 +494,11 @@ static inline void __free_one_page(struct page *page,
*/
static inline void free_page_mlock(struct page *page)
{
+ WARN_ONCE(1, KERN_WARNING
+ "Page flag mlocked set for process %s at pfn:%05lx\n"
+ "page:%p flags:%#lx\n",
+ current->comm, page_to_pfn(page),
+ page, page->flags|__PG_MLOCKED);
__dec_zone_page_state(page, NR_MLOCK);
__count_vm_event(UNEVICTABLE_MLOCKFREED);
}


2009-09-21 15:08:30

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: a patch drop request in -mm

2009/9/22 KOSAKI Motohiro <[email protected]>:
> Mel,
>
> Today, my test found following patch makes false-positive warning.
> because, truncate can free the pages
> although the pages are mlock()ed.
>
> So, I think following patch should be dropped.
> .. or, do you think truncate should clear PG_mlock before free the page?
>
> Can I ask your patch intention?

stacktrace is here.


------------[ cut here ]------------
WARNING: at mm/page_alloc.c:502 free_page_mlock+0x84/0xce()
Hardware name: PowerEdge T105
Page flag mlocked set for process dd at pfn:172d4b
page:ffffea00096a6678 flags:0x700000000400000
Modules linked in: fuse usbhid bridge stp llc nfsd lockd nfs_acl
exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table dm_multipath
kvm_amd kvm serio_raw e1000e i2c_nforce2 i2c_core tg3 dcdbas sr_mod
cdrom pata_acpi sata_nv uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 27030, comm: dd Tainted: G W 2.6.31-rc9-mm1 #13
Call Trace:
[<ffffffff8105fd76>] warn_slowpath_common+0x8d/0xbb
[<ffffffff8105fe31>] warn_slowpath_fmt+0x50/0x66
[<ffffffff81102483>] ? mempool_alloc+0x80/0x146
[<ffffffff811060fb>] free_page_mlock+0x84/0xce
[<ffffffff8110640a>] free_hot_cold_page+0x105/0x20b
[<ffffffff81106597>] __pagevec_free+0x87/0xb2
[<ffffffff8110ad61>] release_pages+0x17c/0x1e8
[<ffffffff810a24b8>] ? trace_hardirqs_on_caller+0x32/0x17b
[<ffffffff8112ff82>] free_pages_and_swap_cache+0x72/0xa3
[<ffffffff8111f2f4>] tlb_flush_mmu+0x46/0x68
[<ffffffff8111f935>] unmap_vmas+0x61f/0x85b
[<ffffffff810a24b8>] ? trace_hardirqs_on_caller+0x32/0x17b
[<ffffffff8111fc2b>] zap_page_range+0xba/0xf9
[<ffffffff8111fce4>] unmap_mapping_range_vma+0x7a/0xff
[<ffffffff8111ff2f>] unmap_mapping_range+0x1c6/0x26d
[<ffffffff8110c407>] truncate_pagecache+0x49/0x85
[<ffffffff8117bd84>] simple_setsize+0x44/0x64
[<ffffffff8110b856>] vmtruncate+0x25/0x5f
[<ffffffff81170558>] inode_setattr+0x4a/0x83
[<ffffffff811e817b>] ext4_setattr+0x26b/0x314
[<ffffffff8117088f>] ? notify_change+0x19c/0x31d
[<ffffffff811708ac>] notify_change+0x1b9/0x31d
[<ffffffff81150556>] do_truncate+0x7b/0xac
[<ffffffff811606c1>] ? get_write_access+0x59/0x76
[<ffffffff81163019>] may_open+0x1c0/0x1d3
[<ffffffff811638bd>] do_filp_open+0x4c3/0x998
[<ffffffff81171d80>] ? alloc_fd+0x4a/0x14b
[<ffffffff81171e5b>] ? alloc_fd+0x125/0x14b
[<ffffffff8114f472>] do_sys_open+0x6f/0x14f
[<ffffffff8114f5bf>] sys_open+0x33/0x49
[<ffffffff8100bf72>] system_call_fastpath+0x16/0x1b
---[ end trace e76f92f117e9e06e ]---

2009-09-21 15:22:15

by Mel Gorman

[permalink] [raw]
Subject: Re: a patch drop request in -mm

On Tue, Sep 22, 2009 at 12:00:51AM +0900, KOSAKI Motohiro wrote:
> Mel,
>
> Today, my test found following patch makes false-positive warning.
> because, truncate can free the pages
> although the pages are mlock()ed.
>
> So, I think following patch should be dropped.
> .. or, do you think truncate should clear PG_mlock before free the page?
>

Is there a reason that truncate cannot clear PG_mlock before freeing the
page?

> Can I ask your patch intention?


Locked pages being freed to the page allocator were considered
unexpected and a counter was in place to determine how often that
situation occurred. However, I considered it unlikely that the counter
would be noticed so the warning was put in place to catch what class of
pages were getting freed locked inappropriately. I think a few anomolies
have been cleared up since. Ultimately, it should have been safe to
delete the check.

>
>
> =============================================================
> commit 7a06930af46eb39351cbcdc1ab98701259f9a72c
> Author: Mel Gorman <[email protected]>
> Date: Tue Aug 25 00:43:07 2009 +0200
>
> When a page is freed with the PG_mlocked set, it is considered an
> unexpected but recoverable situation. A counter records how often this
> event happens but it is easy to miss that this event has occured at
> all. This patch warns once when PG_mlocked is set to prompt debuggers
> to check the counter to see how often it is happening.
>
> Signed-off-by: Mel Gorman <[email protected]>
> Reviewed-by: Christoph Lameter <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 28c2f3e..251fd73 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -494,6 +494,11 @@ static inline void __free_one_page(struct page *page,
> */
> static inline void free_page_mlock(struct page *page)
> {
> + WARN_ONCE(1, KERN_WARNING
> + "Page flag mlocked set for process %s at pfn:%05lx\n"
> + "page:%p flags:%#lx\n",
> + current->comm, page_to_pfn(page),
> + page, page->flags|__PG_MLOCKED);
> __dec_zone_page_state(page, NR_MLOCK);
> __count_vm_event(UNEVICTABLE_MLOCKFREED);
> }
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-09-21 17:33:54

by Johannes Weiner

[permalink] [raw]
Subject: Re: a patch drop request in -mm

Hi,

On Tue, Sep 22, 2009 at 12:08:28AM +0900, KOSAKI Motohiro wrote:
> 2009/9/22 KOSAKI Motohiro <[email protected]>:
> > Mel,
> >
> > Today, my test found following patch makes false-positive warning.
> > because, truncate can free the pages
> > although the pages are mlock()ed.
> >
> > So, I think following patch should be dropped.
> > .. or, do you think truncate should clear PG_mlock before free the page?
> >
> > Can I ask your patch intention?
>
> stacktrace is here.
>
>
> ------------[ cut here ]------------
> WARNING: at mm/page_alloc.c:502 free_page_mlock+0x84/0xce()
> Hardware name: PowerEdge T105
> Page flag mlocked set for process dd at pfn:172d4b
> page:ffffea00096a6678 flags:0x700000000400000
> Modules linked in: fuse usbhid bridge stp llc nfsd lockd nfs_acl
> exportfs sunrpc cpufreq_ondemand powernow_k8 freq_table dm_multipath
> kvm_amd kvm serio_raw e1000e i2c_nforce2 i2c_core tg3 dcdbas sr_mod
> cdrom pata_acpi sata_nv uhci_hcd ohci_hcd ehci_hcd usbcore [last
> unloaded: scsi_wait_scan]
> Pid: 27030, comm: dd Tainted: G W 2.6.31-rc9-mm1 #13
> Call Trace:
> [<ffffffff8105fd76>] warn_slowpath_common+0x8d/0xbb
> [<ffffffff8105fe31>] warn_slowpath_fmt+0x50/0x66
> [<ffffffff81102483>] ? mempool_alloc+0x80/0x146
> [<ffffffff811060fb>] free_page_mlock+0x84/0xce
> [<ffffffff8110640a>] free_hot_cold_page+0x105/0x20b
> [<ffffffff81106597>] __pagevec_free+0x87/0xb2
> [<ffffffff8110ad61>] release_pages+0x17c/0x1e8
> [<ffffffff810a24b8>] ? trace_hardirqs_on_caller+0x32/0x17b
> [<ffffffff8112ff82>] free_pages_and_swap_cache+0x72/0xa3
> [<ffffffff8111f2f4>] tlb_flush_mmu+0x46/0x68
> [<ffffffff8111f935>] unmap_vmas+0x61f/0x85b
> [<ffffffff810a24b8>] ? trace_hardirqs_on_caller+0x32/0x17b
> [<ffffffff8111fc2b>] zap_page_range+0xba/0xf9
> [<ffffffff8111fce4>] unmap_mapping_range_vma+0x7a/0xff
> [<ffffffff8111ff2f>] unmap_mapping_range+0x1c6/0x26d
> [<ffffffff8110c407>] truncate_pagecache+0x49/0x85
> [<ffffffff8117bd84>] simple_setsize+0x44/0x64
> [<ffffffff8110b856>] vmtruncate+0x25/0x5f

This calls unmap_mapping_range() before actually munlocking the page.

Other unmappers like do_munmap() and exit_mmap() munlock explicitely
before unmapping.

We could do the same here but I would argue that mlock lifetime
depends on actual userspace mappings and then move the munlocking a
few levels down into the unmapping guts to make this implicit.

Because truncation makes sure pages get unmapped, this is handled too.

Below is roughly outlined and untested demonstration patch. What do
you think?

> [<ffffffff81170558>] inode_setattr+0x4a/0x83
> [<ffffffff811e817b>] ext4_setattr+0x26b/0x314
> [<ffffffff8117088f>] ? notify_change+0x19c/0x31d
> [<ffffffff811708ac>] notify_change+0x1b9/0x31d
> [<ffffffff81150556>] do_truncate+0x7b/0xac
> [<ffffffff811606c1>] ? get_write_access+0x59/0x76
> [<ffffffff81163019>] may_open+0x1c0/0x1d3
> [<ffffffff811638bd>] do_filp_open+0x4c3/0x998
> [<ffffffff81171d80>] ? alloc_fd+0x4a/0x14b
> [<ffffffff81171e5b>] ? alloc_fd+0x125/0x14b
> [<ffffffff8114f472>] do_sys_open+0x6f/0x14f
> [<ffffffff8114f5bf>] sys_open+0x33/0x49
> [<ffffffff8100bf72>] system_call_fastpath+0x16/0x1b
> ---[ end trace e76f92f117e9e06e ]---

---

diff --git a/mm/internal.h b/mm/internal.h
index f290c4d..0d3c6c6 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -67,10 +67,6 @@ extern long mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
extern void munlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
-static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
-{
- munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
-}
#endif

/*
diff --git a/mm/memory.c b/mm/memory.c
index aede2ce..f8c5ac6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -971,7 +971,7 @@ unsigned long unmap_vmas(struct mmu_gather **tlbp,

mmu_notifier_invalidate_range_start(mm, start_addr, end_addr);
for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) {
- unsigned long end;
+ unsigned long end, nr_pages;

start = max(vma->vm_start, start_addr);
if (start >= vma->vm_end)
@@ -980,8 +980,15 @@ unsigned long unmap_vmas(struct mmu_gather **tlbp,
if (end <= vma->vm_start)
continue;

+ nr_pages = (end - start) >> PAGE_SHIFT;
+
+ if (vma->vm_flags & VM_LOCKED) {
+ mm->locked_vm -= nr_pages;
+ munlock_vma_pages_range(vma, start, end);
+ }
+
if (vma->vm_flags & VM_ACCOUNT)
- *nr_accounted += (end - start) >> PAGE_SHIFT;
+ *nr_accounted += nr_pages;

if (unlikely(is_pfn_mapping(vma)))
untrack_pfn_vma(vma, 0, 0);
diff --git a/mm/mmap.c b/mm/mmap.c
index 8101de4..02189f3 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1921,20 +1921,6 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len)
vma = prev? prev->vm_next: mm->mmap;

/*
- * unlock any mlock()ed ranges before detaching vmas
- */
- if (mm->locked_vm) {
- struct vm_area_struct *tmp = vma;
- while (tmp && tmp->vm_start < end) {
- if (tmp->vm_flags & VM_LOCKED) {
- mm->locked_vm -= vma_pages(tmp);
- munlock_vma_pages_all(tmp);
- }
- tmp = tmp->vm_next;
- }
- }
-
- /*
* Remove the vma's, and unmap the actual pages
*/
detach_vmas_to_be_unmapped(mm, vma, prev, end);
@@ -2089,15 +2075,6 @@ void exit_mmap(struct mm_struct *mm)
/* mm's last user has gone, and its about to be pulled down */
mmu_notifier_release(mm);

- if (mm->locked_vm) {
- vma = mm->mmap;
- while (vma) {
- if (vma->vm_flags & VM_LOCKED)
- munlock_vma_pages_all(vma);
- vma = vma->vm_next;
- }
- }
-
arch_exit_mmap(mm);

vma = mm->mmap;
diff --git a/mm/truncate.c b/mm/truncate.c
index ccc3ecf..a4e3b8f 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -104,7 +104,6 @@ truncate_complete_page(struct address_space *mapping, struct page *page)

cancel_dirty_page(page, PAGE_CACHE_SIZE);

- clear_page_mlock(page);
remove_from_page_cache(page);
ClearPageMappedToDisk(page);
page_cache_release(page); /* pagecache ref */
@@ -129,7 +128,6 @@ invalidate_complete_page(struct address_space *mapping, struct page *page)
if (page_has_private(page) && !try_to_release_page(page, 0))
return 0;

- clear_page_mlock(page);
ret = remove_mapping(mapping, page);

return ret;
@@ -348,7 +346,6 @@ invalidate_complete_page2(struct address_space *mapping, struct page *page)
if (PageDirty(page))
goto failed;

- clear_page_mlock(page);
BUG_ON(page_has_private(page));
__remove_from_page_cache(page);
spin_unlock_irq(&mapping->tree_lock);

2009-09-21 18:02:30

by Hugh Dickins

[permalink] [raw]
Subject: Re: a patch drop request in -mm

On Mon, 21 Sep 2009, Johannes Weiner wrote:
>
> This calls unmap_mapping_range() before actually munlocking the page.
>
> Other unmappers like do_munmap() and exit_mmap() munlock explicitely
> before unmapping.
>
> We could do the same here but I would argue that mlock lifetime
> depends on actual userspace mappings and then move the munlocking a
> few levels down into the unmapping guts to make this implicit.
>
> Because truncation makes sure pages get unmapped, this is handled too.
>
> Below is roughly outlined and untested demonstration patch. What do
> you think?

That certainly looks appealing, but is it actually correct?

I'm thinking that munlock_vma_pages_range() clears VM_LOCKED
from vm_flags, which would be incorrect in the truncation case;
and that the VM_NONLINEAR truncation case only zaps certain
pages in the larger range that it is applied to.

Hugh

> diff --git a/mm/internal.h b/mm/internal.h
> index f290c4d..0d3c6c6 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -67,10 +67,6 @@ extern long mlock_vma_pages_range(struct vm_area_struct *vma,
> unsigned long start, unsigned long end);
> extern void munlock_vma_pages_range(struct vm_area_struct *vma,
> unsigned long start, unsigned long end);
> -static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
> -{
> - munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
> -}
> #endif
>
> /*
> diff --git a/mm/memory.c b/mm/memory.c
> index aede2ce..f8c5ac6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -971,7 +971,7 @@ unsigned long unmap_vmas(struct mmu_gather **tlbp,
>
> mmu_notifier_invalidate_range_start(mm, start_addr, end_addr);
> for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) {
> - unsigned long end;
> + unsigned long end, nr_pages;
>
> start = max(vma->vm_start, start_addr);
> if (start >= vma->vm_end)
> @@ -980,8 +980,15 @@ unsigned long unmap_vmas(struct mmu_gather **tlbp,
> if (end <= vma->vm_start)
> continue;
>
> + nr_pages = (end - start) >> PAGE_SHIFT;
> +
> + if (vma->vm_flags & VM_LOCKED) {
> + mm->locked_vm -= nr_pages;
> + munlock_vma_pages_range(vma, start, end);
> + }
> +
> if (vma->vm_flags & VM_ACCOUNT)
> - *nr_accounted += (end - start) >> PAGE_SHIFT;
> + *nr_accounted += nr_pages;
>
> if (unlikely(is_pfn_mapping(vma)))
> untrack_pfn_vma(vma, 0, 0);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 8101de4..02189f3 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1921,20 +1921,6 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len)
> vma = prev? prev->vm_next: mm->mmap;
>
> /*
> - * unlock any mlock()ed ranges before detaching vmas
> - */
> - if (mm->locked_vm) {
> - struct vm_area_struct *tmp = vma;
> - while (tmp && tmp->vm_start < end) {
> - if (tmp->vm_flags & VM_LOCKED) {
> - mm->locked_vm -= vma_pages(tmp);
> - munlock_vma_pages_all(tmp);
> - }
> - tmp = tmp->vm_next;
> - }
> - }
> -
> - /*
> * Remove the vma's, and unmap the actual pages
> */
> detach_vmas_to_be_unmapped(mm, vma, prev, end);
> @@ -2089,15 +2075,6 @@ void exit_mmap(struct mm_struct *mm)
> /* mm's last user has gone, and its about to be pulled down */
> mmu_notifier_release(mm);
>
> - if (mm->locked_vm) {
> - vma = mm->mmap;
> - while (vma) {
> - if (vma->vm_flags & VM_LOCKED)
> - munlock_vma_pages_all(vma);
> - vma = vma->vm_next;
> - }
> - }
> -
> arch_exit_mmap(mm);
>
> vma = mm->mmap;
> diff --git a/mm/truncate.c b/mm/truncate.c
> index ccc3ecf..a4e3b8f 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -104,7 +104,6 @@ truncate_complete_page(struct address_space *mapping, struct page *page)
>
> cancel_dirty_page(page, PAGE_CACHE_SIZE);
>
> - clear_page_mlock(page);
> remove_from_page_cache(page);
> ClearPageMappedToDisk(page);
> page_cache_release(page); /* pagecache ref */
> @@ -129,7 +128,6 @@ invalidate_complete_page(struct address_space *mapping, struct page *page)
> if (page_has_private(page) && !try_to_release_page(page, 0))
> return 0;
>
> - clear_page_mlock(page);
> ret = remove_mapping(mapping, page);
>
> return ret;
> @@ -348,7 +346,6 @@ invalidate_complete_page2(struct address_space *mapping, struct page *page)
> if (PageDirty(page))
> goto failed;
>
> - clear_page_mlock(page);
> BUG_ON(page_has_private(page));
> __remove_from_page_cache(page);
> spin_unlock_irq(&mapping->tree_lock);
>

2009-09-24 00:40:43

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: a patch drop request in -mm

> On Tue, Sep 22, 2009 at 12:00:51AM +0900, KOSAKI Motohiro wrote:
> > Mel,
> >
> > Today, my test found following patch makes false-positive warning.
> > because, truncate can free the pages
> > although the pages are mlock()ed.
> >
> > So, I think following patch should be dropped.
> > .. or, do you think truncate should clear PG_mlock before free the page?
>
> Is there a reason that truncate cannot clear PG_mlock before freeing the
> page?

CC to Lee.
IIRC, Lee tried it at first. but after some trouble, he decided change free_hot_cold_page().
but unfortunately, I don't recall the reason ;-)

Lee, Can you recall it?


> > Can I ask your patch intention?
>
> Locked pages being freed to the page allocator were considered
> unexpected and a counter was in place to determine how often that
> situation occurred. However, I considered it unlikely that the counter
> would be noticed so the warning was put in place to catch what class of
> pages were getting freed locked inappropriately. I think a few anomolies
> have been cleared up since. Ultimately, it should have been safe to
> delete the check.

OK. it seems reasonable. so, I only hope no see linus tree output false-positive warnings.
Thus, I propse

- don't merge this patch to linus tree
- but, no drop from -mm
it be holded in mm until this issue fixed.
- I'll working on fixing this issue.

I think this is enough fair.


Hannes, I'm sorry. I haven't review your patch. I'm too busy now. please gime me more
sevaral time.


2009-09-24 01:47:14

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: a patch drop request in -mm

On Thu, 2009-09-24 at 09:40 +0900, KOSAKI Motohiro wrote:
> > On Tue, Sep 22, 2009 at 12:00:51AM +0900, KOSAKI Motohiro wrote:
> > > Mel,
> > >
> > > Today, my test found following patch makes false-positive warning.
> > > because, truncate can free the pages
> > > although the pages are mlock()ed.
> > >
> > > So, I think following patch should be dropped.
> > > .. or, do you think truncate should clear PG_mlock before free the page?
> >
> > Is there a reason that truncate cannot clear PG_mlock before freeing the
> > page?
>
> CC to Lee.
> IIRC, Lee tried it at first. but after some trouble, he decided change free_hot_cold_page().
> but unfortunately, I don't recall the reason ;-)
>
> Lee, Can you recall it?

Well, truncation does call clear_page_mlock() for this purpose. This
should always succeed in clearing PG_mlock, altho' I suppose it could be
set from somewhere else after that? Looking at the 2.6.31 sources, I
see that there is a call to page_cache_release() in
truncate_inode_pages_range() that doesn't have a corresponding
clear_page_mlock() associated with it. Perhaps we missed this one, or
it's been added since w/o munlocking the page.

If you can eliminate the false positive, I think it would be good to
keep the warning in place. There might be other "leaks" of mlocked
pages that aren't as benign as this. But, keeping it in -mm until it's
sorted out sound reasonable to me


>
>
> > > Can I ask your patch intention?
> >
> > Locked pages being freed to the page allocator were considered
> > unexpected and a counter was in place to determine how often that
> > situation occurred. However, I considered it unlikely that the counter
> > would be noticed so the warning was put in place to catch what class of
> > pages were getting freed locked inappropriately. I think a few anomolies
> > have been cleared up since. Ultimately, it should have been safe to
> > delete the check.
>
> OK. it seems reasonable. so, I only hope no see linus tree output false-positive warnings.
> Thus, I propse
>
> - don't merge this patch to linus tree
> - but, no drop from -mm
> it be holded in mm until this issue fixed.
> - I'll working on fixing this issue.
>
> I think this is enough fair.
>
>
> Hannes, I'm sorry. I haven't review your patch. I'm too busy now. please gime me more
> sevaral time.
>
>
>

2009-09-24 09:09:21

by Mel Gorman

[permalink] [raw]
Subject: Re: a patch drop request in -mm

On Thu, Sep 24, 2009 at 09:40:34AM +0900, KOSAKI Motohiro wrote:
> > On Tue, Sep 22, 2009 at 12:00:51AM +0900, KOSAKI Motohiro wrote:
> > > Mel,
> > >
> > > Today, my test found following patch makes false-positive warning.
> > > because, truncate can free the pages
> > > although the pages are mlock()ed.
> > >
> > > So, I think following patch should be dropped.
> > > .. or, do you think truncate should clear PG_mlock before free the page?
> >
> > Is there a reason that truncate cannot clear PG_mlock before freeing the
> > page?
>
> CC to Lee.
> IIRC, Lee tried it at first. but after some trouble, he decided change free_hot_cold_page().
> but unfortunately, I don't recall the reason ;-)
>
> Lee, Can you recall it?
>
>
> > > Can I ask your patch intention?
> >
> > Locked pages being freed to the page allocator were considered
> > unexpected and a counter was in place to determine how often that
> > situation occurred. However, I considered it unlikely that the counter
> > would be noticed so the warning was put in place to catch what class of
> > pages were getting freed locked inappropriately. I think a few anomolies
> > have been cleared up since. Ultimately, it should have been safe to
> > delete the check.
>
> OK. it seems reasonable. so, I only hope no see linus tree output false-positive warnings.
> Thus, I propse
>
> - don't merge this patch to linus tree
> - but, no drop from -mm
> it be holded in mm until this issue fixed.
> - I'll working on fixing this issue.
>
> I think this is enough fair.
>

I'm afraid I'm just about to run out the door and will be offline until
Tuesday at the very least. I haven't had the chance to review the patch.
However, I have no problem with this patch not being merged to Linus's tree
if it remains in -mm to catch this and other false positives.

> Hannes, I'm sorry. I haven't review your patch. I'm too busy now. please gime me more
> sevaral time.
>

It'll be Tuesday at the very earliest before I get a chance to review.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-09-24 10:38:01

by Johannes Weiner

[permalink] [raw]
Subject: Re: a patch drop request in -mm

Hi,

On Thu, Sep 24, 2009 at 10:09:23AM +0100, Mel Gorman wrote:
> On Thu, Sep 24, 2009 at 09:40:34AM +0900, KOSAKI Motohiro wrote:
> > > On Tue, Sep 22, 2009 at 12:00:51AM +0900, KOSAKI Motohiro wrote:
> > > > Mel,
> > > >
> > > > Today, my test found following patch makes false-positive warning.
> > > > because, truncate can free the pages
> > > > although the pages are mlock()ed.
> > > >
> > > > So, I think following patch should be dropped.
> > > > .. or, do you think truncate should clear PG_mlock before free the page?
> > >
> > > Is there a reason that truncate cannot clear PG_mlock before freeing the
> > > page?
> >
> > CC to Lee.
> > IIRC, Lee tried it at first. but after some trouble, he decided change free_hot_cold_page().
> > but unfortunately, I don't recall the reason ;-)
> >
> > Lee, Can you recall it?
> >
> >
> > > > Can I ask your patch intention?
> > >
> > > Locked pages being freed to the page allocator were considered
> > > unexpected and a counter was in place to determine how often that
> > > situation occurred. However, I considered it unlikely that the counter
> > > would be noticed so the warning was put in place to catch what class of
> > > pages were getting freed locked inappropriately. I think a few anomolies
> > > have been cleared up since. Ultimately, it should have been safe to
> > > delete the check.
> >
> > OK. it seems reasonable. so, I only hope no see linus tree output false-positive warnings.
> > Thus, I propse
> >
> > - don't merge this patch to linus tree
> > - but, no drop from -mm
> > it be holded in mm until this issue fixed.
> > - I'll working on fixing this issue.
> >
> > I think this is enough fair.
> >
>
> I'm afraid I'm just about to run out the door and will be offline until
> Tuesday at the very least. I haven't had the chance to review the patch.
> However, I have no problem with this patch not being merged to Linus's tree
> if it remains in -mm to catch this and other false positives.
>
> > Hannes, I'm sorry. I haven't review your patch. I'm too busy now. please gime me more
> > sevaral time.
> >
>
> It'll be Tuesday at the very earliest before I get a chance to review.

Hugh already pointed out its defects, so the patch as it stands is not
usable.

The problem, if I understood it correctly, is that truncation munlocks
page cache pages but we also unmap (and free) their private COWs,
which can still be mlocked.

So my patch moved the munlocking from truncation to unmap code to make
sure we catch the cows, but for nonlinear unmapping we also want
non-linear munlocking, where my patch is broken.

Perhaps we can do page-wise munlocking in zap_pte_range(), where
zap_details are taken into account. Hopefully, I will have time on
the weekend to look further into it.

Thanks,
Hannes