2012-06-14 01:12:21

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin

d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
Let's fix it.

[1] http://comments.gmane.org/gmane.linux.kernel.mm/65844

I copy and paste d179e84ba's contents for description.

"It is unsafe to run page_count during the physical pfn scan because
compound_head could trip on a dangling pointer when reading
page->first_page if the compound page is being freed by another CPU."

* changelog from v1
- Add comment about skip tail page of THP - Andrea
- fix typo - Wanpeng Li
- based on next-20120613

Cc: Andrea Arcangeli <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Wanpeng Li <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---
mm/page_alloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 266f267..543cc2d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5496,11 +5496,18 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
continue;

page = pfn_to_page(check);
- if (!page_count(page)) {
+ /*
+ * We can't use page_count without pin a page
+ * because another CPU can free compound page.
+ * This check already skips compound tails of THP
+ * because their page->_count is zero at all time.
+ */
+ if (!atomic_read(&page->_count)) {
if (PageBuddy(page))
iter += (1 << page_order(page)) - 1;
continue;
}
+
if (!PageLRU(page))
found++;
/*
--
1.7.9.5


2012-06-14 01:12:34

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 2/2] mm: clean up __count_immobile_pages

__count_immobile_pages naming is rather awkward.
This patch changes function name more clear and add comment.

* changelog from v1
- write down page flag race in function comment
- commit change log change

Cc: Andrea Arcangeli <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---
mm/page_alloc.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 543cc2d..dc7f8c5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5467,26 +5467,28 @@ void set_pageblock_flags_group(struct page *page, unsigned long flags,
}

/*
- * This is designed as sub function...plz see page_isolation.c also.
- * set/clear page block's type to be ISOLATE.
- * page allocater never alloc memory from ISOLATE block.
+ * This function checks whether pageblock includes unmovable pages or not.
+ * If @count is not zero, it is okay to include less @count unmovable pages
+ *
+ * PageLRU check wihtout isolation or lru_lock could race so that
+ * MIGRATE_MOVABLE block might include unmovable pages. It means you can't
+ * expect this function should be exact.
*/
-
-static int
-__count_immobile_pages(struct zone *zone, struct page *page, int count)
+static bool
+__has_unmovable_pages(struct zone *zone, struct page *page, int count)
{
unsigned long pfn, iter, found;
int mt;

/*
* For avoiding noise data, lru_add_drain_all() should be called
- * If ZONE_MOVABLE, the zone never contains immobile pages
+ * If ZONE_MOVABLE, the zone never contains unmovable pages
*/
if (zone_idx(zone) == ZONE_MOVABLE)
- return true;
+ return false;
mt = get_pageblock_migratetype(page);
if (mt == MIGRATE_MOVABLE || is_migrate_cma(mt))
- return true;
+ return false;

pfn = page_to_pfn(page);
for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
@@ -5524,9 +5526,9 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
* page at boot.
*/
if (found > count)
- return false;
+ return true;
}
- return true;
+ return false;
}

bool is_pageblock_removable_nolock(struct page *page)
@@ -5550,7 +5552,7 @@ bool is_pageblock_removable_nolock(struct page *page)
zone->zone_start_pfn + zone->spanned_pages <= pfn)
return false;

- return __count_immobile_pages(zone, page, 0);
+ return !__has_unmovable_pages(zone, page, 0);
}

int set_migratetype_isolate(struct page *page)
@@ -5589,12 +5591,12 @@ int set_migratetype_isolate(struct page *page)
* FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
* We just check MOVABLE pages.
*/
- if (__count_immobile_pages(zone, page, arg.pages_found))
+ if (!__has_unmovable_pages(zone, page, arg.pages_found))
ret = 0;
-
/*
- * immobile means "not-on-lru" paes. If immobile is larger than
- * removable-by-driver pages reported by notifier, we'll fail.
+ * Unmovable means "not-on-lru" pages. If Unmovable pages are
+ * larger than removable-by-driver pages reported by notifier,
+ * we'll fail.
*/

out:
--
1.7.9.5

2012-06-14 01:14:09

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin

Missing Bartlomiej, Sorry!

On 06/14/2012 10:12 AM, Minchan Kim wrote:

> d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
> Let's fix it.
>
> [1] http://comments.gmane.org/gmane.linux.kernel.mm/65844
>
> I copy and paste d179e84ba's contents for description.
>
> "It is unsafe to run page_count during the physical pfn scan because
> compound_head could trip on a dangling pointer when reading
> page->first_page if the compound page is being freed by another CPU."
>
> * changelog from v1
> - Add comment about skip tail page of THP - Andrea
> - fix typo - Wanpeng Li
> - based on next-20120613
>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: KAMEZAWA Hiroyuki <[email protected]>
> Cc: Wanpeng Li <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
> mm/page_alloc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 266f267..543cc2d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5496,11 +5496,18 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
> continue;
>
> page = pfn_to_page(check);
> - if (!page_count(page)) {
> + /*
> + * We can't use page_count without pin a page
> + * because another CPU can free compound page.
> + * This check already skips compound tails of THP
> + * because their page->_count is zero at all time.
> + */
> + if (!atomic_read(&page->_count)) {
> if (PageBuddy(page))
> iter += (1 << page_order(page)) - 1;
> continue;
> }
> +
> if (!PageLRU(page))
> found++;
> /*



--
Kind regards,
Minchan Kim

2012-06-14 01:15:12

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] mm: clean up __count_immobile_pages

Missing Bartlomiej, Sorry!

On 06/14/2012 10:12 AM, Minchan Kim wrote:

> __count_immobile_pages naming is rather awkward.
> This patch changes function name more clear and add comment.
>
> * changelog from v1
> - write down page flag race in function comment
> - commit change log change
>
> Cc: Andrea Arcangeli <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Acked-by: KAMEZAWA Hiroyuki <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
> mm/page_alloc.c | 34 ++++++++++++++++++----------------
> 1 file changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 543cc2d..dc7f8c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5467,26 +5467,28 @@ void set_pageblock_flags_group(struct page *page, unsigned long flags,
> }
>
> /*
> - * This is designed as sub function...plz see page_isolation.c also.
> - * set/clear page block's type to be ISOLATE.
> - * page allocater never alloc memory from ISOLATE block.
> + * This function checks whether pageblock includes unmovable pages or not.
> + * If @count is not zero, it is okay to include less @count unmovable pages
> + *
> + * PageLRU check wihtout isolation or lru_lock could race so that
> + * MIGRATE_MOVABLE block might include unmovable pages. It means you can't
> + * expect this function should be exact.
> */
> -
> -static int
> -__count_immobile_pages(struct zone *zone, struct page *page, int count)
> +static bool
> +__has_unmovable_pages(struct zone *zone, struct page *page, int count)
> {
> unsigned long pfn, iter, found;
> int mt;
>
> /*
> * For avoiding noise data, lru_add_drain_all() should be called
> - * If ZONE_MOVABLE, the zone never contains immobile pages
> + * If ZONE_MOVABLE, the zone never contains unmovable pages
> */
> if (zone_idx(zone) == ZONE_MOVABLE)
> - return true;
> + return false;
> mt = get_pageblock_migratetype(page);
> if (mt == MIGRATE_MOVABLE || is_migrate_cma(mt))
> - return true;
> + return false;
>
> pfn = page_to_pfn(page);
> for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> @@ -5524,9 +5526,9 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
> * page at boot.
> */
> if (found > count)
> - return false;
> + return true;
> }
> - return true;
> + return false;
> }
>
> bool is_pageblock_removable_nolock(struct page *page)
> @@ -5550,7 +5552,7 @@ bool is_pageblock_removable_nolock(struct page *page)
> zone->zone_start_pfn + zone->spanned_pages <= pfn)
> return false;
>
> - return __count_immobile_pages(zone, page, 0);
> + return !__has_unmovable_pages(zone, page, 0);
> }
>
> int set_migratetype_isolate(struct page *page)
> @@ -5589,12 +5591,12 @@ int set_migratetype_isolate(struct page *page)
> * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
> * We just check MOVABLE pages.
> */
> - if (__count_immobile_pages(zone, page, arg.pages_found))
> + if (!__has_unmovable_pages(zone, page, arg.pages_found))
> ret = 0;
> -
> /*
> - * immobile means "not-on-lru" paes. If immobile is larger than
> - * removable-by-driver pages reported by notifier, we'll fail.
> + * Unmovable means "not-on-lru" pages. If Unmovable pages are
> + * larger than removable-by-driver pages reported by notifier,
> + * we'll fail.
> */
>
> out:



--
Kind regards,
Minchan Kim

2012-06-14 02:22:36

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin

(2012/06/14 10:12), Minchan Kim wrote:
> d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
> Let's fix it.
>
> [1] http://comments.gmane.org/gmane.linux.kernel.mm/65844
>
> I copy and paste d179e84ba's contents for description.
>
> "It is unsafe to run page_count during the physical pfn scan because
> compound_head could trip on a dangling pointer when reading
> page->first_page if the compound page is being freed by another CPU."
>
> * changelog from v1
> - Add comment about skip tail page of THP - Andrea
> - fix typo - Wanpeng Li
> - based on next-20120613
>
> Cc: Andrea Arcangeli<[email protected]>
> Cc: Mel Gorman<[email protected]>
> Cc: Michal Hocko<[email protected]>
> Cc: KAMEZAWA Hiroyuki<[email protected]>
> Cc: Wanpeng Li<[email protected]>
> Signed-off-by: Minchan Kim<[email protected]>

Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>