2012-10-17 12:03:21

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 0/5] bugfix for memory hotplug

From: Wen Congyang <[email protected]>

Wen Congyang (5):
memory-hotplug: skip HWPoisoned page when offlining pages
memory-hotplug: update mce_bad_pages when removing the memory
memory-hotplug: auto offline page_cgroup when onlining memory block
failed
memory-hotplug: fix NR_FREE_PAGES mismatch
memory-hotplug: allocate zone's pcp before onlining pages

include/linux/page-isolation.h | 10 ++++++----
mm/memory-failure.c | 2 +-
mm/memory_hotplug.c | 14 ++++++++------
mm/page_alloc.c | 37 ++++++++++++++++++++++++++++---------
mm/page_cgroup.c | 3 +++
mm/page_isolation.c | 27 ++++++++++++++++++++-------
mm/sparse.c | 21 +++++++++++++++++++++
7 files changed, 87 insertions(+), 27 deletions(-)


2012-10-17 12:03:25

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 3/5] memory-hotplug: auto offline page_cgroup when onlining memory block failed

From: Wen Congyang <[email protected]>

When a memory block is onlined, we will try allocate memory on that node
to store page_cgroup. If onlining the memory block failed, we don't
offline the page cgroup, and we have no chance to offline this page cgroup
unless the memory block is onlined successfully again. It will cause
that we can't hot-remove the memory device on that node, because some
memory is used to store page cgroup. If onlining the memory block
is failed, there is no need to stort page cgroup for this memory. So
auto offline page_cgroup when onlining memory block failed.

CC: David Rientjes <[email protected]>
CC: Jiang Liu <[email protected]>
CC: Len Brown <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Christoph Lameter <[email protected]>
Cc: Minchan Kim <[email protected]>
CC: Andrew Morton <[email protected]>
CC: KOSAKI Motohiro <[email protected]>
CC: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Wen Congyang <[email protected]>
---
mm/page_cgroup.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 5ddad0c..44db00e 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -251,6 +251,9 @@ static int __meminit page_cgroup_callback(struct notifier_block *self,
mn->nr_pages, mn->status_change_nid);
break;
case MEM_CANCEL_ONLINE:
+ offline_page_cgroup(mn->start_pfn,
+ mn->nr_pages, mn->status_change_nid);
+ break;
case MEM_GOING_OFFLINE:
break;
case MEM_ONLINE:
--
1.7.1

2012-10-17 12:03:24

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 4/5] memory-hotplug: fix NR_FREE_PAGES mismatch

From: Wen Congyang <[email protected]>

NR_FREE_PAGES will be wrong after offlining pages. We add/dec NR_FREE_PAGES
like this now:
1. mova all pages in buddy system to MIGRATE_ISOLATE, and dec NR_FREE_PAGES
2. don't add NR_FREE_PAGES when it is freed and the migratetype is MIGRATE_ISOLATE
3. dec NR_FREE_PAGES when offlining isolated pages.
4. add NR_FREE_PAGES when undoing isolate pages.

When we come to step 3, all pages are in MIGRATE_ISOLATE list, and NR_FREE_PAGES
are right. When we come to step4, all pages are not in buddy system, so we don't
change NR_FREE_PAGES in this step, but we change NR_FREE_PAGES in step3. So
NR_FREE_PAGES is wrong after offlining pages. So there is no need to change
NR_FREE_PAGES in step3.

This patch also fixs a problem in step2: if the migratetype is MIGRATE_ISOLATE,
we should not add NR_FRR_PAGES when we remove pages from pcppages.

CC: David Rientjes <[email protected]>
CC: Jiang Liu <[email protected]>
CC: Len Brown <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Christoph Lameter <[email protected]>
Cc: Minchan Kim <[email protected]>
CC: Andrew Morton <[email protected]>
CC: KOSAKI Motohiro <[email protected]>
CC: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Wen Congyang <[email protected]>
---
mm/page_alloc.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e33d0fb..9aa9490 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -667,11 +667,13 @@ static void free_pcppages_bulk(struct zone *zone, int count,
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
- if (is_migrate_cma(mt))
- __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
+ if (likely(mt != MIGRATE_ISOLATE)) {
+ __mod_zone_page_state(zone, NR_FREE_PAGES, 1);
+ if (is_migrate_cma(mt))
+ __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
+ }
} while (--to_free && --batch_free && !list_empty(list));
}
- __mod_zone_page_state(zone, NR_FREE_PAGES, count);
spin_unlock(&zone->lock);
}

@@ -6006,8 +6008,6 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
list_del(&page->lru);
rmv_page_order(page);
zone->free_area[order].nr_free--;
- __mod_zone_page_state(zone, NR_FREE_PAGES,
- - (1UL << order));
for (i = 0; i < (1 << order); i++)
SetPageReserved((page+i));
pfn += (1 << order);
--
1.7.1

2012-10-17 12:03:23

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 5/5] memory-hotplug: allocate zone's pcp before onlining pages

From: Wen Congyang <[email protected]>

We use __free_page() to put a page to buddy system when onlining pages.
__free_page() will store NR_FREE_PAGES in zone's pcp.vm_stat_diff, so we
should allocate zone's pcp before onlining pages, otherwise we will lose
some free pages.

CC: David Rientjes <[email protected]>
CC: Jiang Liu <[email protected]>
CC: Len Brown <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Christoph Lameter <[email protected]>
Cc: Minchan Kim <[email protected]>
CC: Andrew Morton <[email protected]>
CC: KOSAKI Motohiro <[email protected]>
CC: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Wen Congyang <[email protected]>
---
mm/memory_hotplug.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index ec899a2..eb4c132 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -505,12 +505,16 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
* So, zonelist must be updated after online.
*/
mutex_lock(&zonelists_mutex);
- if (!populated_zone(zone))
+ if (!populated_zone(zone)) {
need_zonelists_rebuild = 1;
+ build_all_zonelists(NULL, zone);
+ }

ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
online_pages_range);
if (ret) {
+ if (need_zonelists_rebuild)
+ zone_pcp_reset(zone);
mutex_unlock(&zonelists_mutex);
printk(KERN_DEBUG "online_pages [mem %#010llx-%#010llx] failed\n",
(unsigned long long) pfn << PAGE_SHIFT,
@@ -525,9 +529,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
zone->zone_pgdat->node_present_pages += onlined_pages;
if (onlined_pages) {
node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
- if (need_zonelists_rebuild)
- build_all_zonelists(NULL, zone);
- else
+ if (!need_zonelists_rebuild)
zone_pcp_update(zone);
}

--
1.7.1

2012-10-17 12:04:19

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 1/5] memory-hotplug: skip HWPoisoned page when offlining pages

From: Wen Congyang <[email protected]>

hwpoisoned may be set when we offline a page by the sysfs interface
/sys/devices/system/memory/soft_offline_page or
/sys/devices/system/memory/hard_offline_page. If we don't clear
this flag when onlining pages, this page can't be freed, and will
not in free list. So we can't offline these pages again. So we
should skip such page when offlining pages.

CC: David Rientjes <[email protected]>
CC: Jiang Liu <[email protected]>
CC: Len Brown <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Christoph Lameter <[email protected]>
Cc: Minchan Kim <[email protected]>
CC: Andrew Morton <[email protected]>
CC: KOSAKI Motohiro <[email protected]>
CC: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Wen Congyang <[email protected]>
---
include/linux/page-isolation.h | 10 ++++++----
mm/memory-failure.c | 2 +-
mm/memory_hotplug.c | 4 ++--
mm/page_alloc.c | 27 +++++++++++++++++++++++----
mm/page_isolation.c | 27 ++++++++++++++++++++-------
5 files changed, 52 insertions(+), 18 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 76a9539..a92061e 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -2,7 +2,8 @@
#define __LINUX_PAGEISOLATION_H


-bool has_unmovable_pages(struct zone *zone, struct page *page, int count);
+bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
+ bool skip_hwpoisoned_pages);
void set_pageblock_migratetype(struct page *page, int migratetype);
int move_freepages_block(struct zone *zone, struct page *page,
int migratetype);
@@ -21,7 +22,7 @@ int move_freepages(struct zone *zone,
*/
int
start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype);
+ unsigned migratetype, bool skip_hwpoisoned_pages);

/*
* Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
@@ -34,12 +35,13 @@ undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
/*
* Test all pages in [start_pfn, end_pfn) are isolated or not.
*/
-int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn);
+int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
+ bool skip_hwpoisoned_pages);

/*
* Internal functions. Changes pageblock's migrate type.
*/
-int set_migratetype_isolate(struct page *page);
+int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages);
void unset_migratetype_isolate(struct page *page, unsigned migratetype);
struct page *alloc_migrate_target(struct page *page, unsigned long private,
int **resultp);
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 6c5899b..1abffee 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1385,7 +1385,7 @@ static int get_any_page(struct page *p, unsigned long pfn, int flags)
* Isolate the page, so that it doesn't get reallocated if it
* was free.
*/
- set_migratetype_isolate(p);
+ set_migratetype_isolate(p, true);
/*
* When the target page is a free hugepage, just remove it
* from free hugepage list.
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 56b758a..ec899a2 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -854,7 +854,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
{
int ret;
long offlined = *(long *)data;
- ret = test_pages_isolated(start_pfn, start_pfn + nr_pages);
+ ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true);
offlined = nr_pages;
if (!ret)
*(long *)data += offlined;
@@ -901,7 +901,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
nr_pages = end_pfn - start_pfn;

/* set above range as isolated */
- ret = start_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+ ret = start_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true);
if (ret)
goto out;

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bb90971..e33d0fb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5575,7 +5575,8 @@ void set_pageblock_flags_group(struct page *page, unsigned long flags,
* MIGRATE_MOVABLE block might include unmovable pages. It means you can't
* expect this function should be exact.
*/
-bool has_unmovable_pages(struct zone *zone, struct page *page, int count)
+bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
+ bool skip_hwpoisoned_pages)
{
unsigned long pfn, iter, found;
int mt;
@@ -5610,6 +5611,13 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count)
continue;
}

+ /*
+ * The HWPoisoned page may be not in buddy system, and
+ * page_count() is not 0.
+ */
+ if (skip_hwpoisoned_pages && PageHWPoison(page))
+ continue;
+
if (!PageLRU(page))
found++;
/*
@@ -5652,7 +5660,7 @@ bool is_pageblock_removable_nolock(struct page *page)
zone->zone_start_pfn + zone->spanned_pages <= pfn)
return false;

- return !has_unmovable_pages(zone, page, 0);
+ return !has_unmovable_pages(zone, page, 0, true);
}

#ifdef CONFIG_CMA
@@ -5823,7 +5831,8 @@ int alloc_contig_range(unsigned long start, unsigned long end,
*/

ret = start_isolate_page_range(pfn_max_align_down(start),
- pfn_max_align_up(end), migratetype);
+ pfn_max_align_up(end), migratetype,
+ false);
if (ret)
goto done;

@@ -5862,7 +5871,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
}

/* Make sure the range is really isolated. */
- if (test_pages_isolated(outer_start, end)) {
+ if (test_pages_isolated(outer_start, end, false)) {
pr_warn("alloc_contig_range test_pages_isolated(%lx, %lx) failed\n",
outer_start, end);
ret = -EBUSY;
@@ -5977,6 +5986,16 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
continue;
}
page = pfn_to_page(pfn);
+ /*
+ * The HWPoisoned page may be not in buddy system, and
+ * page_count() is not 0.
+ */
+ if (unlikely(!PageBuddy(page) && PageHWPoison(page))) {
+ pfn++;
+ SetPageReserved(page);
+ continue;
+ }
+
BUG_ON(page_count(page));
BUG_ON(!PageBuddy(page));
order = page_order(page);
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f2f5b48..9d2264e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -30,7 +30,7 @@ static void restore_pageblock_isolate(struct page *page, int migratetype)
zone->nr_pageblock_isolate--;
}

-int set_migratetype_isolate(struct page *page)
+int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages)
{
struct zone *zone;
unsigned long flags, pfn;
@@ -66,7 +66,8 @@ int set_migratetype_isolate(struct page *page)
* FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
* We just check MOVABLE pages.
*/
- if (!has_unmovable_pages(zone, page, arg.pages_found))
+ if (!has_unmovable_pages(zone, page, arg.pages_found,
+ skip_hwpoisoned_pages))
ret = 0;

/*
@@ -134,7 +135,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
* Returns 0 on success and -EBUSY if any part of range cannot be isolated.
*/
int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype)
+ unsigned migratetype, bool skip_hwpoisoned_pages)
{
unsigned long pfn;
unsigned long undo_pfn;
@@ -147,7 +148,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
pfn < end_pfn;
pfn += pageblock_nr_pages) {
page = __first_valid_page(pfn, pageblock_nr_pages);
- if (page && set_migratetype_isolate(page)) {
+ if (page &&
+ set_migratetype_isolate(page, skip_hwpoisoned_pages)) {
undo_pfn = pfn;
goto undo;
}
@@ -190,7 +192,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
* Returns 1 if all pages in the range are isolated.
*/
static int
-__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
+__test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
+ bool skip_hwpoisoned_pages)
{
struct page *page;

@@ -220,6 +223,14 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
else if (page_count(page) == 0 &&
get_freepage_migratetype(page) == MIGRATE_ISOLATE)
pfn += 1;
+ else if (skip_hwpoisoned_pages && PageHWPoison(page)) {
+ /*
+ * The HWPoisoned page may be not in buddy
+ * system, and page_count() is not 0.
+ */
+ pfn++;
+ continue;
+ }
else
break;
}
@@ -228,7 +239,8 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn)
return 1;
}

-int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
+int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
+ bool skip_hwpoisoned_pages)
{
unsigned long pfn, flags;
struct page *page;
@@ -251,7 +263,8 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
/* Check all pages are free or Marked as ISOLATED */
zone = page_zone(page);
spin_lock_irqsave(&zone->lock, flags);
- ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
+ ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn,
+ skip_hwpoisoned_pages);
spin_unlock_irqrestore(&zone->lock, flags);
return ret ? 0 : -EBUSY;
}
--
1.7.1

2012-10-17 12:04:17

by Wen Congyang

[permalink] [raw]
Subject: [PATCH v2 2/5] memory-hotplug: update mce_bad_pages when removing the memory

From: Wen Congyang <[email protected]>

When we hotremove a memory device, we will free the memory to store
struct page. If the page is hwpoisoned page, we should decrease
mce_bad_pages.

CC: David Rientjes <[email protected]>
CC: Jiang Liu <[email protected]>
CC: Len Brown <[email protected]>
CC: Benjamin Herrenschmidt <[email protected]>
CC: Paul Mackerras <[email protected]>
CC: Christoph Lameter <[email protected]>
Cc: Minchan Kim <[email protected]>
CC: Andrew Morton <[email protected]>
CC: KOSAKI Motohiro <[email protected]>
CC: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Wen Congyang <[email protected]>
---
mm/sparse.c | 21 +++++++++++++++++++++
1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index fac95f2..24072e4 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -773,6 +773,23 @@ out:
return ret;
}

+#ifdef CONFIG_MEMORY_FAILURE
+static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
+{
+ int i;
+
+ if (!memmap)
+ return;
+
+ for (i = 0; i < PAGES_PER_SECTION; i++) {
+ if (PageHWPoison(&memmap[i])) {
+ atomic_long_sub(1, &mce_bad_pages);
+ ClearPageHWPoison(&memmap[i]);
+ }
+ }
+}
+#endif
+
void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
{
struct page *memmap = NULL;
@@ -786,6 +803,10 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
ms->pageblock_flags = NULL;
}

+#ifdef CONFIG_MEMORY_FAILURE
+ clear_hwpoisoned_pages(memmap, PAGES_PER_SECTION);
+#endif
+
free_section_usemap(memmap, usemap);
}
#endif
--
1.7.1

2012-10-17 12:22:40

by Ni zhan Chen

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] bugfix for memory hotplug

On 10/17/2012 08:08 PM, [email protected] wrote:
> From: Wen Congyang <[email protected]>
>
> Wen Congyang (5):
> memory-hotplug: skip HWPoisoned page when offlining pages
> memory-hotplug: update mce_bad_pages when removing the memory
> memory-hotplug: auto offline page_cgroup when onlining memory block
> failed
> memory-hotplug: fix NR_FREE_PAGES mismatch
> memory-hotplug: allocate zone's pcp before onlining pages

Oops, why you don't write changelog?

>
> include/linux/page-isolation.h | 10 ++++++----
> mm/memory-failure.c | 2 +-
> mm/memory_hotplug.c | 14 ++++++++------
> mm/page_alloc.c | 37 ++++++++++++++++++++++++++++---------
> mm/page_cgroup.c | 3 +++
> mm/page_isolation.c | 27 ++++++++++++++++++++-------
> mm/sparse.c | 21 +++++++++++++++++++++
> 7 files changed, 87 insertions(+), 27 deletions(-)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2012-10-17 15:12:10

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] memory-hotplug: update mce_bad_pages when removing the memory

Hi Wen,

> +#ifdef CONFIG_MEMORY_FAILURE
> +static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
> +{
> + int i;
> +
> + if (!memmap)
> + return;

I guess free_section_usemap() does the same thing.

> + for (i = 0; i < PAGES_PER_SECTION; i++) {
> + if (PageHWPoison(&memmap[i])) {
> + atomic_long_sub(1, &mce_bad_pages);
> + ClearPageHWPoison(&memmap[i]);
> + }
> + }
> +}
> +#endif
> +
> void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
> {
> struct page *memmap = NULL;
> @@ -786,6 +803,10 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
> ms->pageblock_flags = NULL;
> }
>
> +#ifdef CONFIG_MEMORY_FAILURE
> + clear_hwpoisoned_pages(memmap, PAGES_PER_SECTION);
> +#endif
> +
> free_section_usemap(memmap, usemap);
> }
> #endif

But why put the call outside the "if (ms->section_mem_map)" block? If
you put it inside, then you don't have to check for !memmap in
clear_hwpoisoned_pages().

Also, we really frown on #ifdefs scattered throughout code. I'd suggest
either:

+static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
+{
+#ifdef CONFIG_MEMORY_FAILURE
... existing code
+#endif /* CONFIG_MEMORY_FAILURE */
+}

or

+#ifdef CONFIG_MEMORY_FAILURE
+static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
+{
... existing code
+}
+#else
+static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
+{}
+#endif /* CONFIG_MEMORY_FAILURE */

and keep the #ifdef out of sparse_remove_one_section().

2012-10-17 15:26:41

by Wen Congyang

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] memory-hotplug: auto offline page_cgroup when onlining memory block failed

At 2012/10/17 20:08, [email protected] Wrote:
> From: Wen Congyang<[email protected]>
>
> When a memory block is onlined, we will try allocate memory on that node
> to store page_cgroup. If onlining the memory block failed, we don't
> offline the page cgroup, and we have no chance to offline this page cgroup
> unless the memory block is onlined successfully again. It will cause
> that we can't hot-remove the memory device on that node, because some
> memory is used to store page cgroup. If onlining the memory block
> is failed, there is no need to stort page cgroup for this memory. So
> auto offline page_cgroup when onlining memory block failed.
>
> CC: David Rientjes<[email protected]>
> CC: Jiang Liu<[email protected]>
> CC: Len Brown<[email protected]>
> CC: Benjamin Herrenschmidt<[email protected]>
> CC: Paul Mackerras<[email protected]>
> CC: Christoph Lameter<[email protected]>
> Cc: Minchan Kim<[email protected]>
> CC: Andrew Morton<[email protected]>
> CC: KOSAKI Motohiro<[email protected]>
> CC: Yasuaki Ishimatsu<[email protected]>
> Signed-off-by: Wen Congyang<[email protected]>

This patch has been acked by kosaki.
I forgot to add "Acked-by: KOSAKI Motohiro <[email protected]>"


> ---
> mm/page_cgroup.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 5ddad0c..44db00e 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -251,6 +251,9 @@ static int __meminit page_cgroup_callback(struct notifier_block *self,
> mn->nr_pages, mn->status_change_nid);
> break;
> case MEM_CANCEL_ONLINE:
> + offline_page_cgroup(mn->start_pfn,
> + mn->nr_pages, mn->status_change_nid);
> + break;
> case MEM_GOING_OFFLINE:
> break;
> case MEM_ONLINE:

2012-10-17 15:31:24

by Wen Congyang

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] bugfix for memory hotplug

At 2012/10/17 20:22, Ni zhan Chen Wrote:
> On 10/17/2012 08:08 PM, [email protected] wrote:
>> From: Wen Congyang <[email protected]>
>>
>> Wen Congyang (5):
>> memory-hotplug: skip HWPoisoned page when offlining pages
>> memory-hotplug: update mce_bad_pages when removing the memory
>> memory-hotplug: auto offline page_cgroup when onlining memory block
>> failed
>> memory-hotplug: fix NR_FREE_PAGES mismatch
>> memory-hotplug: allocate zone's pcp before onlining pages
>
> Oops, why you don't write changelog?

I forgot to add it. Here is the changelog:

Patch 1: updated according to kosaki's suggestion

Patch 2: new patch, and update mce_bad_pages when removing memory.

Patch 4: new patch, and fix a NR_FREE_PAGES mismatch, and this bug
cause oom in my test.

Patch 5: new patch, and fix a new bug. When repeating to online/offline
pages, the free pages will continue to decrease.

>
>>
>> include/linux/page-isolation.h | 10 ++++++----
>> mm/memory-failure.c | 2 +-
>> mm/memory_hotplug.c | 14 ++++++++------
>> mm/page_alloc.c | 37 ++++++++++++++++++++++++++++---------
>> mm/page_cgroup.c | 3 +++
>> mm/page_isolation.c | 27 ++++++++++++++++++++-------
>> mm/sparse.c | 21 +++++++++++++++++++++
>> 7 files changed, 87 insertions(+), 27 deletions(-)
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to [email protected]. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2012-10-18 22:20:11

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] memory-hotplug: update mce_bad_pages when removing the memory

On Wed, 17 Oct 2012 08:09:55 -0700
Dave Hansen <[email protected]> wrote:

> Hi Wen,
>
> > +#ifdef CONFIG_MEMORY_FAILURE
> > +static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
> > +{
> > + int i;
> > +
> > + if (!memmap)
> > + return;
>
> I guess free_section_usemap() does the same thing.

What does this observation mean?

> > + for (i = 0; i < PAGES_PER_SECTION; i++) {
> > + if (PageHWPoison(&memmap[i])) {
> > + atomic_long_sub(1, &mce_bad_pages);
> > + ClearPageHWPoison(&memmap[i]);
> > + }
> > + }
> > +}
> > +#endif
> > +
> > void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
> > {
> > struct page *memmap = NULL;
>
> ..
>
> and keep the #ifdef out of sparse_remove_one_section().

yup.

--- a/mm/sparse.c~memory-hotplug-update-mce_bad_pages-when-removing-the-memory-fix
+++ a/mm/sparse.c
@@ -788,6 +788,10 @@ static void clear_hwpoisoned_pages(struc
}
}
}
+#else
+static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
+{
+}
#endif

void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
@@ -803,10 +807,7 @@ void sparse_remove_one_section(struct zo
ms->pageblock_flags = NULL;
}

-#ifdef CONFIG_MEMORY_FAILURE
clear_hwpoisoned_pages(memmap, PAGES_PER_SECTION);
-#endif
-
free_section_usemap(memmap, usemap);
}
#endif
_

2012-10-19 13:43:00

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] memory-hotplug: update mce_bad_pages when removing the memory

On 10/18/2012 03:20 PM, Andrew Morton wrote:
> On Wed, 17 Oct 2012 08:09:55 -0700
> Dave Hansen <[email protected]> wrote:
>>> +#ifdef CONFIG_MEMORY_FAILURE
>>> +static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages)
>>> +{
>>> + int i;
>>> +
>>> + if (!memmap)
>>> + return;
>>
>> I guess free_section_usemap() does the same thing.
>
> What does this observation mean?

sparse_remove_one_section() has an if(ms->section_mem_map) statement.
Inside that if() block is the only place in the function where 'memmap'
can get set.

Currently, sparse_remove_one_section() calls in to free_section_usemap()
ouside of that if() block. With this patch new call to
clear_hwpoisoned_pages() is done in the same place, both passing 'memmap'.

However, both free_section_usemap() and clear_hwpoisoned_pages() check
'memmap' for NULL and immediately return if so. That's a bit silly
since it could hide garbage coming back from sparse_decode_mem_map().
Seems like we should just call them both inside that if() block, or
reorganize sparse_remove_one_section(), maybe like this:

void sparse_remove_one_section(struct zone *zone, struct mem_section *ms)
{
struct page *memmap = NULL;
unsigned long *usemap = NULL;

if (!ms->section_mem_map)
return;

usemap = ms->pageblock_flags;
memmap = sparse_decode_mem_map(ms->section_mem_map,
__section_nr(ms));
ms->section_mem_map = 0;
ms->pageblock_flags = NULL;

free_section_usemap(memmap, usemap);
clear_hwpoisoned_pages(usemap, ...);
}