From: Wu Fengguang <[email protected]>
If memory corruption hits the free buddy pages, we can safely ignore them.
No one will access them until page allocation time, then prep_new_page()
will automatically check and isolate PG_hwpoison page for us (for 0-order
allocation).
This patch expands prep_new_page() to check every component page in a high
order page allocation, in order to completely stop PG_hwpoison pages from
being recirculated.
Note that the common case -- only allocating a single page, doesn't
do any more work than before. Allocating > order 0 does a bit more work,
but that's relatively uncommon.
This simple implementation may drop some innocent neighbor pages, hopefully
it is not a big problem because the event should be rare enough.
This patch adds some runtime costs to high order page users.
[AK: Improved description]
v2: Andi Kleen:
Port to -mm code
Move check into separate function.
Don't dump stack in bad_pages for hwpoisoned pages.
Signed-off-by: Wu Fengguang <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
mm/page_alloc.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
--- sound-2.6.orig/mm/page_alloc.c
+++ sound-2.6/mm/page_alloc.c
@@ -233,6 +233,12 @@ static void bad_page(struct page *page)
static unsigned long nr_shown;
static unsigned long nr_unshown;
+ /* Don't complain about poisoned pages */
+ if (PageHWPoison(page)) {
+ __ClearPageBuddy(page);
+ return;
+ }
+
/*
* Allow a burst of 60 reports, then keep quiet for that minute;
* or allow a steady drip of one report per second.
@@ -646,7 +652,7 @@ static inline void expand(struct zone *z
/*
* This page is about to be returned from the page allocator
*/
-static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
+static inline int check_new_page(struct page *page)
{
if (unlikely(page_mapcount(page) |
(page->mapping != NULL) |
@@ -655,6 +661,18 @@ static int prep_new_page(struct page *pa
bad_page(page);
return 1;
}
+ return 0;
+}
+
+static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
+{
+ int i;
+
+ for (i = 0; i < (1 << order); i++) {
+ struct page *p = page + i;
+ if (unlikely(check_new_page(p)))
+ return 1;
+ }
set_page_private(page, 0);
set_page_refcounted(page);
--
On Mon, 15 Jun 2009 10:45:30 +0800
Wu Fengguang <[email protected]> wrote:
> From: Wu Fengguang <[email protected]>
>
> If memory corruption hits the free buddy pages, we can safely ignore them.
> No one will access them until page allocation time, then prep_new_page()
> will automatically check and isolate PG_hwpoison page for us (for 0-order
> allocation).
>
> This patch expands prep_new_page() to check every component page in a high
> order page allocation, in order to completely stop PG_hwpoison pages from
> being recirculated.
>
> Note that the common case -- only allocating a single page, doesn't
> do any more work than before. Allocating > order 0 does a bit more work,
> but that's relatively uncommon.
>
> This simple implementation may drop some innocent neighbor pages, hopefully
> it is not a big problem because the event should be rare enough.
>
> This patch adds some runtime costs to high order page users.
>
> [AK: Improved description]
>
> v2: Andi Kleen:
> Port to -mm code
> Move check into separate function.
> Don't dump stack in bad_pages for hwpoisoned pages.
>
> Signed-off-by: Wu Fengguang <[email protected]>
> Signed-off-by: Andi Kleen <[email protected]>
>
> ---
> mm/page_alloc.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> --- sound-2.6.orig/mm/page_alloc.c
> +++ sound-2.6/mm/page_alloc.c
> @@ -233,6 +233,12 @@ static void bad_page(struct page *page)
> static unsigned long nr_shown;
> static unsigned long nr_unshown;
>
> + /* Don't complain about poisoned pages */
> + if (PageHWPoison(page)) {
> + __ClearPageBuddy(page);
> + return;
> + }
Hmm ? why __ClearPageBuddy() is necessary ?
Thanks,
-Kame
> +
> /*
> * Allow a burst of 60 reports, then keep quiet for that minute;
> * or allow a steady drip of one report per second.
> @@ -646,7 +652,7 @@ static inline void expand(struct zone *z
> /*
> * This page is about to be returned from the page allocator
> */
> -static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
> +static inline int check_new_page(struct page *page)
> {
> if (unlikely(page_mapcount(page) |
> (page->mapping != NULL) |
> @@ -655,6 +661,18 @@ static int prep_new_page(struct page *pa
> bad_page(page);
> return 1;
> }
> + return 0;
> +}
> +
> +static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
> +{
> + int i;
> +
> + for (i = 0; i < (1 << order); i++) {
> + struct page *p = page + i;
> + if (unlikely(check_new_page(p)))
> + return 1;
> + }
>
> set_page_private(page, 0);
> set_page_refcounted(page);
>
> --
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>
On Mon, Jun 15, 2009 at 05:41:12PM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 15 Jun 2009 10:45:30 +0800
> Wu Fengguang <[email protected]> wrote:
>
> > From: Wu Fengguang <[email protected]>
> >
> > If memory corruption hits the free buddy pages, we can safely ignore them.
> > No one will access them until page allocation time, then prep_new_page()
> > will automatically check and isolate PG_hwpoison page for us (for 0-order
> > allocation).
> >
> > This patch expands prep_new_page() to check every component page in a high
> > order page allocation, in order to completely stop PG_hwpoison pages from
> > being recirculated.
> >
> > Note that the common case -- only allocating a single page, doesn't
> > do any more work than before. Allocating > order 0 does a bit more work,
> > but that's relatively uncommon.
> >
> > This simple implementation may drop some innocent neighbor pages, hopefully
> > it is not a big problem because the event should be rare enough.
> >
> > This patch adds some runtime costs to high order page users.
> >
> > [AK: Improved description]
> >
> > v2: Andi Kleen:
> > Port to -mm code
> > Move check into separate function.
> > Don't dump stack in bad_pages for hwpoisoned pages.
> >
> > Signed-off-by: Wu Fengguang <[email protected]>
> > Signed-off-by: Andi Kleen <[email protected]>
> >
> > ---
> > mm/page_alloc.c | 20 +++++++++++++++++++-
> > 1 file changed, 19 insertions(+), 1 deletion(-)
> >
> > --- sound-2.6.orig/mm/page_alloc.c
> > +++ sound-2.6/mm/page_alloc.c
> > @@ -233,6 +233,12 @@ static void bad_page(struct page *page)
> > static unsigned long nr_shown;
> > static unsigned long nr_unshown;
> >
> > + /* Don't complain about poisoned pages */
> > + if (PageHWPoison(page)) {
> > + __ClearPageBuddy(page);
> > + return;
> > + }
>
> Hmm ? why __ClearPageBuddy() is necessary ?
Because this page is considered to be "allocated" out of the buddy
system, even though we fail the allocation here.
The page is now owned by no one, especially not owned by the buddy
allocator.
Thanks,
Fengguang
On Mon, 15 Jun 2009 18:16:20 +0800
Wu Fengguang <[email protected]> wrote:
> On Mon, Jun 15, 2009 at 05:41:12PM +0800, KAMEZAWA Hiroyuki wrote:
> > On Mon, 15 Jun 2009 10:45:30 +0800
> > Wu Fengguang <[email protected]> wrote:
> >
> > > From: Wu Fengguang <[email protected]>
> > >
> > > If memory corruption hits the free buddy pages, we can safely ignore them.
> > > No one will access them until page allocation time, then prep_new_page()
> > > will automatically check and isolate PG_hwpoison page for us (for 0-order
> > > allocation).
> > >
> > > This patch expands prep_new_page() to check every component page in a high
> > > order page allocation, in order to completely stop PG_hwpoison pages from
> > > being recirculated.
> > >
> > > Note that the common case -- only allocating a single page, doesn't
> > > do any more work than before. Allocating > order 0 does a bit more work,
> > > but that's relatively uncommon.
> > >
> > > This simple implementation may drop some innocent neighbor pages, hopefully
> > > it is not a big problem because the event should be rare enough.
> > >
> > > This patch adds some runtime costs to high order page users.
> > >
> > > [AK: Improved description]
> > >
> > > v2: Andi Kleen:
> > > Port to -mm code
> > > Move check into separate function.
> > > Don't dump stack in bad_pages for hwpoisoned pages.
> > >
> > > Signed-off-by: Wu Fengguang <[email protected]>
> > > Signed-off-by: Andi Kleen <[email protected]>
> > >
> > > ---
> > > mm/page_alloc.c | 20 +++++++++++++++++++-
> > > 1 file changed, 19 insertions(+), 1 deletion(-)
> > >
> > > --- sound-2.6.orig/mm/page_alloc.c
> > > +++ sound-2.6/mm/page_alloc.c
> > > @@ -233,6 +233,12 @@ static void bad_page(struct page *page)
> > > static unsigned long nr_shown;
> > > static unsigned long nr_unshown;
> > >
> > > + /* Don't complain about poisoned pages */
> > > + if (PageHWPoison(page)) {
> > > + __ClearPageBuddy(page);
> > > + return;
> > > + }
> >
> > Hmm ? why __ClearPageBuddy() is necessary ?
>
> Because this page is considered to be "allocated" out of the buddy
> system, even though we fail the allocation here.
>
> The page is now owned by no one, especially not owned by the buddy
> allocator.
>
I just wonder "why __ClearPageBuddy() is necessary."
When bad_page() is called, a page is removed from buddy allocator and no
PG_buddy flag at all....I'm sorry if you added bad_page() caller in buddy allocator.
Buddy Allocator I call here is just 2 functions.
- __free_one_page()
- expand()
Bye,
-Kame
> Thanks,
> Fengguang
>
On Tue, Jun 16, 2009 at 07:52:22AM +0800, KAMEZAWA Hiroyuki wrote:
> On Mon, 15 Jun 2009 18:16:20 +0800
> Wu Fengguang <[email protected]> wrote:
>
> > On Mon, Jun 15, 2009 at 05:41:12PM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 15 Jun 2009 10:45:30 +0800
> > > Wu Fengguang <[email protected]> wrote:
> > >
> > > > From: Wu Fengguang <[email protected]>
> > > >
> > > > If memory corruption hits the free buddy pages, we can safely ignore them.
> > > > No one will access them until page allocation time, then prep_new_page()
> > > > will automatically check and isolate PG_hwpoison page for us (for 0-order
> > > > allocation).
> > > >
> > > > This patch expands prep_new_page() to check every component page in a high
> > > > order page allocation, in order to completely stop PG_hwpoison pages from
> > > > being recirculated.
> > > >
> > > > Note that the common case -- only allocating a single page, doesn't
> > > > do any more work than before. Allocating > order 0 does a bit more work,
> > > > but that's relatively uncommon.
> > > >
> > > > This simple implementation may drop some innocent neighbor pages, hopefully
> > > > it is not a big problem because the event should be rare enough.
> > > >
> > > > This patch adds some runtime costs to high order page users.
> > > >
> > > > [AK: Improved description]
> > > >
> > > > v2: Andi Kleen:
> > > > Port to -mm code
> > > > Move check into separate function.
> > > > Don't dump stack in bad_pages for hwpoisoned pages.
> > > >
> > > > Signed-off-by: Wu Fengguang <[email protected]>
> > > > Signed-off-by: Andi Kleen <[email protected]>
> > > >
> > > > ---
> > > > mm/page_alloc.c | 20 +++++++++++++++++++-
> > > > 1 file changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > --- sound-2.6.orig/mm/page_alloc.c
> > > > +++ sound-2.6/mm/page_alloc.c
> > > > @@ -233,6 +233,12 @@ static void bad_page(struct page *page)
> > > > static unsigned long nr_shown;
> > > > static unsigned long nr_unshown;
> > > >
> > > > + /* Don't complain about poisoned pages */
> > > > + if (PageHWPoison(page)) {
> > > > + __ClearPageBuddy(page);
> > > > + return;
> > > > + }
> > >
> > > Hmm ? why __ClearPageBuddy() is necessary ?
> >
> > Because this page is considered to be "allocated" out of the buddy
> > system, even though we fail the allocation here.
> >
> > The page is now owned by no one, especially not owned by the buddy
> > allocator.
> >
> I just wonder "why __ClearPageBuddy() is necessary."
>
> When bad_page() is called, a page is removed from buddy allocator and no
> PG_buddy flag at all....I'm sorry if you added bad_page() caller in buddy allocator.
You are right. But I didn't add bad_page() callers either :)
> Buddy Allocator I call here is just 2 functions.
> - __free_one_page()
> - expand()
Right. Then the original __ClearPageBuddy() call in bad_page() is
questionable, I guess this line was there just for the sake of safety
(ie. the buddy allocator itself goes wrong):
sound-2.6/mm/page_alloc.c
@@ -269,7 +269,6 @@ static void bad_page(struct page *page)
dump_stack();
out:
/* Leave bad fields for debug, except PageBuddy could make trouble */
===> __ClearPageBuddy(page);
add_taint(TAINT_BAD_PAGE);
}
Thanks,
Fengguang
On Tue, 16 Jun 2009, Wu Fengguang wrote:
>
> Right. Then the original __ClearPageBuddy() call in bad_page() is
> questionable, I guess this line was there just for the sake of safety
> (ie. the buddy allocator itself goes wrong):
>
> sound-2.6/mm/page_alloc.c
>
> @@ -269,7 +269,6 @@ static void bad_page(struct page *page)
> dump_stack();
> out:
> /* Leave bad fields for debug, except PageBuddy could make trouble */
> ===> __ClearPageBuddy(page);
> add_taint(TAINT_BAD_PAGE);
> }
I didn't put that in for the case of the buddy allocator going wrong
(not sure if there could be such a case - I don't mean that the buddy
allocator is provably perfect! but how would it get here if it were
wrong?). No, I put that in for the case when the flag bits in struct
page have themselves got corrupted somehow, and hence we arrive at
bad_page(): most of the bits are best left as they are, to provide
maximum debug info; but leaving PageBuddy set there might conceivably
allow this corrupted struct page to get paired up with its buddy later,
and so freed for reuse, when we're trying to make sure it's never reused.
Hugh
On Tue, Jun 16, 2009 at 07:29:45PM +0800, Hugh Dickins wrote:
> On Tue, 16 Jun 2009, Wu Fengguang wrote:
> >
> > Right. Then the original __ClearPageBuddy() call in bad_page() is
> > questionable, I guess this line was there just for the sake of safety
> > (ie. the buddy allocator itself goes wrong):
> >
> > sound-2.6/mm/page_alloc.c
> >
> > @@ -269,7 +269,6 @@ static void bad_page(struct page *page)
> > dump_stack();
> > out:
> > /* Leave bad fields for debug, except PageBuddy could make trouble */
> > ===> __ClearPageBuddy(page);
> > add_taint(TAINT_BAD_PAGE);
> > }
>
> I didn't put that in for the case of the buddy allocator going wrong
> (not sure if there could be such a case - I don't mean that the buddy
> allocator is provably perfect! but how would it get here if it were
> wrong?). No, I put that in for the case when the flag bits in struct
> page have themselves got corrupted somehow, and hence we arrive at
> bad_page(): most of the bits are best left as they are, to provide
> maximum debug info; but leaving PageBuddy set there might conceivably
> allow this corrupted struct page to get paired up with its buddy later,
> and so freed for reuse, when we're trying to make sure it's never reused.
Hugh, thank you for the detailed explanations! You are always informative :)
Thanks,
Fengguang