[PATCH]: fix the bug of __free_pages() of mm/page_alloc.c
=========================================================
The buddy allocator's __free_pages() function seems to be buggy.
The following codes are from kernel 2.6.10:
fastcall void __free_pages(struct page *page, unsigned int order)
{
if (!PageReserved(page) && put_page_testzero(page)) {
if (order == 0)
free_hot_page(page);
else
__free_pages_ok(page, order);
}
}
As you know, before truely freeing all pages, this function calls
put_page_testzero(page) to
drop the refcount of the pages.
But, in fact the macro put_page_testzero(page) **only** drops **one**
page's refcount.
Therefore, if (order > 0), the refcounts of (page+1) ..
(page+(1<<order)-1) are unchanged!
This will cause __free_pages_ok() to dump stack, because it finds some
pages' page_count()
are not zero!
The following is the OOPS of this bug (I make kernel print verbose info):
slab cache: size-32768 is to be destroyed!
__free_pages: order=3, we are freeing page: 001d8740-001d8858
page 001d8740 count: 1
page 001d8768 count: 1
page 001d8790 count: 1
page 001d87b8 count: 1
page 001d87e0 count: 1
page 001d8808 count: 1
page 001d8830 count: 1
page 001d8858 count: 1
put_page_testzero() has been called!
page 001d8740 count: 0
page 001d8768 count: 1
page 001d8790 count: 1
page 001d87b8 count: 1
page 001d87e0 count: 1
page 001d8808 count: 1
page 001d8830 count: 1
page 001d8858 count: 1
Bad page state at __free_pages_ok (in process 'hello', mem_map
00139000, order 3, page 001d8768, i
1)
flags:0x20000000 mapping:00000000 mapped:0 count:1
Backtrace:
Call Trace:
[<00034b9c>] __free_pages_ok+0x12c/0x1f0
[<000358d0>] __free_pages+0xd0/0x1a0
[<000359f4>] free_pages+0x54/0x70
[<00038df4>] slab_destroy+0x114/0x1e0
[<0003a0b4>] reap_timer_fnc+0x194/0x300
[<0001e058>] do_timer+0x58/0x600
[<0001de7c>] run_timer_softirq+0x10c/0x200
[<0001ace4>] do_softirq+0xb4/0x140
[<00124548>] init_bio+0x8/0x200
[<0000be68>] do_IRQ+0x208/0x280
[<00124548>] init_bio+0x8/0x200
[<000e22f8>] handle_IRQ+0x58/0x60
[<000d7344>] blkdev_ioctl+0x304/0x8f0
[<00122ca0>] free_bootmem_core+0xa0/0xf0
[<00124548>] init_bio+0x8/0x200
[<0001101c>] try_to_wake_up+0x14c/0x250
[<00011140>] wake_up_process+0x20/0x30
[<000370a0>] pdflush_operation+0xc0/0xe0
[<00036ab4>] wb_timer_fn+0x24/0x70
[<0001de7c>] run_timer_softirq+0x10c/0x200
[<00123a88>] free_area_init_node+0x158/0x450
[<00090bb8>] group_reserve_blocks+0x8/0x90
[<0000ee00>] do_gettimeofday+0xb0/0x160
......
Please confirm me! This is a patch to fix this bug:
---------------------------------------------------
--- linux-2.6.10.orig/mm/page_alloc.c 2004-12-25 05:33:51.000000000 +0800
+++ linux-2.6.10/mm/page_alloc.c 2005-01-20 21:31:07.000000000 +0800
@@ -786,9 +786,19 @@
free_hot_cold_page(pvec->pages[i], pvec->cold);
}
+static inline int drop_pages_testzero(struct page *page, unsigned int order)
+{
+ int i, ret = 1;
+ struct page *pg = page;
+
+ for (i = 0; i < (1 << order); i++, pg++)
+ ret &= put_page_testzero(pg);
+ return ret;
+}
+
fastcall void __free_pages(struct page *page, unsigned int order)
{
- if (!PageReserved(page) && put_page_testzero(page)) {
+ if (!PageReserved(page) && drop_pages_testzero(page, order)) {
if (order == 0)
free_hot_page(page);
else
--
Rongkai Zhan
On Thu, Jan 20, 2005 at 09:34:17PM +0800, zhan rongkai wrote:
> [PATCH]: fix the bug of __free_pages() of mm/page_alloc.c
> =========================================================
>
> The buddy allocator's __free_pages() function seems to be buggy.
>
> The following codes are from kernel 2.6.10:
>
> fastcall void __free_pages(struct page *page, unsigned int order)
> {
> if (!PageReserved(page) && put_page_testzero(page)) {
> if (order == 0)
> free_hot_page(page);
> else
> __free_pages_ok(page, order);
> }
> }
>
> As you know, before truely freeing all pages, this function calls
> put_page_testzero(page) to
> drop the refcount of the pages.
>
> But, in fact the macro put_page_testzero(page) **only** drops **one**
> page's refcount.
> Therefore, if (order > 0), the refcounts of (page+1) ..
> (page+(1<<order)-1) are unchanged!
> This will cause __free_pages_ok() to dump stack, because it finds some
> pages' page_count()
> are not zero!
When you allocate a page with order > 0, the first 0-order page has a
refcount of 1, and the remaining 0-order pages have a refcount of 0.
If you're triggering this check, I suspect you're fiddling about with
the individual pages (using get_page on them individually?) which is
a no-no.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core
On Thu, 20 Jan 2005 14:31:34 +0000, Russell King
<[email protected]> wrote:
> On Thu, Jan 20, 2005 at 09:34:17PM +0800, zhan rongkai wrote:
> > [PATCH]: fix the bug of __free_pages() of mm/page_alloc.c
> > =========================================================
> >
> > The buddy allocator's __free_pages() function seems to be buggy.
> >
> > The following codes are from kernel 2.6.10:
> >
> > fastcall void __free_pages(struct page *page, unsigned int order)
> > {
> > if (!PageReserved(page) && put_page_testzero(page)) {
> > if (order == 0)
> > free_hot_page(page);
> > else
> > __free_pages_ok(page, order);
> > }
> > }
> >
> > As you know, before truely freeing all pages, this function calls
> > put_page_testzero(page) to
> > drop the refcount of the pages.
> >
> > But, in fact the macro put_page_testzero(page) **only** drops **one**
> > page's refcount.
> > Therefore, if (order > 0), the refcounts of (page+1) ..
> > (page+(1<<order)-1) are unchanged!
> > This will cause __free_pages_ok() to dump stack, because it finds some
> > pages' page_count()
> > are not zero!
>
> When you allocate a page with order > 0, the first 0-order page has a
> refcount of 1, and the remaining 0-order pages have a refcount of 0.
Thank you for telling me this point.
> If you're triggering this check, I suspect you're fiddling about with
> the individual pages (using get_page on them individually?) which is
> a no-no.
>
> --
> Russell King
>
Oh, I forget to tell you that my CPU has no MMU, sorry:-)
Let's see the function set_page_refs() which is called by
prep_new_page() function:
static inline void set_page_refs(struct page *page, int order)
{
#ifdef CONFIG_MMU
set_page_count(page, 1);
#else
int i;
/*
* We need to reference all the pages for this order, otherwise if
* anyone accesses one of the pages with (get/put) it will be freed.
*/
for (i = 0; i < (1 << order); i++)
set_page_count(page+i, 1);
#endif /* CONFIG_MMU */
}
We can see that it sets all pages' refcount to 1 when there is no MMU.
My previous patch is wrong. Here is new one:
--------------------------------------------
--- linux-2.6.10.orig/mm/page_alloc.c 2004-12-25 05:33:51.000000000 +0800
+++ linux-2.6.10/mm/page_alloc.c 2005-01-21 11:34:57.000000000 +0800
@@ -787,8 +787,23 @@
}
fastcall void __free_pages(struct page *page, unsigned int order)
-{
- if (!PageReserved(page) && put_page_testzero(page)) {
+{
+ if (!PageReserved(page)) {
+#ifdef CONFIG_MMU
+ if (!put_page_testzero(page))
+ return;
+#else
+ int i, result = 1;
+
+ /*
+ * We need to de-reference all the pages for this order -- see
set_page_refs()
+ */
+ for (i = 0; i < (1 << order); i++)
+ result &= put_page_testzero(page);
+ if (!result)
+ BUG();
+#endif /* CONFIG_MMU */
+
if (order == 0)
free_hot_page(page);
else
--
Rongkai Zhan
--- linux-2.6.10.orig/mm/page_alloc.c 2004-12-25 05:33:51.000000000 +0800
+++ linux-2.6.10/mm/page_alloc.c 2005-01-21 11:43:44.000000000 +0800
@@ -788,7 +788,22 @@
fastcall void __free_pages(struct page *page, unsigned int order)
{
- if (!PageReserved(page) && put_page_testzero(page)) {
+ if (!PageReserved(page)) {
+#ifdef CONFIG_MMU
+ if (!put_page_testzero(page))
+ return;
+#else
+ int i, result = 1;
+
+ /*
+ * We need to de-reference all the pages for this order -- see
set_page_refs()
+ */
+ for (i = 0; i < (1 << order); i++)
+ result &= put_page_testzero(page+i);
+ if (!result)
+ BUG();
+#endif /* CONFIG_MMU */
+
if (order == 0)
free_hot_page(page);
else
--
Rongkai Zhan
--- linux-2.6.10.orig/mm/page_alloc.c 2004-12-25 05:33:51.000000000 +0800
+++ linux-2.6.10/mm/page_alloc.c 2005-01-21 11:46:58.000000000 +0800
@@ -788,7 +788,22 @@
fastcall void __free_pages(struct page *page, unsigned int order)
{
- if (!PageReserved(page) && put_page_testzero(page)) {
+ if (!PageReserved(page)) {
+#ifdef CONFIG_MMU
+ if (!put_page_testzero(page))
+ return;
+#else
+ int i, result = 1;
+
+ /*
+ * We need to de-reference all the pages for this order -- see
set_page_refs()
+ */
+ for (i = 0; i < (1 << order); i++)
+ result &= put_page_testzero(page+i);
+ if (!result)
+ BUG();
+#endif /* CONFIG_MMU */
+
if (order == 0)
free_hot_page(page);
else
On Fri, 21 Jan 2005 11:40:52 +0800, zhan rongkai <[email protected]> wrote:
> --- linux-2.6.10.orig/mm/page_alloc.c 2004-12-25 05:33:51.000000000 +0800
> +++ linux-2.6.10/mm/page_alloc.c 2005-01-21 11:43:44.000000000 +0800
> @@ -788,7 +788,22 @@
>
> fastcall void __free_pages(struct page *page, unsigned int order)
> {
> - if (!PageReserved(page) && put_page_testzero(page)) {
> + if (!PageReserved(page)) {
> +#ifdef CONFIG_MMU
> + if (!put_page_testzero(page))
> + return;
> +#else
> + int i, result = 1;
> +
> + /*
> + * We need to de-reference all the pages for this order -- see
> set_page_refs()
> + */
> + for (i = 0; i < (1 << order); i++)
> + result &= put_page_testzero(page+i);
> + if (!result)
> + BUG();
> +#endif /* CONFIG_MMU */
> +
> if (order == 0)
> free_hot_page(page);
> else
>
> --
> Rongkai Zhan
>
--
Rongkai Zhan