Hi,
I experienced several kernel BUGs while running the linux kernel version
2.4.19
on a single cpu s390 machine with 2GB RAM and 256MB of swap space. All of
these
BUGs happened at page_alloc.c in the function __free_pages_ok. In that
function
there is the check
if (page->mapping) BUG();
which is exactly what happened. A page had a mapping but __free_pages_ok()
got
called anyway. Looking at the backtrace I was able to see that this
specific
BUG() occurred when page_cache_release() was called from the function
try_to_swap_out().
Looks to me that this function itself has a bug: after the drop_pte label
it is
checked if the current page has a mapping. If this is true there is a jump
to
the drop_pte label, where without any further checking
page_cache_release() gets
called which will result in the above described BUG() if page_count(page)
== 1.
Here is the output of the kernel (I removed all inline statements in
vmscan.c):
kernel BUG at page_alloc.c:91!
illegal operation: 0001
CPU: 0 Not tainted
80042730 00000001 013c578c 6ce26e00
00000020 575a0001 6ce26e00 00000000
013c578c 80042388 80042730 6c7e13c8
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
Call Trace: [<000430d2>] [<00040eec>] [<00041088>] [<00041132>]
[<000411da>] [<000412cc>] [<000413b0>] [<00041646>]
Warning (Oops_read): Code line not seen, dumping what data is available
Trace; 000430d2 <__free_pages+52/58>
Trace; 00040eec <try_to_swap_out+224/284>
Trace; 00041088 <swap_out_pmd+13c/178>
Trace; 00041132 <swap_out_pgd+6e/a0>
Trace; 000411da <swap_out_vma+76/bc>
Trace; 000412cc <swap_out_mm+ac/d0>
Trace; 000413b0 <swap_out+c0/150>
Trace; 00041646 <shrink_cache+206/5c8>
regards,
Heiko
On Monday 02 September 2002 10:26, Heiko Carstens wrote:
> Looks to me that this function itself has a bug: after the drop_pte label
> it is
> checked if the current page has a mapping. If this is true there is a jump
> to
> the drop_pte label, where without any further checking
> page_cache_release() gets
> called which will result in the above described BUG() if page_count(page)
It's not a bug in itself. The pte was cleared just above, so the reference
being dropped corresponds to the pte that was cleared. Because the page
has a mapping, there is still at least one count on the page that got there
when the page was put in the page cache, so the page won't be freed just
yet. (No, this code is not a model of clarity.)
Chances are, you've run into the subtle double-free race I've been working
on for the last few days. Would you like to try this patch as see if it
makes a difference?
http://nl.linux.org/~phillips/patches/lru.race-2.4.19
--
Daniel
Hi Daniel,
>> Looks to me that this function itself has a bug: after the drop_pte
label
>> it is checked if the current page has a mapping. If this is true there
is
>> ...
>Chances are, you've run into the subtle double-free race I've been
working
>on for the last few days. Would you like to try this patch as see if it
>makes a difference?
>http://nl.linux.org/~phillips/patches/lru.race-2.4.19
Thanks for the patch but unfortunately it doesn't change the behaviour at
all. This BUG is still 100% reproducible by just having 1 process which
allocates memory chunks of 256KB and after each allocation writes to each
of the pages in order to make them dirty.
regards,
Heiko
On Monday 02 September 2002 14:54, Heiko Carstens wrote:
> Hi Daniel,
>
> >> Looks to me that this function itself has a bug: after the drop_pte
> label
> >> it is checked if the current page has a mapping. If this is true there
> is
> >> ...
> >Chances are, you've run into the subtle double-free race I've been
> working
> >on for the last few days. Would you like to try this patch as see if it
> >makes a difference?
> >http://nl.linux.org/~phillips/patches/lru.race-2.4.19
>
> Thanks for the patch but unfortunately it doesn't change the behaviour at
> all. This BUG is still 100% reproducible by just having 1 process which
> allocates memory chunks of 256KB and after each allocation writes to each
> of the pages in order to make them dirty.
Um, no smp --> no free race anyway. But try the following instead, to
start narrowing down the possibilities:
--- ./vmscan.c 2002-09-02 21:15:17.000000000 +0200
+++ mm/vmscan.c 2002-09-02 21:33:24.000000000 +0200
@@ -82,7 +82,7 @@
*/
if (PageSwapCache(page)) {
entry.val = page->index;
- swap_duplicate(entry);
+ BUG_ON(!swap_duplicate(entry));
set_swap_pte:
set_pte(page_table, swp_entry_to_pte(entry));
drop_pte:
@@ -109,8 +109,10 @@
* Basically, this just makes it possible for us to do
* some real work in the future in "refill_inactive()".
*/
- if (page->mapping)
+ if (page->mapping) {
+ BUG_ON( page_count(page) == 1);
goto drop_pte;
+ }
if (!PageDirty(page))
goto drop_pte;
Hi,
>> Thanks for the patch but unfortunately it doesn't change the behaviour
at
>> all. This BUG is still 100% reproducible by just having 1 process which
>> allocates memory chunks of 256KB and after each allocation writes to
each
>> of the pages in order to make them dirty.
>Um, no smp --> no free race anyway. But try the following instead, to
>start narrowing down the possibilities:
Still the same BUG in __free_pages_ok happens, or in other words both of
your
checks didn't catch the error...
Any other ideas?
Regards,
Heiko
On Tuesday 03 September 2002 19:16, Heiko Carstens wrote:
> Hi,
>
> >> Thanks for the patch but unfortunately it doesn't change the behaviour
> at
> >> all. This BUG is still 100% reproducible by just having 1 process which
> >> allocates memory chunks of 256KB and after each allocation writes to
> each
> >> of the pages in order to make them dirty.
> >Um, no smp --> no free race anyway. But try the following instead, to
> >start narrowing down the possibilities:
>
> Still the same BUG in __free_pages_ok happens, or in other words both of
> your
> checks didn't catch the error...
My intention was to verify which one of the two possible execution paths
was taken, and also to verify that swap_duplicate doesn't see any problem
(there's a missing error check here). Note that we also definitively
eliminated your original theory since we didn't arrive at the
page_cache_release via the if (page->mapping) path.
> Any other ideas?
Have you trimmed your config down to the absolute minimum?
Is there any such thing as kdb for S390?
--
Daniel