Subject: 2.4.19 BUG in page_alloc.c:91

ksymoops 2.4.4 on i686 2.4.19. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.19/ (default)
-m /boot/System.map-2.4.19 (default)

Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

Reading Oops report from the terminal
Aug 7 19:23:30 manic kernel: kernel BUG at page_alloc.c:89!
Aug 7 19:23:30 manic kernel: invalid operand: 0000
Aug 7 19:23:30 manic kernel: CPU: 0
Aug 7 19:23:30 manic kernel: EIP: 0010:[<c012c331>] Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386
Aug 7 19:23:30 manic kernel: EFLAGS: 00010286
Aug 7 19:23:30 manic kernel: eax: 00000000 ebx: c116980c ecx: c116980c edx: 00000000
Aug 7 19:23:30 manic kernel: esi: 00000000 edi: cc5a3e00 ebp: c2b3fae0 esp: e7fe1f44
Aug 7 19:23:30 manic kernel: ds: 0018 es: 0018 ss: 0018
Aug 7 19:23:30 manic kernel: Process kupdated (pid: 6, stackpage=e7fe1000)
Aug 7 19:23:30 manic kernel: Stack: e7fe0000 00000246 c1167b58 c0125a53 00000000 c116980c c116980c c116980c
Aug 7 19:23:30 manic kernel: c2b3fad0 00000000 c2b3fae0 c01256b0 c016b320 00000004 c2b3fa20 c2b3fa20
Aug 7 19:23:30 manic kernel: e7d86c60 c0143316 c2b3fad0 e7d86c00 e7fe1f9c e7fe1f9c 00000000 00000000
Aug 7 19:23:30 manic kernel: Call Trace: [<c0125a53>] [<c01256b0>] [<c016b320>] [<c0143316>] [<c0113490>]
Aug 7 19:23:30 manic kernel: [<c0135355>] [<c0135600>] [<c0105000>] [<c0105000>] [<c0107066>] [<c0135510>]
Aug 7 19:23:30 manic kernel: Code: 0f 0b 59 00 b4 74 25 c0 8b 4b 08 85 c9 74 08 0f 0b 5b 00 b4

>>EIP; c012c331 <__free_pages_ok+21/270> <=====
Trace; c0125a53 <___wait_on_page+b3/c0>
Trace; c01256b0 <filemap_fdatasync+80/90>
Trace; c016b320 <reiserfs_writepage+0/30>
Trace; c0143316 <sync_unlocked_inodes+96/170>
Trace; c0113490 <process_timeout+0/50>
Trace; c0135355 <sync_old_buffers+5/40>
Trace; c0135600 <kupdate+f0/110>
Trace; c0105000 <_stext+0/0>
Trace; c0105000 <_stext+0/0>
Trace; c0107066 <kernel_thread+26/30>
Trace; c0135510 <kupdate+0/110>
Code; c012c331 <__free_pages_ok+21/270>
00000000 <_EIP>:
Code; c012c331 <__free_pages_ok+21/270> <=====
0: 0f 0b ud2a <=====
Code; c012c333 <__free_pages_ok+23/270>
2: 59 pop %ecx
Code; c012c334 <__free_pages_ok+24/270>
3: 00 b4 74 25 c0 8b 4b add %dh,0x4b8bc025(%esp,%esi,2)
Code; c012c33b <__free_pages_ok+2b/270>
a: 08 85 c9 74 08 0f or %al,0xf0874c9(%ebp)
Code; c012c341 <__free_pages_ok+31/270>
10: 0b 5b 00 or 0x0(%ebx),%ebx
Code; c012c344 <__free_pages_ok+34/270>
13: b4 00 mov $0x0,%ah


1 warning issued. Results may not be reliable.


Attachments:
oops.txt (3.03 kB)

2002-08-08 02:47:09

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Wed, 7 Aug 2002, Anthony Russo., a.k.a. Stupendous Man wrote:

> My info: Pentium III PC, kernel 2.4.19 vanilla, redhat 7.3, reiserfs.
>
> It's not pretty. Let me know what I can do to help.
>
> Aug 7 19:23:30 manic kernel: kernel BUG at page_alloc.c:89!

> Aug 7 19:23:30 manic kernel: EIP: 0010:[<c012c331>] Tainted: P

Which proprietary module have you loaded ?

kind regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/

Subject: Re: 2.4.19 BUG in page_alloc.c:91

I don't believe I have any "proprietary" modules loaded, but
here is the output of lsmod:


Module Size Used by Tainted: P
nls_iso8859-1 3488 1 (autoclean)
nls_cp437 5120 1 (autoclean)
vfat 11804 1 (autoclean)
fat 36024 0 (autoclean) [vfat]
smbfs 36960 1 (autoclean)
ppp_async 7840 0 (unused)
sb 8992 1 (autoclean)
apm 11976 2
ppa 10688 0 (unused)
rtc 7484 0 (unused)
ppp_synctty 6176 1
ppp_generic 19212 3 [ppp_async ppp_synctty]
slhc 6028 0 [ppp_generic]
n_hdlc 7168 1
sb_lib 39456 0 [sb]
uart401 7744 0 [sb_lib]
sound 69292 1 [sb_lib uart401]
soundcore 6276 5 [sb_lib sound]
isa-pnp 36892 0 [sb]
nfsd 74496 8 (autoclean)
lockd 54976 1 (autoclean) [nfsd]
sunrpc 72628 1 (autoclean) [nfsd lockd]
parport_pc 17444 2 (autoclean)
lp 7872 0 (autoclean)
ipt_ttl 1152 1 (autoclean)
ipt_limit 1568 35 (autoclean)
ipt_unclean 7520 3 (autoclean)
ip_nat_irc 3680 0 (unused)
ip_nat_ftp 4448 0 (unused)
ipt_state 1088 7 (autoclean)
iptable_mangle 2688 0 (unused)
ipt_LOG 4160 1
ipt_MASQUERADE 2560 1
ipt_TOS 1568 0 (unused)
ipt_REDIRECT 1280 0 (unused)
iptable_nat 23860 3 [ip_nat_irc ip_nat_ftp ipt_MASQUERADE
ipt_REDIRECT]
ipt_REJECT 3488 0 (unused)
ip_conntrack_irc 3552 0 [ip_nat_irc]
ip_conntrack_ftp 4832 0 [ip_nat_ftp]
ip_conntrack 26636 4 [ip_nat_irc ip_nat_ftp ipt_state
ipt_MASQUERADE ipt_REDIRECT iptable_nat ip_conntrack_irc ip_conntrack_ftp]
iptable_filter 2368 1 (autoclean)
ip_tables 15808 14 [ipt_ttl ipt_limit ipt_unclean
ipt_state iptable_mangle ipt_LOG ipt_MASQUERADE ipt_TOS ipt_REDIRECT
iptable_nat ipt_REJECT iptable_filter]
fa312 5900 1
usb-uhci 24260 0 (unused)
usbcore 77344 1 [usb-uhci]


Rik van Riel wrote:

>On Wed, 7 Aug 2002, Anthony Russo., a.k.a. Stupendous Man wrote:
>
>
>
>> My info: Pentium III PC, kernel 2.4.19 vanilla, redhat 7.3, reiserfs.
>>
>>It's not pretty. Let me know what I can do to help.
>>
>>Aug 7 19:23:30 manic kernel: kernel BUG at page_alloc.c:89!
>>
>>
>
>
>
>>Aug 7 19:23:30 manic kernel: EIP: 0010:[<c012c331>] Tainted: P
>>
>>
>
>Which proprietary module have you loaded ?
>
>kind regards,
>
>Rik
>
>

-- tony


2002-08-08 07:52:21

by Willy Tarreau

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Wed, Aug 07, 2002 at 10:54:56PM -0400, Anthony Russo., a.k.a. Stupendous Man wrote:
> I don't believe I have any "proprietary" modules loaded, but
> here is the output of lsmod:

but a proprietary module *has* been loaded and then removed before your
lsmod, right ? wouldn't it be nvidia's ? the oops ressembles the one many
nvidia users experiment.

Regards,
Willy

2002-08-08 11:15:21

by Alan

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Thu, 2002-08-08 at 03:45, Anthony Russo., a.k.a. Stupendous Man
wrote:

> Aug 7 19:23:30 manic kernel: EIP: 0010:[<c012c331>] Tainted: P
> Aug 7 19:23:30 manic kernel: EFLAGS: 00010286

Dulplicate the problem from a cold boot without ever having loaded
whatever module set the taint flag (ie wasnt a standard GPL one)

2002-08-08 12:17:48

by Alan

[permalink] [raw]
Subject: Re: [OT] 2.4.19 BUG in page_alloc.c:91

On Thu, 2002-08-08 at 13:19, Andreas Steinmetz wrote:
> # insmod xircom_tulip_cb
> Using /lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o
> Warning: loading
> /lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o will taint the
> kernel: non-GPL license - GPL v2

Sounds like its missing the license tag - ok that would explain it. I'll
take a look at that

> That's no criticism of the nvidia handling (I don't own such a card), it is
> more the new "P=must be proprietary" attitude I don't exactly like.

Its the only way I can filter bugs efficiently enough. The alternative
is I only look at bug reports from Red Hat paying customers, and that
isnt helpful to anyone

2002-08-08 12:15:44

by Andreas Steinmetz

[permalink] [raw]
Subject: Re: [OT] 2.4.19 BUG in page_alloc.c:91

On 08 Aug 2002 13:38:56 +0100 Alan Cox <[email protected]> wrote:

Rik and Alan,
not exactly related but please don't be too persistent about "Tainted: P".
Just try to insmod xircom_tulip_cb of a stock 2.4.19 kernel and, surprise:

# insmod xircom_tulip_cb
Using /lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o
Warning: loading
/lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o will taint the
kernel: non-GPL license - GPL v2

The result of trying to load the module even with no card inserted results
in "Tainted: P" (modutils-2.4.14).

As long as the current stock kernel isn't completely compliant to the new
rules (e.g. drivers/net/pcmcia/xircom_tulip_cb.c) and as long as it cannot
be assumed that everybody is running the latest modutils it should not be
assumed that anybody has loaded a proprietary module by default.

That's no criticism of the nvidia handling (I don't own such a card), it is
more the new "P=must be proprietary" attitude I don't exactly like.


Subject: Re: 2.4.19 BUG in page_alloc.c:91

I'm sorry, but I don't have an NVidia card, and I definitely didn't
load and unload that module. It must be something else.

AFAIK, the only proprietary module that is loaded is the fa312,
which is a driver for my ethernet card, which is still loaded
and has never caused any problems for the 1.5 years I've used it.

This problem is new with the 2.4.19 kernel.

-- tony

Willy Tarreau wrote:

>On Wed, Aug 07, 2002 at 10:54:56PM -0400, Anthony Russo., a.k.a. Stupendous Man wrote:
>
>
>> I don't believe I have any "proprietary" modules loaded, but
>>here is the output of lsmod:
>>
>>
>
>but a proprietary module *has* been loaded and then removed before your
>lsmod, right ? wouldn't it be nvidia's ? the oops ressembles the one many
>nvidia users experiment.
>
>Regards,
>Willy
>
>
>
>

--
"Surrender to the Void."
<http://162.83.145.190:8080/%7Eapr/surrenderToTheVoid.mp3> -- John Lennon


Subject: Re: 2.4.19 BUG in page_alloc.c:91


Is there a way to tell which module it is that is setting the taint flag?
I can load each module one by one and check after each if the taint flag
is set, but
I just need to know how to tell it is set.

Once I can do that (assuming it's a module I can live without) I will
duplicate from cold boot.

Thank you.

>
>Dulplicate the problem from a cold boot without ever having loaded
>whatever module set the taint flag (ie wasnt a standard GPL one)
>
>
>
>

-- tony
"Surrender to the Void."
<http://162.83.145.190:8080/%7Eapr/surrenderToTheVoid.mp3> -- John Lennon


2002-08-08 16:03:50

by Willy Tarreau

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Thu, Aug 08, 2002 at 11:45:15AM -0400, Anthony Russo., a.k.a. Stupendous Man wrote:

> Is there a way to tell which module it is that is setting the taint flag?
> I can load each module one by one and check after each if the taint flag
> is set, but I just need to know how to tell it is set.

Modinfo could help you by telling you the licence for each module.
In the worst case, manually unload them all, and reload them one at a time.
Modprobe will issue a warning when loading such a module.

BTW, my apologies for doubting about a removed nvidia driver ;-)

Regards,
Willy

Subject: Re: 2.4.19 BUG in page_alloc.c:91

Thank you very much. As I suspected, the tainted driver is
the one I use for my NetGear ethernet card (not nvidia :)
I have switched to using the standard netgear driver that
comes with linux and won't taint the kernel. I am now
rebooting and if the problem reoccurs I will follow up
with an email.

Thank you all for the support.

-- tony

Willy Tarreau wrote:

>On Thu, Aug 08, 2002 at 11:45:15AM -0400, Anthony Russo., a.k.a. Stupendous Man wrote:
>
>
>
>>Is there a way to tell which module it is that is setting the taint flag?
>>I can load each module one by one and check after each if the taint flag
>>is set, but I just need to know how to tell it is set.
>>
>>
>
>Modinfo could help you by telling you the licence for each module.
>In the worst case, manually unload them all, and reload them one at a time.
>Modprobe will issue a warning when loading such a module.
>
>BTW, my apologies for doubting about a removed nvidia driver ;-)
>
>Regards,
>Willy
>
>
>
>

--
"Surrender to the Void."
<http://162.83.145.190:8080/%7Eapr/surrenderToTheVoid.mp3> -- John Lennon


2002-08-08 18:20:41

by Alan

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Thu, 2002-08-08 at 16:42, Anthony Russo., a.k.a. Stupendous Man
wrote:
> AFAIK, the only proprietary module that is loaded is the fa312,
> which is a driver for my ethernet card, which is still loaded
> and has never caused any problems for the 1.5 years I've used it.

You should be able to swap the fa312 driver for the matching open source
driver anyway. If I remember rightly isnt the 312 a realtek ?

Subject: Re: 2.4.19 BUG in page_alloc.c:91

It's a national semi chip ... the module name is natsemi.o,
and indeed I am using it without incident now in place of fa312.o.

All of my modules are now GPL according to modinfo, so if the problem
reoccurs now that I've rebooted we'll know if it's real.

-- tony

Alan Cox wrote:

>On Thu, 2002-08-08 at 16:42, Anthony Russo., a.k.a. Stupendous Man
>wrote:
>
>
>>AFAIK, the only proprietary module that is loaded is the fa312,
>>which is a driver for my ethernet card, which is still loaded
>>and has never caused any problems for the 1.5 years I've used it.
>>
>>
>
>You should be able to swap the fa312 driver for the matching open source
>driver anyway. If I remember rightly isnt the 312 a realtek ?
>
>
>
>

--
"Surrender to the Void."
<http://162.83.143.240:8080/%7Eapr/surrenderToTheVoid.mp3> -- John Lennon


2002-08-08 18:41:18

by Alan

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

On Thu, 2002-08-08 at 19:26, Anthony Russo., a.k.a. Stupendous Man
wrote:
> It's a national semi chip ... the module name is natsemi.o,
> and indeed I am using it without incident now in place of fa312.o.
>
> All of my modules are now GPL according to modinfo, so if the problem
> reoccurs now that I've rebooted we'll know if it's real.

Not only is it real, but I have every line of code on your box in front
of me if it does happen with that module. That helps no end in bug
hunting

2002-08-09 09:11:30

by Keith Owens

[permalink] [raw]
Subject: Re: [OT] 2.4.19 BUG in page_alloc.c:91

On Thu, 8 Aug 2002 14:19:48 +0200 (CEST),
Andreas Steinmetz <[email protected]> wrote:
>On 08 Aug 2002 13:38:56 +0100 Alan Cox <[email protected]> wrote:
>Rik and Alan,
>not exactly related but please don't be too persistent about "Tainted: P".
>Just try to insmod xircom_tulip_cb of a stock 2.4.19 kernel and, surprise:
># insmod xircom_tulip_cb
>Using /lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o
>Warning: loading
>/lib/modules/2.4.19/kernel/drivers/net/pcmcia/xircom_tulip_cb.o will taint the
>kernel: non-GPL license - GPL v2

Upgrade to modutils >= 2.4.17. Somebody added 'GPL v2' to a module
license without checking if it was acceptable or not. The definitive
list of acceptable licenses is in include/linux/module.h.

2002-08-14 15:15:38

by Ingo Saitz

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

I got the same bug this morning. Untainted 2.4.19 vanilla kernel.
I just noticed it because my kernel was creating Zombies ~10 hours
later.

My system: Debian unstable, P2 Celeron (Mendocino), ext3, bttv card,
lirc 0.6.5 and ati.2 driver from gatos.sf.net installed (does not
require kernel modules).

Aug 14 05:42:31 pinguin kernel: kernel BUG at page_alloc.c:91!
Aug 14 05:42:31 pinguin kernel: invalid operand: 0000
Aug 14 05:42:31 pinguin kernel: CPU: 0
Aug 14 05:42:31 pinguin kernel: EIP: 0010:[__free_pages_ok+45/612] Not tainted
Aug 14 05:42:31 pinguin kernel: EFLAGS: 00010282
Aug 14 05:42:31 pinguin kernel: eax: c1301bc0 ebx: c100ffdc ecx: c100ffdc edx: c02cfc40
Aug 14 05:42:31 pinguin kernel: esi: 00000000 edi: 00000011 ebp: 00000200 esp: c15c7f18
Aug 14 05:42:31 pinguin kernel: ds: 0018 es: 0018 ss: 0018
Aug 14 05:42:31 pinguin kernel: Process kswapd (pid: 5, stackpage=c15c7000)
Aug 14 05:42:31 pinguin kernel: Stack: d3bfb620 c100ffdc 00000011 00000200 c01350dc c100ffdc 000001d0 00000011
Aug 14 05:42:31 pinguin kernel: 00000200 c013365c d3bfb620 c100ffdc c012b652 c012c583 c012b68b 00000020
Aug 14 05:42:31 pinguin kernel: 000001d0 00000020 00000006 00000006 c15c6000 00002a19 000001d0 c02cfdd4
Aug 14 05:42:31 pinguin kernel: Call Trace: [try_to_free_buffers+144/228] [try_to_release_page+68/72] [shrink_cache+474/748] [__free_pages+27/28] [shrink_cache+531/748]
Aug 14 05:42:31 pinguin kernel: [shrink_caches+88/128] [try_to_free_pages+55/88] [kswapd_balance_pgdat+67/140] [kswapd_balance+18/40] [kswapd+153/188] [kernel_thread+40/56]
Aug 14 05:42:31 pinguin kernel:
Aug 14 05:42:31 pinguin kernel: Code: 0f 0b 5b 00 d3 b3 27 c0 89 d8 2b 05 b0 14 33 c0 69 c0 a3 8b

Ingo
--
echo "I love windows" | tr wisd \\buxi

2002-08-27 12:43:31

by Salvador Eduardo Tropea

[permalink] [raw]
Subject: Re: 2.4.19 BUG in page_alloc.c:91

I can?t find a final conclusion about this topic.
What I found is that it happends some hours after the kernel killed a
process that was eating the memory.
Example:

Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 18432
(mozilla-bin).
Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 18443
(mozilla-bin).
Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 18444
(mozilla-bin).
Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 18445
(mozilla-bin).
Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 18447
(mozilla-bin).
Aug 26 13:07:22 ice kernel: Out of Memory: Killed process 30410
(mozilla-bin).
....
Aug 27 06:26:00 ice kernel: kernel BUG at page_alloc.c:91!
Aug 27 06:26:00 ice kernel: invalid operand: 0000
Aug 27 06:26:00 ice kernel: CPU: 0
Aug 27 06:26:00 ice kernel: EIP: 0010:[__free_pages_ok+45/624]
Tainted: P
Aug 27 06:26:00 ice kernel: EFLAGS: 00010286
Aug 27 06:26:00 ice kernel: eax: c103a528 ebx: c1105760 ecx:
c1105760 edx: c0228e9c
Aug 27 06:26:00 ice kernel: esi: 00000000 edi: 00000016 ebp:
0000005e esp: c7f93f18
Aug 27 06:26:00 ice kernel: ds: 0018 es: 0018 ss: 0018
Aug 27 06:26:00 ice kernel: Process kswapd (pid: 4, stackpage=c7f93000)
Aug 27 06:26:00 ice kernel: Stack: c22d28c0 c1105760 00000016 0000005e
c013755c c1105760 00000
1d0 00000016
Aug 27 06:26:00 ice kernel: 0000005e c0135a09 c22d28c0 c1105760
c012d482 c012e3fb c012d
4bb 00000020
Aug 27 06:26:00 ice kernel: 000001d0 00000020 00000006 00000006
c7f92000 000003a4 00000
1d0 c0229034
Aug 27 06:26:00 ice kernel: Call Trace: [try_to_free_buffers+140/224]
[try_to_release_page+
73/80] [shrink_cache+482/768] [__free_pages+27/32] [shrink_cache+539/768]
Aug 27 06:26:00 ice kernel: [shrink_caches+86/128]
[try_to_free_pages+60/96] [kswapd_balance
_pgdat+67/144] [kswapd_balance+22/48] [kswapd+157/192] [kernel_thread+40/64]

At 6:25 the cron started the regular locate data base update and other
tasks I guess it made the kernel reduce the size of the cache and fail.
I saw reports it happends with "non-tainted" kernels and with or without
nVidia driver, so perhaps that?s a real bug in the mechanism used by the
kernel to kill a process that is eating all the memory introduced in 2.4.19.
I can fill a full bug report but perhaps any of you know about the
conclusion of this thread and I?m just too stupid to find it.
Note that it doesn?t happend if the kernel doesn?t kill mozilla-bin for
days, when kernel kills this process is just a matter of hours before it
and from this point the cache size won?t be reduced.

SET

--
Salvador Eduardo Tropea (SET). (Electronics Engineer)
Visit my home page: http://welcome.to/SetSoft or
http://www.geocities.com/SiliconValley/Vista/6552/
Alternative e-mail: [email protected] [email protected]
Address: Curapaligue 2124, Caseros, 3 de Febrero
Buenos Aires, (1678), ARGENTINA Phone: +(5411) 4759 0013