2002-03-25 00:56:51

by Arjan Opmeer

[permalink] [raw]
Subject: Anyone else seen VM related oops on 2.4.18?


Hi,

Are there other people that are suffering from a VM related oops on kernel
2.4.18?

The problem is that before I started to have this problem I upgraded both
the kernel to 2.4.18 but also the NVidia driver module to the latest
version.

Yes, I know that using a closed source module taints the kernel and that I
cannot expect much help from the kernel hacker community, but I am just
trying to find out whether the kernel or the driver upgrade is causing this
problem.

Another problem is that the oops occurs during the morning cronjobs, like
updatedb, and I am not at the machine to get a better oops trace. For anyone
that is interested to take a look anyway, this is what it looks like in the
log:

invalid operand: 0000
CPU: 0
EIP: 0010:[__free_pages_ok+40/500] Tainted: P
EFLAGS: 00010282
eax: c12bc79c ebx: c1311980 ecx: c1311980 edx: c01e1520
esi: c1311980 edi: 00000000 ebp: 00000b08 esp: c1437f28
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 4, stackpage=c1437000)
Stack: c2a0e440 c1311980 0000001e 00000b08 c1311980 000001d0 c2a0e440 c1311980
c0127a72 c01287f3 c0127aab 00000020 000001d0 00000020 00000006 00000006
c1436000 0000011a 000001d0 c01e1648 c0127ca1 00000006 0000001b 00000006
Call Trace: [shrink_cache+454/724] [__free_pages+27/28] [shrink_cache+511/724] [shrink_caches+93/132] [try_to_free_pages+52/84]
[kswapd_balance_pgdat+67/140] [kswapd_balance+18/40] [kswapd+153/188] [kernel_thread+40/56]

Code: 0f 0b 89 d8 2b 05 ec 8e 23 c0 c1 f8 06 3b 05 e0 8e 23 c0 72
invalid operand: 0000
CPU: 0
EIP: 0010:[__free_pages_ok+40/500] Tainted: P
EFLAGS: 00010282
eax: c1320d1c ebx: c13c33c0 ecx: c13c33c0 edx: c01e1520
esi: c13c33c0 edi: 00000000 ebp: 00000b12 esp: cdf67e20
ds: 0018 es: 0018 ss: 0018
Process find (pid: 31508, stackpage=cdf67000)
Stack: c2a0e140 c13c33c0 00000020 00000b12 c13c33c0 000001d2 c2a0e140 c13c33c0
c0127a72 c01287f3 c0127aab 00000020 000001d2 00000020 00000006 00000006
cdf66000 0000011b 000001d2 c01e1648 c0127ca1 00000006 0000001b 00000006
Call Trace: [shrink_cache+454/724] [__free_pages+27/28] [shrink_cache+511/724] [shrink_caches+93/132] [try_to_free_pages+52/84]
[balance_classzone+78/364] [__alloc_pages+254/352] [read_cache_page+61/288] [_alloc_pages+22/24] [read_cache_page+84/288] [ext2_get_page+29/112]
[ext2_readpage+0/20] [ext2_readdir+222/504] [vfs_readdir+89/124] [filldir64+0/276] [sys_getdents64+79/179] [filldir64+0/276]
[sys_fcntl64+127/136] [system_call+51/64]


2002-03-25 01:02:00

by Rik van Riel

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

On Mon, 25 Mar 2002, Arjan Opmeer wrote:

> Are there other people that are suffering from a VM related oops on kernel
> 2.4.18?
>
> ... NVidia driver ...
>
> ... I am just trying to find out whether the kernel or the driver
> upgrade is causing this problem.

Well, can you reproduce the problem without the NVidia driver ?

regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/

2002-03-25 01:12:04

by Arjan Opmeer

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

On Sun, Mar 24, 2002 at 10:01:09PM -0300, Rik van Riel wrote:
> On Mon, 25 Mar 2002, Arjan Opmeer wrote:
>
> > Are there other people that are suffering from a VM related oops on
> > kernel 2.4.18?
> >
> > ... I am just trying to find out whether the kernel or the driver
> > upgrade is causing this problem.
>
> Well, can you reproduce the problem without the NVidia driver ?

I am waiting... :)

It usually takes a week or so before the oops occurs. I am now running
without the NVidia driver and will see what happens...


Arjan

2002-03-25 01:54:19

by Andrew Morton

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

Arjan Opmeer wrote:
>
> On Sun, Mar 24, 2002 at 10:01:09PM -0300, Rik van Riel wrote:
> > On Mon, 25 Mar 2002, Arjan Opmeer wrote:
> >
> > > Are there other people that are suffering from a VM related oops on
> > > kernel 2.4.18?
> > >
> > > ... I am just trying to find out whether the kernel or the driver
> > > upgrade is causing this problem.
> >
> > Well, can you reproduce the problem without the NVidia driver ?
>
> I am waiting... :)
>

Please ensure that the kernel was built with `verbose BUG reporting',
under the kernel hacking menu.

And when it happens again, make sure that you take note
of the line number at which it's hitting the BUG(). It'll
be `Kernel BUG at page_alloc.c:NNN'.

-

2002-03-25 03:16:53

by Arjan Opmeer

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

On Sun, Mar 24, 2002 at 05:52:26PM -0800, Andrew Morton wrote:
>
> Please ensure that the kernel was built with `verbose BUG reporting',
> under the kernel hacking menu.

Okay. Just built a new kernel with that option activated. Had some troubles
with unresolved symbols in essential modules that only building from a
completely new untarred kernel tree seemed to solve.

Running that kernel now and waiting again... :)


Arjan

2002-03-27 10:06:37

by Stas Sergeev

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

Hello.

Arjan Opmeer wrote:
> Are there other people that are suffering from a VM related oops on kernel
> 2.4.18?
Yes:(
I've seen that oops 24/7 after installing a
new video card Radeon 7500 AGP.
Before I had PCI video card.
When DRI is enabled, the whole box hangs after
~10 minutes of using OpenGL, and if DRI disabled
and radeon.o is unloaded, I have a vm-related Oopses.
Exactly the same: invalid operand in __free_pages_ok().
I still have a lot of them in my system log, but I am
afraid I don't have the relevant System.map for
ksymoops'ing them.
I tried to switch to -ac tree and what I get is just
about the same Oops:
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0203.3/0308.html

BUT: Andrea vm patches seems to cure that!
I am only two days with them, so everything is
still possible, but before it used to Oops just
about every half an hour.
So thousands of thanks goes to Andrea:)

Btw, I've seen exactly the same reports in DRI
mailing lists and they were reported with different video
cards, but the similar thing was that all the reporters
has an AMD 751 Irongate as host bridge.
I also have it.
What is your north bridge?
This really seems strange for me that video card
or the host bridge causes vm to oops, but who knows...
Anyway, it is definitely not a nvidia drivers related:(

If anyone wants me to reproduce and ksymoops this
Oops, feel free to ask. I am ready to do just
about everything to get this problem fixed, else
I just can't use my new cool Radeon...

2002-03-27 11:12:30

by Mike Galbraith

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

On Wed, 27 Mar 2002, Stas Sergeev wrote:

> Hello.
>
> Arjan Opmeer wrote:
> > Are there other people that are suffering from a VM related oops on kernel
> > 2.4.18?
> Yes:(
> I've seen that oops 24/7 after installing a
> new video card Radeon 7500 AGP.
> Before I had PCI video card.
> When DRI is enabled, the whole box hangs after
> ~10 minutes of using OpenGL, and if DRI disabled
> and radeon.o is unloaded, I have a vm-related Oopses.

You can use dri with your 7500? Same processor as 8500 cards?
If so, which X sources are you using?

I bought an 8500 Evil Master II specifically because I saw Radeon
support in the kernel and X source. Much to my chagrin, I can't
use dri because the source (4.2.0) says dri is not yet implimented
for that processor.

-Mike

2002-03-27 12:29:51

by Stas Sergeev

[permalink] [raw]
Subject: Re: Anyone else seen VM related oops on 2.4.18?

Hello.

Mike Galbraith wrote:
> You can use dri with your 7500?
Yes. And it works really great until
either locks up the system or gets killed
by an Oops.

> Same processor as 8500 cards?
No idea.

> If so, which X sources are you using?
Mine works with 4.2.0's DRI, with latest
DRI from dri.sourceforge.net and with an
alternative drivers from gatos.sourceforge.net.
They all locks my system probably due to
an AMD Irongate.

> I bought an 8500 Evil Master II specifically because I saw Radeon
> support in the kernel and X source. Much to my chagrin, I can't
> use dri because the source (4.2.0) says dri is not yet implimented
> for that processor.
I was also confused by the fact that 7500
support doesn't exist in 4.1.0, but 4.2.0
really supports it.
Anyway, visit gatos.sourceforge.net and try
the drivers from there.