2002-01-24 19:06:26

by Bernd Schubert

[permalink] [raw]
Subject: kernel BUG at slab.c:1200!

Hi all,

we have a machine here that runs quite instable, with 2.2.16 it was crashing quite often and now after a big system
update it also crashes with 2.4.17.
But at least /var/log/messages shows the following errors:

Jan 24 17:16:19 noether kernel: kernel BUG at slab.c:1200!
Jan 24 17:16:19 noether kernel: invalid operand: 0000
Jan 24 17:16:19 noether kernel: CPU: 1
Jan 24 17:16:19 noether kernel: EIP: 0010:[<c0137421>] Not tainted
Jan 24 17:16:19 noether kernel: EFLAGS: 00010082
Jan 24 17:16:19 noether kernel: eax: 0000001b ebx: c3217760 ecx: c03235ac edx: 000118e7
Jan 24 17:16:19 noether kernel: esi: 00000000 edi: f764976c ebp: f7634013 esp: f7bfbf10
Jan 24 17:16:19 noether kernel: ds: 0018 es: 0018 ss: 0018
Jan 24 17:16:19 noether kernel: Process kswapd (pid: 5, stackpage=f7bfb000)
Jan 24 17:16:19 noether kernel: Stack: c02b9ff7 000004b0 f764976c f7634013 f7635013 c3217760 c0138780 c3217760
Jan 24 17:16:19 noether kernel: f764976c f7634013 00000020 000001d0 00000020 00000006 f7ba2000 00001000
Jan 24 17:16:19 noether kernel: f7634013 c3217784 c03a9bc8 c3217770 00000006 c32231a4 c322316c 00000000
Jan 24 17:16:19 noether kernel: Call Trace: [<c0138780>] [<c013a1d8>] [<c013a27c>] [<c013a313>] [<c013a36e>]
Jan 24 17:16:19 noether kernel: [<c013a47d>] [<c01056e4>]
Jan 24 17:16:19 noether kernel:
Jan 24 17:16:19 noether kernel: Code: 0f 0b 83 c4 08 8b 5f 14 83 fb ff 74 23 89 f6 39 f3 75 14 68

Machine is dual processor PentiumIII-800 (kernel booting protocol is attached).

About half an hour before the crash we got the following message:

Jan 24 16:35:00 noether kernel: mtrr: Serverworks LE detected. Write-combining disabled.
Jan 24 16:35:00 noether kernel: mtrr: your processor doesn't support write-combining
Jan 24 16:35:00 noether kernel: mtrr: Serverworks LE detected. Write-combining disabled.
Jan 24 16:35:00 noether kernel: mtrr: your processor doesn't support write-combining

Currently I'm doing a memory check but I'm not sure if memtest86 is able to detect any errors, since ECC-RAM is installed.
The machine has 2GB RAM + 1GB SWAP, I hope that I have compiled it with 4GB support
(but the line from boot.msg "1023MB HIGHMEM available" confuses me).
The machine was crashing when a memory allocation test was running.

Perhaps one of you can help me to debug the problem (please tell me if you need further information).

Thanks in advance,

Bernd


--
Bernd Schubert
Physikalisch Chemisches Institut
Abt. Theoretische Chemie
INF 229, 69120 Heidelberg
Tel.: 06221/54-5210
e-mail: [email protected]



Attachments:
boot.msg.gz (4.29 kB)

2002-01-24 19:40:58

by Bernd Schubert

[permalink] [raw]
Subject: Re: kernel BUG at slab.c:1200!

On Thursday 24 January 2002 20:17, you wrote:
> > Jan 24 17:16:19 noether kernel: kernel BUG at slab.c:1200!
>
> so the slab is corrupted. you need to decode the oops,
> so the backtrace can be seen. this is described in the faq.

O.k. I'll try to do it tomorrow, but I think its a RAM problem, memtest86 just found several errors and it hasn't finished yet.

Bernd

2002-01-25 13:44:56

by Robert Love

[permalink] [raw]
Subject: Re: kernel BUG at slab.c:1200!

On Thu, 2002-01-24 at 14:06, Bernd Schubert wrote:

> we have a machine here that runs quite instable, with 2.2.16 it was crashing quite often and now after a big system
> update it also crashes with 2.4.17.
> But at least /var/log/messages shows the following errors:

Hi. The BUG is caused by extra checks on free present by the slab
debugging you have enabled. I suspect you wouldn't notice a thing if
you turned debugging off ;)

That said, you need to run the oops through ksymoops so we can get a
back trace and see who the offending caller is.

Robert Love

2002-01-28 16:51:41

by Bernd Schubert

[permalink] [raw]
Subject: Re: kernel BUG at slab.c:1200!

Hi,

we have removed a buggy RAM module but have other problems now (and don't see
the reasing why -- lots of kernel messages about defect hardware and doubly
reserved IRQ's; though its booting the kernel, but becomes very, very slow
now; I think its time to call the warranty hotline for a hardware damage).
Sorry but currently the machine is definetely not able to do a kernel
debugging. When its up and running again, we will try to run ksymoops.

Thanks for your help,

Bernd

On Friday 25 January 2002 14:49, Robert Love wrote:
> On Thu, 2002-01-24 at 14:06, Bernd Schubert wrote:
> > we have a machine here that runs quite instable, with 2.2.16 it was
> > crashing quite often and now after a big system update it also crashes
> > with 2.4.17.
> > But at least /var/log/messages shows the following errors:
>
> Hi. The BUG is caused by extra checks on free present by the slab
> debugging you have enabled. I suspect you wouldn't notice a thing if
> you turned debugging off ;)
>
> That said, you need to run the oops through ksymoops so we can get a
> back trace and see who the offending caller is.
>
> Robert Love