2003-01-18 12:53:52

by Peter Karlsson

[permalink] [raw]
Subject: 2.4.20 kernel crashes while scanning partition list

I am installing Debian Linux on my new Athlon XP PC, and I have
problems with the 2.4.20 kernel crashing on boot. I can successfully
boot using the 2.4.18 kernel build that is included with the Debian 3.0
install cds.

The problems seems to stem from my use of Promise FastTrak133 RAID
controller (yes, I know they say it sucks, but it's what is integrated
on the MB). In 2.4.18, I can get ataraid working, although I have to
specify the address to the IDE controller manually. 2.4.20 picks that
up automatically, but crashes when enumerating the HD partitions
(reproducible every time).

The crash output looks like this (I had to write it down by hand, hopefully
there will be no typos):

{{{output from enumerating partitions on /dev/ataraid/d0}}}
Unable to handle kernel NULL pointer dereference at virtual address 0000000
printing eip: 00000000
*pde = 00000000
Oops: 00000000
CPU: 0
EIP: 0010:[<00000000>] Not tainted
EFLAGS: 00010282
eax: 00000000 ebx: 00000000 ecx: c038c740 edx: 00000000
esi: f7e80f40 edi: 0040f5c8 ebp: 06ed3cf9 esp: c1c17d94
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1c17000)
Stack: c01e5ddc c038c740 00000000 f7e80f40 00000002 00000000 f7e80dc0 c1a7b3d0
c01e5e3e 00000000 f7e80f40 00000000 00000006 c01351b0 00000000 f7e80f40
c1a7b3d0 d7fbcdb4 00000000 f7ed0d30 00001000 00000004 00000400 f7e80f40
Call trace: [<c01e5ddc>] [<c01e5e3e>] [<c01351b0>] [<c012580c>] [<c0137d6f>]
[<c0137cd0>] [<c0127c75>] [<c0152162>] [<c0137d60>] [<c01522a3>]
[<c0116315>] [<c015278b>] [<c014538e>] [<c013811d>] [<c0151f84>]
[<c01345f8>] [<c0296373>] [<c02093a8>] [<c02094ac>] [<c0209437>]
[<c01520f7>] [<c0152036>] [<c0208af4>] [<c0105037>] [<c01055b8>]
Code: Bad EIP value.
<0>Kernel panic: Attempted to kill init!

The corresponding System.map file can be downloaded at
<URL:http://www.softwolves.pp.se/tmp/2.4.20/System.map-2.4.20> and a
partition list from sfdisk -l can be found at
<URL:http://www.softwolves.pp.se/tmp/2.4.20/partlist.txt>.

The motherboard is a MSI KT3Ultra2 ("VIA KT333 chipset based"), the CPU
is an Athlon XP2200+, and the machine has 1 Gbyte of RAM.

Any help in getting this resolved would be greatly appreciated! I
really would like to move on from my old K6 machine...

Please Cc replies to me, I do not subscribe to the list due to its
volume.
--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.


2003-01-19 01:00:32

by Joshua Kwan

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

I ran your oops log through ksymoops, doesn't look like you made an
error writing it down :)

I have no clue what is going on, though - I'm just attaching it and see
if the list can make any sense of it.

Attached (ksymoops.log) is the aforementioned file

Regards
Josh

On Sat, Jan 18, 2003 at 02:02:40PM +0100, Peter Karlsson wrote:
> I am installing Debian Linux on my new Athlon XP PC, and I have
> problems with the 2.4.20 kernel crashing on boot. I can successfully
> boot using the 2.4.18 kernel build that is included with the Debian 3.0
> install cds.
>
> The problems seems to stem from my use of Promise FastTrak133 RAID
> controller (yes, I know they say it sucks, but it's what is integrated
> on the MB). In 2.4.18, I can get ataraid working, although I have to
> specify the address to the IDE controller manually. 2.4.20 picks that
> up automatically, but crashes when enumerating the HD partitions
> (reproducible every time).
>
> The crash output looks like this (I had to write it down by hand, hopefully
> there will be no typos):
>
> {{{output from enumerating partitions on /dev/ataraid/d0}}}
> Unable to handle kernel NULL pointer dereference at virtual address 0000000
> printing eip: 00000000
> *pde = 00000000
> Oops: 00000000
> CPU: 0
> EIP: 0010:[<00000000>] Not tainted
> EFLAGS: 00010282
> eax: 00000000 ebx: 00000000 ecx: c038c740 edx: 00000000
> esi: f7e80f40 edi: 0040f5c8 ebp: 06ed3cf9 esp: c1c17d94
> ds: 0018 es: 0018 ss: 0018
> Process swapper (pid: 1, stackpage=c1c17000)
> Stack: c01e5ddc c038c740 00000000 f7e80f40 00000002 00000000 f7e80dc0 c1a7b3d0
> c01e5e3e 00000000 f7e80f40 00000000 00000006 c01351b0 00000000 f7e80f40
> c1a7b3d0 d7fbcdb4 00000000 f7ed0d30 00001000 00000004 00000400 f7e80f40
> Call trace: [<c01e5ddc>] [<c01e5e3e>] [<c01351b0>] [<c012580c>] [<c0137d6f>]
> [<c0137cd0>] [<c0127c75>] [<c0152162>] [<c0137d60>] [<c01522a3>]
> [<c0116315>] [<c015278b>] [<c014538e>] [<c013811d>] [<c0151f84>]
> [<c01345f8>] [<c0296373>] [<c02093a8>] [<c02094ac>] [<c0209437>]
> [<c01520f7>] [<c0152036>] [<c0208af4>] [<c0105037>] [<c01055b8>]
> Code: Bad EIP value.
> <0>Kernel panic: Attempted to kill init!
>
> The corresponding System.map file can be downloaded at
> <URL:http://www.softwolves.pp.se/tmp/2.4.20/System.map-2.4.20> and a
> partition list from sfdisk -l can be found at
> <URL:http://www.softwolves.pp.se/tmp/2.4.20/partlist.txt>.
>
> The motherboard is a MSI KT3Ultra2 ("VIA KT333 chipset based"), the CPU
> is an Athlon XP2200+, and the machine has 1 Gbyte of RAM.
>
> Any help in getting this resolved would be greatly appreciated! I
> really would like to move on from my old K6 machine...
>
> Please Cc replies to me, I do not subscribe to the list due to its
> volume.
> --
> \\//
> Peter - http://www.softwolves.pp.se/
>
> I do not read or respond to mail with HTML attachments.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
.-`-.-`-.-=============----->
Joshua Kwan [email protected]
[email protected]


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2003-01-26 10:44:50

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Me, last week:

> I am installing Debian Linux on my new Athlon XP PC, and I have
> problems with the 2.4.20 kernel crashing on boot.

So, no one has any good ideas on what might be wrong with the 2.4.20
kernel? I must say that it's quite irritating to have a brand new
computer just standing around unusable because I can't run Linux on it
:-(

--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.

2003-01-26 15:12:27

by Andrew Walrond

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Sorry I don't have a copy of your first email, so apologies if you've
already addressed these questions

Is it just the 2.4.20 kernel? Have you tried others?
Can we see some dmesg output? If it's not getting through the kernel
boot this would involve setting up a serial link to another machine
(there are howto's for this)
Another other clues? 'It crashes' isn't enough to go on.

Andrew

Peter Karlsson wrote:
> Me, last week:
>
>
>>I am installing Debian Linux on my new Athlon XP PC, and I have
>>problems with the 2.4.20 kernel crashing on boot.
>
>
> So, no one has any good ideas on what might be wrong with the 2.4.20
> kernel? I must say that it's quite irritating to have a brand new
> computer just standing around unusable because I can't run Linux on it
> :-(
>


2003-01-26 15:19:07

by Douglas McNaught

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Peter Karlsson <[email protected]> writes:

> Me, last week:
>
> > I am installing Debian Linux on my new Athlon XP PC, and I have
> > problems with the 2.4.20 kernel crashing on boot.
>
> So, no one has any good ideas on what might be wrong with the 2.4.20
> kernel? I must say that it's quite irritating to have a brand new
> computer just standing around unusable because I can't run Linux on it
> :-(

Have you tried 2.4.21pre? It may have fixes to handle newer hardware.

Does the machine pass memtest86?

-Doug

2003-01-26 17:30:46

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Doug McNaught:

> Have you tried 2.4.21pre? It may have fixes to handle newer hardware.

Not yet, I did try 2.4.19, to see if that worked better, but got the
same crash. 2.4.18 seems to be different, however, since it does not
autodetect the drives (and does not crash when I pass the IDE addresses
manually).

> Does the machine pass memtest86?

No errors were reported.


Mark Hahn:

> well, posting an undecoded oops is one mistake.

Someone decoded it and posted a decoded trace. Did that not arrive to
the list properly?

> can you dispense with the silly hardware dumb-raid, and just use
> kernel software raid? (faster, more robust).

I'd prefer to use the on-board RAID to be compatible with other OSes.
And, according to people I have asked, the Linux software-RAID is not
very reliable either, so I don't feel especially inclined to switch.

> I decoded your oops, below. it's not much use without someone also
> running objdump -D drivers/block/ll_rw_blk.o on your system and
> figuring out how the eip=0 happened. it's obviously some callback.

If I run this (I can boot the machine using 2.4.18), what should I
look for? I haven't debugged the kernel before, it has just always
worked for me, on all the machines I've tried it. I've only ever had
the kernel crash once on me before since 1996.

> it's also somewhat odd that prink is in the backtrace.

That might be a typo on my part.

> there are no printk's before the oops?

The last thing that happens before the crash is that it tries to
enumerate the hard disk partitions. I lists a few but then crashes.

--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.


2003-01-26 20:52:00

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Andrew Walrond:

> Sorry I don't have a copy of your first email, so apologies if you've
> already addressed these questions

My previous message is archived at
<URL:http://www.uwsg.indiana.edu/hypermail/linux/kernel/0301.2/0519.html>

> Is it just the 2.4.20 kernel? Have you tried others?

2.4.19 and 2.4.20 crashes, 2.4.18 as distributed with Debian 3.0's
installation system works fine, but it needs to be told where the IDE
channels are. It seems that the change in 2.4.18->2.4.19 that
introduces support for recognizing where the FastTrak card has its IDE
channels somehow causes the crash. However, manually specifying the IDE
channels on the command line as for 2.4.18 does not make 2.4.19/20 stop
crashing.

> Can we see some dmesg output?

Please see the previous post.

--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.


2003-01-27 05:43:45

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Tomas Szepe:

> I have yet to hear about a scenario where Linux raid performs poorly.
> What exactly did these people mean who told you md weren't reliable?

The problem was not performance, according to what I read, the problem
was in handling when harddisks fail and such. And since I want to use
raid to improve security (by using mirroring), not performance, that's
quite important.

--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.

2003-02-01 13:25:14

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

Doug McNaught:

> Have you tried 2.4.21pre? It may have fixes to handle newer hardware.

I tried the 2.4.21pre4, and it crashes as well, seemingly in the exact
same spot as the other kernels. I also tried 2.5.59, but I couldn't get
it to find the array at all (it does list the IDE channels, though).

Here's a summary of what happens:

Kernel Finds IDE? Finds array? Finds partitions
------ ---------- ------------ ----------------
2.4.18 No Yes* Yes, and boots fine
2.4.19 Yes Yes Crashes when enumerating
2.4.20 Yes Yes Crashes when enumerating
2.4.21pre4** Yes No No partitions found
2.4.21pre4*** Yes Yes Crashes when enumerating
2.5.59 Yes No No partitions found

* When fed IDE channels on the command line
** With configuration copied from 2.4.20.
*** Enabled the non-ataraid options for Promise FastTrak.

The configuration for 2.4.19--21 should be the same. 2.4.18 is
pre-built version from Debian's boot-floppies. Source for 2.4.21pre4
and 2.5.59 are pristine kernel.org sources. (My 2.5.59 configuration
might have been flawed, however.)

As before, all relevant debugging material may be found at
<URL:http://www.softwolves.pp.se/tmp/2.4.20>


I could really need some help debugging. I can set up an account on my
machine (I have a cable tv connection), and set up a null modem cable
from the problematic machine if that would be helpful. I don't really
know much about kernel debugging, especially when it crashes before the
system has actually started up.

--
\\//
Peter

I do not read or respond to mail with HTML attachments.


2003-02-01 14:22:32

by Peter Karlsson

[permalink] [raw]
Subject: Re: 2.4.20 kernel crashes while scanning partition list

John Bradford:

> As you've tested all these different kernels, why not submit this as a
> bug in my bug database, http://grabjohn.com/kernelbugdatabase/ ,

Done. It is reported as bug 33 in your database. However, I was a bit
uncertain about what to status to set for 2.4.18, but I set it to
"working" as the crash itself does not occur once I specify things
manually.

--
\\//
Peter - http://www.softwolves.pp.se/

I do not read or respond to mail with HTML attachments.