2007-10-14 18:46:06

by Dave Milter

[permalink] [raw]
Subject: linux-2.6.23-mm1 crashed

I build linux-2.6.23-mm1 and try to boot it using qemu,
and it crashed with trace like this:
do_page_fault
error_code
lock_acquire
_spin_lock_irqsave
gdth_timeout
run_timer_softirq
__do_softirq
do_softirq

I have screenshot, but have no idea, is it legal to include it, if I
sent copy to lkml.
config of kernel in attachment,
I apply all three patches from hot-fixes.


Attachments:
(No filename) (368.00 B)
config (45.65 kB)
Download all attachments

2007-10-14 19:22:16

by Andrew Morton

[permalink] [raw]
Subject: Re: linux-2.6.23-mm1 crashed

On Sun, 14 Oct 2007 22:45:47 +0400 "Dave Milter" <[email protected]> wrote:

> I build linux-2.6.23-mm1 and try to boot it using qemu,
> and it crashed with trace like this:
> do_page_fault
> error_code
> lock_acquire
> _spin_lock_irqsave
> gdth_timeout
> run_timer_softirq
> __do_softirq
> do_softirq
>
> I have screenshot, but have no idea, is it legal to include it, if I
> sent copy to lkml.
> config of kernel in attachment,
> I apply all three patches from hot-fixes.
>

The screenshot is here: http://userweb.kernel.org/~akpm/crash.png

It would appear that gdth_timeout() is passing a bad pointer into
spin_lock_irqsave().

2007-10-14 19:24:49

by Dave Milter

[permalink] [raw]
Subject: Re: linux-2.6.23-mm1 crashed

By the way, because of oops happens on early stage of boot,
you not need any image to reproduce this bug:
something like this will be enough:
1)cd /tmp/ && qemu-img create hda.img 10M
2)cd linux/mm/source/code
3)qemu -kernel arch/i386/boot/bzImage -hda /tmp/hda.img

if you add "-s" to qemu options you can after that do:
gdb vmlinux
$target remote localhost:1234
$br gth_timeout
$continue

On 10/14/07, Dave Milter <[email protected]> wrote:
> I build linux-2.6.23-mm1 and try to boot it using qemu,
> and it crashed with trace like this:
> do_page_fault
> error_code
> lock_acquire
> _spin_lock_irqsave
> gdth_timeout
> run_timer_softirq
> __do_softirq
> do_softirq
>
> I have screenshot, but have no idea, is it legal to include it, if I
> sent copy to lkml.
> config of kernel in attachment,
> I apply all three patches from hot-fixes.
>
>

2007-10-14 19:31:19

by Andrew Morton

[permalink] [raw]
Subject: Re: linux-2.6.23-mm1 crashed


(please don't top-post! edited...)

On Sun, 14 Oct 2007 23:24:39 +0400 "Dave Milter" <[email protected]> wrote:

> On 10/14/07, Dave Milter <[email protected]> wrote:
> > I build linux-2.6.23-mm1 and try to boot it using qemu,
> > and it crashed with trace like this:
> > do_page_fault
> > error_code
> > lock_acquire
> > _spin_lock_irqsave
> > gdth_timeout
> > run_timer_softirq
> > __do_softirq
> > do_softirq
> >
> > I have screenshot, but have no idea, is it legal to include it, if I
> > sent copy to lkml.
> > config of kernel in attachment,
> > I apply all three patches from hot-fixes.
> >
>
> By the way, because of oops happens on early stage of boot,
> you not need any image to reproduce this bug:
> something like this will be enough:
> 1)cd /tmp/ && qemu-img create hda.img 10M
> 2)cd linux/mm/source/code
> 3)qemu -kernel arch/i386/boot/bzImage -hda /tmp/hda.img
>
> if you add "-s" to qemu options you can after that do:
> gdb vmlinux
> $target remote localhost:1234
> $br gth_timeout
> $continue
>

I didn't notice that qemu was involved. Does qemu have an emulator for the
gdth hardware?

2007-10-14 22:26:42

by James Bottomley

[permalink] [raw]
Subject: Re: linux-2.6.23-mm1 crashed

On Sun, 2007-10-14 at 12:21 -0700, Andrew Morton wrote:
> On Sun, 14 Oct 2007 22:45:47 +0400 "Dave Milter" <[email protected]> wrote:
>
> > I build linux-2.6.23-mm1 and try to boot it using qemu,
> > and it crashed with trace like this:
> > do_page_fault
> > error_code
> > lock_acquire
> > _spin_lock_irqsave
> > gdth_timeout
> > run_timer_softirq
> > __do_softirq
> > do_softirq
> >
> > I have screenshot, but have no idea, is it legal to include it, if I
> > sent copy to lkml.
> > config of kernel in attachment,
> > I apply all three patches from hot-fixes.
> >
>
> The screenshot is here: http://userweb.kernel.org/~akpm/crash.png
>
> It would appear that gdth_timeout() is passing a bad pointer into
> spin_lock_irqsave().

There's a bug in the gdth rework in that the instance can be deleted
from the list before the actual timer is stopped. This can be worked
around I think by the following patch; although we really should be
stopping the timer from firing when the list goes empty.

James

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index e8010a7..7fa22be 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3793,6 +3793,9 @@ static void gdth_timeout(ulong data)
gdth_ha_str *ha;
ulong flags;

+ if (list_empty(&gdth_instances))
+ return;
+
ha = list_first_entry(&gdth_instances, gdth_ha_str, list);
spin_lock_irqsave(&ha->smp_lock, flags);




2007-10-16 05:44:37

by Dave Milter

[permalink] [raw]
Subject: Re: linux-2.6.23-mm1 crashed

On 10/14/07, Andrew Morton <[email protected]> wrote:
> I didn't notice that qemu was involved. Does qemu have an emulator for the
> gdth hardware?
>

I think no, the kernel just probe exist or not hardware, and hangs after that.