2006-02-18 00:17:19

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: oops during boot of 2.6.16-rc3-git7 on AMD64

Hi,

vanilla 2.6.16-rc3-git7 gives me the following oops during boot (most
of the time while mounting all filesystems) on my amd64 machine:

(hand-written, no serial interface available)
Unable to handle kernel NULL pointer dereference at 00000008
rip: run_timer_softirq+322
process udev
Call trace:
__do_softirq+68
call_softirq+30
do_softirq+46
do_IRQ+61
ret_from_intr+0
EOI

Since the oops happens in an interrupt every time I got it, it
locked the machine hard with a kernel panic message.

The same machine runs fine with 2.6.15-git1 and previous kernels.
switch:~ # cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 31
model name : AMD Athlon(tm) 64 Processor 3200+
stepping : 0
cpu MHz : 1000.000
cache size : 512 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow lahf_lm
bogomips : 2012.44
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

Modules loaded:
cls_u32 sch_htb ipt_physdev iptable_filter ip_tables ebtable_filter
ebt_arpreply ebt_arp ebt_ip ebtable_nat ebtables bridge cpufreq_userspace
powernow_k8 freq_table thermal processor fan button battery ac ipv6
floppy snd_pcm_oss snd_mixer_oss twofish cryptoloop edd joydev st sr_mod
video1394 raw1394 capability commoncap sg ohci1394 ieee1394 sky2 ehci_hcd
snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore
snd_page_alloc forcedeth ohci_hcd usbcore parport_pc lp parport reiserfs
ide_cd cdrom ide_disk dm_snapshot dm_mod sata_sil sata_nv libata amd74xx
ide_core sd_mod scsi_mod


Regards,
Carl-Daniel
--
http://www.hailfinger.org/


2006-02-18 22:11:28

by Andi Kleen

[permalink] [raw]
Subject: Re: oops during boot of 2.6.16-rc3-git7 on AMD64

Carl-Daniel Hailfinger <[email protected]> writes:

> Hi,
>
> vanilla 2.6.16-rc3-git7 gives me the following oops during boot (most
> of the time while mounting all filesystems) on my amd64 machine:
>
> (hand-written, no serial interface available)
> Unable to handle kernel NULL pointer dereference at 00000008
> rip: run_timer_softirq+322
> process udev
> Call trace:
> __do_softirq+68



Looks like someone is corrupting the timer list. Nasty. Can you do
a binary search?

Or alternatively remove as many drivers as possible and narrow down
which one is to blame.

-Andi

2006-02-20 19:42:12

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: Re: oops during boot of 2.6.16-rc3-git7 on AMD64

Andi Kleen schrieb:
> Carl-Daniel Hailfinger <[email protected]> writes:
>
>
>>Hi,
>>
>>vanilla 2.6.16-rc3-git7 gives me the following oops during boot (most
>>of the time while mounting all filesystems) on my amd64 machine:
>>
>>(hand-written, no serial interface available)
>>Unable to handle kernel NULL pointer dereference at 00000008
>>rip: run_timer_softirq+322
>>process udev
>>Call trace:
>>__do_softirq+68
>
>
> Looks like someone is corrupting the timer list. Nasty. Can you do
> a binary search?
>
> Or alternatively remove as many drivers as possible and narrow down
> which one is to blame.

Yes, will do that.


Regards,
Carl-Daniel
--
http://www.hailfinger.org/

2006-03-04 16:01:14

by Adrian Bunk

[permalink] [raw]
Subject: Re: oops during boot of 2.6.16-rc3-git7 on AMD64

On Mon, Feb 20, 2006 at 08:42:08PM +0100, Carl-Daniel Hailfinger wrote:
> Andi Kleen schrieb:
> > Carl-Daniel Hailfinger <[email protected]> writes:
> >
> >
> >>Hi,
> >>
> >>vanilla 2.6.16-rc3-git7 gives me the following oops during boot (most
> >>of the time while mounting all filesystems) on my amd64 machine:
> >>
> >>(hand-written, no serial interface available)
> >>Unable to handle kernel NULL pointer dereference at 00000008
> >>rip: run_timer_softirq+322
> >>process udev
> >>Call trace:
> >>__do_softirq+68
> >
> >
> > Looks like someone is corrupting the timer list. Nasty. Can you do
> > a binary search?
> >
> > Or alternatively remove as many drivers as possible and narrow down
> > which one is to blame.
>
> Yes, will do that.

What is the status of this problem?

> Regards,
> Carl-Daniel

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed