2002-06-17 04:54:38

by kk maddowx

[permalink] [raw]
Subject: 2.4.18 kernel panics before and after boot

I am running kernel version 2.4.18 and have
experienced server kernel panics. Sometimes this
occurs at boot and other times the box boots but will
invariably panic at some point. Ksymoops reports the
following from my oops file:


1>Unable to handle kernel NULL pointer dereference at
v
00000400
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<00000400>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00000001 ebx: c02e4e40 ecx: c8816000 edx:
00000004
esi: 00000004 edi: 0000001b ebp: 000003d9 esp:
c1263f28
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 4, stackpage=c1263000)
Stack: c11db840 c0127b63 c02e4e40 0181987c c1262000
0000005e 000001d0 c02897e8
c0300320 c12f51e0 c1207180 00000000 00000020
000001d0 00000006 00002e94
c0127cd6 00000006 00000010 c02897e8 00000006
000001d0 c02897e8 00000000
Call Trace: [<c0127b63>] [<c0127cd6>] [<c0127d40>]
[<c0127dd4>] [<c0127e36>]
[<c0127f51>] [<c0127eb0>] [<c010552b>]
Code: Bad EIP value.

>>EIP; 00000400 Before first symbol <=====
Trace; c0127b63 <shrink_cache+2b3/2f0>
Trace; c0127cd6 <shrink_caches+56/90>
Trace; c0127d40 <try_to_free_pages+30/50>
Trace; c0127dd4 <kswapd_balance_pgdat+44/90>
Trace; c0127e36 <kswapd_balance+16/30>
Trace; c0127f51 <kswapd+a1/c0>
Trace; c0127eb0 <kswapd+0/c0>
Trace; c010552b <kernel_thread+2b/40>

Distro: Redhat 7.0
kernel version: 2.4.18
compiler: gcc version 2.96 20000731
cpu:

processor : 0
vendor_id : AuthenticAMD
cpu family : 5
model : 8
model name : AMD-K6(tm) 3D processor
stepping : 0
cpu MHz : 333.522
cache size : 64 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8 sep
mmx 3dnow
bogomips : 665.19

modules: 3com Boomerang 3c59x, adaptec aic7xxx.

The scsi card installed only contains a tape drive.
The disks are all ide using ext2. More relevant
hardware details is as follows:

LI15X3: IDE controller on PCI bus 00 dev 78
ALI15X3: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings:
hda:pio, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings:
hdc:pio, hdd:pio
keyboard: Timeout - AT keyboard not present?
keyboard: Timeout - AT keyboard not present?
hda: WDC AC313000R, ATA DISK drive
hdb: CREATIVEDVD-ROM DVD2240E 12/24/97, ATAPI CDROM
drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: WDC AC313000R, 12416MB w/512kB Cache,
CHS=1582/255/63, (U)DMA
hdb: ATAPI 24X DVD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.11
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077

I did a google search and found similar posts but no
replies as to possible causes. I am providing this
post in the hopes of providing a commonality or median
as to the source of these panics with 2.4.18. Please
let me know if any other info is helpful. I would
appreciate any feedback.

TIA
kk









__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com


2002-06-17 16:56:00

by kk maddowx

[permalink] [raw]
Subject: Re: 2.4.18 kernel panics before and after boot

Unfortunately I could not get memtest to work. I added
the lines:

label=memtest
image=/boot/memtest

to lilo.conf and ran lilo.
I can see the selection for memtest but it wont accept
it as a bootable image. I did swap the memory out and
still recieve kernel panics with known working memory.

However if I boot from my old 2.2.20 kernel I will
never see a panic or experience a panic after boot
making me think the memory is ok. Here is the dmesg
from a successful 2.4 boot if that helps:



LILO
Loading 2.4..................
Linux version 2.4.18a (root@birdbrain) (gcc version
2.96 20000731 (Red Hat Linux
7.0)) #1 Thu Jun 13 01:54:35 EDT 2002
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00
(usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000
(reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000
(reserved)
BIOS-e820: 0000000000100000 - 0000000008000000
(usable)
BIOS-e820: 00000000fec00000 - 00000000fec01000
(reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000
(reserved)
BIOS-e820: 00000000fffe0000 - 0000000100000000
(reserved)
On node 0 totalpages: 32768
zone(0): 4096 pages.
zone(1): 28672 pages.
zone(2): 0 pages.
Kernel command line: auto BOOT_IMAGE=2.4 ro root=305
BOOT_FILE=/boot/vmlinuz-2.4
.18a console=ttyS0,9600n8 console=tty0
Initializing CPU#0
Detected 333.523 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 665.19 BogoMIPS
Memory: 126532k/131072k available (1263k kernel code,
4152k reserved, 396k data,
220k init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5,
131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536
bytes)
Mount-cache hash table entries: 2048 (order: 2, 16384
bytes)
Buffer-cache hash table entries: 8192 (order: 3, 32768
bytes)
Page-cache hash table entries: 32768 (order: 5, 131072
bytes)
CPU: L1 I Cache: 32K (32 bytes/line), D cache 32K (32
bytes/line)
CPU: AMD-K6(tm) 3D processor stepping 00
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
PCI: PCI BIOS revision 2.10 entry at 0xfdb11, last
bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router ALI [10b9/1533] at 00:07.0
PCI: Hardcoded IRQ 14 for device 00:0f.0
isapnp: Scanning for PnP cards...
isapnp: SB audio device quirk - increasing port range
isapnp: AWE32 quirk - adding two ports
isapnp: Card 'Creative SB AWE64 PnP'
isapnp: 1 Plug & Play card detected total
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society
NET3.039
Initializing RT netlink socket
Starting kswapd
VFS: Diskquotas version dquot_6.4.0 initialized
Detected PS/2 Mouse Port.
pty: 256 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with
MANY_PORTS SHARE_IRQ SERIAL_PCI IS
APNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
block: 128 slots per queue, batch=32
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes;
override with idebus=xx
ALI15X3: IDE controller on PCI bus 00 dev 78
PCI: Hardcoded IRQ 14 for device 00:0f.0
ALI15X3: chipset revision 32
ALI15X3: not 100% native mode: will probe irqs later
ALI15X3: simplex device: DMA disabled
ide0: ALI15X3 Bus-Master DMA disabled (BIOS)
ALI15X3: simplex device: DMA disabled
ide1: ALI15X3 Bus-Master DMA disabled (BIOS)
hda: WDC AC313000R, ATA DISK drive
hdb: CREATIVEDVD-ROM DVD2240E 12/24/97, ATAPI
CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 25429824 sectors (13020 MB) w/512KiB Cache,
CHS=1582/255/63
hdb: ATAPI 24X DVD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.12
Partition check:
hda: hda1 hda2 < hda5 hda6 >
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory:
96M
agpgart: Detected Ali M1541 chipset
agpgart: AGP aperture is 64M @ 0xe0000000
[drm] Initialized tdfx 1.0.0 20010216 on minor 0
[drm] AGP 0.99 on ALi M1541 @ 0xe0000000 64MB
[drm] Initialized radeon 1.1.1 20010405 on minor 1
SCSI subsystem driver Revision: 1.00
request_module[scsi_hostadapter]: Root fs not mounted
Linux Kernel Card Services 3.1.22
options: [pci] [cardbus] [pm]
usb.c: registered new driver hub
uhci.c: USB Universal Host Controller Interface driver
v1.1
Initializing USB Mass Storage driver...
usb.c: registered new driver usb-storage
USB Mass Storage support registered.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 1024 buckets, 8Kbytes
TCP: Hash tables configured (established 8192 bind
8192)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
ds: no socket drivers loaded!
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 220k freed


I have noticed that if the kernel does decide to panic
on boot it will happen after the "Freeing unused
memory" message is printed. Do you have any ideas what
might be casuing this? TIA


--- Kristian Peters <[email protected]>
wrote:
> Hello.
>
> I suspect bad ram. Could you verify with memtest86
> that your ram is ok ?
>
> *Kristian
>
> kk maddowx <[email protected]> wrote:
> > >>EIP; 00000400 Before first symbol <=====
> > Trace; c0127b63 <shrink_cache+2b3/2f0>
> > Trace; c0127cd6 <shrink_caches+56/90>
> > Trace; c0127d40 <try_to_free_pages+30/50>
> > Trace; c0127dd4 <kswapd_balance_pgdat+44/90>
> > Trace; c0127e36 <kswapd_balance+16/30>
> > Trace; c0127f51 <kswapd+a1/c0>
> > Trace; c0127eb0 <kswapd+0/c0>
> > Trace; c010552b <kernel_thread+2b/40>
>
> :... [snd.science] ...:
> :: _o)
> :: http://www.korseby.net /\\
> :: http://gsmp.sf.net _\_V
> :.........................:


__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com

2002-06-17 18:23:57

by Sebastian Szonyi

[permalink] [raw]
Subject: Re: 2.4.18 kernel panics before and after boot



On Mon, 17 Jun 2002, kk maddowx wrote:

> Date: Mon, 17 Jun 2002 09:52:44 -0700 (PDT)
> From: kk maddowx <[email protected]>
> To: Kristian Peters <[email protected]>
> Cc: [email protected]
> Subject: Re: 2.4.18 kernel panics before and after boot
>
> Unfortunately I could not get memtest to work. I added
> the lines:
>
> label=memtest
> image=/boot/memtest
>
> to lilo.conf and ran lilo.

try swapping the lines

> I can see the selection for memtest but it wont accept
> it as a bootable image. I did swap the memory out and
> still recieve kernel panics with known working memory.
>
> However if I boot from my old 2.2.20 kernel I will
> never see a panic or experience a panic after boot
> making me think the memory is ok. Here is the dmesg
> from a successful 2.4 boot if that helps:
>
>
>
> LILO
> Loading 2.4..................
> Linux version 2.4.18a (root@birdbrain) (gcc version
> 2.96 20000731 (Red Hat Linux
> 7.0)) #1 Thu Jun 13 01:54:35 EDT 2002

Linux 2.4.18a ?
Never heard about it :-)

gcc 2.96 ?
use 2.95.3 or 2.95.4

See $(kernel_root)/Documentation/Changes to find out what you need
for compiling this kernel (where $(kernel_root) is where
your kernel tree is located for example /usr/src/linux )



>
> I have noticed that if the kernel does decide to panic
> on boot it will happen after the "Freeing unused
> memory" message is printed. Do you have any ideas what
> might be casuing this? TIA
>
>

An unsupported filesystem in your kernel (i.e. your
filesystem is xfs for example and you don't have xfs
support in kernel)

Could be many things.

> --- Kristian Peters <[email protected]>
> wrote:
> > Hello.
> >
> > I suspect bad ram. Could you verify with memtest86
> > that your ram is ok ?
> >
> > *Kristian
> >
> > kk maddowx <[email protected]> wrote:
> > > >>EIP; 00000400 Before first symbol <=====
> > > Trace; c0127b63 <shrink_cache+2b3/2f0>
> > > Trace; c0127cd6 <shrink_caches+56/90>
> > > Trace; c0127d40 <try_to_free_pages+30/50>
> > > Trace; c0127dd4 <kswapd_balance_pgdat+44/90>
> > > Trace; c0127e36 <kswapd_balance+16/30>
> > > Trace; c0127f51 <kswapd+a1/c0>
> > > Trace; c0127eb0 <kswapd+0/c0>
> > > Trace; c010552b <kernel_thread+2b/40>
> >
> > :... [snd.science] ...:
> > :: _o)
> > :: http://www.korseby.net /\\
> > :: http://gsmp.sf.net _\_V
> > :.........................:
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! - Official partner of 2002 FIFA World Cup
> http://fifaworldcup.yahoo.com
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-06-18 07:27:15

by Kristian Peters

[permalink] [raw]
Subject: Re: 2.4.18 kernel panics before and after boot

kk maddowx <[email protected]> wrote:
> Unfortunately I could not get memtest to work. I added
> the lines:
>
> label=memtest
> image=/boot/memtest

It should be the reverse order:

image=/boot/memtest
label=memtest

Please check that /boot/memtest exists.

> I have noticed that if the kernel does decide to panic
> on boot it will happen after the "Freeing unused
> memory" message is printed. Do you have any ideas what
> might be casuing this? TIA

Bad memory. ;-) Linux 2.4 only triggers that special sector at the beginning where 2.2 does not.

If memtest can't find any corruption it'd be worth trying an official kernel from redhat.

*Kristian

:... [snd.science] ...:
:: _o)
:: http://www.korseby.net /\\
:: http://gsmp.sf.net _\_V
:.........................: