2004-10-11 14:34:49

by Harald Dunkel

[permalink] [raw]
Subject: 2.6.9-rc4: Aiee on amd64

Hi folks,

I installed 2.6.9-rc4 this morning, but it died at boot time
(a lot of hex output and something about "Aiee" :-). I tried
to redirect syslog to another host, but the error message did
not show up in the foreign log files.

Any idea how to catch this message? The problem seems to be
reproducable, and I would be glad to help.


Regards

Harri


2004-10-11 14:51:53

by Jesper Juhl

[permalink] [raw]
Subject: Re: 2.6.9-rc4: Aiee on amd64

On Mon, 11 Oct 2004, Harald Dunkel wrote:

> Hi folks,
>
> I installed 2.6.9-rc4 this morning, but it died at boot time
> (a lot of hex output and something about "Aiee" :-). I tried
> to redirect syslog to another host, but the error message did
> not show up in the foreign log files.
>
> Any idea how to catch this message? The problem seems to be
> reproducable, and I would be glad to help.
>
Serial console or pen and paper.

--
Jesper Juhl

2004-10-11 15:00:32

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.9-rc4: Aiee on amd64

"Harald Dunkel" <[email protected]> writes:

> Hi folks,
>
> I installed 2.6.9-rc4 this morning, but it died at boot time
> (a lot of hex output and something about "Aiee" :-). I tried
> to redirect syslog to another host, but the error message did
> not show up in the foreign log files.
>
> Any idea how to catch this message? The problem seems to be
> reproducable, and I would be glad to help.

Use a null modem to the other system and boot with
console=ttyS0,baudrate
Then use some logging terminal program (e.g. kermit)
on the other side to catch the output.

Before doing this verify in a working system that the cable works.

If that is not possible you can use a digital camera in the worst
case and put the jpeg somewhere.

-Andi

2004-10-12 06:57:36

by Harald Dunkel

[permalink] [raw]
Subject: USB Problem (was: 2.6.9-rc4: Aiee on amd64)

Harald Dunkel wrote:
> Hi folks,
>
> I installed 2.6.9-rc4 this morning, but it died at boot time
> (a lot of hex output and something about "Aiee" :-). I tried
> to redirect syslog to another host, but the error message did
> not show up in the foreign log files.
>
> Any idea how to catch this message? The problem seems to be
> reproducable, and I would be glad to help.
>

I disabled ehci_hcd.ko in the boot procedure and loaded
it manually when syslog was running. This was written into
kern.log:

:
ACPI: PCI interrupt 0000:00:02.2[C] -> GSI 11 (level, low) -> IRQ 11
ehci_hcd 0000:00:02.2: nVidia Corporation nForce3 USB 2.0
PCI: Setting latency timer of device 0000:00:02.2 to 64
ehci_hcd 0000:00:02.2: irq 11, pci mem ffffff0000b98000
ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1
PCI: cache line size of 64 is not supported by device 0000:00:02.2
ehci_hcd 0000:00:02.2: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
usb 1-3: new high speed USB device using address 2
ub: sizeof ub_scsi_cmd 88 ub_dev 1112
uba: device 2 capacity nsec 50 bsize 512
uba: made changed
uba: device 2 capacity nsec 50 bsize 512
uba: device 2 capacity nsec 50 bsize 512
/dev/ub/a:end_request: I/O error, dev uba, sector 0
Buffer I/O error on device uba, logical block 0
end_request: I/O error, dev uba, sector 2
Buffer I/O error on device uba, logical block 1
end_request: I/O error, dev uba, sector 4
Buffer I/O error on device uba, logical block 2
end_request: I/O error, dev uba, sector 6
Buffer I/O error on device uba, logical block 3
end_request: I/O error, dev uba, sector 6
Buffer I/O error on device uba, logical block 3
end_request: I/O error, dev uba, sector 4
Buffer I/O error on device uba, logical block 2
end_request: I/O error, dev uba, sector 2
Buffer I/O error on device uba, logical block 1
end_request: I/O error, dev uba, sector 0
Buffer I/O error on device uba, logical block 0
unable to read partition table
/dev/ub/a:end_request: I/O error, dev uba, sector 2
Buffer I/O error on device uba, logical block 1
end_request: I/O error, dev uba, sector 4
Buffer I/O error on device uba, logical block 2
end_request: I/O error, dev uba, sector 6
Buffer I/O error on device uba, logical block 3
end_request: I/O error, dev uba, sector 0
Buffer I/O error on device uba, logical block 0
unable to read partition table
usbcore: registered new driver ub
usb 1-4: new high speed USB device using address 3


If I keep ehci_hcd in the boot procedure, then the
kernel dies immediately after printing these messages.
In my test environment there was no crash, though.


Regards

Harri

2004-10-15 13:37:09

by Alexander Nyberg

[permalink] [raw]
Subject: Re: 2.6.9-rc4: Aiee on amd64

> I installed 2.6.9-rc4 this morning, but it died at boot time
> (a lot of hex output and something about "Aiee" :-). I tried
> to redirect syslog to another host, but the error message did
> not show up in the foreign log files.
>
> Any idea how to catch this message? The problem seems to be
> reproducable, and I would be glad to help.

You need to use netconsole, serial console or some other technique to
get the panic info over to another machine. Please mail again with info
on what you do to get the panic and the info that comes out of the panic
itself. I'm sending you the netconsole text.

-----------------------------------------------

started by Ingo Molnar <[email protected]>, 2001.09.17
2.6 port and netpoll api by Matt Mackall <[email protected]>, Sep 9 2003

Please send bug reports to Matt Mackall <[email protected]>

This module logs kernel printk messages over UDP allowing debugging of
problem where disk logging fails and serial consoles are impractical.

It can be used either built-in or as a module. As a built-in,
netconsole initializes immediately after NIC cards and will bring up
the specified interface as soon as possible. While this doesn't allow
capture of early kernel panics, it does capture most of the boot
process.

It takes a string configuration parameter "netconsole" in the
following format:


netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]

where
src-port source for UDP packets (defaults to 6665)
src-ip source IP to use (interface address)
dev network interface (eth0)
tgt-port port for logging agent (6666)
tgt-ip IP address for logging agent
tgt-macaddr ethernet MAC address for logging agent (broadcast)

Examples:

linux [email protected]/eth1,[email protected]/12:34:56:78:9a:bc

or

insmod netconsole netconsole=@/,@10.0.0.2/

Built-in netconsole starts immediately after the TCP stack is
initialized and attempts to bring up the supplied dev at the supplied
address.

The remote host can run either 'netcat -u -l -p <port>' or syslogd.

WARNING: the default target ethernet setting uses the broadcast
ethernet address to send packets, which can cause increased load on
other systems on the same ethernet segment.

NOTE: the network device (eth1 in the above case) can run any kind
of other network traffic, netconsole is not intrusive. Netconsole
might cause slight delays in other traffic if the volume of kernel
messages is high, but should have no other impact.

Netconsole was designed to be as instantaneous as possible, to
enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while
sending packets. Due to these unique needs, configuration can not
be more automatic, and some fundamental limitations will remain:
only IP networks, UDP packets and ethernet devices are supported.