2006-01-08 11:43:18

by folkert

[permalink] [raw]
Subject: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

Hi,

My system freezes (crashes) when I run tcpdump on the interface
connected to a 3c905b card. I've tried swapping the card for an other
3c905b card but that did not help. 2 out of 3 times the last message on
the console is "Transmit error, Tx status register 82". sysreq+t doesn't
work. Not only tcpdump, any program which put the interface into
promisques mode makes the system crash. All other interfaces (eth0 and
eth2) are fine. I tried starting the system with 'debug=7' attached to
the modprobe for the module but then the system crashes with a
"vortex_error() status=0x8081".
The tcpdump-problem is reproducable, in fact: the system crashes always
when I run tcpdump on that interface.
This seems to be the only problem with that system: no other crashes or
segfaults or anything out of the ordinary.

kernel 2.6.15
3.2GHz P4, HT enabled, 2GB ram

[ 13.422920] ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 21 (level, low) -> IRQ 16
[ 13.423004] 3c59x: Donald Becker and others. http://www.scyld.com/network/vortex.html
[ 13.423051] 0000:02:09.0: 3Com PCI 3c905B Cyclone 100baseTx at f8882400. Vers LK1.1.19
[ 13.445519] ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 21 (level, low) -> IRQ 16

02:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100
Flags: bus master, medium devsel, latency 64, IRQ 16
I/O ports at d880 [size=128]
Memory at feaff400 (32-bit, non-prefetchable) [size=128]
Expansion ROM at fe700000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 1
00: b7 10 55 90 17 01 10 02 30 00 00 02 04 40 00 00
10: 81 d8 00 00 00 f4 af fe 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 55 90
30: 00 00 ac fe dc 00 00 00 00 00 00 00 05 01 0a 0a

UTP connected to a switching hub

vortex-diag.c:v2.16 1/12/2004 Donald Becker ([email protected])
http://www.scyld.com/diag/index.html
Index #1: Found a 3c905B Cyclone 100baseTx adapter at 0xd880.
Station address 00:50:da:df:1d:3a.
Receive mode is 0x07: Normal unicast and all multicast.
The Vortex chip may be active, so FIFO registers will not be read.
To see all register values use the '-f' flag.
Initial window 4, registers values by window:
Window 0: 0000 0000 0000 0000 f5f5 00bf 0000 0000.
Window 1: FIFO FIFO 0000 0000 0000 0000 0000 2000.
Window 2: 5000 dfda 3a1d 0000 0000 0000 000a 4000.
Window 3: 0000 0180 05ea 0020 000a 0800 0800 6000.
Window 4: 0000 0000 0000 0cd8 0003 8880 0000 8000.
Window 5: 1ffc 0000 0000 0600 0807 06ce 06c6 a000.
Window 6: 0000 0000 0000 da00 1000 4a47 52cf c000.
Window 7: 0000 0000 0000 0000 0000 0000 0000 e000.
Vortex chip registers at 0xd880
0xD890: **FIFO** 00000000 0000001c *STATUS*
0xD8A0: 00000020 00000000 00080000 00000004
0xD8B0: 00000000 e7be1842 377a9110 00080004
0xD8C0: 008b890f 00000000 00000000 00000000
0xD8D0: 00000000 00000000 00000000 00000000
0xD8E0: 00000000 00000000 00000000 00000000
0xD8F0: 00009000 00000000 01600160 00000000
DMA control register is 00000020.
Tx list starts at 00000000.
Tx FIFO thresholds: min. burst 256 bytes, priority with 128 bytes to empty.
Rx FIFO thresholds: min. burst 256 bytes, priority with 128 bytes to full.
Poll period Tx 00 ns., Rx 0 ns.
Maximum burst recorded Tx 352, Rx 352.
Indication enable is 06c6, interrupt enable is 06ce.
No interrupt sources are pending.
Transceiver/media interfaces available: 100baseTx 10baseT.
Transceiver type in use: Autonegotiate.
MAC settings: full-duplex.
Station address set to 00:50:da:df:1d:3a.
Configuration options 000a.
EEPROM format 64x16, configuration table at offset 0:
00: 0050 dadf 1d3a 9055 002d 0036 4258 6d50
0x08: 2971 0000 0050 dadf 1d3a 0010 0000 0022
0x10: 32a2 0000 0000 0180 0000 0000 0000 10b7
0x18: 9055 000a 0000 0000 0000 0000 0000 0000
0x20: 00ea 0000 0000 0000 0000 0000 0000 0000
0x28: 0000 0000 0000 0000 0000 0000 0000 0000
...

The word-wide EEPROM checksum is 0x30f7.
Saved EEPROM settings of a 3Com Vortex/Boomerang:
3Com Node Address 00:50:DA:DF:1D:3A (used as a unique ID only).
OEM Station address 00:50:DA:DF:1D:3A (used as the ethernet address).
Device ID 9055, Manufacturer ID 6d50.
Manufacture date (MM/DD/YYYY) 1/13/2000, division 6, product XB.
No BIOS ROM is present.
Transceiver selection: Autonegotiate.
Options: negotiated duplex, link beat required.
PCI bus requested settings -- minimum grant 10, maximum latency 10 (250ns units).
PCI Subsystem IDs: Vendor 10b7 Device 9055.
100baseTx 10baseT.
Vortex format checksum is incorrect (82 vs. 10b7).
Cyclone format checksum is correct (0xea vs. 0xea).
Hurricane format checksum is correct (0xea vs. 0xea).


mii-diag.c:v2.11 3/21/2005 Donald Becker ([email protected])
http://www.scyld.com/diag/index.html
Using the new SIOCGMIIPHY value on PHY 24 (BMCR 0x3000).
The autonegotiated capability is 01e0.
The autonegotiated media type is 100baseTx-FD.
Basic mode control register 0x3000: Auto-negotiation enabled.
You have link beat, and everything is working OK.
This transceiver is capable of 100baseTx-FD 100baseTx 10baseT-FD 10baseT.
Able to perform Auto-negotiation, negotiation complete.
Your link partner advertised 45e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD 10baseT, w/ 802.3X flow control.
End of basic transceiver information.

MII PHY #24 transceiver registers:
3000 786d 0000 0000 01e1 45e1 0005 2801
0000 0000 0000 0000 0000 0000 0000 0000
8000 0afb f5ff 0000 0000 0005 2001 0000
0000 2050 0003 1c11 019a 1000 0000 0000


CPU0 CPU1
0: 39275 36446 IO-APIC-edge timer
1: 8 0 IO-APIC-edge i8042
4: 375 293 IO-APIC-edge serial
7: 0 0 IO-APIC-edge parport0
9: 0 0 IO-APIC-level acpi
12: 101 0 IO-APIC-edge i8042
14: 8133 6029 IO-APIC-edge ide0
15: 26 0 IO-APIC-edge ide1
16: 59024 10 IO-APIC-level eth0, eth1
17: 0 0 IO-APIC-level uhci_hcd:usb5
18: 294950 389876 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb8, wcfxo
19: 841 1620 IO-APIC-level ehci_hcd:usb2, bttv0
20: 13095 10832 IO-APIC-level uhci_hcd:usb3, uhci_hcd:usb6
21: 125 0 IO-APIC-level uhci_hcd:usb4
22: 76446 65337 IO-APIC-level uhci_hcd:usb7
23: 92265 73603 IO-APIC-level Intel ICH5
NMI: 0 0
LOC: 75664 75663
ERR: 0
MIS: 10



Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com


2006-01-09 12:11:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

Folkert van Heusden <[email protected]> wrote:
>
> My system freezes (crashes) when I run tcpdump on the interface
> connected to a 3c905b card.

Works for me with a 3c980-TX. I can dig out a 905b.

Please send the exact commands which you're using to demonstrate this -
sufficient info for me to get as close as possible to what you're doing.

Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
of /proc/interrupts is incrementing.

2006-01-09 14:45:25

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

> > My system freezes (crashes) when I run tcpdump on the interface
> > connected to a 3c905b card.
> Works for me with a 3c980-TX. I can dig out a 905b.
> Please send the exact commands which you're using to demonstrate this -
> sufficient info for me to get as close as possible to what you're doing.

The exact command is:
tcpdump -i eth1

Yes, it is that simple. Not only tcpdump gives this problem; iftop as
well.

> Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> of /proc/interrupts is incrementing.

I'll give it a try. I've added it to the append-line in the lilo config.
Am now compiling the kernel.


Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-01-09 19:37:57

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

> > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > of /proc/interrupts is incrementing.
> I'll give it a try. I've added it to the append-line in the lilo config.
> Am now compiling the kernel.

No change. Well, that is: the last message on the console now is
"setting eth1 to promiscues mode".


Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-01-10 06:48:40

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

Folkert van Heusden <[email protected]> wrote:
>
> > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > of /proc/interrupts is incrementing.
> > I'll give it a try. I've added it to the append-line in the lilo config.
> > Am now compiling the kernel.
>
> No change. Well, that is: the last message on the console now is
> "setting eth1 to promiscues mode".
>

Did you confirm that the NMI counters in /proc/interrupts are incrementing?

2006-01-10 14:27:28

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

> > > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > > of /proc/interrupts is incrementing.
> > > I'll give it a try. I've added it to the append-line in the lilo config.
> > > Am now compiling the kernel.
> > No change. Well, that is: the last message on the console now is
> > "setting eth1 to promiscues mode".
> Did you confirm that the NMI counters in /proc/interrupts are incrementing?

Yes:
root@muur:/home/folkert# for i in `seq 1 5` ; do cat /proc/interrupts | grep NMI ; sleep 1 ; done
NMI: 6949080 6949067
NMI: 6949182 6949169
NMI: 6949284 6949271
NMI: 6949386 6949373
NMI: 6949488 6949475


Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-01-14 13:24:17

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

Hi,

> > > > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > > > of /proc/interrupts is incrementing.
> > > > I'll give it a try. I've added it to the append-line in the lilo config.
> > > > Am now compiling the kernel.
> > > No change. Well, that is: the last message on the console now is
> > > "setting eth1 to promiscues mode".
> > Did you confirm that the NMI counters in /proc/interrupts are incrementing?
> Yes:
> root@muur:/home/folkert# for i in `seq 1 5` ; do cat /proc/interrupts | grep NMI ; sleep 1 ; done
> NMI: 6949080 6949067
> NMI: 6949182 6949169
> NMI: 6949284 6949271
> NMI: 6949386 6949373
> NMI: 6949488 6949475

Is there anything else I can try?


Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-01-14 14:05:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

Folkert van Heusden <[email protected]> wrote:
>
> > > > > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > > > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > > > > of /proc/interrupts is incrementing.
> > > > > I'll give it a try. I've added it to the append-line in the lilo config.
> > > > > Am now compiling the kernel.
> > > > No change. Well, that is: the last message on the console now is
> > > > "setting eth1 to promiscues mode".
> > > Did you confirm that the NMI counters in /proc/interrupts are incrementing?
> > Yes:
> > root@muur:/home/folkert# for i in `seq 1 5` ; do cat /proc/interrupts | grep NMI ; sleep 1 ; done
> > NMI: 6949080 6949067
> > NMI: 6949182 6949169
> > NMI: 6949284 6949271
> > NMI: 6949386 6949373
> > NMI: 6949488 6949475
>
> Is there anything else I can try?

argh. I haven't forgotten. Hopefully after -rc1 I'll have more time...

Your report didn't mention whether that card work OK under earlier 2.6
kernels. If it does, a bit of bisection searching would really help.

2006-01-14 23:37:07

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

> > > > > > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > > > > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > > > > > of /proc/interrupts is incrementing.
> > > > > > I'll give it a try. I've added it to the append-line in the lilo config.
> > > > > > Am now compiling the kernel.
> > > > > No change. Well, that is: the last message on the console now is
> > > > > "setting eth1 to promiscues mode".
> > > > Did you confirm that the NMI counters in /proc/interrupts are incrementing?
> > > Yes:
> > > root@muur:/home/folkert# for i in `seq 1 5` ; do cat /proc/interrupts | grep NMI ; sleep 1 ; done
> > > NMI: 6949080 6949067
...
> > > NMI: 6949488 6949475
> >
> > Is there anything else I can try?
> argh. I haven't forgotten. Hopefully after -rc1 I'll have more time...

Sorry :-)

> Your report didn't mention whether that card work OK under earlier 2.6
> kernels. If it does, a bit of bisection searching would really help.

2.6.15 crash
2.6.14.4 crash
2.6.14 crash
2.6.12.6 crash "NMI watchdog detected LOCKUP"
2.6.6 crash "NMI watchdog detected LOCKUP on CPU1 eip c02500aa, registers:"
2.6.1 would not boot


Folkert van Heusden

--
Try MultiTail! Multiple windows with logfiles, filtered with regular
expressions, colored output, etc. etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Get your PGP/GPG key signed at http://www.biglumber.com!
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-02-26 14:04:00

by folkert

[permalink] [raw]
Subject: Re: [2.6.15] running tcpdump on 3c905b causes freeze (reproducable)

> > > > > > > > Have you tried enabling the NMI watchdog? Enable CONFIG_X86_LOCAL_APIC and
> > > > > > > > boot with `nmi_watchdog=1' on the command line, make sure that the NMI line
> > > > > > > > of /proc/interrupts is incrementing.
> > > > > > > I'll give it a try. I've added it to the append-line in the lilo config.
> > > > > > > Am now compiling the kernel.
> > > > > > No change. Well, that is: the last message on the console now is
> > > > > > "setting eth1 to promiscues mode".
> > > > > Did you confirm that the NMI counters in /proc/interrupts are incrementing?
> > > > Yes:
> > > > root@muur:/home/folkert# for i in `seq 1 5` ; do cat /proc/interrupts | grep NMI ; sleep 1 ; done
> > > > NMI: 6949080 6949067
> ...
> > > > NMI: 6949488 6949475
> > >
> > > Is there anything else I can try?
> > argh. I haven't forgotten. Hopefully after -rc1 I'll have more time...
> Sorry :-)
> > Your report didn't mention whether that card work OK under earlier 2.6
> > kernels. If it does, a bit of bisection searching would really help.
> 2.6.15 crash
> 2.6.14.4 crash
> 2.6.14 crash
> 2.6.12.6 crash "NMI watchdog detected LOCKUP"
> 2.6.6 crash "NMI watchdog detected LOCKUP on CPU1 eip c02500aa, registers:"
> 2.6.1 would not boot

It is definately a 3com 3c905b problem: I swapped that card yesterday
with an adapter from a different brand (gigabit one) and now I can
tcpdump without any lockups.


Folkert van Heusden

--
iPod winnen? --> http://keetweej.vanheusden.com/redir.php?id=62
--------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com