Hi,
I had fun with a remote system (custom vanilla kernel 2.6.24.4), a
NIC sort-of died all of a sudden.
(note: eth4 is udev's idea, not mine - the box has just 2 NICs).
/var/log/messages:
Jan 19 09:25:59 A kernel: NETDEV WATCHDOG: eth4: transmit timed out
Jan 19 09:26:02 A kernel: eth4: link up, 100Mbps, full-duplex, lpa 0xFFFF
Jan 19 09:26:11 A kernel: NETDEV WATCHDOG: eth4: transmit timed out
Jan 19 09:26:14 A kernel: eth4: link up, 100Mbps, full-duplex, lpa 0xFFFF
[goes on for 8 more minutes]
/var/log/debug:
Jan 19 09:26:02 A kernel: eth4: Transmit timeout, status ff ffff ffff media ff.
Jan 19 09:26:02 A kernel: eth4: Tx queue start entry 73297348 dirty entry 73297344.
Jan 19 09:26:02 A kernel: eth4: Tx descriptor 0 is ffffffff. (queue head)
Jan 19 09:26:02 A kernel: eth4: Tx descriptor 1 is ffffffff.
Jan 19 09:26:02 A kernel: eth4: Tx descriptor 2 is ffffffff.
Jan 19 09:26:02 A kernel: eth4: Tx descriptor 3 is ffffffff.
Jan 19 09:26:14 A kernel: eth4: Transmit timeout, status ff ffff ffff media ff.
Jan 19 09:26:14 A kernel: eth4: Tx queue start entry 4 dirty entry 0.
Jan 19 09:26:14 A kernel: eth4: Tx descriptor 0 is ffffffff. (queue head)
Jan 19 09:26:14 A kernel: eth4: Tx descriptor 1 is ffffffff.
Jan 19 09:26:14 A kernel: eth4: Tx descriptor 2 is ffffffff.
Jan 19 09:26:14 A kernel: eth4: Tx descriptor 3 is ffffffff.
[goes on for 8 more minutes]
(FWIW, there also was a BUG with trace info via dmesg but I did not
save that info, nor did it show up in any logfile, doh)
System was rebooted but instead of getting rid of the problem, it
showed eth4 was not usable anymore. It did not show up in dmesg/via
ifconfig and the corresponding lspci -vvv entry states:
00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
Unknown device 8119 (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. Unknown device 8119
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV-
VGASnoop-ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR+ INTx-
Latency: 32 (16000ns max)
Interrupt: pin A routed to IRQ 12
Region 0: I/O ports at a400 [size=256]
Region 1: Memory at d5800000 (32-bit, non-prefetchable)
[size=256]
[virtual] Expansion ROM at 30000000 [disabled] [size=64K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
PME(D0-,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
I replaced the NIC and everything works as expected:
00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Allied Telesyn International AT-2500TX/ACPI
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 32 (8000ns min, 16000ns max)
Interrupt: pin A routed to IRQ 12
Region 0: I/O ports at a400 [size=256]
Region 1: Memory at d5800000 (32-bit, non-prefetchable)
[size=256]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
Kernel driver in use: 8139too
Kernel modules: 8139too, 8139cp
Just for the fun of it, I plugged the faulty NIC into a sole win
machine, it worked. Doesn't work in the linux machine afterwards,
though. First time that I observe such strange behaviour, any ideas
what could be the cause?
--
left blank, right bald
markus reichelt <[email protected]> :
[...]
> Just for the fun of it, I plugged the faulty NIC into a sole win
> machine, it worked. Doesn't work in the linux machine afterwards,
> though.
Was the faulty card was a 8139 and is 'lspci -vx' unable to recognize
it as such anymore ?
--
Ueimor
markus reichelt wrote:
> Just for the fun of it, I plugged the faulty NIC into a sole win
> machine, it worked. Doesn't work in the linux machine afterwards,
> though. First time that I observe such strange behaviour, any ideas
> what could be the cause?
>
Did you put it in the same slot? Was there a firmware upgrade recently?
Those are the things I would check, but not with any great expectation of
finding the cause.
--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot