2001-03-12 13:21:38

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: 21041 transmit timed out

On Wed, 20 Dec 2000, Geert Uytterhoeven wrote:
> Since I switched from the de4x5 driver to the tulip driver several months ago
> (before that, tulip didn't work on PPC), I never saw 21041 transmit timed out
> messages again, until today:
>
> | Dec 20 12:13:29 callisto kernel: Linux Tulip driver version 0.9.11 (November 3, 2000)
> | Dec 20 12:13:29 callisto kernel: PCI: Enabling device 00:04.0 (0000 -> 0003)
> | Dec 20 12:13:29 callisto kernel: eth0: Digital DC21041 Tulip rev 33 at 0x1080, 21041 mode, 00:80:C8:5A:F8:5B, IRQ 29.
> | Dec 20 12:13:29 callisto kernel: eth0: 21041 Media table, default media 0800 (Autosense).
> | Dec 20 12:13:29 callisto kernel: eth0: 21041 media #0, 10baseT.
> | Dec 20 12:13:29 callisto kernel: eth0: 21041 media #4, 10baseT-FD.
> | Dec 20 12:13:29 callisto kernel: eth0: 21041 media #1, 10base2.
> | Dec 20 12:55:27 callisto kernel: NETDEV WATCHDOG: eth0: transmit timed out
> | Dec 20 12:55:27 callisto kernel: eth0: 21041 transmit timed out, status fc660000, CSR12 000001c8, CSR13 ffffef05, CSR14 ffffff3f, resetting...
> | Dec 20 12:55:27 callisto kernel: eth0: 21143 100baseTx sensed media.
> | Dec 20 12:55:35 callisto kernel: NETDEV WATCHDOG: eth0: transmit timed out
> | Dec 20 12:55:35 callisto kernel: eth0: 21041 transmit timed out, status fc660010, CSR12 000002c8, CSR13 ffffef0d, CSR14 fffff73d, resetting...
> | Dec 20 12:55:35 callisto kernel: eth0: 21143 100baseTx sensed media.

[...]

> Then I tried ifconfig down/up, which didn't work. An additional rmmod/insmod
> pair for the tulip module woke up the card again:
>
> | Dec 20 13:00:43 callisto kernel: Linux Tulip driver version 0.9.11 (November 3, 2000)
> | Dec 20 13:00:43 callisto kernel: eth0: Digital DC21041 Tulip rev 33 at 0x1080, 21041 mode, 00:80:C8:5A:F8:5B, IRQ 29.
> | Dec 20 13:00:43 callisto kernel: eth0: 21041 Media table, default media 0800 (Autosense).
> | Dec 20 13:00:43 callisto kernel: eth0: 21041 media #0, 10baseT.
> | Dec 20 13:00:43 callisto kernel: eth0: 21041 media #4, 10baseT-FD.
> | Dec 20 13:00:43 callisto kernel: eth0: 21041 media #1, 10base2.
>
> The machine is a CHRP LongTrail (PPC 604e), running some kind of 2.4.0-test11
> kernel.
>
> The card is a D-Link DE-530CT, with a 21041 on a 10baseT network (hence not
> 100baseTx, as detected during the problem phase!).

lspci output:

| 00:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21041 [Tulip Pass 3] (rev 21)
| Subsystem: D-Link System Inc DE-530+
| Flags: bus master, medium devsel, latency 0, IRQ 29
| I/O ports at 1080 [size=128]
| Memory at c1080000 (32-bit, non-prefetchable) [size=128]
| Expansion ROM at c11c0000 [disabled] [size=256K]

I made a list of driver versions that showed the problem so far:

| Tulip driver version 0.9.11 (November 3, 2000)
| Tulip driver version 0.9.13 (January 2, 2001)
| Tulip driver version 0.9.13a (January 20, 2001)
| Tulip driver version 0.9.14 (February 20, 2001)

There's one version I never had problems with:

| Tulip driver version 0.9.5 (May 30, 2000) (from 2.4.0-test1-ac10)

Of course I'm not 100% sure, but I ran 2.4.0-test1-ac10 for nearly 6 months on
that machine, without a single Tulip problem. All other versions caused
problems in a much shorter timeframe.

BTW, do you want me to try 1.1.2?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2001-03-12 17:38:05

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: 21041 transmit timed out

On Mon, 12 Mar 2001, Geert Uytterhoeven wrote:
> I made a list of driver versions that showed the problem so far:
>
> | Tulip driver version 0.9.14 (February 20, 2001)

Wow! 0.9.14 in my 2.4.3-pre2 kernel just seem to have recovered from the
problem:

| NETDEV WATCHDOG: eth0: transmit timed out
| eth0: 21041 transmit timed out, status fc660000, CSR12 000001c8, CSR13 ffffef05, CSR14 ffffff3f, resetting...
| eth0: 21143 100baseTx sensed media.
| NETDEV WATCHDOG: eth0: transmit timed out
| eth0: 21041 transmit timed out, status fc260010, CSR12 000002c8, CSR13 ffffef0d, CSR14 fffff73d, resetting...
| NETDEV WATCHDOG: eth0: transmit timed out
| eth0: 21041 transmit timed out, status fc260010, CSR12 000002c8, CSR13 ffffef0d, CSR14 fffff73d, resetting...
| eth0: Out-of-sync dirty pointer, 243506 vs. 243524.

And eth0 is back online.

The next few lines in the syslog do look suspicious, though:

| release_dev: driver.table[1] not tty for ()
| release_dev: driver.table[7] not tty for (FikXZdoGQcCWQFGjV5evDzG+mv
| Q1jmH5buYh7LWpmXr8mfjykOCoR2Ry+NmzL3sE49mLozzdT22tUJKu6zt?P???)
| Warning: dev (04:08) tty->count(2) != #fd's(1) in release_dev
| release_dev: driver.table[7] not tty for ()
| Warning: dev (04:41) tty->count(2) != #fd's(1) in tty_open
| Warning: dev (04:08) tty->count(3) != #fd's(1) in tty_open
| Warning: dev (04:41) tty->count(3) != #fd's(2) in tty_open
| Warning: dev (04:41) tty->count(3) != #fd's(2) in release_dev
| Warning: dev (04:41) tty->count(3) != #fd's(2) in tty_open

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds