2004-04-20 09:23:04

by Eamonn Hamilton

[permalink] [raw]
Subject: e100 NETDEV WATCHDOG transmit timeout since 2.6.4

Hi,

Since 2.6.4 my ethernet interface is timing out after successfully
operating for 30 minutes or so, and is reset by the watchdog and runs
for another 30 minutes or so. This behaviour is only since 2.6.4 when I
believe a rewrite of the code was done. I've tried it with acpi
disabled, and got the same results if this helps.

Anybody got any ideas?

Cheers,
Eamonn
--
Eamonn Hamilton <[email protected]>


2004-04-20 16:08:13

by Mike Keehan

[permalink] [raw]
Subject: Re: e100 NETDEV WATCHDOG transmit timeout since 2.6.4

I am getting these too in recent kernels, but only on a
10Mhz half-duplex network. I have to manually down
and up the interface to recover.

On a 100Mhz switched network, the e100 is OK (at least 12hours+).

Mike.

NETDEV WATCHDOG: eth0: transmit timed out
--- [email protected] wrote:
> ----- Forwarded message from Eamonn Hamilton
> <[email protected]> -----
>
> Subject: e100 NETDEV WATCHDOG transmit timeout since 2.6.4
> From: Eamonn Hamilton <[email protected]>
> To: [email protected]
> Date: Tue, 20 Apr 2004 10:13:33 +0100
>
> Hi,
>
> Since 2.6.4 my ethernet interface is timing out after successfully
> operating for 30 minutes or so, and is reset by the watchdog and runs
> for another 30 minutes or so. This behaviour is only since 2.6.4 when
> I
> believe a rewrite of the code was done. I've tried it with acpi
> disabled, and got the same results if this helps.
>
> Anybody got any ideas?
>
> Cheers,
> Eamonn
> --
> Eamonn Hamilton <[email protected]>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
>
> ----- End forwarded message -----





__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25?
http://photos.yahoo.com/ph/print_splash

2004-04-20 23:11:56

by Eamonn Hamilton

[permalink] [raw]
Subject: Re: e100 NETDEV WATCHDOG transmit timeout since 2.6.4

Hmm, looks like we have an element of commonality - I also am on a 10
MBit half duplex network ( an old corporate switch thing with no
auto-sensing ).

Unfortunately the chance of persuading the powers that be to buy some
less prehistoric networking gear is ... slight :)

I've also tried the eepro100 driver and get the same symptoms, but I
notice that both drivers use the mii module, which appears to be main
element of commonality.


--
Eamonn Hamilton <[email protected]>

2004-04-20 23:16:55

by Feldman, Scott

[permalink] [raw]
Subject: Re: e100 NETDEV WATCHDOG transmit timeout since 2.6.4



On Tue, 20 Apr 2004, Mike Keehan wrote:

> I am getting these too in recent kernels, but only on a
> 10Mhz half-duplex network. I have to manually down
> and up the interface to recover.
>
> On a 100Mhz switched network, the e100 is OK (at least 12hours+).


Mike/Eamonn,

Give this patch a try. This adds a required workaround for ICH when
working at 10/Half. (Guessing you have an ICH system).

--- linux-2.5/drivers/net/e100.c 2004-04-20 15:52:24.000000000 -0700
+++ linux-2.5/drivers/net/e100.c.mod 2004-04-20 15:55:32.000000000 -0700
@@ -287,6 +287,7 @@ enum scb_cmd_hi {
};

enum scb_cmd_lo {
+ cuc_nop = 0x00,
ruc_start = 0x01,
ruc_load_base = 0x06,
cuc_start = 0x10,
@@ -514,10 +515,11 @@ struct nic {
/* End: frequently used values: keep adjacent for cache effect */

enum {
- ich = (1 << 0),
- promiscuous = (1 << 1),
- multicast_all = (1 << 2),
- wol_magic = (1 << 3),
+ ich = (1 << 0),
+ promiscuous = (1 << 1),
+ multicast_all = (1 << 2),
+ wol_magic = (1 << 3),
+ ich_10h_workaround = (1 << 4),
} flags ____cacheline_aligned;

enum mac mac;
@@ -1225,6 +1227,12 @@ static void e100_watchdog(unsigned long
/* Issue a multicast command to workaround a 557 lock up */
e100_set_multicast_list(nic->netdev);

+ if(nic->flags & ich && cmd.speed==SPEED_10 && cmd.duplex==DUPLEX_HALF)
+ /* Need SW workaround for ICH[x] 10Mbps/half duplex Tx hang. */
+ nic->flags |= ich_10h_workaround;
+ else
+ nic->flags &= ~ich_10h_workaround;
+
mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD);
}

@@ -1244,7 +1252,17 @@ static inline void e100_xmit_prepare(str
static int e100_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
{
struct nic *nic = netdev->priv;
- int err = e100_exec_cb(nic, skb, e100_xmit_prepare);
+ int err;
+
+ if(nic->flags & ich_10h_workaround) {
+ /* SW workaround for ICH[x] 10Mbps/half duplex Tx hang.
+ Issue a NOP command followed by a 1us delay before
+ issuing the Tx command. */
+ e100_exec_cmd(nic, cuc_nop, 0);
+ udelay(1);
+ }
+
+ err = e100_exec_cb(nic, skb, e100_xmit_prepare);

switch(err) {
case -ENOSPC:

2004-04-21 09:42:56

by Eamonn Hamilton

[permalink] [raw]
Subject: Re: e100 NETDEV WATCHDOG transmit timeout since 2.6.4

Hi Scott,

Thanks for the patch - a bit of manual fumbling to get it applied, but
it seems to work ok so far ( ~46 minutes, so it's not exactly definitive
:)

Certainly it doesn't make things worse in my case at least.

Cheers,
Eamonn

On Wed, 2004-04-21 at 00:04, Scott Feldman wrote:
> On Tue, 20 Apr 2004, Mike Keehan wrote:
>
> > I am getting these too in recent kernels, but only on a
> > 10Mhz half-duplex network. I have to manually down
> > and up the interface to recover.
> >
> > On a 100Mhz switched network, the e100 is OK (at least 12hours+).
>
>
> Mike/Eamonn,
>
> Give this patch a try. This adds a required workaround for ICH when
> working at 10/Half. (Guessing you have an ICH system).
>
> --- linux-2.5/drivers/net/e100.c 2004-04-20 15:52:24.000000000 -0700
> +++ linux-2.5/drivers/net/e100.c.mod 2004-04-20 15:55:32.000000000 -0700
> @@ -287,6 +287,7 @@ enum scb_cmd_hi {
> };
>
> enum scb_cmd_lo {
> + cuc_nop = 0x00,
> ruc_start = 0x01,
> ruc_load_base = 0x06,
> cuc_start = 0x10,
> @@ -514,10 +515,11 @@ struct nic {
> /* End: frequently used values: keep adjacent for cache effect */
>
> enum {
> - ich = (1 << 0),
> - promiscuous = (1 << 1),
> - multicast_all = (1 << 2),
> - wol_magic = (1 << 3),
> + ich = (1 << 0),
> + promiscuous = (1 << 1),
> + multicast_all = (1 << 2),
> + wol_magic = (1 << 3),
> + ich_10h_workaround = (1 << 4),
> } flags ____cacheline_aligned;
>
> enum mac mac;
> @@ -1225,6 +1227,12 @@ static void e100_watchdog(unsigned long
> /* Issue a multicast command to workaround a 557 lock up */
> e100_set_multicast_list(nic->netdev);
>
> + if(nic->flags & ich && cmd.speed==SPEED_10 && cmd.duplex==DUPLEX_HALF)
> + /* Need SW workaround for ICH[x] 10Mbps/half duplex Tx hang. */
> + nic->flags |= ich_10h_workaround;
> + else
> + nic->flags &= ~ich_10h_workaround;
> +
> mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD);
> }
>
> @@ -1244,7 +1252,17 @@ static inline void e100_xmit_prepare(str
> static int e100_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
> {
> struct nic *nic = netdev->priv;
> - int err = e100_exec_cb(nic, skb, e100_xmit_prepare);
> + int err;
> +
> + if(nic->flags & ich_10h_workaround) {
> + /* SW workaround for ICH[x] 10Mbps/half duplex Tx hang.
> + Issue a NOP command followed by a 1us delay before
> + issuing the Tx command. */
> + e100_exec_cmd(nic, cuc_nop, 0);
> + udelay(1);
> + }
> +
> + err = e100_exec_cb(nic, skb, e100_xmit_prepare);
>
> switch(err) {
> case -ENOSPC:
--
Eamonn Hamilton

Senior Technical Designer
Technical Projects and Design (North)

Tel : +44 (0) 1224 333833
Fax : +44 (0) 1224 840032
Email [email protected]

2004-04-21 13:20:41

by Mike Keehan

[permalink] [raw]
Subject: Re: e100 NETDEV WATCHDOG transmit timeout since 2.6.4

HI Scott,

That's a lot better.

I've been running OK for about four hours on the half duplex network.
The only time that the wathcdog timed out was when I was using
KDE's Konqueror to look at files on a SMB mounted directory
(from an NT 4 system).

Previously the network connection would fail within minutes when
using ssh or ftp to another Linux box. Now, they have worked
consistently for hours.

Cheers,

Mike.

--- Scott Feldman <[email protected]> wrote:
>
>
> On Tue, 20 Apr 2004, Mike Keehan wrote:
>
> > I am getting these too in recent kernels, but only on a
> > 10Mhz half-duplex network. I have to manually down
> > and up the interface to recover.
> >
> > On a 100Mhz switched network, the e100 is OK (at least 12hours+).
>
>
> Mike/Eamonn,
>
> Give this patch a try. This adds a required workaround for ICH when
> working at 10/Half. (Guessing you have an ICH system).
>
> --- linux-2.5/drivers/net/e100.c 2004-04-20 15:52:24.000000000 -0700
> +++ linux-2.5/drivers/net/e100.c.mod 2004-04-20 15:55:32.000000000
> -0700
> @@ -287,6 +287,7 @@ enum scb_cmd_hi {
> };
>
> enum scb_cmd_lo {
> + cuc_nop = 0x00,
> ruc_start = 0x01,
> ruc_load_base = 0x06,
> cuc_start = 0x10,
> @@ -514,10 +515,11 @@ struct nic {
> /* End: frequently used values: keep adjacent for cache effect */
>
> enum {
> - ich = (1 << 0),
> - promiscuous = (1 << 1),
> - multicast_all = (1 << 2),
> - wol_magic = (1 << 3),
> + ich = (1 << 0),
> + promiscuous = (1 << 1),
> + multicast_all = (1 << 2),
> + wol_magic = (1 << 3),
> + ich_10h_workaround = (1 << 4),
> } flags ____cacheline_aligned;
>
> enum mac mac;
> @@ -1225,6 +1227,12 @@ static void e100_watchdog(unsigned long
> /* Issue a multicast command to workaround a 557 lock up */
> e100_set_multicast_list(nic->netdev);
>
> + if(nic->flags & ich && cmd.speed==SPEED_10 &&
> cmd.duplex==DUPLEX_HALF)
> + /* Need SW workaround for ICH[x] 10Mbps/half duplex Tx hang. */
> + nic->flags |= ich_10h_workaround;
> + else
> + nic->flags &= ~ich_10h_workaround;
> +
> mod_timer(&nic->watchdog, jiffies + E100_WATCHDOG_PERIOD);
> }
>
> @@ -1244,7 +1252,17 @@ static inline void e100_xmit_prepare(str
> static int e100_xmit_frame(struct sk_buff *skb, struct net_device
> *netdev)
> {
> struct nic *nic = netdev->priv;
> - int err = e100_exec_cb(nic, skb, e100_xmit_prepare);
> + int err;
> +
> + if(nic->flags & ich_10h_workaround) {
> + /* SW workaround for ICH[x] 10Mbps/half duplex Tx hang.
> + Issue a NOP command followed by a 1us delay before
> + issuing the Tx command. */
> + e100_exec_cmd(nic, cuc_nop, 0);
> + udelay(1);
> + }
> +
> + err = e100_exec_cb(nic, skb, e100_xmit_prepare);
>
> switch(err) {
> case -ENOSPC:
>





__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25?
http://photos.yahoo.com/ph/print_splash