2001-07-02 11:55:46

by Juergen Wolf

[permalink] [raw]
Subject: Problem with SMC Etherpower II + kernel newer 2.4.2

Hi everybody,

currently I experience some strange problems with every kernels newer
than 2.4.2 and my SMC Etherpower II network card. While running such a
kernel, the network hangs and I get lots of errors like these listed
below:

Jul 2 13:06:59 localhost kernel: eth0: Too much work at interrupt,
IntrStatus=0x008d0004.
Jul 2 13:07:06 localhost last message repeated 5 times
Jul 2 13:07:20 localhost kernel: NETDEV WATCHDOG: eth0: transmit timed
out
Jul 2 13:07:20 localhost kernel: eth0: Transmit timeout using MII
device, Tx status 4003.
Jul 2 13:07:22 localhost kernel: eth0: Too much work at interrupt,
IntrStatus=0x008d0004.


The /proc/pci lists the following system components:

PCI devices found:
Bus 0, device 0, function 0:
Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev
3).
Master Capable. Latency=32.
Prefetchable 32 bit memory at 0xd8000000 [0xdbffffff].
Bus 0, device 1, function 0:
PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
(rev 0).
Master Capable. No bursts. Min Gnt=12.
Bus 0, device 7, function 0:
ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South]
(rev 34).
Bus 0, device 7, function 1:
IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 16).
Master Capable. Latency=32.
I/O at 0xc000 [0xc00f].
Bus 0, device 7, function 2:
USB Controller: VIA Technologies, Inc. UHCI USB (rev 16).
IRQ 9.
Master Capable. Latency=32.
I/O at 0xc400 [0xc41f].
Bus 0, device 7, function 3:
USB Controller: VIA Technologies, Inc. UHCI USB (#2) (rev 16).
IRQ 9.
Master Capable. Latency=32.
I/O at 0xc800 [0xc81f].
Bus 0, device 7, function 4:
Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
(rev 48).
Bus 0, device 7, function 5:
Multimedia audio controller: VIA Technologies, Inc. AC97 Audio
Controller (rev 32).
IRQ 11.
I/O at 0xcc00 [0xccff].
I/O at 0xd000 [0xd003].
I/O at 0xd400 [0xd403].
Bus 0, device 9, function 0:
Multimedia audio controller: Xilinx, Inc. RME Digi96/8 (rev 4).
IRQ 10.
Non-prefetchable 32 bit memory at 0xde000000 [0xdeffffff].
Bus 0, device 10, function 0:
Multimedia audio controller: Ensoniq 5880 AudioPCI (rev 2).
IRQ 5.
Master Capable. Latency=32. Min Gnt=12.Max Lat=128.
I/O at 0xdc00 [0xdc3f].
Bus 0, device 11, function 0:
Ethernet controller: Standard Microsystems Corp [SMC] 83C170QF (rev
9).
IRQ 11.
Master Capable. Latency=32. Min Gnt=8.Max Lat=28.
I/O at 0xe000 [0xe0ff].
Non-prefetchable 32 bit memory at 0xe0000000 [0xe0000fff].
Bus 1, device 0, function 0:
VGA compatible controller: nVidia Corporation NV11 (rev 161).
IRQ 10.
Master Capable. Latency=32. Min Gnt=5.Max Lat=1.
Non-prefetchable 32 bit memory at 0xdc000000 [0xdcffffff].
Prefetchable 32 bit memory at 0xd0000000 [0xd7ffffff].


Does anybody else got these errors or knows about a solution for this ??
The 2.2.x kernels and all kernel versions below (including) 2.4.2 work
fine on the same system and I did not find any entries in the changelogs
for the SMC driver code.

Thx
Juergen


2001-07-02 14:18:39

by John Jasen

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

On Mon, 2 Jul 2001, Juergen Wolf wrote:

> currently I experience some strange problems with every kernels newer
> than 2.4.2 and my SMC Etherpower II network card. While running such a
> kernel, the network hangs and I get lots of errors like these listed
> below:

under the dumb question department:

a) bad cable?
b) not negotiating speed and duplex with the switch correctly?
c) bad card?
d) IRQ sharing causing a conflict?

I'm less predisposed to blame the card in general or the driver, as I have
probably about a dozen SMC epic100 cards here, in single processor x86,
dual x86, and dual alphas that have been flawless from about 2.2.14 to
2.4.4.


2001-07-03 09:32:08

by Florian Schmitt

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2


> Does anybody else got these errors or knows about a solution for this ??

Same problem here, it won't run at all on newer kernels. But it isn't even
100% stable in 2.2.x here - on very high network traffic the card stops
working. In this case, it helps to pull the network plug for a short time,
then everything goes back to normal. I reduced speed to 10MB, and now it is
stable at least in 2.2.x.
Any suggestions would be greatly appreciated. I even put the card into
another pci slot with exactly zero effect.
There are drivers on smc.com, but they won't help either :-(

2001-07-03 14:58:10

by Olivier Sessink

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

On 0, Florian Schmitt <[email protected]> wrote:
>
> > Does anybody else got these errors or knows about a solution for this ??
>
> Same problem here, it won't run at all on newer kernels. But it isn't even
> 100% stable in 2.2.x here - on very high network traffic the card stops
> working. In this case, it helps to pull the network plug for a short time,
> then everything goes back to normal. I reduced speed to 10MB, and now it is
> stable at least in 2.2.x.

I use (kernel 2.4.4 and 2.4.5) a cron script that pings, and will run
ifdown eth0; ifup eth0
when the ping fails, this seems to be good enough to get it up and running
again, sometimes I need to reload the module, but it's indeed very annoying.

if ! ping -c 1 -n -q 192.168.100.2 ; then
ifdown eth0
ifup eth0
if ! ping -c 1 -n -q 192.168.100.2 ; then
ifdown eth0
rmmod epic100
insmod epic100
ifup eth0
fi
fi

regards,
Olivier

2001-07-04 09:19:54

by Francois Romieu

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Florian Schmitt <[email protected]> ecrit :
[...]
> Same problem here, it won't run at all on newer kernels. But it isn't even
> 100% stable in 2.2.x here - on very high network traffic the card stops
> working. In this case, it helps to pull the network plug for a short time,

Could you specify what you mean by "very high network traffic" in terms
of interrupt rate and Mb/s ?
Ftp on full CD content or gross ping -f doesn't kill it under 2.4 here.
autonegociation sucks sometimes.

[...]
> then everything goes back to normal. I reduced speed to 10MB, and now it is
> stable at least in 2.2.x.
> Any suggestions would be greatly appreciated. I even put the card into
> another pci slot with exactly zero effect.

Different switch/cable/*motherboard* ?

--
Ueimor

2001-07-04 09:16:24

by Juergen Wolf

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

John Jasen wrote:
>
> a) bad cable?
> b) not negotiating speed and duplex with the switch correctly?
> c) bad card?
>

Hi,

The same errors show up with the network cable plugged or unplugged on
all computers with the SMC card around here. But all these computers are
equipped with nearly the same hardware (see my first posting of the
/proc/pci file). Also the 2.4.6 kernel does not solve the problem.


> d) IRQ sharing causing a conflict?

I dont think so, at least I dont get a IRQ conflict message and there
is no other device shown as using the same interrupt. If I use the 2.4.2
kernel or a version below everything works fine on the same host.

Another strange effect is, that if I wait for quite some time (5-10
Minutes) while trying to start up the eth0 device with "ifconfig eth0
up" I see messages like

Jul 4 09:38:58 localhost kernel: eth0: Setting full-duplex based on MII
#3 link partner capability of 41e1.
Jul 4 09:39:00 localhost kernel: NETDEV WATCHDOG: eth0: transmit timed
out
Jul 4 09:39:00 localhost kernel: eth0: Transmit timeout using MII
device, Tx status 000b.
Jul 4 09:39:02 localhost kernel: eth0: Too much work at interrupt,
IntrStatus=0x008d0004.
Jul 4 09:40:55 localhost kernel: eth0: Setting half-duplex based on MII
#3 link partner capability of 0001.

in between hundreds of "too much work at interrupt" messages. This error
also occures regardles of the network cable is plugged or unplugged.

Regards
Juergen

2001-07-04 12:58:18

by Francois Romieu

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Juergen Wolf <[email protected]> ecrit :
[...]
Jul 2 13:06:59 localhost kernel: eth0: Too much work at interrupt, IntrStatus=0x008d0004.

Receive Status Valid
Receive Copy In Progress
Transmit Idle
Receive Queue Empty -> no more receive buffer available

It looks like one waits too long before processing incoming data
but I'm curious to know where they come from if nothing is plugged.

[...]
> Bus 1, device 0, function 0:
> VGA compatible controller: nVidia Corporation NV11 (rev 161).
> IRQ 10.
> Master Capable. Latency=32. Min Gnt=5.Max Lat=1.
> Non-prefetchable 32 bit memory at 0xdc000000 [0xdcffffff].
> Prefetchable 32 bit memory at 0xd0000000 [0xd7ffffff].

Is X or something like a nvidia module enabled ?

--
Ueimor

2001-07-04 15:31:13

by Florian Schmitt

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

> Could you specify what you mean by "very high network traffic" in terms
> of interrupt rate and Mb/s ?
> Ftp on full CD content or gross ping -f doesn't kill it under 2.4 here.
> autonegociation sucks sometimes.

That's about what I did, except that I saved the data to a nfs mounted disk.

> Different switch/cable/*motherboard* ?

Probably not. I tried the drivers from http://www.scyld.com/network/ , and
the problem disappeared (thanks to Jeff Garzik for the suggestion).
I haven't tried 2.4.x again, but last time I did (2.4.6-pre6 or so), it
didn't even finish importing my nfs shares on startup.

In case you are interested, here is the output of the 2.2.18 drivers, when
the card hangs:

Jun 4 16:44:34 siechfried kernel: eth0: Transmit timeout using MII device,
Tx status 0005.
Jun 4 16:44:34 siechfried kernel: eth0: Restarting the EPIC chip, Rx
2026941/2026941 Tx 497569/497585.
Jun 4 16:44:34 siechfried kernel: eth0: epic_restart() done, cmd status
000a, ctl 0512 interrupt 240000.
Jun 4 16:44:39 siechfried kernel: eth0: Transmit timeout using MII device,
Tx status 0005.
Jun 4 16:44:39 siechfried kernel: eth0: Restarting the EPIC chip, Rx
2026941/2026941 Tx 497569/497585.
Jun 4 16:44:39 siechfried kernel: eth0: epic_restart() done, cmd status
000a, ctl 0512 interrupt 240000.
Jun 4 16:44:44 siechfried kernel: eth0: Transmit timeout using MII device,
Tx status 0005.
etc...

The driver from scyld.com did also issue such a warning, but only once and
everythings seems to be back to normal afterwards:

Jul 4 15:13:06 siechfried kernel: eth0: Tx hung, 25721 vs. 25713.
Jul 4 15:13:06 siechfried kernel: eth0: Transmit timeout using MII device,
Tx status 0003.
Jul 4 15:13:06 siechfried kernel: eth0: Restarting the EPIC chip, Rx
24507/24507 Tx 25713/25721.
Jul 4 15:13:06 siechfried kernel: eth0: epic_restart() done, cmd status
000a, ctl 0512 interrupt 240000.

I hope this helps,
Flo

2001-07-06 07:48:45

by Juergen Wolf

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Francois Romieu wrote:
>
> Is X or something like a nvidia module enabled ?
>

Hi,

the nvidia modul is not loaded or enabled but X is running sometimes.
Anyways, it seems to happen if X is not running too.
Luckily I got a very helpfull hint from Hans-Christian Armingeon in
reply to my questions here on the list. The epic100.c from
http://lrcwww.epfl.ch/~boch/sw/epic100.c.txt fixes the problem in all
the affected kernel versions.

Thanx for your help guys
Juergen

2001-07-06 11:44:40

by Francois Romieu

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Juergen Wolf <[email protected]> ecrit :
[...]
> Luckily I got a very helpfull hint from Hans-Christian Armingeon in
> reply to my questions here on the list. The epic100.c from
> http://lrcwww.epfl.ch/~boch/sw/epic100.c.txt fixes the problem in all
> the affected kernel versions.

Interesting.
- /* Donald: If this is for Cardbus only then define it so. It *//*HB*/
- /* breaks the SMC9432BTX Rev 09 boards *//*HB*/
-#ifdef CARDBUS /*HB*/
- outl(0x12, ioaddr + MIICfg);
-#endif /*HB*/
+ outl(dev->if_port == 1 ? 0x13 : 0x12, ioaddr + MIICfg);

Could you try 2.4.6 with just this modification: ?

--- linux-2.4.6.orig/drivers/net/epic100.c Wed Jul 4 14:42:13 2001
+++ linux-2.4.6/drivers/net/epic100.c Fri Jul 6 13:34:17 2001
@@ -681,7 +681,9 @@
required by the details of which bits are reset and the transceiver
wiring on the Ositech CardBus card.
*/
+#ifdef 0
outl(dev->if_port == 1 ? 0x13 : 0x12, ioaddr + MIICfg);
+#endif
if (ep->chip_flags & MII_PWRDWN)
outl((inl(ioaddr + NVCTL) & ~0x003C) | 0x4800, ioaddr + NVCTL);


--
Ueimor

2001-07-06 12:30:40

by Juergen Wolf

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Francois Romieu wrote:
>
> Could you try 2.4.6 with just this modification: ?
>

hm, looks like thats really the point. After applying your diff file
everything works fine.

Regards,
Juergen

2001-07-06 12:34:50

by Jeff Garzik

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Juergen Wolf wrote:
>
> Francois Romieu wrote:
> >
> > Could you try 2.4.6 with just this modification: ?
> >
>
> hm, looks like thats really the point. After applying your diff file
> everything works fine.

Does it work with this line?

outl(0x12, ioaddr + MIICfg);

--
Jeff Garzik | A recent study has shown that too much soup
Building 1024 | can cause malaise in laboratory mice.
MandrakeSoft |

2001-07-06 12:52:32

by Juergen Wolf

[permalink] [raw]
Subject: Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Jeff Garzik wrote:
>
> Does it work with this line?
>
> outl(0x12, ioaddr + MIICfg);
>

yes, works fine too

Regards,
Juergen

2001-07-06 12:59:34

by Jeff Garzik

[permalink] [raw]
Subject: [PATCH] Re: Problem with SMC Etherpower II + kernel newer 2.4.2

--- /spare/tmp/linux-2.4.7-pre3/drivers/net/epic100.c Mon Jul 2 21:03:04 2001
+++ linux/drivers/net/epic100.c Fri Jul 6 12:56:40 2001
@@ -48,13 +48,16 @@
* ethtool driver info support (jgarzik)

LK1.1.9:
- * MII ioctl support (jgarzik)
+ * ethtool media get/set support (jgarzik)
+
+ LK1.1.10:
+ * revert MII transceiver init change (jgarzik)

*/

#define DRV_NAME "epic100"
-#define DRV_VERSION "1.11+LK1.1.9"
-#define DRV_RELDATE "July 2, 2001"
+#define DRV_VERSION "1.11+LK1.1.10"
+#define DRV_RELDATE "July 6, 2001"


/* The user-configurable values.
@@ -448,7 +451,7 @@
outl(0x0008, ioaddr + TEST1);

/* Turn on the MII transceiver. */
- outl(dev->if_port == 1 ? 0x13 : 0x12, ioaddr + MIICfg);
+ outl(0x12, ioaddr + MIICfg);
if (chip_idx == 1)
outl((inl(ioaddr + NVCTL) & ~0x003C) | 0x4800, ioaddr + NVCTL);
outl(0x0200, ioaddr + GENCTL);


Attachments:
epic100-2.4.7.3.patch (870.00 B)

2001-07-06 15:05:34

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH] Re: Problem with SMC Etherpower II + kernel newer 2.4.2

Jeff Garzik <[email protected]> ecrit :
[...]
> --- /spare/tmp/linux-2.4.7-pre3/drivers/net/epic100.c Mon Jul 2 21:03:04 2001
> +++ linux/drivers/net/epic100.c Fri Jul 6 12:56:40 2001
[...]
> /* The user-configurable values.
> @@ -448,7 +451,7 @@
> outl(0x0008, ioaddr + TEST1);
>
> /* Turn on the MII transceiver. */
> - outl(dev->if_port == 1 ? 0x13 : 0x12, ioaddr + MIICfg);
> + outl(0x12, ioaddr + MIICfg);
> if (chip_idx == 1)
> outl((inl(ioaddr + NVCTL) & ~0x003C) | 0x4800, ioaddr + NVCTL);
> outl(0x0200, ioaddr + GENCTL);

The link that Juergen sent does that in epic_init_one but it removes it
from epic_open (the patch I forwarded).

Btw it plays rude games with udelay() (consequence of posted writes + optimized
loops ?).

--
Ueimor