2001-07-19 16:20:14

by Kevin P. Fleming

[permalink] [raw]
Subject: 2.4.7-pre7 natsemi network driver random pauses

I upgraded two machines here from 2.4.7-pre6 to 2.4.7-pre7 yesterday
afternoon.

The first machine I upgraded, my workstation, is a 1GHz Athlon on a VIA
KT133 (not A) motherboard using a NetGear FA312TX network card. This machine
has always run Linux just fine. After this upgrade, telnetting to my other
Linux machine (not yet upgraded) and doing long directory listings (or tar
tzvf linux-2.4.0.tar) exhibits random (and long) pauses in the output.
Switching back to 2.4.7-pre6 makes the problem disappear. Note that at this
time only the _client_ end of this connection had been upgraded to -pre7.

I then upgraded my server as well, which is a 700 MHz Coppermine Celeron on
an SIS 630 motherboard, also using a NetGear FA312TX network card. Now this
machine exhibits the same symptoms, even when the telnet client is on a
Windows machine.

So, it appears that one of two things happened:

a) the natsemi driver had changes merged between -pre6 and -pre7 (not listed
in the changelogs) that had negative effects on my systems

b) something else in the kernel caused irq/softirq/whatever random latency
to appear

Any ideas where I should start looking?


2001-07-19 16:35:44

by Wilfried Weissmann

[permalink] [raw]
Subject: Re: 2.4.7-pre7 natsemi network driver random pauses

"Kevin P. Fleming" wrote:
>
> I upgraded two machines here from 2.4.7-pre6 to 2.4.7-pre7 yesterday
> afternoon.
>
> The first machine I upgraded, my workstation, is a 1GHz Athlon on a VIA
> KT133 (not A) motherboard using a NetGear FA312TX network card. This machine
> has always run Linux just fine. After this upgrade, telnetting to my other
> Linux machine (not yet upgraded) and doing long directory listings (or tar
> tzvf linux-2.4.0.tar) exhibits random (and long) pauses in the output.
> Switching back to 2.4.7-pre6 makes the problem disappear. Note that at this
> time only the _client_ end of this connection had been upgraded to -pre7.
>
> I then upgraded my server as well, which is a 700 MHz Coppermine Celeron on
> an SIS 630 motherboard, also using a NetGear FA312TX network card. Now this
> machine exhibits the same symptoms, even when the telnet client is on a
> Windows machine.
>
> So, it appears that one of two things happened:
>
> a) the natsemi driver had changes merged between -pre6 and -pre7 (not listed
> in the changelogs) that had negative effects on my systems
>
> b) something else in the kernel caused irq/softirq/whatever random latency
> to appear
>
> Any ideas where I should start looking?

Just for curiosity, do you have those messages in our logfiles:

eth0: Transmit error, Tx status register 82.
Flags; bus-master 1, dirty 20979238(6) current 20979242(10)
Transmit list 1f659290 vs. df659260.
0: @df659200 length 800005ea status 000105ea
1: @df659210 length 80000296 status 00010296
2: @df659220 length 800005ea status 000105ea

I had those with 2.4.3-pre6. They disappeared in 2.4.4. Another user
reported the same on lkml with different kernel versions.

Wilfried

2001-07-19 16:47:45

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.7-pre7 natsemi network driver random pauses

Wilfried Weissmann wrote:
>
> Just for curiosity, do you have those messages in our logfiles:
>
> eth0: Transmit error, Tx status register 82.

That's a 3com message, not natsemi.

And it's such a common error that it is now specially detected in the
driver:

if (tx_status == 0x82) {
printk(KERN_ERR "Probably a duplex mismatch. See "
"Documentation/networking/vortex.txt\n");

Which expands to:


Transmit error, Tx status register 82
-------------------------------------

This is a common error which is almost always caused by another host on
the same network being in full-duplex mode, while this host is in
half-duplex mode. You need to find that other host and make it run in
half-duplex mode or fix this host to run in full-duplex mode.

As a last resort, you can force the 3c59x driver into full-duplex mode
with

options 3c59x full_duplex=1

but this has to be viewed as a workaround for broken network gear and
should only really be used for equipment which cannot autonegotiate.

-