2003-07-22 05:27:44

by Matthew Hunter

[permalink] [raw]
Subject: 2.4.21, NFS v3, and 3com 920

Short version:
Any known gotchas for 2.4.21 NFS v3 and/or a 3com 920?

Long version:
I'm trying to run the /home directory for a small network from a
single fileserver via NFS. The server is a 4-way PPro 200 on
SCSI RAID5 using linux v2.4.21, and the client is a 2-way Athlon
(on a Tyan 2466, MPX chipset). The network is a just 5 machines,
mostly idle, Fast Ethernet on a switch.

Everything was working wonderfully until I rebooted the client
server, and ran into the gnome-over-NFS-has-gconf-problems
problem. Researched, and the fix said "Use NFSv3". So I
recompiled, and used NFSv3.

Except this didn't work. At all.

I could mount the directory as before, but any attempt to read
from it takes forever and involves a large number of timeouts.
By "forever" I mean hours just to run mutt and open up a mailbox.
It might eventually succeed. Maybe.

No load is apparant on either server while this is happening.
Occasionally, I'll see a the client report that the server timed
out. If I'm very patient, I'll eventually see a report that the
server is "OK".

After much futzing around trying to isolate the problem, it turns
out I can trigger it by compiling with NFSv3 on (from the client
machine; server always has v3). NFS without v3 works, except for
the gnome thing. And NFS with v3 works if you use TCP, but it's
slow.

So I'm thinking, maybe there's some kind of packet loss problem?

ifconfig doesn't show any errors or dropped packets. It does
show some overruns when running under NFS v3 TCP, but did not
when running without TCP.

I decide to run some large file transfers via SCP outside of the
NFS mount to see what those show. Low and behold, the fancy new
system can only receive at very low speeds -- a minimum of about
40 KB/s, max of maybe 200, mostly 60-80 KB/s. Weird -- but that
shouldn't make NFS time out horribly by itself.

Running more tests, it turns out the speed problem is isolated to
the one machine, and only to *receiving* data. Sending goes at
8 M/s to other machines from the client machine. Sending from
any machine to the client machine is slowed down, not just from
the server.

I can drop another ethernet card into the client machine and try
that, and in fact, that's probably the next thing I'll do when I
work on it further tomorrow evening. But for now, does any of
this sound familiar to anyone? Is this particular 3com chipset
known to be broken, or just not supported well? Any possible
software cause for the slowdown?

Motherboard information is here:
http://www.tyan.com/products/html/tigermpx.html

--
Matthew Hunter ([email protected])
Public Key: http://matthew.infodancer.org/public_key.txt
Homepage: http://matthew.infodancer.org/index.jsp
Politics: http://www.triggerfinger.org/index.jsp


2003-07-22 13:04:35

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.4.21, NFS v3, and 3com 920

On Tue, 22 Jul 2003 00:42:45 CDT, Matthew Hunter <[email protected]> said:

> Running more tests, it turns out the speed problem is isolated to
> the one machine, and only to *receiving* data. Sending goes at
> 8 M/s to other machines from the client machine. Sending from
> any machine to the client machine is slowed down, not just from
> the server.

These symptoms sound suspiciously like a 100BaseT auto-negotiation
problem. With some combinations of gear, if one end is set to auto-negotiate
and the other end is nailed to full/half duplex (sorry, can't remember which and
I've not my caffiene yet), things go horribly wrong and many packets
dissapear silently on transmission, forcing retransmit timeouts and bad
throughput. Basically, you end up with one end thinking it's full duplex,
the other end at half - and if the full duplex side ever sends a packet while
the half side is sending, the packet's lost.

Try nailing the devices on both ends of the cat-5 to the same thing (full or
half). This can of course be interesting if you have an unmanaged hub that
doesn't give you a choice...




Attachments:
(No filename) (226.00 B)

2003-07-22 17:57:26

by Samuel Flory

[permalink] [raw]
Subject: Re: 2.4.21, NFS v3, and 3com 920

[email protected] wrote:

>On Tue, 22 Jul 2003 00:42:45 CDT, Matthew Hunter <[email protected]> said:
>
>
>
>>Running more tests, it turns out the speed problem is isolated to
>>the one machine, and only to *receiving* data. Sending goes at
>>8 M/s to other machines from the client machine. Sending from
>>any machine to the client machine is slowed down, not just from
>>the server.
>>
>>
>
>These symptoms sound suspiciously like a 100BaseT auto-negotiation
>problem. With some combinations of gear, if one end is set to auto-negotiate
>and the other end is nailed to full/half duplex (sorry, can't remember which and
>I've not my caffiene yet), things go horribly wrong and many packets
>dissapear silently on transmission, forcing retransmit timeouts and bad
>throughput. Basically, you end up with one end thinking it's full duplex,
>the other end at half - and if the full duplex side ever sends a packet while
>the half side is sending, the packet's lost.
>
>Try nailing the devices on both ends of the cat-5 to the same thing (full or
>half). This can of course be interesting if you have an unmanaged hub that
>doesn't give you a choice...
>
>
>
>
>

You should be able to use mii-tool, or ethtool (one or both should
work) to check the state your ethernet controller thinks it is set to,
and change the settings.

[root@sflory cujo]# mii-tool -v eth0
eth0: negotiated 100baseTx-FD, link ok
product info: vendor 00:aa:00, model 51 rev 0
basic mode: autonegotiation enabled
basic status: autonegotiation complete, link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
[root@sflory cujo]# ethtool eth0
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Advertised auto-negotiation: Yes
Speed: 100Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: puag
Wake-on: g
Link detected: yes
[root@sflory cujo]#

--
Once you have their hardware. Never give it back.
(The First Rule of Hardware Acquisition)
Sam Flory <[email protected]>


2003-07-22 18:43:20

by Matthew Hunter

[permalink] [raw]
Subject: Re: 2.4.21, NFS v3, and 3com 920

On Tue, Jul 22, 2003 at 11:06:59AM -0700, Samuel Flory <[email protected]> wrote:
> >Try nailing the devices on both ends of the cat-5 to the same thing (full
> >or half). This can of course be interesting if you have an
> >unmanaged hub that doesn't give you a choice...
> You should be able to use mii-tool, or ethtool (one or both should
> work) to check the state your ethernet controller thinks it is set to,
> and change the settings.

So far I've seen several people point to this, and I just now had
the chance to test the advice. Here are the results:

image:~# mii-tool -v eth0
eth0: negotiated 100baseTx-FD, link ok
product info: vendor 00:10:5a, model 0 rev 0
basic mode: autonegotiation enabled
basic status: autonegotiation complete, link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

That's the default. OK, the hub thinks it's FD, the adapter
thinks its FD. Should be a match.

Test with a large file transfer: 80 KB/s, about as expected (ie,
the problem still exists.

Let's assume the hub is smoking something interesting and
force HD. (The hub is unmanaged, so I can't force it to do
anything).

image:~# mii-tool --force=100baseTx-HD eth0
image:~# mii-tool -v eth0
eth0: 100 Mbit, half duplex, link ok
product info: vendor 00:10:5a, model 0 rev 0
basic mode: 100 Mbit, half duplex
basic status: link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control

OK, adapter forced to half duplex.

Test with a large file transfer -- no change, still about 80
KB/s.

Let's try to autonegotiate for the same result...

image:~# mii-tool --reset eth0
resetting the transceiver...
image:~# mii-tool --advertise=100baseTx-HD eth0
restarting autonegotiation...
image:~# mii-tool -v eth0
eth0: negotiated 100baseTx-HD, link ok
product info: vendor 00:10:5a, model 0 rev 0
basic mode: autonegotiation enabled
basic status: autonegotiation complete, link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-HD flow-control
link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

OK, looks fine. Test... no change.

I predict hardware swaps in my future when I get home.

Just for giggles, I'll try 10baseT.

image:~# mii-tool --reset eth0
resetting the transceiver...
image:~# mii-tool --advertise=10baseT-FD eth0
restarting autonegotiation...
image:~# mii-tool -v eth0
eth0: negotiated 10baseT-FD, link ok
product info: vendor 00:10:5a, model 0 rev 0
basic mode: autonegotiation enabled
basic status: autonegotiation complete, link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 10baseT-FD flow-control
link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

Low-and-behold, 1.1 MB/s!

Note that this is supposedly a fast ethernet hub and a fast
ethernet adapter. The other hosts on the hub all think so.

I wonder if I'm plugged into a special port or something.
I'll play with that when I'm near the hardware later on tonight.

Thanks for your help, all of you. I think I have the answers
that I wanted -- namely, it's probably not a kernel problem.

I am unsure if this explains the NFS problem (ie, NFS breaks with
v3 enabled), but since it works via tcp, I'm not of any mind to
complain. If anyone is interested, I can try without tcp but
with the ethernet controller in better shape and see if I can
still cause the same symptoms.

--
Matthew Hunter ([email protected])
Public Key: http://matthew.infodancer.org/public_key.txt
Homepage: http://matthew.infodancer.org/index.jsp
Politics: http://www.triggerfinger.org/index.jsp