How can it be possible that my i486 machine (100MHz, 32Mb RAM, running
2.4.18) with 3Com 3c905C-TX (Tornado) occasionally miscalculates TCP
checksums for outgoing packets?
Running "tcpdump -vvv -e -s 0 -w file" on 486, and examining file with
ethereal showed outgoing packet:
TCP checksum 3c4b (correct)
Obviously, this was a lie, because "netstat -s" on the client shows bad
segments received, and ethereal on the client said for the received
packet:
TCP checksum 3c4b (incorrect, should be 3c47)
I am unable to transfer anything more than a few megabytes from this
486, as client drops invalid packets and connection eventually hangs. In
fact, in 1 out 10 connections TCP recovers as though as a mysterious
force changed some bits and corrected the checksum, but only to hang
again a second later. So much about putting an old 486 in use for an
internet gateway...
I have personally assured that packet of dispute in example above was
identical on both sides. Same kernel was used on both sides, same
ethereal, and same sha1sum. Besides this, I have also tried numerous
other clients, from win2k, xp, to solaris, and in all cases connections
hang within a few seconds.
And now to the funny part.
Network card is verified and working. IP checksums never fail (0 packets
lost after two days of flood ping). TCP works with same kernel, same NIC
but on a different machine (Athlon 950) as well as with same machine,
same NIC and Windows 98. Needless to say, it works with different
machine (PIII) and different OS (Win2k).
Machine is verified, it has been working reliably for years. If, instead
of Tornado, I use a 3Com 3C509B (10Mbit EISA), the TCP works perfectly.
But if Tornado card was defective, TCP should also work with Via Rhine
(DFE-530TX) - but it DOESN'T. (However, drivers via-rhine and 3c59x I
believe were made by the same author, just in case that makes any sense)
Driver 3c59x is verified. It works with the same NIC, same kernel but in
a different machine. (BTW, weren't these cards/drivers among ones of the
highest reputation?) I have also tried 2.4.7 and 2.4.20 with the same
hardware - you guessed it, it didn't help either.
Kernel is verified. In several of my other installations from the same
RedHat 8 CD set, I have not seen this problem ever occur, and some of
the machines had the very same kind of NIC. And you should probably know
that better than me! 8-)
No problems with link or wire either. I am using direct crossover cable,
cards hook up in 100 Mbit/s and full duplex instantly. It does not
matter what the link partner is. There are no I/O or IRQ conflicts.
Forcing other speeds and duplex modes does not help, nor does toying
with hdparam or kernel compilation parameters. And I've tried many, I
believe. mii-diag says everything is okay.
PCI slot is verified. It does not matter into which slot I put the card,
or how I set PCI parameters in BIOS. Other cards work in this slot, this
cards doesn't work in any.
It does not matter how I set hw_checksums (3c59x.c) or
CONFIG_USE_PPRO_CHECKSUM (checksum.S). Isn't that strange? If the wrong
checksum was made in kernel, using hardware checksums should make the
problem go away. And vice versa.
Any clues or hints about what's going on? I would greatly appreciate
them, or at least some suggestions about what else to try. Because after
months without progress, I feel exhausted and I'm starting to run out of
ideas.
Nevertheless, I am not losing my enthusiasm in Linux. So please nobody
mention the cost of replacing the 486!
Best regards, Matjaz
(I am not a list member)
> Network card is verified and working. IP checksums never fail (0 packets
> lost after two days of flood ping). TCP works with same kernel, same NIC
> but on a different machine (Athlon 950) as well as with same machine,
> same NIC and Windows 98. Needless to say, it works with different
> machine (PIII) and different OS (Win2k).
>
> Machine is verified, it has been working reliably for years. If, instead
> of Tornado, I use a 3Com 3C509B (10Mbit EISA), the TCP works perfectly.
> But if Tornado card was defective, TCP should also work with Via Rhine
> (DFE-530TX) - but it DOESN'T. (However, drivers via-rhine and 3c59x I
> believe were made by the same author, just in case that makes any sense)
Is the bus speed of the 486, and the other machine you tested the
3x905C-TX in the same? The 486 sounds like it has EISA and PCI busses
- are you sure that the PCI bus is set to the correct clock speed?
John.