2001-11-30 14:35:46

by Jessica Blank

[permalink] [raw]
Subject: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

Hello esteemed kernel hackers:

As you doubtless know, NT and BSD both have a broken slow-start
implementation. As you may not know, when you try having a Linux box
co-exist on a network with a Windows box, this seems to cause the Windows
box to CROWD OUT the Linux box on the network.

There is a fix to Solaris for this-- or a "workaround", I should
say:

http://www.sun.com/sun-on-net/performance/tcp.slowstart.html

THERE IS NO FIX TO LINUX FOR THIS. At least, not as far as I could
find-- and I just got done Web-searching for a solid 15 minutes, finding
MULTIPLE references to the Solaris workaround in the process.

It is high time this problem is acknowledged and FIXED. I am
forced to share a network with a bunch of NT servers, some of which get
plenty of traffic-- enough so that they manage to crowd out my machine to
the tune of 600ish ms ping times to the Linux box versus only **70**
(!!!!!!) to the Windows box. THESE MACHINES ARE ON THE SAME NETWORK, but
the Linux box is as sluggish, latency-wise, as telnetting into a box on a
MODEM-- whereas the Windows box, where latency isn't even as important (no
one telnets into them), is nice and zippy.

I do not want to have to move to Solaris.

Please, how can this problem be solved? PLEASE CC ME ANY
SOLUTION(S) DISCUSSED!

--Jessica

=========================================
J e s s i c a L e a h B l a n k
-----------------------------------------
Programmer * Unix Sysadmin * Web Geek
[email protected] -- [email protected]
-`-,-{@ http://www.jessl.org/ @}-,-`-
=========================================




2001-11-30 15:28:34

by Alan

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

> http://www.sun.com/sun-on-net/performance/tcp.slowstart.html

This URL has no bearing or relation to the things you seem to be
reporting. There are hundreds of possible and more likely reason (load
on that ethernet segment, half/full duplex, 10/100Mbit settings) or even
incorrect irq configuration that are rather more plausible

It can also be highly card dependant how a box behaves on an overloaded lan

2001-11-30 16:00:56

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

On Fri, 30 Nov 2001, Jessica Blank wrote:

> Hello esteemed kernel hackers:
>
> As you doubtless know, NT and BSD both have a broken slow-start
> implementation. As you may not know, when you try having a Linux box
> co-exist on a network with a Windows box, this seems to cause the Windows
> box to CROWD OUT the Linux box on the network.
>
> There is a fix to Solaris for this-- or a "workaround", I should
> say:
>
> http://www.sun.com/sun-on-net/performance/tcp.slowstart.html
>

> THERE IS NO FIX TO LINUX FOR THIS. At least, not as far as I could
> find-- and I just got done Web-searching for a solid 15 minutes, finding
> MULTIPLE references to the Solaris workaround in the process.

I seriouly doubt that your problem has anything to do with Linux, but
rather that the NT machines are set up to use Netbeui which puts
NETBIOS packets into broadcast packets. This means that all the data
to/from the M$ file-servers ends up being handled by your Ethernet board
and driver, then dumped onto the floor.

A properly implimented IP Network minimizes the amount of broadcast
traffic. M$ tends to maximize it. Such a typical mess looks like
this:

# tcpdump -n
10:47:03.349550 10.110.128.209.138 > 10.111.255.255.138: udp 215
10:47:03.349607 10.110.1.173.138 > 10.111.255.255.138: udp 216
10:47:03.350618 10.110.129.85.138 > 10.111.255.255.138: udp 221
10:47:03.351338 10.110.129.95.138 > 10.111.255.255.138: udp 213
10:47:03.352340 10.110.1.152.138 > 10.111.255.255.138: udp 211
10:47:03.352973 10.110.130.143.138 > 10.111.255.255.138: udp 212
10:47:03.356839 10.110.130.53.138 > 10.111.255.255.138: udp 215
10:47:03.359190 10.110.129.11.138 > 10.111.255.255.138: udp 217
10:47:03.360571 10.110.129.47.138 > 10.111.255.255.138: udp 208
10:47:03.361669 10.110.128.96.138 > 10.111.255.255.138: udp 215
10:47:03.361938 10.110.129.51.138 > 10.111.255.255.138: udp 214
10:47:03.363611 10.110.129.182.138 > 10.111.255.255.138: udp 213
^C
#

Here, data is being sent from a server 10.110.129.182, port 138
to a broadcast address, 10.111.255.255, port 138. Port 138 is
NETBIOS Datagram. So, all this data gets sent to every machine
on the LAN. It generates an interrupt, the driver gets the data
and passes it on. The IP stack looks at it and says "It ain't for
me...", and throws it away. This all takes CPU cycles that
Microsoft is stealing from you. The solution is to fire your
M$ administrator. Failing that, you need to get the fastest
Ethernet card, with a good fast driver. This allows the M$ data
to be thrown away without using too much of your CPU time.

IP filtering in your machine doesn't do any good. It just adds
CPU cycles. The broadcast packets aren't for your machine anyway
so they are being rejected without any additional filter.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-11-30 16:03:07

by Jessica Blank

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

Sooo... having the Windows-type person remove NetBEUI and Windows
filesharing (SMB) would fix this if this is indeed the cause of problems?

On Fri, 30 Nov 2001, Richard B. Johnson wrote:

> On Fri, 30 Nov 2001, Jessica Blank wrote:
>
> > Hello esteemed kernel hackers:
> >
> > As you doubtless know, NT and BSD both have a broken slow-start
> > implementation. As you may not know, when you try having a Linux box
> > co-exist on a network with a Windows box, this seems to cause the Windows
> > box to CROWD OUT the Linux box on the network.
> >
> > There is a fix to Solaris for this-- or a "workaround", I should
> > say:
> >
> > http://www.sun.com/sun-on-net/performance/tcp.slowstart.html
> >
>
> > THERE IS NO FIX TO LINUX FOR THIS. At least, not as far as I could
> > find-- and I just got done Web-searching for a solid 15 minutes, finding
> > MULTIPLE references to the Solaris workaround in the process.
>
> I seriouly doubt that your problem has anything to do with Linux, but
> rather that the NT machines are set up to use Netbeui which puts
> NETBIOS packets into broadcast packets. This means that all the data
> to/from the M$ file-servers ends up being handled by your Ethernet board
> and driver, then dumped onto the floor.
>
> A properly implimented IP Network minimizes the amount of broadcast
> traffic. M$ tends to maximize it. Such a typical mess looks like
> this:
>
> # tcpdump -n
> 10:47:03.349550 10.110.128.209.138 > 10.111.255.255.138: udp 215
> 10:47:03.349607 10.110.1.173.138 > 10.111.255.255.138: udp 216
> 10:47:03.350618 10.110.129.85.138 > 10.111.255.255.138: udp 221
> 10:47:03.351338 10.110.129.95.138 > 10.111.255.255.138: udp 213
> 10:47:03.352340 10.110.1.152.138 > 10.111.255.255.138: udp 211
> 10:47:03.352973 10.110.130.143.138 > 10.111.255.255.138: udp 212
> 10:47:03.356839 10.110.130.53.138 > 10.111.255.255.138: udp 215
> 10:47:03.359190 10.110.129.11.138 > 10.111.255.255.138: udp 217
> 10:47:03.360571 10.110.129.47.138 > 10.111.255.255.138: udp 208
> 10:47:03.361669 10.110.128.96.138 > 10.111.255.255.138: udp 215
> 10:47:03.361938 10.110.129.51.138 > 10.111.255.255.138: udp 214
> 10:47:03.363611 10.110.129.182.138 > 10.111.255.255.138: udp 213
> ^C
> #
>
> Here, data is being sent from a server 10.110.129.182, port 138
> to a broadcast address, 10.111.255.255, port 138. Port 138 is
> NETBIOS Datagram. So, all this data gets sent to every machine
> on the LAN. It generates an interrupt, the driver gets the data
> and passes it on. The IP stack looks at it and says "It ain't for
> me...", and throws it away. This all takes CPU cycles that
> Microsoft is stealing from you. The solution is to fire your
> M$ administrator. Failing that, you need to get the fastest
> Ethernet card, with a good fast driver. This allows the M$ data
> to be thrown away without using too much of your CPU time.
>
> IP filtering in your machine doesn't do any good. It just adds
> CPU cycles. The broadcast packets aren't for your machine anyway
> so they are being rejected without any additional filter.
>
> Cheers,
> Dick Johnson
>
> Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).
>
> I was going to compile a list of innovations that could be
> attributed to Microsoft. Once I realized that Ctrl-Alt-Del
> was handled in the BIOS, I found that there aren't any.
>
>


=========================================
J e s s i c a L e a h B l a n k
-----------------------------------------
Programmer * Unix Sysadmin * Web Geek
[email protected] -- [email protected]
-`-,-{@ http://www.jessl.org/ @}-,-`-
=========================================


2001-11-30 16:17:07

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

On Fri, 30 Nov 2001, Jessica Blank wrote:

> Sooo... having the Windows-type person remove NetBEUI and Windows
> filesharing (SMB) would fix this if this is indeed the cause of problems?
>

Just turn OFF NetBEUI. Enable TCP/IP and NETBIOS (only). Everybody
can "share" as usual. No negative impact upon anybody.

Cheers,

Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.


2001-11-30 16:23:37

by Trever L. Adams

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

On Fri, 2001-11-30 at 11:02, Jessica Blank wrote:
> Sooo... having the Windows-type person remove NetBEUI and Windows
> filesharing (SMB) would fix this if this is indeed the cause of problems?
>

Partially. SMB can be an ok netizen given that you disable NetBEUI and
possibly IPX. It won't be the best netizen, but it won't be so insanely
broken.

Trever

2001-11-30 18:43:11

by Tom Sightler

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

> There is a fix to Solaris for this-- or a "workaround", I should
> say:
>
> http://www.sun.com/sun-on-net/performance/tcp.slowstart.html

This is a workaround for a TCP problem which you probably don't have based
on your comments below. Why do I think you don't have this problem? See
below.

> It is high time this problem is acknowledged and FIXED. I am
> forced to share a network with a bunch of NT servers, some of which get
> plenty of traffic-- enough so that they manage to crowd out my machine to
> the tune of 600ish ms ping times to the Linux box versus only **70**
> (!!!!!!) to the Windows box. THESE MACHINES ARE ON THE SAME NETWORK, but
> the Linux box is as sluggish, latency-wise, as telnetting into a box on a
> MODEM-- whereas the Windows box, where latency isn't even as important (no
> one telnets into them), is nice and zippy.

This fix applies to slow start for TCP connections that usually involve
things like streaming, etc. You noted in your email that ping had high
latency times, well, ping is ICMP, not TCP, and does not use slow start.
Also, interactive TCP sessions such as telnet/ssh are very unlikely to show
a problem with slow start.

Later,
Tom


2001-11-30 20:01:12

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

Followup to: <1007137256.1244.0.camel@aurora>
By author: "Trever L. Adams" <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Fri, 2001-11-30 at 11:02, Jessica Blank wrote:
> > Sooo... having the Windows-type person remove NetBEUI and Windows
> > filesharing (SMB) would fix this if this is indeed the cause of problems?
> >
>
> Partially. SMB can be an ok netizen given that you disable NetBEUI and
> possibly IPX. It won't be the best netizen, but it won't be so insanely
> broken.
>

Indeed. Note that even Microsoft have been recommending running SMB
over TCP/IP and disabling other protocols for many years now; starting
in Win98 this is the default configuration.

-hpa

--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-11-30 22:29:08

by David Miller

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

From: Jessica Blank <[email protected]>
Date: Fri, 30 Nov 2001 08:35:35 -0600 (CST)

It is high time this problem is acknowledged and FIXED. I am
forced to share a network with a bunch of NT servers, some of which get
plenty of traffic-- enough so that they manage to crowd out my machine to
the tune of 600ish ms ping times to the Linux box versus only **70**
(!!!!!!) to the Windows box.

Changes to TCP and therefore anything having to do with
slow-start is not going to have any effect on ping times.

To me this sounds like a problem somewhere else, perhaps a driver
issue.

2001-12-03 08:51:53

by watermodem

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge theproblem!

"David S. Miller" wrote:
>
> From: Jessica Blank <[email protected]>
> Date: Fri, 30 Nov 2001 08:35:35 -0600 (CST)
>
> It is high time this problem is acknowledged and FIXED. I am
> forced to share a network with a bunch of NT servers, some of which get
> plenty of traffic-- enough so that they manage to crowd out my machine to
> the tune of 600ish ms ping times to the Linux box versus only **70**
> (!!!!!!) to the Windows box.

600ms looks like 10base T (no duplex).
Not full duplex or 100base T.
Check you cable!

>
> Changes to TCP and therefore anything having to do with
> slow-start is not going to have any effect on ping times.
>
> To me this sounds like a problem somewhere else, perhaps a driver
> issue.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-12-04 00:20:49

by Bill Davidsen

[permalink] [raw]
Subject: Re: Slow start -- Linux vs. NT -- it's time to acknowledge the problem!

> From: Jessica Blank <[email protected]>
> Date: Fri, 30 Nov 2001 08:35:35 -0600 (CST)
>
> It is high time this problem is acknowledged and FIXED. I am
> forced to share a network with a bunch of NT servers, some of which get
> plenty of traffic-- enough so that they manage to crowd out my machine to
> the tune of 600ish ms ping times to the Linux box versus only **70**
> (!!!!!!) to the Windows box.

Okay, I acknowledge the problem and admit you don't understand networking.
You have confused the slowstart feature (TCP) with ping (ICMP). TCP and
ICMP are what are called protocols. Slowstart has nothing to do with ICMP.

I will add that if you do an ifconfig I suspect you will see collisions on
your interface, resulting in poor performance. This is caused by running
the interface half duplex, and should not be used unless you are connected
to a type of obsolete hardware known as a hub, instead of a switch. If you
connect to a switch you should see zero collisions.

Finally, even 70ms is really poor ping time, I get better than that
between Albany NY and San Jose CA! Local ping time, even between Pentium
class utility machines on a thinnet (10Mbit half duplex technology) is
70-1400us on a somewhat loaded network.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.