2001-11-03 18:03:02

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

> I don't mean an cluster of PC's, depending on the volume for the download
> exist some pretty "cache-pox" that mean no load balancing but the box can
> cache up to 1-4 GB ob data in the ram and the "web-server" there is
> running
> in hard wired asic's i think P5, Cisco and another Producer build it.
> Becasue in my opinion the traffic of up to 4gbit is not handable on an
> linux box (thinking on x86 achritekture)

hm...

how much do you think you can get out of a server with several 1Gb
ethernet cards, multiple 66MHz/64bit PCI busses, multiple SCSI busses or
perhaps some sort of SAN solution based on FibreChannel 2?

---
Computers are like air conditioners.
They stop working when you open Windows.


2001-11-03 19:08:37

by Thomas Lussnig

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

>
>
>how much do you think you can get out of a server with several 1Gb
>ethernet cards, multiple 66MHz/64bit PCI busses, multiple SCSI busses or
>perhaps some sort of SAN solution based on FibreChannel 2?
>
Ok,
on this hardware i think that the problem is the that the Kernel and
Webserver need to suport that ( each of the 1Gbit card is bound to its
own process and on Multiprozessor machine that the prozess is fixed to
one CPU to minimize the siwtch overhead, also im not firm with the
FibreChannel2
spezifikation i think that there can some trouble with the load, but much
more important is to know how much different data is served, because then
you talk about khttpd i think that it is definit static data and so the
question
is how much, because on an ideal case the whole set of files is cached
in the
ram, with 500 hundred Users i think there is only minmal patch in the
kernel to
do for higher file handles. So if there is only there the choice left open
tux or khttpd i think you should use tux

- more defelopment
- more tuning/config/log options
- better code ( khttpd soud's a little bit of try and error )


2001-11-03 19:14:57

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

>> how much do you think you can get out of a server with several 1Gb
>> ethernet cards, multiple 66MHz/64bit PCI busses, multiple SCSI busses or
>> perhaps some sort of SAN solution based on FibreChannel 2?
>
> Ok,
> on this hardware i think that the problem is the that the Kernel and
> Webserver need to suport that ( each of the 1Gbit card is bound to its
> own process and on Multiprozessor machine that the prozess is fixed to
> one CPU to minimize the siwtch overhead, also im not firm with the
> FibreChannel2
> spezifikation i think that there can some trouble with the load, but much
> more important is to know how much different data is served, because then
> you talk about khttpd i think that it is definit static data and so the
> question
> is how much, because on an ideal case the whole set of files is cached
> in the
> ram, with 500 hundred Users i think there is only minmal patch in the
> kernel to
> do for higher file handles. So if there is only there the choice left open
> tux or khttpd i think you should use tux

What's this patch thing?
Do I need to patch up or rewrite parts of the kernel to support <1000 file
handles?

---
Computers are like air conditioners.
They stop working when you open Windows.

2001-11-03 19:14:57

by Alan

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

> on this hardware i think that the problem is the that the Kernel and
> Webserver need to suport that ( each of the 1Gbit card is bound to its
> own process and on Multiprozessor machine that the prozess is fixed to
> one CPU to minimize the siwtch overhead, also im not firm with the
> FibreChannel2

Each GigE card will need its own 66MHz PCI bus. Each PCI bridge will need
to be coming off a memory bus that can sustain all of these and the CPU
at once.

At that point it really doesnt look much like a PC.

2001-11-03 19:18:47

by Roy Sigurd Karlsbakk

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

> Each GigE card will need its own 66MHz PCI bus. Each PCI bridge will need
> to be coming off a memory bus that can sustain all of these and the CPU
> at once.
>
> At that point it really doesnt look much like a PC.

How much raw speed do you think I can manage to get out of a really cool
n-way server from Compaq? I beleive we'll go for a Compaq server, as
that's what's been decided some time ago.

I read something by Linus about linux scalability, and I beleive he said
that 'linux [2.4] scales good up to 4 cpus, but not that good futher on
[to 8?]'. Can anyone fill in the holes here?

thanks

roy

---
Computers are like air conditioners.
They stop working when you open Windows.

2001-11-03 19:31:48

by Alan

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

> > At that point it really doesnt look much like a PC.
>
> How much raw speed do you think I can manage to get out of a really cool
> n-way server from Compaq? I beleive we'll go for a Compaq server, as
> that's what's been decided some time ago.

Take a look at the tux benchmark numbers. Thats pushing the limit of the
hardware

2001-11-03 19:32:30

by J Sloan

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

Roy Sigurd Karlsbakk wrote:

> I read something by Linus about linux scalability, and I beleive he said
> that 'linux [2.4] scales good up to 4 cpus, but not that good futher on
> [to 8?]'. Can anyone fill in the holes here?

Nobody scales better 1-4 CPUs, as indicated
by specweb99 - at 8 CPUs linux is OK, but not
as dominating....

When the high end specialists from IBM etc
can send in patches that enhance high end
performance without hurting the low end case
the numbers on 8-32 CPUs should really start
to shine. (There has been progress on that
front seen on lkml)

cu

jjs




2001-11-04 00:07:34

by Erik Mouw

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

On Sat, Nov 03, 2001 at 08:18:19PM +0100, Roy Sigurd Karlsbakk wrote:
> > Each GigE card will need its own 66MHz PCI bus. Each PCI bridge will need
> > to be coming off a memory bus that can sustain all of these and the CPU
> > at once.
> >
> > At that point it really doesnt look much like a PC.
>
> How much raw speed do you think I can manage to get out of a really cool
> n-way server from Compaq? I beleive we'll go for a Compaq server, as
> that's what's been decided some time ago.

Not that much. Alan's point is that you're pushing the limit of the
memory bandwidth, not the number of CPUs. This is the single reason
that high traffic websites either use some serious non-PC hardware (IBM
Z-series, for example) or a large number of PCs in parallel to share
the load.

> I read something by Linus about linux scalability, and I beleive he said
> that 'linux [2.4] scales good up to 4 cpus, but not that good futher on
> [to 8?]'. Can anyone fill in the holes here?

The number of CPUs really doesn't matter in this case. With several
GigE cards memory bandwidth and latency is your main problem.


Erik

--
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Faculty
of Information Technology and Systems, Delft University of Technology,
PO BOX 5031, 2600 GA Delft, The Netherlands Phone: +31-15-2783635
Fax: +31-15-2781843 Email: [email protected]
WWW: http://www-ict.its.tudelft.nl/~erik/

2001-11-04 01:13:19

by Alan

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

> Nobody scales better 1-4 CPUs, as indicated
> by specweb99 - at 8 CPUs linux is OK, but not
> as dominating....

At specweb. For some 2 and a large number of 4 processor workloads our
scheduler does not make good decisions

2001-11-04 15:33:52

by John Alvord

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

On Sun, 4 Nov 2001, Erik Mouw wrote:

> On Sat, Nov 03, 2001 at 08:18:19PM +0100, Roy Sigurd Karlsbakk wrote:
> > > Each GigE card will need its own 66MHz PCI bus. Each PCI bridge will need
> > > to be coming off a memory bus that can sustain all of these and the CPU
> > > at once.
> > >
> > > At that point it really doesnt look much like a PC.
> >
> > How much raw speed do you think I can manage to get out of a really cool
> > n-way server from Compaq? I beleive we'll go for a Compaq server, as
> > that's what's been decided some time ago.
>
> Not that much. Alan's point is that you're pushing the limit of the
> memory bandwidth, not the number of CPUs. This is the single reason
> that high traffic websites either use some serious non-PC hardware (IBM
> Z-series, for example) or a large number of PCs in parallel to share
> the load.
>
> > I read something by Linus about linux scalability, and I beleive he said
> > that 'linux [2.4] scales good up to 4 cpus, but not that good futher on
> > [to 8?]'. Can anyone fill in the holes here?
>
> The number of CPUs really doesn't matter in this case. With several
> GigE cards memory bandwidth and latency is your main problem.

Interesting parallel...

In the last few years there have been multiple cases where people reported
benchmarks where a dual processir gave less thruput then a single
processor. In most cases, the single processor benchmark had saturated the
memory bandwidth and a second processor didn't make much difference.

This was on "cheap" multi-processors.

john alvord

2001-11-05 09:17:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux


On Sat, 3 Nov 2001, J Sloan wrote:

> Nobody scales better 1-4 CPUs, as indicated
> by specweb99 - at 8 CPUs linux is OK, but not
> as dominating....

This is a common misinterpretation of the TUX SPECweb99 numbers.
Performance and scalability are two distinct things. Also, maximum
performance on a given hardware, and the true scalability of the software
running on it are two different things as well. You can have a very slow
webserver that scales very well - and you can have a fast webserver that
scales poorly, but if the fast one beats the slow one even with the
highest number of CPUs used, the 'good' scalability of the slow webserver
does not matter much, does it? Also, TUX will max out an i486 pretty
quickly, and it will scale very badly on 4-way i486 systems (yes such
beasts do exist), simply because the hardware itself is pushed to the
maximum, more CPUs simply do not help - performance does not increase.

Ideally we want to have a very fast and very scalable webserver - TUX is
an attempt to be just that, and nothing more.

TUX maxes out the hardware on all systems tested so far - so the true
'scalability' of the Linux kernel and TUX simply cannot be measured: it's
the hardware (CPU, networking card, etc.) that is slowing TUX down, not
TUX's scalability faults. Algorithmically and SMP caching/locking-wise the
kernel and TUX is doing the right thing already, under these read-mostly
pagecache & TCP/IP loads. [well, this is not some black art, we simply
fixed every limit that showed up on the way.]

TUX maxes out 2-way and 4-way systems as well, while IIS does not appear
to do a good job there. So we can say that it's proven that IIS does not
scale well. I can still not say whether Linux+TUX scales well, i can only
say that it's too fast for the given hardware :-)

why does it look like as if TUX scaled well on 1, 2, 4 CPUs? Because
hardware designers are sizing up systems with more CPUs, so the true
limits of the hardware show a similar scalability graph as the scalability
graph would be of a scalable webserver.

Scalability of the software can only be judged on hardware where every
component (CPU, system board, cards) is faster than what TUX can push - so
it can be measured exactly how TUX (and the kernel) reacts to the addition
of more CPUs. Once a webserver pushes to the limits of the hardware, the
true scalability of the code gets distorted.

> When the high end specialists from IBM etc
> can send in patches that enhance high end
> performance without hurting the low end case
> the numbers on 8-32 CPUs should really start
> to shine. [...]

sadly, the TUX workloads scale 'perfectly' already both within TUX and
within the kernel (to the best of my knowledge), from an algorithmic point
of view - i dont think anyone could claim to be able to improve that
significantly, even on 32 way systems. My main development box is an 8-way
ia32 box (and a fair number of other kernel hackers have such boxes as
well), so we know the 8-way limits pretty well. Note that the TUX patches
include 3 extra scalability patches to the stock kernel:

- the pagecache SMP-scalability patch [gets rid of pagecache_lock]
- the smptimers patch [makes timers completely per-CPU.]
- the per-CPU page allocator

There might be other areas in the kernel that could scale better under
non-TUX workloads (especially the block IO code has some scalability
problems), but none of them affects TUX in any measurable way on the
systems we measured. I'd say that TUX should scale pretty well to 16 or 32
CPUs, and SGI's tests appear to prove this in part: the pagecache
scalability patch alone helped their (non-TUX) NUMA cached-dbench
performance measurably. [on an 8-way system the pagecache scalability
patch is only a small but measurable win.] And if any kernel scalability
limit pops up on bigger boxes, we can fix it - there are few fundamental
issues left.

Ingo

2001-11-06 04:47:17

by J Sloan

[permalink] [raw]
Subject: Re: [khttpd-users] khttpd vs tux

Ingo,

Thanks for commenting on this -

Ingo Molnar wrote:

> On Sat, 3 Nov 2001, J Sloan wrote:
>
> > Nobody scales better 1-4 CPUs, as indicated
> > by specweb99 - at 8 CPUs linux is OK, but not
> > as dominating....
>
> This is a common misinterpretation of the TUX SPECweb99 numbers.
> Performance and scalability are two distinct things.

Absolutely correct, I spoke sloppily.
I should have said, "nobody performs better...".

But the scalability certainly _appears_
to be better than average -

> TUX maxes out 2-way and 4-way systems as well, while IIS does not appear
> to do a good job there. So we can say that it's proven that IIS does not
> scale well. I can still not say whether Linux+TUX scales well, i can only
> say that it's too fast for the given hardware :-)

indeed...

> why does it look like as if TUX scaled well on 1, 2, 4 CPUs? Because
> hardware designers are sizing up systems with more CPUs, so the true
> limits of the hardware show a similar scalability graph as the scalability
> graph would be of a scalable webserver.

Excellent point, thanks for making the distinction.

Thanks as well for the other excellent insights,
it was informative to hear what you had to say.

cu

jjs