Hi all.
I've started a new project where the final goal is to overload
vger.kernel.org with silly questions...
Well... Seriously, I'm working on a project, setting up a web server,
serving large, multi-gigabyte media files (aka streaming). Below, I've
set up a list of questions of things I want to know to tune the system as
good as possible. Thank you all for all help :-)
First of all - the plan now, is to use Tux as the web server, as I've got
the impression - both by testing it, and of what's been said on the list,
that it's one of the fastest solutions I can find - at least on Linux. I
plan to serve some <= 250 concurrent connections, each given a maximum
bandwidht (in tux) of 7 Mbps. The average bandwidth use will be somewhere
between 4 and 5 Mbps, but it may peak at 7. The server I'll set up will
only run Tux - no Apache. Tux will be setup not to log, and virutally all
system daemons will be stopped. Some management agents (from Compaq or
whoever we'll go for) may run, but that's it. I also plan to set it up
without any swap partition.
Q: What would the memory requirements for such a system be? I've read Tux
uses sendfile(), but I don't know how Tux or sendfile() caches. Does
sendfile read the file into the cache buffer before sending it? I mean - I
can't keep the files in memory, so how much would do?
Q: If using the bandwidth control in Tux, how much CPU overhead may I
expect per stream (sizes as mentioned above)?
Q: Do anyone know what SCSI and NIC (1 or 10Gb-Ethernet) that have low or
high CPU overhead? I have the impression that there are quite some
variations between different drivers. Is this true?
Q: Tux has some way of linking IRQs directly to a specific CPU/thread
(don't really remember what this was...). What should/could be used of
these, or other tuning factors to speed up the server?
Q: Are there ways to strip down the kernel, as to get rid of everything I
don't want or don't need? Could I gain speed on this, or is it just a
waste?
Q: What file system would be best suitable for this? Ext[23]? ReiserFS?
Xfs? FAT? :-)
Q: If using iptables to block any unwanted traffic - that is - anything
that's not TCP/80, how much CPU overhead could this generate? Could use of
tcpwrappers or similar systems work better? (I need ssh, snmp and a some
management agents available on the net, but I don't want the end users to
be able to access them).
Well... That's all for now...
Thanks for any help
Regards
roy
---
Computers are like air conditioners.
They stop working when you open Windows.
On Mon, 5 Nov 2001, Roy Sigurd Karlsbakk wrote:
> Hi all.
Hi
>
> First of all - the plan now, is to use Tux as the web server, as I've got
> the impression - both by testing it, and of what's been said on the list,
> that it's one of the fastest solutions I can find - at least on Linux. I
> plan to serve some <= 250 concurrent connections, each given a maximum
> bandwidht (in tux) of 7 Mbps. The average bandwidth use will be somewhere
> between 4 and 5 Mbps, but it may peak at 7. The server I'll set up will
> only run Tux - no Apache. Tux will be setup not to log, and virutally all
> system daemons will be stopped. Some management agents (from Compaq or
> whoever we'll go for) may run, but that's it. I also plan to set it up
> without any swap partition.
>
to answer other "not asked" questions of yours ill point you to :
http://www.specbench.org/osg/web99/results/res2000q4/web99-20001127-00075.html
as that should help you very much :) (that /proc tweaking its pretty cool)
----------------------------
Mihai RUSU
"... and what if this is as good as it gets ?"
> to answer other "not asked" questions of yours ill point you to :
> http://www.specbench.org/osg/web99/results/res2000q4/web99-20001127-00075.html
>
> as that should help you very much :) (that /proc tweaking its pretty cool)
Thanks!
Just one thing...
I need redundancy, so I can't go with RAID 0. I thought I'd go with RAID
4, to avoid reading the parity info (and thereby wasting time), and still
have some quite good redundancy.
Q: Should I use hardware RAID or software RAID here? I can see they've
been using a rather large stripe (or chunk) size on the RAID (2MB). The
RAID controller I planned to use only supports up to 512kB stripes. As I
said, the files I'm reading are rather large - up to 10GB each, or at
least 1GB. I'm reading 4-7Mbps (500-900kB) per connection and each
connection reads only one file. Will a large stripe size help me here?
roy
---
Computers are like air conditioners.
They stop working when you open Windows.
On Mon, 5 Nov 2001, Roy Sigurd Karlsbakk wrote:
> > to answer other "not asked" questions of yours ill point you to :
> > http://www.specbench.org/osg/web99/results/res2000q4/web99-20001127-00075.html
> >
> > as that should help you very much :) (that /proc tweaking its pretty cool)
>
> Thanks!
>
no problem
> Just one thing...
>
> I need redundancy, so I can't go with RAID 0. I thought I'd go with RAID
> 4, to avoid reading the parity info (and thereby wasting time), and still
> have some quite good redundancy.
>
i see
we use raid-5 in production here
> Q: Should I use hardware RAID or software RAID here? I can see they've
> been using a rather large stripe (or chunk) size on the RAID (2MB). The
> RAID controller I planned to use only supports up to 512kB stripes. As I
> said, the files I'm reading are rather large - up to 10GB each, or at
> least 1GB. I'm reading 4-7Mbps (500-900kB) per connection and each
> connection reads only one file. Will a large stripe size help me here?
>
if you got the money i recommend Mylex AccelRAID (http://www.mylex.com)
they are very well supported on linux and pretty fast too :)
im not a raid expert but i found some interesting information in the
DAC960 docs (/usr/src/linux/Documentation/README.DAC960)
Quote:
For maximum performance and the most efficient E2FSCK performance, it is
recommended that EXT2 file systems be built with a 4KB block size and 16
block stride to match the DAC960 controller's 64KB default stripe size.
The command "mke2fs -b 4096 -R stride=16 <device>" is appropriate.
Unless there will be a large number of small files on the file systems, it
is also beneficial to add the "-i 16384" option to increase the bytes per
inode parameter thereby reducing the file system metadata. Finally, on
systems that will only be run with Linux 2.2 or later kernels it is
beneficial to enable sparse superblocks with the "-s 1" option.
now i know you will not use ext2 (for data at least) but its a start point
to optimize the RAID and FS (whatever your choice)
i recommend XFS for those large files you have there
----------------------------
Mihai RUSU
"... and what if this is as good as it gets ?"
On Mon, 5 Nov 2001, Roy Sigurd Karlsbakk wrote:
> Just one thing...
>
> I need redundancy, so I can't go with RAID 0. I thought I'd go with RAID
> 4, to avoid reading the parity info (and thereby wasting time), and still
> have some quite good redundancy.
>
> Q: Should I use hardware RAID or software RAID here? I can see they've
> been using a rather large stripe (or chunk) size on the RAID (2MB). The
> RAID controller I planned to use only supports up to 512kB stripes. As I
> said, the files I'm reading are rather large - up to 10GB each, or at
> least 1GB. I'm reading 4-7Mbps (500-900kB) per connection and each
> connection reads only one file. Will a large stripe size help me here?
If you spec your boxes such that you have extra CPU cycles around after
TUX is done, consider software raid. Software raid is faster than hardware
raid in almost every circumstance I have seen, with the caveat that it
uses slightly more CPU resources (RAID 5 has worst CPU performance because
of parity calculations, RAID 1 is better)
--
Michael E. Brown, RHCE, MCSE+I, CNA
Dell Linux Solutions
http://www.dell.com/linux
If each of us have one object, and we exchange them,
then each of us still has one object.
If each of us have one idea, and we exchange them,
then each of us now has two ideas.
> If you spec your boxes such that you have extra CPU cycles around after
> TUX is done, consider software raid. Software raid is faster than hardware
> raid in almost every circumstance I have seen, with the caveat that it
> uses slightly more CPU resources (RAID 5 has worst CPU performance because
> of parity calculations, RAID 1 is better)
The only writing to the drives, will be weekly batch jobs running at
night. I originally planned to disable all or at least most loggin, but I
changed my mind to allow for logging to an NFS volume on a separate
management segment. Thus, there's virtually no writing in the system while
it's running. Therefor - again - I'd like to run software RAID 4, as
this'll give me read access to what looks like a RAID 0 (the n-1 data
drives), and although writing will be a pain in the ass, reading will be
fast, and I have some sort of redundancy.
What would you think would be a good stripe size in these circumstances
(>=1GB files)
roy
---
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
Computers are like air conditioners.
They stop working when you open Windows.
> ~220 MB/s. I'm not sure why you expect this to be a problem.
because of the pci busses, memory etc. it's got to be read and
transmitted. I'll be glad if i don't need to worry
> why? VM in general is designed to have swap. why would it hurt?
well... i thought of potential dropouts... i don't know why... forget it
> neither tux nor sendfile have anything to do with the caching,
> which is done by the kernel (VM and FS readahead, if any).
>
> > sendfile read the file into the cache buffer before sending it? I mean - I
> > can't keep the files in memory, so how much would do?
>
> VM works by scavenging not recently or frequently-used pages.
Does this mean the system could run smoothly on very low memory - say -
128MB RAM?
> > Q: If using the bandwidth control in Tux, how much CPU overhead may I
> > expect per stream (sizes as mentioned above)?
>
> trivial, I expect. after all, each stream has a trivial bandwidth,
> and bandwidth-control just needs to take a few looks at some timer.
> as I recall, tux uses rdtsc in a smart way to avoid gettimeofday
> syscalls.
Good...
> > Q: Do anyone know what SCSI and NIC (1 or 10Gb-Ethernet) that have low or
> > high CPU overhead? I have the impression that there are quite some
> > variations between different drivers. Is this true?
>
> the bsd-ported scsi driver seems to have some religious friction
> with linux, not surprisingly. afaik, all the usual gigE cards are
> roughly equivalent.
Does this mean I can choose whatever gigE card I want?
How about SCSI drivers?
> > Q: Tux has some way of linking IRQs directly to a specific CPU/thread
> > (don't really remember what this was...). What should/could be used of
> > these, or other tuning factors to speed up the server?
>
> depends on the hardware and IO rates. why are you expecting difficulty?
> 220 MB/s is pretty modest.
ok
> > Q: Are there ways to strip down the kernel, as to get rid of everything I
> > don't want or don't need? Could I gain speed on this, or is it just a
> > waste?
>
> silly.
Not silly. I'm not a kernel hacker. I ask naiive questions due to the fact
that I find it better to ask and get to know something, than staying
unknowable of some fact. But thanks for telling me it's of no problem.
> > Q: What file system would be best suitable for this? Ext[23]? ReiserFS?
> > Xfs? FAT? :-)
>
> couldn't possibly make any difference. journalling FS's are only
> valuable because of their crash-robustness in the presence of writes.
> reiserfs is only faster if you have bizzarely skewed file and dir sizes
> (billions of <1K files, billions of dir entries). FAT would be very cool,
> in a retro kind of way, but you'd probably hit file-length limits.
How about XFS? I've heard it's better pushing large files...
> > Q: If using iptables to block any unwanted traffic - that is - anything
> > that's not TCP/80, how much CPU overhead could this generate? Could use of
>
> how much unwanted traffic? why would there be any nontrivial amount?
I thought more about security. I don't want anyone exploiting potential
security holes.
>
> > tcpwrappers or similar systems work better? (I need ssh, snmp and a some
>
> security in depth: use iptables, tcpwrappers, *and* app-based controls.
>
> obviously, iptables discarding any snmp packet from !my.workstation
> is a lot faster than receiving the packet, doing the handshake,
> waking up inetd, firing up tcpd, checking the incoming IP, including
> rev DNS lookups, then passing the socket off to snmp... but it's also
> true that any large amount of bogus packets would be a problem you
> should handle elsehow (ie, DOS attack or serious misconfiguration.)
How about cpu overhead in iptables/kernel when it processes packets that
are allowed? How about connection tracking? Is the rule order in iptables
important when it comes to speed?
I apologize if some of these questions may seem a little stupid. Please do
not flame
Regards
roy
---
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
Computers are like air conditioners.
They stop working when you open Windows.