2001-02-02 10:05:48

by Andrew Morton

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

"David S. Miller" wrote:
>
> ...
> Finally, please do some tests on loopback. It is usually a great
> way to get "pure software overhead" measurements of our TCP stack.

Here we are. TCP and NFS/UDP over lo.

Machine is a dual-PII. I didn't bother running CPU utilisation
testing while benchmarking loopback, although this may be of
some interest for SMP. I just looked at the throughput.

Machine is a dual 500MHz PII (again). Memory read bandwidth
is 320 meg/sec. Write b/w is 130 meg/sec. The working set
is 60 ~300k files, everything cached. We run the following
tests:

1: sendfile() to localhost, sender and receiver pinned to
separate CPUs

2: sendfile() to localhost, sender and receiver pinned to
the same CPU

3: sendfile() to localhost, no explicit pinning.

4, 5, 6: same as above, except we use send() in 8kbyte
chunks.

Repeat with and without zerocopy patch 2.4.1-2.

The receiver reads 64k hunks and throws them away. sendfile()
sends the entire file.

Also, do an NFS mount of localhost, rsize=wsize=8192, see how
long it takes to `cp' a 100 meg file from the "server" to
/dev/null. The file is cached on the "server". Do this for
the three pinning cases as well - all the NFS kernel processes
were pinned as a group and `cp' was the other group.


sendfile() send(8k) NFS
Mbyte/s Mbyte/s Mbyte/s

No explicit bonding
2.4.1: 66600 70000 25600
2.4.1-zc: 208000 69000 25000

Bond client and server to separate CPUs
2.4.1: 66700 68000 27800
2.4.1-zc: 213047 66000 25700

Bond client and server to same CPU:
2.4.1: 56000 57000 23300
2.4.1-zc: 176000 55000 22100



Much the same story. Big increase in sendfile() efficiency,
small drop in send() and NFS unchanged.

The relative increase in sendfile() efficiency is much higher
than with a real NIC, presumably because we've factored out
the constant (and large) cost of the device driver.

All the bits and pieces to reproduce this are at

http://www.uow.edu.au/~andrewm/linux/#zc

-


2001-02-02 17:54:42

by David Lang

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

I have been watching this thread with interest for a while now, but am
wondering about the real-world use of this, given the performance penalty
for write()

As I see it there are two basic cases you are saying this will help in.

1. webservers

2. other fileservers

I also freely admit that I don't know a lot about sendfile() so it may
have some capability that makes my concerns meaningless, if so please let
me know.

1a. for webservers that server static content (and can therefor use
sendfile) I don't see this as significant becouse as your tests have been
showing, even a modest machine can saturate your network (unless you are
useing gigE at which time it takes a skightly larger machine)

1b. for webservers that are not primarily serving static content, they
have to use write() for the output from cgi's, etc and therefor pay the
performance penalty without being able to use sendfile() much to get the
advantages. These machines are the ones that really need the performance
as the cgi's take a significant amount of your cpu.

2. for other fileservers sendfile() sounds like it would be useful if the
client is reading the entire file, but what about the cases where the
client is reading part of the file, or is writing to the file. In both of
these cases it seems that the fileserver is back to the write() penalty.
does anyone have stats on the types of requests that fileservers are being
asked for?

David Lang



On Fri, 2 Feb 2001, Andrew Morton wrote:

> Date: Fri, 02 Feb 2001 21:12:50 +1100
> From: Andrew Morton <[email protected]>
> To: David S. Miller <[email protected]>
> Cc: lkml <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
> "David S. Miller" wrote:
> >
> > ...
> > Finally, please do some tests on loopback. It is usually a great
> > way to get "pure software overhead" measurements of our TCP stack.
>
> Here we are. TCP and NFS/UDP over lo.
>
> Machine is a dual-PII. I didn't bother running CPU utilisation
> testing while benchmarking loopback, although this may be of
> some interest for SMP. I just looked at the throughput.
>
> Machine is a dual 500MHz PII (again). Memory read bandwidth
> is 320 meg/sec. Write b/w is 130 meg/sec. The working set
> is 60 ~300k files, everything cached. We run the following
> tests:
>
> 1: sendfile() to localhost, sender and receiver pinned to
> separate CPUs
>
> 2: sendfile() to localhost, sender and receiver pinned to
> the same CPU
>
> 3: sendfile() to localhost, no explicit pinning.
>
> 4, 5, 6: same as above, except we use send() in 8kbyte
> chunks.
>
> Repeat with and without zerocopy patch 2.4.1-2.
>
> The receiver reads 64k hunks and throws them away. sendfile()
> sends the entire file.
>
> Also, do an NFS mount of localhost, rsize=wsize=8192, see how
> long it takes to `cp' a 100 meg file from the "server" to
> /dev/null. The file is cached on the "server". Do this for
> the three pinning cases as well - all the NFS kernel processes
> were pinned as a group and `cp' was the other group.
>
>
> sendfile() send(8k) NFS
> Mbyte/s Mbyte/s Mbyte/s
>
> No explicit bonding
> 2.4.1: 66600 70000 25600
> 2.4.1-zc: 208000 69000 25000
>
> Bond client and server to separate CPUs
> 2.4.1: 66700 68000 27800
> 2.4.1-zc: 213047 66000 25700
>
> Bond client and server to same CPU:
> 2.4.1: 56000 57000 23300
> 2.4.1-zc: 176000 55000 22100
>
>
>
> Much the same story. Big increase in sendfile() efficiency,
> small drop in send() and NFS unchanged.
>
> The relative increase in sendfile() efficiency is much higher
> than with a real NIC, presumably because we've factored out
> the constant (and large) cost of the device driver.
>
> All the bits and pieces to reproduce this are at
>
> http://www.uow.edu.au/~andrewm/linux/#zc
>
> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-02-02 22:48:01

by David Miller

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


David Lang writes:
> 1a. for webservers that server static content (and can therefor use
> sendfile) I don't see this as significant becouse as your tests have been
> showing, even a modest machine can saturate your network (unless you are
> useing gigE at which time it takes a skightly larger machine)

Start using more than one interface, then it begins to become
interesting.

> 1b. for webservers that are not primarily serving static content, they
> have to use write() for the output from cgi's, etc and therefor pay the
> performance penalty without being able to use sendfile() much to get the
> advantages. These machines are the ones that really need the performance
> as the cgi's take a significant amount of your cpu.

CGI's can be cached btw if the implementation is clever (f.e. CGI
tells the web server that if the file used as input to the CGI does
not change then the output from the CGI will not change, meaning CGI
output is based solely on input, therefore CGI output can be cached
by the web server).

> 2. for other fileservers sendfile() sounds like it would be useful if the
> client is reading the entire file, but what about the cases where the
> client is reading part of the file, or is writing to the file. In both of
> these cases it seems that the fileserver is back to the write() penalty.
> does anyone have stats on the types of requests that fileservers are being
> asked for?

It helps no matter what part of the file the client reads.

sendfile() can be used on an arbitrary offset+len portion of
a file, it is not limited to just sending an entire fire.

Later,
David S. Miller
[email protected]

2001-02-02 23:01:22

by David Lang

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

Thanks, that info on sendfile makes sense for the fileserver situation.
for webservers we will have to see (many/most CGI's look at stuff from the
client so I still have doubts as to how much use cacheing will be)

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

> Date: Fri, 2 Feb 2001 14:46:07 -0800 (PST)
> From: David S. Miller <[email protected]>
> To: David Lang <[email protected]>
> Cc: Andrew Morton <[email protected]>, lkml <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
>
> David Lang writes:
> > 1a. for webservers that server static content (and can therefor use
> > sendfile) I don't see this as significant becouse as your tests have been
> > showing, even a modest machine can saturate your network (unless you are
> > useing gigE at which time it takes a skightly larger machine)
>
> Start using more than one interface, then it begins to become
> interesting.
>
> > 1b. for webservers that are not primarily serving static content, they
> > have to use write() for the output from cgi's, etc and therefor pay the
> > performance penalty without being able to use sendfile() much to get the
> > advantages. These machines are the ones that really need the performance
> > as the cgi's take a significant amount of your cpu.
>
> CGI's can be cached btw if the implementation is clever (f.e. CGI
> tells the web server that if the file used as input to the CGI does
> not change then the output from the CGI will not change, meaning CGI
> output is based solely on input, therefore CGI output can be cached
> by the web server).
>
> > 2. for other fileservers sendfile() sounds like it would be useful if the
> > client is reading the entire file, but what about the cases where the
> > client is reading part of the file, or is writing to the file. In both of
> > these cases it seems that the fileserver is back to the write() penalty.
> > does anyone have stats on the types of requests that fileservers are being
> > asked for?
>
> It helps no matter what part of the file the client reads.
>
> sendfile() can be used on an arbitrary offset+len portion of
> a file, it is not limited to just sending an entire fire.
>
> Later,
> David S. Miller
> [email protected]
>

2001-02-02 23:11:04

by David Miller

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


David Lang writes:
> Thanks, that info on sendfile makes sense for the fileserver situation.
> for webservers we will have to see (many/most CGI's look at stuff from the
> client so I still have doubts as to how much use cacheing will be)

Also note that the decreased CPU utilization resulting from
zerocopy sendfile leaves more CPU available for CGI execution.

This was a point I forgot to make.

Later,
David S. Miller
[email protected]

2001-02-02 23:16:34

by David Lang

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

right, assuming that there is enough sendfile() benifit to overcome the
write() penalty from the stuff that can't be cached or sent from a file.

my question was basicly are there enough places where sendfile would
actually be used to make it a net gain.

David Lang

On Fri, 2 Feb 2001, David S. Miller wrote:

> Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
> From: David S. Miller <[email protected]>
> To: David Lang <[email protected]>
> Cc: Andrew Morton <[email protected]>, lkml <[email protected]>,
> "[email protected]" <[email protected]>
> Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
>
>
> David Lang writes:
> > Thanks, that info on sendfile makes sense for the fileserver situation.
> > for webservers we will have to see (many/most CGI's look at stuff from the
> > client so I still have doubts as to how much use cacheing will be)
>
> Also note that the decreased CPU utilization resulting from
> zerocopy sendfile leaves more CPU available for CGI execution.
>
> This was a point I forgot to make.
>
> Later,
> David S. Miller
> [email protected]
>

2001-02-02 23:28:57

by Jeff Barrow

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


Let's see.... all the work being done for clustering would definitely
benefit... all the static images on your webserver--and static images
makes up most of the bandwidth from web servers (images, activeX controls,
java apps, sound clips...)... NFS servers, Samba servers (both of which
are used more than you may think)... email servers...

Once Real Networks patches their Realserver to use sendfile (which
shouldn't bee all that hard), then that would help too....

I think that sendfile can be used in a LOT of applications, and the only
ones that wouldn't benefit are mostly low-bandwidth anyway (CGI apps
almost always return either a small html file or a small image file, then
there's telnet and other interactive utilities...).

Most applications that use a lot of bandwidth (and thus a lot of CPU time
sending the packets) are capable of being patched to use sendfile.


On Fri, 2 Feb 2001, David Lang wrote:

> right, assuming that there is enough sendfile() benifit to overcome the
> write() penalty from the stuff that can't be cached or sent from a file.
>
> my question was basicly are there enough places where sendfile would
> actually be used to make it a net gain.
>
> David Lang
>
> On Fri, 2 Feb 2001, David S. Miller wrote:
>
> > Date: Fri, 2 Feb 2001 15:09:13 -0800 (PST)
> > From: David S. Miller <[email protected]>
> > To: David Lang <[email protected]>
> > Cc: Andrew Morton <[email protected]>, lkml <[email protected]>,
> > "[email protected]" <[email protected]>
> > Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)
> >
> >
> > David Lang writes:
> > > Thanks, that info on sendfile makes sense for the fileserver situation.
> > > for webservers we will have to see (many/most CGI's look at stuff from the
> > > client so I still have doubts as to how much use cacheing will be)
> >
> > Also note that the decreased CPU utilization resulting from
> > zerocopy sendfile leaves more CPU available for CGI execution.
> >
> > This was a point I forgot to make.
> >
> > Later,
> > David S. Miller
> > [email protected]
> >
>

2001-02-02 23:33:27

by David Miller

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)


David Lang writes:
> right, assuming that there is enough sendfile() benifit to overcome the
> write() penalty from the stuff that can't be cached or sent from a file.
>
> my question was basicly are there enough places where sendfile would
> actually be used to make it a net gain.

There are non-performance issues as well (really, all of these points
have been mentioned in this thread btw). One is that since paged
SKBs use only single-order page allocations, the memory allocation
subsystem is stressed less than the current scheme where SLAB
allocates multi-order pages to satisfy allocations of linear SKB data
buffers.

This has consequences and benefits system wide.

Later,
David S. Miller
[email protected]

2001-02-03 02:28:17

by James A Sutherland

[permalink] [raw]
Subject: Re: sendfile+zerocopy: fairly sexy (nothing to do with ECN)

On Fri, 2 Feb 2001, David Lang wrote:

> Thanks, that info on sendfile makes sense for the fileserver situation.
> for webservers we will have to see (many/most CGI's look at stuff from the
> client so I still have doubts as to how much use cacheing will be)

CGI performance isn't directly affected by this - the whole point is to
reduce the "cost" of handling static requests to zero (at least, as close
as possible) leaving as much CPU as possible for the CGI to use.

So sendfile won't help your CGI directly - it will just give your CGI more
resources to work with.


James.