2009-03-12 22:00:56

by Jim Callahan

[permalink] [raw]
Subject: Best A->B large file copy performance

I'm trying to determine the most optimal way to have a single NFS client
copy large numbers (100-1000) of fairly large (1-50M) files from one
location on an file server to another location on the same file server.
There seem to be several API layers which influence this:

1. Number of OS level processes performing the copy in parallel.
2. Record size used buy the C-library read()/write() calls from these
processes.
3. NFS client rsize/wsize settings.
4. Ethernet MTU size.
5. Bandwidth of the ethernet network and switches.

So far we've played around with larger MTU and rsize/wsize settings
without seeing a huge difference. Since we have been using "cp" to
perform (1), we've not tweaked the record size at all at this point.
My suspicion is that we should be carefully coordinating the sizes
specified in for the layers 2, 3 and 4. Perhaps we should be using "dd"
instead of "cp" so we can control the record size being used. Since
the number of permutations of these three settings are large I was
hoping that I might get some advise from this list about a range of
values we should be investigating and any unpleasant interactions
between these levels of settings we should be aware of to narrow our
search. Also, if there are other major factors outside those listed I'd
appreciate being pointed in the right direction.

---

While I'm on the subject, has there been any discussion about adding an
NFS request that would allow copying files from one location to another
on the same NFS server without requiring a round trip to a client? Its
not at all uncommon to need to move data around in this manner and it
seems a huge waste of bandwidth to have to send all this data from the
server to the client just to have the client send the data back
unaltered to a different location. Such a COPY request would be high
level along the lines of RENAME and each server vendor could optimize
this for their particular hardware architecture. For our particular
application, having such a request would make a huge difference in
performance.

--
Jim Callahan - President - Temerity Software <http://www.temerity.us>


2009-03-13 19:16:59

by Trond Myklebust

[permalink] [raw]
Subject: Re: Best A->B large file copy performance

On Thu, 2009-03-12 at 17:00 -0400, Jim Callahan wrote:
> I'm trying to determine the most optimal way to have a single NFS client
> copy large numbers (100-1000) of fairly large (1-50M) files from one
> location on an file server to another location on the same file server.
> There seem to be several API layers which influence this:
>
> 1. Number of OS level processes performing the copy in parallel.
> 2. Record size used buy the C-library read()/write() calls from these
> processes.
> 3. NFS client rsize/wsize settings.
> 4. Ethernet MTU size.
> 5. Bandwidth of the ethernet network and switches.
>
> So far we've played around with larger MTU and rsize/wsize settings
> without seeing a huge difference. Since we have been using "cp" to
> perform (1), we've not tweaked the record size at all at this point.
> My suspicion is that we should be carefully coordinating the sizes
> specified in for the layers 2, 3 and 4. Perhaps we should be using "dd"
> instead of "cp" so we can control the record size being used. Since
> the number of permutations of these three settings are large I was
> hoping that I might get some advise from this list about a range of
> values we should be investigating and any unpleasant interactions
> between these levels of settings we should be aware of to narrow our
> search. Also, if there are other major factors outside those listed I'd
> appreciate being pointed in the right direction.

MTU, and rsize/wsize settings shouldn't matter much unless you're using
a UDP connection. I'd recommend just using the default r/wsize
negotiated by the client and server, and then whatever MTU is most
convenient for the other applications you may have.

Bandwidth and switch quality do matter (a lot). Particularly so if you
have many clients...

If you're just copying and not interested in using the file or its
contents afterwards, then you might consider using direct i/o instead of
ordinary cached i/o.

> While I'm on the subject, has there been any discussion about adding an
> NFS request that would allow copying files from one location to another
> on the same NFS server without requiring a round trip to a client? Its
> not at all uncommon to need to move data around in this manner and it
> seems a huge waste of bandwidth to have to send all this data from the
> server to the client just to have the client send the data back
> unaltered to a different location. Such a COPY request would be high
> level along the lines of RENAME and each server vendor could optimize
> this for their particular hardware architecture. For our particular
> application, having such a request would make a huge difference in
> performance.

I don't think anyone has talked about a server-to-server protocol, but I
believe there will be a proposal for file copy at the coming IETF
meeting. If you want server-to-server, then now is the time to speak up
and make the case. You'd probably want to start a thread on
[email protected]...

Cheers
Trond


2009-03-13 22:40:00

by Jim Callahan

[permalink] [raw]
Subject: Re: Best A->B large file copy performance

Trond Myklebust wrote:
> On Thu, 2009-03-12 at 17:00 -0400, Jim Callahan wrote:
>> While I'm on the subject, has there been any discussion about adding an
>> NFS request that would allow copying files from one location to another
>> on the same NFS server without requiring a round trip to a client? Its
>> not at all uncommon to need to move data around in this manner and it
>> seems a huge waste of bandwidth to have to send all this data from the
>> server to the client just to have the client send the data back
>> unaltered to a different location. Such a COPY request would be high
>> level along the lines of RENAME and each server vendor could optimize
>> this for their particular hardware architecture. For our particular
>> application, having such a request would make a huge difference in
>> performance.
>>
>
> I don't think anyone has talked about a server-to-server protocol, but I
> believe there will be a proposal for file copy at the coming IETF
> meeting. If you want server-to-server, then now is the time to speak up
> and make the case. You'd probably want to start a thread on
> [email protected]...
>
Thanks for the responses Trond. I wasn't actually suggesting a
server-to-server protocol, but rather an additional client-server
protocol request to tell the server to copy files internally. The idea
being that the typical usage of "cp" via NFS is wasting bandwidth
transmitting the contents of the source file from the server to client
only to have the client send it back unaltered. If this was instead
performed internally on the server itself, it seems to me that it might
be dramatically faster and not waste valuable network bandwidth. The
calling convention would be identical to the current RENAME request.
The implementation would of course be different in this new COPY request
would create a new i-node for the target and then copy all data from he
source to target file. A vendor could choose the most efficient manner
for performing this based on their hardware/software architecture.

Thanks for the pointer to [email protected]. I'll bring this up there as
well...

In case you are wondering, we make an application which includes version
control features somewhat along the lines of CVS or SVN. In other
words, there is a central repository for checked-in versions and
independent scratch areas for users who can have their own copies of
files. So both check-in and check-out operations frequently involve
performing a "cp" from file A to B located on the same NFS server.

--
Jim Callahan - President - Temerity Software <http://www.temerity.us>

2009-03-13 02:39:20

by Greg Banks

[permalink] [raw]
Subject: Re: Best A->B large file copy performance

Jim Callahan wrote:
> I'm trying to determine the most optimal way to have a single NFS
> client copy large numbers (100-1000) of fairly large (1-50M) files [...]
I'd like to propose a new rule of thumb: to be considered "fairly
large", a file should be larger than the capacity of a USB key which
could be comfortably swallowed.

> [...] Since the number of permutations of these three settings are
> large I was hoping that I might get some advise from this list about a
> range of values we should be investigating and any unpleasant
> interactions between these levels of settings we should be aware of to
> narrow our search. Also, if there are other major factors outside
> those listed I'd appreciate being pointed in the right direction.
Try

http://mirror.linux.org.au/pub/linux.conf.au/2008/slides/130-lca2008-nfs-tuning-secrets-d7.odp

--
Greg Banks, P.Engineer, SGI Australian Software Group.
the brightly coloured sporks of revolution.
I don't speak for SGI.