2005-10-08 04:39:57

by Xin Zhao

[permalink] [raw]
Subject: why is NFS performance poor when decompress linux kernel

Hi,

I setup two virtual machines. One works as NFS server and the other is
client. They talk to each other via in-host network communication.

I noticed that when doing large file copy or linux kernel compilation
in a NFS direcotry, the performance is not bad compared to local disk
filesystem such as ext2. However, if I do linux kernel tarball
decompression on a NFS directory, the performance is much worse than
local disk filesystem (over 3 times slower). Anybody know the reason?

My guess is that NFS has to do lookup and getattr over the network,
while local disk filesystem can do that in local memory. Is this the
major reason? or there are some other reasons?

Thanks for help!

-x


2005-10-08 05:59:52

by Lee Revell

[permalink] [raw]
Subject: Re: why is NFS performance poor when decompress linux kernel

On Sat, 2005-10-08 at 00:39 -0400, Xin Zhao wrote:
> I noticed that when doing large file copy or linux kernel compilation
> in a NFS direcotry, the performance is not bad compared to local disk
> filesystem such as ext2. However, if I do linux kernel tarball
> decompression on a NFS directory, the performance is much worse than
> local disk filesystem (over 3 times slower). Anybody know the reason?

Because NFS requires all writes to be synchronous by default, and
uncompressing the kernel is the most write intensive of those three
operations. Mount with the async option and the performance should be
closer to a local disk. Obviously this is more dangerous.

Lee

2005-10-08 07:19:43

by Willy Tarreau

[permalink] [raw]
Subject: Re: why is NFS performance poor when decompress linux kernel

Hi,

On Sat, Oct 08, 2005 at 01:59:48AM -0400, Lee Revell wrote:
> On Sat, 2005-10-08 at 00:39 -0400, Xin Zhao wrote:
> > I noticed that when doing large file copy or linux kernel compilation
> > in a NFS direcotry, the performance is not bad compared to local disk
> > filesystem such as ext2. However, if I do linux kernel tarball
> > decompression on a NFS directory, the performance is much worse than
> > local disk filesystem (over 3 times slower). Anybody know the reason?
>
> Because NFS requires all writes to be synchronous by default, and
> uncompressing the kernel is the most write intensive of those three
> operations. Mount with the async option and the performance should be
> closer to a local disk. Obviously this is more dangerous.

I don't agree with you, Lee. My NFS is mounted with async by default,
and what takes the most time when extracting a kernel archive is that
tar does a stat() on every file before writing it. And THAT stat()
prevents writes from being buffered. A better solution might be to
process several files in parallel (multi-process/multi-thread).
Perhaps a project for a new tar ?

Just for a test, I tried extracting multiple files in parallel. The
method is completely crappy, but I could saturate my NFS server this
way :

$ tar ztf /tmp/linux-2.6.9.tar.gz >/tmp/file-list
$ sed -n '1~4p' < /tmp/file-list >/tmp/file-list1
$ sed -n '2~4p' < /tmp/file-list >/tmp/file-list2
$ sed -n '3~4p' < /tmp/file-list >/tmp/file-list3
$ sed -n '4~4p' < /tmp/file-list >/tmp/file-list4

$ tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list1 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list2 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list3 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list4 & wait

OK, it finally took more time, although the server was saturated (maybe
it crawled under seeks at the end, I did not check). This may constitute
a starting point for people having more time to research in this area.

> Lee

Cheers,
Willy

2005-10-08 14:35:13

by Xin Zhao

[permalink] [raw]
Subject: Re: why is NFS performance poor when decompress linux kernel

I think the stat might be one reason. cuz when I do 'nfsstat', I
noticed that "getattr" and "setattr" are executed about 40000 times
while other operations are executed for less than 10000 times. That
gave me a feeling that some optimization can be considered to reduce
the getattr and setattr requests.

async and sync options affect write performance on large files more
significantly. But decompress kernel involves a lot of small files.
Because nfs will force data sync to disk before file close. async and
sync do not behave quite different.

Anyone has exeprience with NFS4? I don't know whether it improves in this parts

Xin

On 10/8/05, Willy Tarreau <[email protected]> wrote:
> Hi,
>
> On Sat, Oct 08, 2005 at 01:59:48AM -0400, Lee Revell wrote:
> > On Sat, 2005-10-08 at 00:39 -0400, Xin Zhao wrote:
> > > I noticed that when doing large file copy or linux kernel compilation
> > > in a NFS direcotry, the performance is not bad compared to local disk
> > > filesystem such as ext2. However, if I do linux kernel tarball
> > > decompression on a NFS directory, the performance is much worse than
> > > local disk filesystem (over 3 times slower). Anybody know the reason?
> >
> > Because NFS requires all writes to be synchronous by default, and
> > uncompressing the kernel is the most write intensive of those three
> > operations. Mount with the async option and the performance should be
> > closer to a local disk. Obviously this is more dangerous.
>
> I don't agree with you, Lee. My NFS is mounted with async by default,
> and what takes the most time when extracting a kernel archive is that
> tar does a stat() on every file before writing it. And THAT stat()
> prevents writes from being buffered. A better solution might be to
> process several files in parallel (multi-process/multi-thread).
> Perhaps a project for a new tar ?
>
> Just for a test, I tried extracting multiple files in parallel. The
> method is completely crappy, but I could saturate my NFS server this
> way :
>
> $ tar ztf /tmp/linux-2.6.9.tar.gz >/tmp/file-list
> $ sed -n '1~4p' < /tmp/file-list >/tmp/file-list1
> $ sed -n '2~4p' < /tmp/file-list >/tmp/file-list2
> $ sed -n '3~4p' < /tmp/file-list >/tmp/file-list3
> $ sed -n '4~4p' < /tmp/file-list >/tmp/file-list4
>
> $ tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list1 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list2 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list3 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list4 & wait
>
> OK, it finally took more time, although the server was saturated (maybe
> it crawled under seeks at the end, I did not check). This may constitute
> a starting point for people having more time to research in this area.
>
> > Lee
>
> Cheers,
> Willy
>
>

2005-10-08 15:03:27

by Xin Zhao

[permalink] [raw]
Subject: Re: why is NFS performance poor when decompress linux kernel

BTW: where did you see that stat is called before each write? can you
point out the code or function that does this? I might want to look
into the source code to see whether we can improve it.

Thanks,

Xin

On 10/8/05, Xin Zhao <[email protected]> wrote:
> I think the stat might be one reason. cuz when I do 'nfsstat', I
> noticed that "getattr" and "setattr" are executed about 40000 times
> while other operations are executed for less than 10000 times. That
> gave me a feeling that some optimization can be considered to reduce
> the getattr and setattr requests.
>
> async and sync options affect write performance on large files more
> significantly. But decompress kernel involves a lot of small files.
> Because nfs will force data sync to disk before file close. async and
> sync do not behave quite different.
>
> Anyone has exeprience with NFS4? I don't know whether it improves in this parts
>
> Xin
>
> On 10/8/05, Willy Tarreau <[email protected]> wrote:
> > Hi,
> >
> > On Sat, Oct 08, 2005 at 01:59:48AM -0400, Lee Revell wrote:
> > > On Sat, 2005-10-08 at 00:39 -0400, Xin Zhao wrote:
> > > > I noticed that when doing large file copy or linux kernel compilation
> > > > in a NFS direcotry, the performance is not bad compared to local disk
> > > > filesystem such as ext2. However, if I do linux kernel tarball
> > > > decompression on a NFS directory, the performance is much worse than
> > > > local disk filesystem (over 3 times slower). Anybody know the reason?
> > >
> > > Because NFS requires all writes to be synchronous by default, and
> > > uncompressing the kernel is the most write intensive of those three
> > > operations. Mount with the async option and the performance should be
> > > closer to a local disk. Obviously this is more dangerous.
> >
> > I don't agree with you, Lee. My NFS is mounted with async by default,
> > and what takes the most time when extracting a kernel archive is that
> > tar does a stat() on every file before writing it. And THAT stat()
> > prevents writes from being buffered. A better solution might be to
> > process several files in parallel (multi-process/multi-thread).
> > Perhaps a project for a new tar ?
> >
> > Just for a test, I tried extracting multiple files in parallel. The
> > method is completely crappy, but I could saturate my NFS server this
> > way :
> >
> > $ tar ztf /tmp/linux-2.6.9.tar.gz >/tmp/file-list
> > $ sed -n '1~4p' < /tmp/file-list >/tmp/file-list1
> > $ sed -n '2~4p' < /tmp/file-list >/tmp/file-list2
> > $ sed -n '3~4p' < /tmp/file-list >/tmp/file-list3
> > $ sed -n '4~4p' < /tmp/file-list >/tmp/file-list4
> >
> > $ tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list1 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list2 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list3 & tar zxf /tmp/linux-2.6.9.tar.gz -T /tmp/file-list4 & wait
> >
> > OK, it finally took more time, although the server was saturated (maybe
> > it crawled under seeks at the end, I did not check). This may constitute
> > a starting point for people having more time to research in this area.
> >
> > > Lee
> >
> > Cheers,
> > Willy
> >
> >
>

2005-10-08 21:23:53

by Willy Tarreau

[permalink] [raw]
Subject: Re: why is NFS performance poor when decompress linux kernel

On Sat, Oct 08, 2005 at 11:03:26AM -0400, Xin Zhao wrote:
> BTW: where did you see that stat is called before each write?

strace

> can you point out the code or function that does this? I might want
> to look into the source code to see whether we can improve it.

What would be cool (if at all possible) would be a command line option
to avoid doing it an rely on the open() return code instead.

Regards,
Willy