LinuxLists.cc - 2.4.21 NFSv3 performance graph

2005-01-14 19:00:12

Subject: 2.4.21 NFSv3 performance graph

Can anyone tell me what is happening in the graph at the URL
below? I can replicate it on any Linux box running 2.4.21
and changing rsize/wsize doesn't affect it at all. This
was captured from a 2.6GHz P4 client machine. Same thing
is visible with a Dual 3GHz Xeon box with 4GB memory as a
client. A Solaris 9 client does not display this "falling
on face" behavior.

http://www.kickflop.net/temp/sol9-server-linux2421-client-gigE-ag930m_10533_image001.gif

--
If it ain't broke, don't fix it.
If it ain't needed, don't adopt it.

-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-29 15:50:32

by jehan.procaccia

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

OK so now I run with your recommanded options and I get Output perfs as
high as my network speed !! I am very surprised ! I don't think I am
measuring NFS perfs here but network speed :-( .
Indeed for any couple filesize/record lenght I get wites result (see
sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s -> or
88Mbits/s ~= my 100Mbits ethernet througput ! (less ethernet/ip overhead !)

here's what I did:
$mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
[root@arvouin /mnt/cobra3/iozone/arvouin]
$time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone

Command line used: iozone -a -c -e -i 0 -i 1
Output is in Kbytes/sec
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random
random bkwd
record stride
KB reclen write rewrite read reread read
write read rewrite read fwrite frewrite fread freread
1024 4 10529 10603 409270
408936
1024 8 10571 10666 472558 533076
....
262144 64 11146 11156 11230
11225
262144 128 11152 11172 11228 10948

here only read/reread changes as filesize increases , anyway 400/500MB/s
reads is well over my 12.5 theorical ethernet througput, I suspect cache
intervention here, no ? although I did put -e -c options !

Any comment , advices ? what kind of result do you get for NFS writings
with iozone ? as high as I get ? which options I am missing ?

Thanks.

Trond Myklebust wrote:

>fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>
>
>>more generaly, what tool do you recommand to bench NFS ?
>>I tried bonnie, bonnie++ and iozone.
>>for the latest here's the kind of command I ran (so that it doesn't
>>takes hours to run the test!):
>>/opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m -i
>>0 -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384 -c -U
>>/mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result
>>
>>My problem is that my NFS server has 4Go of ram, and bench programs
>>always recommand to use filesize for tests higher than RAM size and even
>>double size of the RAM so that it is not messuring cache activities !
>>
>>
>
>For tests of reading, this is undoubtedly true. For tests of writing
>over NFS, this may be false: see the discussions of the iozone "-c" and
>"-e" flags below.
>
>Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>flags, and so are indeed not good for testing wire speeds unless you use
>very large files.
>
>
>
>>Can you give me a sample of the iozone arguments you used ?
>>Any other tools ?
>>
>>
>
>It depends on what I want to test 8-)
>
>
>Something like "iozone -c -a" should be fine for a basic test of the
>generic read/write code functionality.
>Note the "-c" which *is* usually necessary under NFS since any cached
>writes are going to be flushed to disk by the "close()" (or when the
>process exits). This means that close() will normally end up dominating
>your write timings for files < memory size.
>
>If you want to test mmap(), something like "iozone -e -B -a". I believe
>that "-e" should normally ensure that any writes are flushed to disk
>using the fsync() command, and that this is timed.
>Note that if you don't care about knowing how long it takes for the
>writes to be flushed to disk then you can drop the "-e": unlike ordinary
>read/write, mmap() does not guarantee that writes are flushed to disk
>after the file is closed.
>
>For direct IO, "iozone -I -a" suffices. Since direct IO is uncached, all
>write operations are synchronous, so "-c" and "-e" are unnecessary.
>
>
>Cheers,
> Trond
>
>

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-30 11:33:54

by jehan.procaccia

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Iozone wrote:

> Jehan,
>
> Your results are what I would expect, given your
> configuration.
>
> 1. The -e and -c will flush the writes from the
> client to the server, and from the server to
> its disks. However, if the file size is smaller than
> the amount of ram in the server, then a copy of the
> data still exists in the server's cache. Thus, client
> reads can be satisfied from the server's cache and
> wire speeds would be expected.
> If the file size is smaller than the amount of RAM in the
> client, then reads could be satisfied from the client's
> cache. Thus, the results that are higher than wire speed.
> Note: In Trond's runs, he uses the -U option. This option
> un-mounts and re-mounts the NFS filesystem on the
> client. This defeats the client's cache, even for files
> that would fit in the client's RAM.

My problem is that I cannot managed to use the -U option :-( , after a
few mount/umounts (rapids -> their are mount/umount between every tests
!) here what happens:

Arvouin NFS client tester:
[root@arvouin /mnt]
$grep cobra3 /etc/fstab
cobra3:/p2v5f3 /mnt/cobra3 nfs defaults 1 2
[root@arvouin /mnt]
$time iozone -a -c -e -i 0 -i 1 -U /mnt/cobra3 -f
/mnt/cobra3/iozone/arvouin/arvouin-async-cobra-sync >
arvouin:async-cobra3:sync-i01-a-c-e-U-F.iozone
umount: /mnt/cobra3: not mounted
mount: cobra3:/p2v5f3 failed, reason given by server: Permission denied
creat: No such file or directory

Cobra3 NFS server logs:
Jan 30 11:32:20 cobra3 rpc.mountd: authenticated mount request from
arvouin.int-evry.fr:844 for /p2v5f3 (/p2v5f3)
Jan 30 11:32:21 cobra3 rpc.mountd: authenticated unmount request from
arvouin.int-evry.fr:848 for /p2v5f3 (/p2v5f3)
Jan 30 11:32:21 cobra3 rpc.mountd: refused mount request from
arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34107

I thought about the firewall (Fedora core2 iptable), so I stopped it on
both sides, no success :-( .

Jan 30 11:34:39 cobra3 rpc.mountd: authenticated unmount request from
arvouin.int-evry.fr:957 for /p2v5f3 (/p2v5f3)
Jan 30 11:34:39 cobra3 rpc.mountd: refused mount request from
arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34111

Any idea on this ?

>
> 2. If you are using mmap, you may control the sync
> behavior with the -D and -G options. The -D causes
> msync() to occur with it happening async. The -G
> causes msync() to occur with it happening sync.
>
I don't understand the "if you are using mmap" ? is running iozone -a
uses mmap ? actually what I want to simulate is users homedirs dailly
usage -> mostly connecting to gnome sessions (lock, name pipe, unix
sockets ...) then use of tar , gcc, emacs, mozilla ! does that mean
"unsing mmap" ? sorry if I seem a bit newbye here ...

> 3. It is not too surprising that you see 11 Mbytes/sec over
> 100 Mbit. It's not very challenging for even a single
> disk, on the server, to satisfy this flow rate. It would
> be more interesting to use Gigabit networking, as this
> would put more load on the server's disk subsystem.
>
Indeed my AX100 fiber channel network storage uses 12x250GB SATA Disks
at 7200 rpm each, it is specified to througout at around 150 MB/s ->
well over ethernet 11 MB/s, so network should be the bottle neck ! but
in that case why the untar of an apache distrib (~7MB) takes more than
2minutes to be done ?

[root@calaz /mnt/cobra3sync/mci/test/Test-sync]
$ time tar xvfz /tmp/httpd-2.0.52.tar.gz
real 2m18.141s

If I compute right it's 50KB/s, far away from 11MB/s so network shortage
is not in cause here . My pb is that users don't care about iozone
11MB/s performances, they complain about their dailly usage !. But their
complains are only "oral", I want to prove their complain with bench
value -> so the use of iozone !
perhaps it's also a question of permissions/mode/attributes value
checking -> NSS checks through the ldap directory etc ..., but iozone
doesn't mesure that ?

> 4. If you need to exceed the RAM in the server (to measure
> without cache effects) then you could do so by using
> the -U option, or you could use large files, or you could
> use the -t option, and have the aggregate file data set size
> be larger than the amount of RAM in the server.
>
Large file size (here I need more than 4GB beacuase I have 4GB of ram on
the NFS server), makes test very long :-( and I don't think it reflects
the dailly usage of users .
I tried -t option:

Server export in sync, client mount in async
[root@arvouin /mnt/cobra3/iozone/arvouin]
$time iozone -i 1 -i 0 -t 4 -r 64 -s 128M -F ./foo1 ./foo2 ./foo3 ./foo4
Throughput test with 4 processes
Each process writes a 131072 Kbyte file in 64 Kbyte records

Children see throughput for 4 initial writers = 10994.48 KB/sec
Parent sees throughput for 4 initial writers = 8561.40 KB/sec
Min throughput per process = 2085.77 KB/sec
Max throughput per process = 3647.83 KB/sec
Avg throughput per process = 2748.62 KB/sec
Min xfer = 82944.00 KB

here with this sample 8561.40 KB/sec I'am still at network 11MB/s bottle
neck .

> 5. Be wary of using -I (Direct I/O) The problem is not
> in Iozone, but in the fact that there are many versions
> of Linux, and other Unixes, that do not actually honor
> the O_DIRECT, but also do not return errors when
> it is used. For example: Some systems have:
>
> Example #1: #define O_DIRECT
> Example #2: #define O_DIRECT O_SYNC
> Example #3: #define O_DIRECT O_RSYNC|O_SYNC
>
> None of the above are actually equivalent to a real
> Direct I/O method.

OK, I'll be carefull, although I don't know how to check what my system
honors (redhat entreprise server 3 kernel 2.4.21-27.ELsmp ). where these
#define can be check ?.
Anyway, I just blindly test -I option directly on the server:
[root@cobra3 /p2v5f3/iozone/cobra3]
$ /opt/iozone/bin/iozone -a -I -i 0 -i 1
KB reclen write rewrite read reread
4096 64 38611 38196 63046 63567

not too bad, although 38MB is not 150 MB, but this is commercial
specification (150MB) maybe not reallity !

>
> 6. Getting very tricky here:
> You might try using the -W option. This enables file locking.
> Not that you wanted file locking, but you might want its
> side effect. In many Unix systems, enabling file locking over
> NFS completely disables the NFS client caching, for
> reads, and writes :-) and does so for ALL file sizes.
>
I tried that, but no sinificative changes, still arount 10MB/s .

Thanks a lot for all your help, I hope I will finally use that iozone
tool correcly :-) .

> Enjoy,
> Don Capps
>
> ----- Original Message ----- From: "jehan.procaccia"
> <[email protected]>
> To: "Trond Myklebust" <[email protected]>
> Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
> <[email protected]>
> Sent: Saturday, January 29, 2005 4:48 AM
> Subject: Re: [NFS] 2.4.21 NFSv3 performance graph
>
>
>> OK so now I run with your recommanded options and I get Output perfs
>> as high as my network speed !! I am very surprised ! I don't think I
>> am measuring NFS perfs here but network speed :-( .
>> Indeed for any couple filesize/record lenght I get wites result (see
>> sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s ->
>> or 88Mbits/s ~= my 100Mbits ethernet througput ! (less ethernet/ip
>> overhead !)
>>
>> here's what I did:
>> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>>
>> Command line used: iozone -a -c -e -i 0 -i 1
>> Output is in Kbytes/sec
>> Processor cache size set to 1024 Kbytes.
>> Processor cache line size set to 32 bytes.
>> File stride size set to 17 * record size.
>> random
>> random bkwd
>> record stride
>> KB reclen write rewrite read reread read
>> write read rewrite read fwrite frewrite fread freread
>> 1024 4 10529 10603 409270 408936 1024
>> 8 10571 10666 472558 533076
>> ....
>> 262144 64 11146 11156 11230 11225
>> 262144 128 11152 11172 11228 10948
>>
>> here only read/reread changes as filesize increases , anyway
>> 400/500MB/s reads is well over my 12.5 theorical ethernet througput,
>> I suspect cache intervention here, no ? although I did put -e -c
>> options !
>>
>> Any comment , advices ? what kind of result do you get for NFS
>> writings with iozone ? as high as I get ? which options I am missing ?
>>
>> Thanks.
>> Trond Myklebust wrote:
>>
>>> fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>>
>>>> more generaly, what tool do you recommand to bench NFS ?
>>>> I tried bonnie, bonnie++ and iozone.
>>>> for the latest here's the kind of command I ran (so that it doesn't
>>>> takes hours to run the test!):
>>>> /opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m
>>>> -i -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384
>>>> -c -U /mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result
>>>>
>>>> My problem is that my NFS server has 4Go of ram, and bench programs
>>>> always recommand to use filesize for tests higher than RAM size and
>>>> even double size of the RAM so that it is not messuring cache
>>>> activities !
>>>>
>>>
>>> For tests of reading, this is undoubtedly true. For tests of writing
>>> over NFS, this may be false: see the discussions of the iozone "-c" and
>>> "-e" flags below.
>>>
>>> Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>> flags, and so are indeed not good for testing wire speeds unless you
>>> use
>>> very large files.
>>>
>>>
>>>> Can you give me a sample of the iozone arguments you used ?
>>>> Any other tools ?
>>>>
>>>
>>> It depends on what I want to test 8-)
>>>
>>>
>>> Something like "iozone -c -a" should be fine for a basic test of the
>>> generic read/write code functionality.
>>> Note the "-c" which *is* usually necessary under NFS since any cached
>>> writes are going to be flushed to disk by the "close()" (or when the
>>> process exits). This means that close() will normally end up dominating
>>> your write timings for files < memory size.
>>>
>>> If you want to test mmap(), something like "iozone -e -B -a". I believe
>>> that "-e" should normally ensure that any writes are flushed to disk
>>> using the fsync() command, and that this is timed.
>>> Note that if you don't care about knowing how long it takes for the
>>> writes to be flushed to disk then you can drop the "-e": unlike
>>> ordinary
>>> read/write, mmap() does not guarantee that writes are flushed to disk
>>> after the file is closed.
>>>
>>> For direct IO, "iozone -I -a" suffices. Since direct IO is uncached,
>>> all
>>> write operations are synchronous, so "-c" and "-e" are unnecessary.
>>>
>>>
>>> Cheers,
>>> Trond
>>>
>>
>>
>
>

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-14 23:59:43

by Trond Myklebust

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

fr den 14.01.2005 Klokka 14:00 (-0500) skreiv Jeff Blaine:
> Can anyone tell me what is happening in the graph at the URL
> below? I can replicate it on any Linux box running 2.4.21
> and changing rsize/wsize doesn't affect it at all. This
> was captured from a 2.6GHz P4 client machine. Same thing
> is visible with a Dual 3GHz Xeon box with 4GB memory as a
> client. A Solaris 9 client does not display this "falling
> on face" behavior.
>
> http://www.kickflop.net/temp/sol9-server-linux2421-client-gigE-ag930m_10533_image001.gif

Firstly, you should always use the '-c' flag when measuring NFS client
performance. Otherwise, you are basically just measuring the speed with
which your machine can write to the local page cache (which may indeed
explain your "knee" at 1MB here - that would be where the client starts
to force a flush to disk).

Secondly, have you actually read the NFS FAQ and NFS HOWTO entries on
how to tune for performance? Particularly the entries on TCP vs. UDP,
and how Solaris clients default to the former.

Trond
--
Trond Myklebust <[email protected]>

-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-18 14:55:08

by Jeff Blaine

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Trond Myklebust wrote:
> fr den 14.01.2005 Klokka 14:00 (-0500) skreiv Jeff Blaine:
>=20
>>Can anyone tell me what is happening in the graph at the URL
>>below? I can replicate it on any Linux box running 2.4.21
>>and changing rsize/wsize doesn't affect it at all. This
>>was captured from a 2.6GHz P4 client machine. Same thing
>>is visible with a Dual 3GHz Xeon box with 4GB memory as a
>>client. A Solaris 9 client does not display this "falling
>>on face" behavior.
>>
>>http://www.kickflop.net/temp/sol9-server-linux2421-client-gigE-ag930m_1=
0533_image001.gif
>=20
> Firstly, you should always use the '-c' flag when measuring NFS client
> performance. Otherwise, you are basically just measuring the speed with
> which your machine can write to the local page cache (which may indeed
> explain your "knee" at 1MB here - that would be where the client starts
> to force a flush to disk).

Firstly,

Iozone manual:

"If you use a file size that is larger than the amount of
memory in the client then the =91c=92 flag is not needed."

I used a file size that is larger than the amount of memory in
the client.

> Secondly, have you actually read the NFS FAQ and NFS HOWTO entries on
> how to tune for performance? Particularly the entries on TCP vs. UDP,
> and how Solaris clients default to the former.

Secondly,

Yes, I have *actually* read the NFS FAQ and NFS HOWTO entries
on how to tune for performance.

It was a pretty simple question. I don't know what all the
attitude is about. Do you have an answer you can share with
me or not?

Anyone else?

-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-18 17:02:38

by Vincent Roqueta

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Le mardi 18 Janvier 2005 15:54, Jeff Blaine a =E9crit=A0:
> Trond Myklebust wrote:
> > fr den 14.01.2005 Klokka 14:00 (-0500) skreiv Jeff Blaine:
> >>Can anyone tell me what is happening in the graph at the URL
> >>below? I can replicate it on any Linux box running 2.4.21
> >>and changing rsize/wsize doesn't affect it at all. This
> >>was captured from a 2.6GHz P4 client machine. Same thing
> >>is visible with a Dual 3GHz Xeon box with 4GB memory as a
> >>client. A Solaris 9 client does not display this "falling
> >>on face" behavior.
> >>
> >>http://www.kickflop.net/temp/sol9-server-linux2421-client-gigE-ag930m_1=
05

That are reading perfs? Over a 1Gb/s network? Synchronous mount? (y)
Yours discs are capables of a 540Mo/s throughput? (No).
So you are working in cache...

http://nfsv4.bullopensource.org/tools/tests/NFSv4_tests.html

Vincent

-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-21 17:09:43

by jehan.procaccia

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Trond Myklebust wrote:

>fr den 14.01.2005 Klokka 14:00 (-0500) skreiv Jeff Blaine:
>
>
>>Can anyone tell me what is happening in the graph at the URL
>>below? I can replicate it on any Linux box running 2.4.21
>>and changing rsize/wsize doesn't affect it at all. This
>>was captured from a 2.6GHz P4 client machine. Same thing
>>is visible with a Dual 3GHz Xeon box with 4GB memory as a
>>client. A Solaris 9 client does not display this "falling
>>on face" behavior.
>>
>>http://www.kickflop.net/temp/sol9-server-linux2421-client-gigE-ag930m_10533_image001.gif
>>
>>
>
>Firstly, you should always use the '-c' flag when measuring NFS client
>performance. Otherwise, you are basically just measuring the speed with
>which your machine can write to the local page cache (which may indeed
>explain your "knee" at 1MB here - that would be where the client starts
>to force a flush to disk).
>
>Secondly, have you actually read the NFS FAQ and NFS HOWTO entries on
>how to tune for performance? Particularly the entries on TCP vs. UDP,
>and how Solaris clients default to the former.
>
> Trond
>
>
Hello,

more generaly, what tool do you recommand to bench NFS ?
I tried bonnie, bonnie++ and iozone.
for the latest here's the kind of command I ran (so that it doesn't
takes hours to run the test!):
/opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m -i
0 -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384 -c -U
/mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result

My problem is that my NFS server has 4Go of ram, and bench programs
always recommand to use filesize for tests higher than RAM size and even
double size of the RAM so that it is not messuring cache activities !

Can you give me a sample of the iozone arguments you used ?
Any other tools ?

thanks.

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-21 17:45:59

by Trond Myklebust

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:

> more generaly, what tool do you recommand to bench NFS ?
> I tried bonnie, bonnie++ and iozone.
> for the latest here's the kind of command I ran (so that it doesn't
> takes hours to run the test!):
> /opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m -i
> 0 -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384 -c -U
> /mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result
>
> My problem is that my NFS server has 4Go of ram, and bench programs
> always recommand to use filesize for tests higher than RAM size and even
> double size of the RAM so that it is not messuring cache activities !

For tests of reading, this is undoubtedly true. For tests of writing
over NFS, this may be false: see the discussions of the iozone "-c" and
"-e" flags below.

Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
flags, and so are indeed not good for testing wire speeds unless you use
very large files.

> Can you give me a sample of the iozone arguments you used ?
> Any other tools ?

It depends on what I want to test 8-)

Something like "iozone -c -a" should be fine for a basic test of the
generic read/write code functionality.
Note the "-c" which *is* usually necessary under NFS since any cached
writes are going to be flushed to disk by the "close()" (or when the
process exits). This means that close() will normally end up dominating
your write timings for files < memory size.

If you want to test mmap(), something like "iozone -e -B -a". I believe
that "-e" should normally ensure that any writes are flushed to disk
using the fsync() command, and that this is timed.
Note that if you don't care about knowing how long it takes for the
writes to be flushed to disk then you can drop the "-e": unlike ordinary
read/write, mmap() does not guarantee that writes are flushed to disk
after the file is closed.

For direct IO, "iozone -I -a" suffices. Since direct IO is uncached, all
write operations are synchronous, so "-c" and "-e" are unnecessary.

Cheers,
Trond
--
Trond Myklebust <[email protected]>

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-17 13:54:22

by jehan.procaccia

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Hello, I didn't received any answer to the post below ... anyway , I
finally published my iozone/bonnie++/tar bench here:
http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/nfs.html#htoc48

unfortunatly most of them are saturated by the 12.5MB (100Mb) ethernet
bottle neck, and for the iozone cache suppression I couldn't find a way
to run corretly with the -U option (which mount/umount FS at every
tests) because of this error on the nfs server:
rpc.mountd: refused mount request from arvouin.int-evry.fr for /p2v5f3
(/p2v5f3): illegal port 34111

I must admit that in the logs I occationnaly get this "illegal port"
error , even for regular NFS clients (not only iozone testers !) .
Is this a bug ? a mis configuration ?
this is on a RH ES 3 (Taroon Update 4) 2.4.21-27.ELsmp nfs-utils-1.0.6-33EL

Thanks.

jehan.procaccia wrote:

> Iozone wrote:
>
>> Jehan,
>>
>> Your results are what I would expect, given your
>> configuration.
>>
>> 1. The -e and -c will flush the writes from the
>> client to the server, and from the server to
>> its disks. However, if the file size is smaller than
>> the amount of ram in the server, then a copy of the
>> data still exists in the server's cache. Thus, client
>> reads can be satisfied from the server's cache and
>> wire speeds would be expected.
>> If the file size is smaller than the amount of RAM in the
>> client, then reads could be satisfied from the client's
>> cache. Thus, the results that are higher than wire speed.
>> Note: In Trond's runs, he uses the -U option. This option
>> un-mounts and re-mounts the NFS filesystem on the
>> client. This defeats the client's cache, even for files
>> that would fit in the client's RAM.
>
>
> My problem is that I cannot managed to use the -U option :-( , after a
> few mount/umounts (rapids -> their are mount/umount between every
> tests !) here what happens:
>
> Arvouin NFS client tester:
> [root@arvouin /mnt]
> $grep cobra3 /etc/fstab
> cobra3:/p2v5f3 /mnt/cobra3 nfs
> defaults 1 2
> [root@arvouin /mnt]
> $time iozone -a -c -e -i 0 -i 1 -U /mnt/cobra3 -f
> /mnt/cobra3/iozone/arvouin/arvouin-async-cobra-sync >
> arvouin:async-cobra3:sync-i01-a-c-e-U-F.iozone
> umount: /mnt/cobra3: not mounted
> mount: cobra3:/p2v5f3 failed, reason given by server: Permission denied
> creat: No such file or directory
>
> Cobra3 NFS server logs:
> Jan 30 11:32:20 cobra3 rpc.mountd: authenticated mount request from
> arvouin.int-evry.fr:844 for /p2v5f3 (/p2v5f3)
> Jan 30 11:32:21 cobra3 rpc.mountd: authenticated unmount request from
> arvouin.int-evry.fr:848 for /p2v5f3 (/p2v5f3)
> Jan 30 11:32:21 cobra3 rpc.mountd: refused mount request from
> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34107
>
> I thought about the firewall (Fedora core2 iptable), so I stopped it
> on both sides, no success :-( .
>
> Jan 30 11:34:39 cobra3 rpc.mountd: authenticated unmount request from
> arvouin.int-evry.fr:957 for /p2v5f3 (/p2v5f3)
> Jan 30 11:34:39 cobra3 rpc.mountd: refused mount request from
> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34111
>
> Any idea on this ?
>
>>
>> 2. If you are using mmap, you may control the sync
>> behavior with the -D and -G options. The -D causes
>> msync() to occur with it happening async. The -G
>> causes msync() to occur with it happening sync.
>>
> I don't understand the "if you are using mmap" ? is running iozone -a
> uses mmap ? actually what I want to simulate is users homedirs dailly
> usage -> mostly connecting to gnome sessions (lock, name pipe, unix
> sockets ...) then use of tar , gcc, emacs, mozilla ! does that mean
> "unsing mmap" ? sorry if I seem a bit newbye here ...
>
>> 3. It is not too surprising that you see 11 Mbytes/sec over
>> 100 Mbit. It's not very challenging for even a single
>> disk, on the server, to satisfy this flow rate. It would
>> be more interesting to use Gigabit networking, as this
>> would put more load on the server's disk subsystem.
>>
> Indeed my AX100 fiber channel network storage uses 12x250GB SATA Disks
> at 7200 rpm each, it is specified to througout at around 150 MB/s ->
> well over ethernet 11 MB/s, so network should be the bottle neck ! but
> in that case why the untar of an apache distrib (~7MB) takes more than
> 2minutes to be done ?
>
> [root@calaz /mnt/cobra3sync/mci/test/Test-sync]
> $ time tar xvfz /tmp/httpd-2.0.52.tar.gz
> real 2m18.141s
>
> If I compute right it's 50KB/s, far away from 11MB/s so network
> shortage is not in cause here . My pb is that users don't care about
> iozone 11MB/s performances, they complain about their dailly usage !.
> But their complains are only "oral", I want to prove their complain
> with bench value -> so the use of iozone !
> perhaps it's also a question of permissions/mode/attributes value
> checking -> NSS checks through the ldap directory etc ..., but iozone
> doesn't mesure that ?
>
>> 4. If you need to exceed the RAM in the server (to measure
>> without cache effects) then you could do so by using
>> the -U option, or you could use large files, or you could
>> use the -t option, and have the aggregate file data set size
>> be larger than the amount of RAM in the server.
>>
> Large file size (here I need more than 4GB beacuase I have 4GB of ram
> on the NFS server), makes test very long :-( and I don't think it
> reflects the dailly usage of users .
> I tried -t option:
>
> Server export in sync, client mount in async
> [root@arvouin /mnt/cobra3/iozone/arvouin]
> $time iozone -i 1 -i 0 -t 4 -r 64 -s 128M -F ./foo1 ./foo2 ./foo3 ./foo4
> Throughput test with 4 processes
> Each process writes a 131072 Kbyte file in 64 Kbyte records
>
> Children see throughput for 4 initial writers = 10994.48
> KB/sec
> Parent sees throughput for 4 initial writers = 8561.40
> KB/sec
> Min throughput per process = 2085.77
> KB/sec
> Max throughput per process = 3647.83
> KB/sec
> Avg throughput per process = 2748.62
> KB/sec
> Min xfer = 82944.00 KB
>
> here with this sample 8561.40 KB/sec I'am still at network 11MB/s
> bottle neck .
>
>> 5. Be wary of using -I (Direct I/O) The problem is not
>> in Iozone, but in the fact that there are many versions
>> of Linux, and other Unixes, that do not actually honor
>> the O_DIRECT, but also do not return errors when
>> it is used. For example: Some systems have:
>>
>> Example #1: #define O_DIRECT
>> Example #2: #define O_DIRECT O_SYNC
>> Example #3: #define O_DIRECT O_RSYNC|O_SYNC
>>
>> None of the above are actually equivalent to a real
>> Direct I/O method.
>
>
> OK, I'll be carefull, although I don't know how to check what my
> system honors (redhat entreprise server 3 kernel 2.4.21-27.ELsmp ).
> where these #define can be check ?.
> Anyway, I just blindly test -I option directly on the server:
> [root@cobra3 /p2v5f3/iozone/cobra3]
> $ /opt/iozone/bin/iozone -a -I -i 0 -i 1
> KB reclen write rewrite read reread 4096 64
> 38611 38196 63046 63567
>
> not too bad, although 38MB is not 150 MB, but this is commercial
> specification (150MB) maybe not reallity !
>
>>
>> 6. Getting very tricky here:
>> You might try using the -W option. This enables file locking.
>> Not that you wanted file locking, but you might want its
>> side effect. In many Unix systems, enabling file locking over
>> NFS completely disables the NFS client caching, for
>> reads, and writes :-) and does so for ALL file sizes.
>>
> I tried that, but no sinificative changes, still arount 10MB/s .
>
> Thanks a lot for all your help, I hope I will finally use that iozone
> tool correcly :-) .
>
>> Enjoy,
>> Don Capps
>>
>> ----- Original Message ----- From: "jehan.procaccia"
>> <[email protected]>
>> To: "Trond Myklebust" <[email protected]>
>> Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
>> <[email protected]>
>> Sent: Saturday, January 29, 2005 4:48 AM
>> Subject: Re: [NFS] 2.4.21 NFSv3 performance graph
>>
>>
>>> OK so now I run with your recommanded options and I get Output perfs
>>> as high as my network speed !! I am very surprised ! I don't think I
>>> am measuring NFS perfs here but network speed :-( .
>>> Indeed for any couple filesize/record lenght I get wites result (see
>>> sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s
>>> -> or 88Mbits/s ~= my 100Mbits ethernet througput ! (less
>>> ethernet/ip overhead !)
>>>
>>> here's what I did:
>>> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
>>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>>> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>>>
>>> Command line used: iozone -a -c -e -i 0 -i 1
>>> Output is in Kbytes/sec
>>> Processor cache size set to 1024 Kbytes.
>>> Processor cache line size set to 32 bytes.
>>> File stride size set to 17 * record size.
>>> random
>>> random bkwd
>>> record stride
>>> KB reclen write rewrite read reread read
>>> write read rewrite read fwrite frewrite fread freread
>>> 1024 4 10529 10603 409270 408936 1024
>>> 8 10571 10666 472558 533076
>>> ....
>>> 262144 64 11146 11156 11230 11225
>>> 262144 128 11152 11172 11228 10948
>>>
>>> here only read/reread changes as filesize increases , anyway
>>> 400/500MB/s reads is well over my 12.5 theorical ethernet througput,
>>> I suspect cache intervention here, no ? although I did put -e -c
>>> options !
>>>
>>> Any comment , advices ? what kind of result do you get for NFS
>>> writings with iozone ? as high as I get ? which options I am missing ?
>>>
>>> Thanks.
>>> Trond Myklebust wrote:
>>>
>>>> fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>>>
>>>>> more generaly, what tool do you recommand to bench NFS ?
>>>>> I tried bonnie, bonnie++ and iozone.
>>>>> for the latest here's the kind of command I ran (so that it
>>>>> doesn't takes hours to run the test!):
>>>>> /opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s
>>>>> 100m -i -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r
>>>>> 16384 -c -U /mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux >
>>>>> iozone-result
>>>>>
>>>>> My problem is that my NFS server has 4Go of ram, and bench
>>>>> programs always recommand to use filesize for tests higher than
>>>>> RAM size and even double size of the RAM so that it is not
>>>>> messuring cache activities !
>>>>>
>>>>
>>>> For tests of reading, this is undoubtedly true. For tests of writing
>>>> over NFS, this may be false: see the discussions of the iozone "-c"
>>>> and
>>>> "-e" flags below.
>>>>
>>>> Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>>> flags, and so are indeed not good for testing wire speeds unless
>>>> you use
>>>> very large files.
>>>>
>>>>
>>>>> Can you give me a sample of the iozone arguments you used ?
>>>>> Any other tools ?
>>>>>
>>>>
>>>> It depends on what I want to test 8-)
>>>>
>>>>
>>>> Something like "iozone -c -a" should be fine for a basic test of the
>>>> generic read/write code functionality.
>>>> Note the "-c" which *is* usually necessary under NFS since any cached
>>>> writes are going to be flushed to disk by the "close()" (or when the
>>>> process exits). This means that close() will normally end up
>>>> dominating
>>>> your write timings for files < memory size.
>>>>
>>>> If you want to test mmap(), something like "iozone -e -B -a". I
>>>> believe
>>>> that "-e" should normally ensure that any writes are flushed to disk
>>>> using the fsync() command, and that this is timed.
>>>> Note that if you don't care about knowing how long it takes for the
>>>> writes to be flushed to disk then you can drop the "-e": unlike
>>>> ordinary
>>>> read/write, mmap() does not guarantee that writes are flushed to disk
>>>> after the file is closed.
>>>>
>>>> For direct IO, "iozone -I -a" suffices. Since direct IO is
>>>> uncached, all
>>>> write operations are synchronous, so "-c" and "-e" are unnecessary.
>>>>
>>>>
>>>> Cheers,
>>>> Trond
>>>>
>>>
>>>
>>
>>
>
>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-17 17:27:51

by jehan.procaccia

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Iozone wrote:

> Jehan,
>
> Yes, the "Illegal port" is a bug in the NFS server, or client.
> It's not coming from Iozone.

OK for this I'll try the "insecure" export option as advices by Bruce
Fields ...

>
> Note: For the Iozone automatic tests, In the 3D graphs
> the right hand edge of the graph is where the
> file size is larger than the cache and represents the
> physical I/O activity. This is where you are bottlenecking
> on the 100Mbit interconnect.
>
> I noticed on your web page that you boiled the
> 3D surface data down to a single value, and picked
> the highest value for this number. Ok... that's
> funky and pretty much meaningless.

Yes , I did that on purpose to have a single value of comparition with
bonnie++ tests, see the beware paragraph in the 13.1.4 section . But
indedd I must adit it is pretty much meaningless !.

> If you must reduce the
> data down to a single value, you'll have to pick what you
> want to represent.
>
> (Client side cache performance, or NFS server performance)
>
> and pick the value from the appropriate region on the
> surface of the plot. Left side is the client cache, right side
> is NFS server.

the thing is that the slope of the surface (not sure of my english here
...) is rather climbing from left to right in writes tests, and droping
from left to right in read tests !
exemple :
http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/Aa-C2s/write/write.png
so when the client cache is overfull I get better write performances ..
? surprising !?.

http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/Aa-C2s/read/read.png
here in reads, when the cache is overfull I get lower read performances
with a signyficative break at the 128MB file size ! Here I find this to
be logical .

> The cross over is where the file size
> no longer fits in the client side cache. Also, you'll need
> to document which thing you are measuring.

Actually I want to measure the whole NFS performance, for daily users
usage, from client to server reponses, isn't it why I did ?

Thanks.

>
> Enjoy,
> Don Capps
>
> ----- Original Message ----- From: "jehan.procaccia"
> <[email protected]>
> To: "jehan.procaccia" <[email protected]>
> Cc: "Iozone" <[email protected]>; "Trond Myklebust"
> <[email protected]>; "Jeff Blaine" <[email protected]>;
> <[email protected]>
> Sent: Thursday, February 17, 2005 7:54 AM
> Subject: Re: [NFS] 2.4.21 NFSv3 performance graph
>
>
>> Hello, I didn't received any answer to the post below ... anyway
>> , I finally published my iozone/bonnie++/tar bench here:
>> http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/nfs.html#htoc48
>>
>> unfortunatly most of them are saturated by the 12.5MB (100Mb)
>> ethernet bottle neck, and for the iozone cache suppression I couldn't
>> find a way to run corretly with the -U option (which mount/umount FS
>> at every tests) because of this error on the nfs server:
>> rpc.mountd: refused mount request from arvouin.int-evry.fr for
>> /p2v5f3 (/p2v5f3): illegal port 34111
>>
>> I must admit that in the logs I occationnaly get this "illegal port"
>> error , even for regular NFS clients (not only iozone testers !) .
>> Is this a bug ? a mis configuration ?
>> this is on a RH ES 3 (Taroon Update 4) 2.4.21-27.ELsmp
>> nfs-utils-1.0.6-33EL
>>
>> Thanks.
>>
>> jehan.procaccia wrote:
>>
>>> Iozone wrote:
>>>
>>>> Jehan,
>>>>
>>>> Your results are what I would expect, given your
>>>> configuration.
>>>>
>>>> 1. The -e and -c will flush the writes from the
>>>> client to the server, and from the server to
>>>> its disks. However, if the file size is smaller than
>>>> the amount of ram in the server, then a copy of the
>>>> data still exists in the server's cache. Thus, client
>>>> reads can be satisfied from the server's cache and
>>>> wire speeds would be expected.
>>>> If the file size is smaller than the amount of RAM in the
>>>> client, then reads could be satisfied from the client's
>>>> cache. Thus, the results that are higher than wire speed.
>>>> Note: In Trond's runs, he uses the -U option. This option
>>>> un-mounts and re-mounts the NFS filesystem on the
>>>> client. This defeats the client's cache, even for files
>>>> that would fit in the client's RAM.
>>>
>>>
>>>
>>> My problem is that I cannot managed to use the -U option :-( , after
>>> a few mount/umounts (rapids -> their are mount/umount between every
>>> tests !) here what happens:
>>>
>>> Arvouin NFS client tester:
>>> [root@arvouin /mnt]
>>> $grep cobra3 /etc/fstab
>>> cobra3:/p2v5f3 /mnt/cobra3 nfs
>>> defaults 1 2
>>> [root@arvouin /mnt]
>>> $time iozone -a -c -e -i 0 -i 1 -U /mnt/cobra3 -f
>>> /mnt/cobra3/iozone/arvouin/arvouin-async-cobra-sync >
>>> arvouin:async-cobra3:sync-i01-a-c-e-U-F.iozone
>>> umount: /mnt/cobra3: not mounted
>>> mount: cobra3:/p2v5f3 failed, reason given by server: Permission denied
>>> creat: No such file or directory
>>>
>>> Cobra3 NFS server logs:
>>> Jan 30 11:32:20 cobra3 rpc.mountd: authenticated mount request from
>>> arvouin.int-evry.fr:844 for /p2v5f3 (/p2v5f3)
>>> Jan 30 11:32:21 cobra3 rpc.mountd: authenticated unmount request
>>> from arvouin.int-evry.fr:848 for /p2v5f3 (/p2v5f3)
>>> Jan 30 11:32:21 cobra3 rpc.mountd: refused mount request from
>>> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34107
>>>
>>> I thought about the firewall (Fedora core2 iptable), so I stopped it
>>> on both sides, no success :-( .
>>>
>>> Jan 30 11:34:39 cobra3 rpc.mountd: authenticated unmount request
>>> from arvouin.int-evry.fr:957 for /p2v5f3 (/p2v5f3)
>>> Jan 30 11:34:39 cobra3 rpc.mountd: refused mount request from
>>> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34111
>>>
>>> Any idea on this ?
>>>
>>>>
>>>> 2. If you are using mmap, you may control the sync
>>>> behavior with the -D and -G options. The -D causes
>>>> msync() to occur with it happening async. The -G
>>>> causes msync() to occur with it happening sync.
>>>>
>>> I don't understand the "if you are using mmap" ? is running iozone
>>> -a uses mmap ? actually what I want to simulate is users homedirs
>>> dailly usage -> mostly connecting to gnome sessions (lock, name
>>> pipe, unix sockets ...) then use of tar , gcc, emacs, mozilla ! does
>>> that mean "unsing mmap" ? sorry if I seem a bit newbye here ...
>>>
>>>> 3. It is not too surprising that you see 11 Mbytes/sec over
>>>> 100 Mbit. It's not very challenging for even a single
>>>> disk, on the server, to satisfy this flow rate. It would
>>>> be more interesting to use Gigabit networking, as this
>>>> would put more load on the server's disk subsystem.
>>>>
>>> Indeed my AX100 fiber channel network storage uses 12x250GB SATA
>>> Disks at 7200 rpm each, it is specified to througout at around 150
>>> MB/s -> well over ethernet 11 MB/s, so network should be the bottle
>>> neck ! but in that case why the untar of an apache distrib (~7MB)
>>> takes more than 2minutes to be done ?
>>>
>>> [root@calaz /mnt/cobra3sync/mci/test/Test-sync]
>>> $ time tar xvfz /tmp/httpd-2.0.52.tar.gz
>>> real 2m18.141s
>>>
>>> If I compute right it's 50KB/s, far away from 11MB/s so network
>>> shortage is not in cause here . My pb is that users don't care about
>>> iozone 11MB/s performances, they complain about their dailly usage
>>> !. But their complains are only "oral", I want to prove their
>>> complain with bench value -> so the use of iozone !
>>> perhaps it's also a question of permissions/mode/attributes value
>>> checking -> NSS checks through the ldap directory etc ..., but
>>> iozone doesn't mesure that ?
>>>
>>>> 4. If you need to exceed the RAM in the server (to measure
>>>> without cache effects) then you could do so by using
>>>> the -U option, or you could use large files, or you could
>>>> use the -t option, and have the aggregate file data set size
>>>> be larger than the amount of RAM in the server.
>>>>
>>> Large file size (here I need more than 4GB beacuase I have 4GB of
>>> ram on the NFS server), makes test very long :-( and I don't think
>>> it reflects the dailly usage of users .
>>> I tried -t option:
>>>
>>> Server export in sync, client mount in async
>>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>>> $time iozone -i 1 -i 0 -t 4 -r 64 -s 128M -F ./foo1 ./foo2 ./foo3
>>> ./foo4
>>> Throughput test with 4 processes
>>> Each process writes a 131072 Kbyte file in 64 Kbyte records
>>>
>>> Children see throughput for 4 initial writers = 10994.48
>>> KB/sec
>>> Parent sees throughput for 4 initial writers = 8561.40
>>> KB/sec
>>> Min throughput per process = 2085.77
>>> KB/sec
>>> Max throughput per process = 3647.83
>>> KB/sec
>>> Avg throughput per process = 2748.62
>>> KB/sec
>>> Min xfer = 82944.00 KB
>>>
>>> here with this sample 8561.40 KB/sec I'am still at network 11MB/s
>>> bottle neck .
>>>
>>>> 5. Be wary of using -I (Direct I/O) The problem is not
>>>> in Iozone, but in the fact that there are many versions
>>>> of Linux, and other Unixes, that do not actually honor
>>>> the O_DIRECT, but also do not return errors when
>>>> it is used. For example: Some systems have:
>>>>
>>>> Example #1: #define O_DIRECT
>>>> Example #2: #define O_DIRECT O_SYNC
>>>> Example #3: #define O_DIRECT O_RSYNC|O_SYNC
>>>>
>>>> None of the above are actually equivalent to a real
>>>> Direct I/O method.
>>>
>>>
>>>
>>> OK, I'll be carefull, although I don't know how to check what my
>>> system honors (redhat entreprise server 3 kernel 2.4.21-27.ELsmp ).
>>> where these #define can be check ?.
>>> Anyway, I just blindly test -I option directly on the server:
>>> [root@cobra3 /p2v5f3/iozone/cobra3]
>>> $ /opt/iozone/bin/iozone -a -I -i 0 -i 1
>>> KB reclen write rewrite read reread 4096 64
>>> 38611 38196 63046 63567
>>>
>>> not too bad, although 38MB is not 150 MB, but this is commercial
>>> specification (150MB) maybe not reallity !
>>>
>>>>
>>>> 6. Getting very tricky here:
>>>> You might try using the -W option. This enables file locking.
>>>> Not that you wanted file locking, but you might want its
>>>> side effect. In many Unix systems, enabling file locking over
>>>> NFS completely disables the NFS client caching, for
>>>> reads, and writes :-) and does so for ALL file sizes.
>>>>
>>> I tried that, but no sinificative changes, still arount 10MB/s .
>>>
>>> Thanks a lot for all your help, I hope I will finally use that
>>> iozone tool correcly :-) .
>>>
>>>> Enjoy,
>>>> Don Capps
>>>>
>>>> ----- Original Message ----- From: "jehan.procaccia"
>>>> <[email protected]>
>>>> To: "Trond Myklebust" <[email protected]>
>>>> Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
>>>> <[email protected]>
>>>> Sent: Saturday, January 29, 2005 4:48 AM
>>>> Subject: Re: [NFS] 2.4.21 NFSv3 performance graph
>>>>
>>>>
>>>>> OK so now I run with your recommanded options and I get Output
>>>>> perfs as high as my network speed !! I am very surprised ! I don't
>>>>> think I am measuring NFS perfs here but network speed :-( .
>>>>> Indeed for any couple filesize/record lenght I get wites result
>>>>> (see sample below) around 11000Kbytes/sec -> so if I am right ->
>>>>> 11MB/s -> or 88Mbits/s ~= my 100Mbits ethernet througput ! (less
>>>>> ethernet/ip overhead !)
>>>>>
>>>>> here's what I did:
>>>>> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
>>>>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>>>>> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>>>>>
>>>>> Command line used: iozone -a -c -e -i 0 -i 1
>>>>> Output is in Kbytes/sec
>>>>> Processor cache size set to 1024 Kbytes.
>>>>> Processor cache line size set to 32 bytes.
>>>>> File stride size set to 17 * record size.
>>>>> random
>>>>> random bkwd
>>>>> record stride
>>>>> KB reclen write rewrite read reread read
>>>>> write read rewrite read fwrite frewrite fread freread
>>>>> 1024 4 10529 10603 409270 408936
>>>>> 1024 8 10571 10666 472558 533076
>>>>> ....
>>>>> 262144 64 11146 11156 11230 11225 262144
>>>>> 128 11152 11172 11228 10948
>>>>>
>>>>> here only read/reread changes as filesize increases , anyway
>>>>> 400/500MB/s reads is well over my 12.5 theorical ethernet
>>>>> througput, I suspect cache intervention here, no ? although I did
>>>>> put -e -c options !
>>>>>
>>>>> Any comment , advices ? what kind of result do you get for NFS
>>>>> writings with iozone ? as high as I get ? which options I am
>>>>> missing ?
>>>>>
>>>>> Thanks.
>>>>> Trond Myklebust wrote:
>>>>>
>>>>>> fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>>>>>
>>>>>>> more generaly, what tool do you recommand to bench NFS ?
>>>>>>> I tried bonnie, bonnie++ and iozone.
>>>>>>> for the latest here's the kind of command I ran (so that it
>>>>>>> doesn't takes hours to run the test!):
>>>>>>> /opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s
>>>>>>> 100m -i -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r
>>>>>>> 16384 -c -U /mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux >
>>>>>>> iozone-result
>>>>>>>
>>>>>>> My problem is that my NFS server has 4Go of ram, and bench
>>>>>>> programs always recommand to use filesize for tests higher than
>>>>>>> RAM size and even double size of the RAM so that it is not
>>>>>>> messuring cache activities !
>>>>>>>
>>>>>>
>>>>>> For tests of reading, this is undoubtedly true. For tests of writing
>>>>>> over NFS, this may be false: see the discussions of the iozone
>>>>>> "-c" and
>>>>>> "-e" flags below.
>>>>>>
>>>>>> Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>>>>> flags, and so are indeed not good for testing wire speeds unless
>>>>>> you use
>>>>>> very large files.
>>>>>>
>>>>>>
>>>>>>> Can you give me a sample of the iozone arguments you used ?
>>>>>>> Any other tools ?
>>>>>>>
>>>>>>
>>>>>> It depends on what I want to test 8-)
>>>>>>
>>>>>>
>>>>>> Something like "iozone -c -a" should be fine for a basic test of the
>>>>>> generic read/write code functionality.
>>>>>> Note the "-c" which *is* usually necessary under NFS since any
>>>>>> cached
>>>>>> writes are going to be flushed to disk by the "close()" (or when the
>>>>>> process exits). This means that close() will normally end up
>>>>>> dominating
>>>>>> your write timings for files < memory size.
>>>>>>
>>>>>> If you want to test mmap(), something like "iozone -e -B -a". I
>>>>>> believe
>>>>>> that "-e" should normally ensure that any writes are flushed to disk
>>>>>> using the fsync() command, and that this is timed.
>>>>>> Note that if you don't care about knowing how long it takes for the
>>>>>> writes to be flushed to disk then you can drop the "-e": unlike
>>>>>> ordinary
>>>>>> read/write, mmap() does not guarantee that writes are flushed to
>>>>>> disk
>>>>>> after the file is closed.
>>>>>>
>>>>>> For direct IO, "iozone -I -a" suffices. Since direct IO is
>>>>>> uncached, all
>>>>>> write operations are synchronous, so "-c" and "-e" are unnecessary.
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Trond
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-17 16:00:40

by J. Bruce Fields

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

On Thu, Feb 17, 2005 at 02:54:08PM +0100, jehan.procaccia wrote:
> Hello, I didn't received any answer to the post below ... anyway , I
> finally published my iozone/bonnie++/tar bench here:
> http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/nfs.html#htoc48
>
> unfortunatly most of them are saturated by the 12.5MB (100Mb) ethernet
> bottle neck, and for the iozone cache suppression I couldn't find a way
> to run corretly with the -U option (which mount/umount FS at every
> tests) because of this error on the nfs server:
> rpc.mountd: refused mount request from arvouin.int-evry.fr for /p2v5f3
> (/p2v5f3): illegal port 34111
>
> I must admit that in the logs I occationnaly get this "illegal port"
> error , even for regular NFS clients (not only iozone testers !) .
> Is this a bug ? a mis configuration ?

It sounds like you just need to add the "insecure" export option to the
relevant export in /etc/exports. See "man exports".

--Bruce Fields

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-17 15:44:46

by Iozone

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Jehan,

Yes, the "Illegal port" is a bug in the NFS server, or client.
It's not coming from Iozone.

Note: For the Iozone automatic tests, In the 3D graphs
the right hand edge of the graph is where the
file size is larger than the cache and represents the
physical I/O activity. This is where you are bottlenecking
on the 100Mbit interconnect.

I noticed on your web page that you boiled the
3D surface data down to a single value, and picked
the highest value for this number. Ok... that's
funky and pretty much meaningless. If you must reduce the
data down to a single value, you'll have to pick what you
want to represent.

(Client side cache performance, or NFS server performance)

and pick the value from the appropriate region on the
surface of the plot. Left side is the client cache, right side
is NFS server. The cross over is where the file size
no longer fits in the client side cache. Also, you'll need
to document which thing you are measuring.

Enjoy,
Don Capps

----- Original Message -----
From: "jehan.procaccia" <[email protected]>
To: "jehan.procaccia" <[email protected]>
Cc: "Iozone" <[email protected]>; "Trond Myklebust"
<[email protected]>; "Jeff Blaine" <[email protected]>;
<[email protected]>
Sent: Thursday, February 17, 2005 7:54 AM
Subject: Re: [NFS] 2.4.21 NFSv3 performance graph

> Hello, I didn't received any answer to the post below ... anyway , I
> finally published my iozone/bonnie++/tar bench here:
> http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/nfs.html#htoc48
>
> unfortunatly most of them are saturated by the 12.5MB (100Mb) ethernet
> bottle neck, and for the iozone cache suppression I couldn't find a way to
> run corretly with the -U option (which mount/umount FS at every tests)
> because of this error on the nfs server:
> rpc.mountd: refused mount request from arvouin.int-evry.fr for /p2v5f3
> (/p2v5f3): illegal port 34111
>
> I must admit that in the logs I occationnaly get this "illegal port"
> error , even for regular NFS clients (not only iozone testers !) .
> Is this a bug ? a mis configuration ?
> this is on a RH ES 3 (Taroon Update 4) 2.4.21-27.ELsmp
> nfs-utils-1.0.6-33EL
>
> Thanks.
>
> jehan.procaccia wrote:
>
>> Iozone wrote:
>>
>>> Jehan,
>>>
>>> Your results are what I would expect, given your
>>> configuration.
>>>
>>> 1. The -e and -c will flush the writes from the
>>> client to the server, and from the server to
>>> its disks. However, if the file size is smaller than
>>> the amount of ram in the server, then a copy of the
>>> data still exists in the server's cache. Thus, client
>>> reads can be satisfied from the server's cache and
>>> wire speeds would be expected.
>>> If the file size is smaller than the amount of RAM in the
>>> client, then reads could be satisfied from the client's
>>> cache. Thus, the results that are higher than wire speed.
>>> Note: In Trond's runs, he uses the -U option. This option
>>> un-mounts and re-mounts the NFS filesystem on the
>>> client. This defeats the client's cache, even for files
>>> that would fit in the client's RAM.
>>
>>
>> My problem is that I cannot managed to use the -U option :-( , after a
>> few mount/umounts (rapids -> their are mount/umount between every tests
>> !) here what happens:
>>
>> Arvouin NFS client tester:
>> [root@arvouin /mnt]
>> $grep cobra3 /etc/fstab
>> cobra3:/p2v5f3 /mnt/cobra3 nfs defaults 1
>> 2
>> [root@arvouin /mnt]
>> $time iozone -a -c -e -i 0 -i 1 -U /mnt/cobra3 -f
>> /mnt/cobra3/iozone/arvouin/arvouin-async-cobra-sync >
>> arvouin:async-cobra3:sync-i01-a-c-e-U-F.iozone
>> umount: /mnt/cobra3: not mounted
>> mount: cobra3:/p2v5f3 failed, reason given by server: Permission denied
>> creat: No such file or directory
>>
>> Cobra3 NFS server logs:
>> Jan 30 11:32:20 cobra3 rpc.mountd: authenticated mount request from
>> arvouin.int-evry.fr:844 for /p2v5f3 (/p2v5f3)
>> Jan 30 11:32:21 cobra3 rpc.mountd: authenticated unmount request from
>> arvouin.int-evry.fr:848 for /p2v5f3 (/p2v5f3)
>> Jan 30 11:32:21 cobra3 rpc.mountd: refused mount request from
>> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34107
>>
>> I thought about the firewall (Fedora core2 iptable), so I stopped it on
>> both sides, no success :-( .
>>
>> Jan 30 11:34:39 cobra3 rpc.mountd: authenticated unmount request from
>> arvouin.int-evry.fr:957 for /p2v5f3 (/p2v5f3)
>> Jan 30 11:34:39 cobra3 rpc.mountd: refused mount request from
>> arvouin.int-evry.fr for /p2v5f3 (/p2v5f3): illegal port 34111
>>
>> Any idea on this ?
>>
>>>
>>> 2. If you are using mmap, you may control the sync
>>> behavior with the -D and -G options. The -D causes
>>> msync() to occur with it happening async. The -G
>>> causes msync() to occur with it happening sync.
>>>
>> I don't understand the "if you are using mmap" ? is running iozone -a
>> uses mmap ? actually what I want to simulate is users homedirs dailly
>> usage -> mostly connecting to gnome sessions (lock, name pipe, unix
>> sockets ...) then use of tar , gcc, emacs, mozilla ! does that mean
>> "unsing mmap" ? sorry if I seem a bit newbye here ...
>>
>>> 3. It is not too surprising that you see 11 Mbytes/sec over
>>> 100 Mbit. It's not very challenging for even a single
>>> disk, on the server, to satisfy this flow rate. It would
>>> be more interesting to use Gigabit networking, as this
>>> would put more load on the server's disk subsystem.
>>>
>> Indeed my AX100 fiber channel network storage uses 12x250GB SATA Disks at
>> 7200 rpm each, it is specified to througout at around 150 MB/s -> well
>> over ethernet 11 MB/s, so network should be the bottle neck ! but in that
>> case why the untar of an apache distrib (~7MB) takes more than 2minutes
>> to be done ?
>>
>> [root@calaz /mnt/cobra3sync/mci/test/Test-sync]
>> $ time tar xvfz /tmp/httpd-2.0.52.tar.gz
>> real 2m18.141s
>>
>> If I compute right it's 50KB/s, far away from 11MB/s so network shortage
>> is not in cause here . My pb is that users don't care about iozone 11MB/s
>> performances, they complain about their dailly usage !. But their
>> complains are only "oral", I want to prove their complain with bench
>> value -> so the use of iozone !
>> perhaps it's also a question of permissions/mode/attributes value
>> checking -> NSS checks through the ldap directory etc ..., but iozone
>> doesn't mesure that ?
>>
>>> 4. If you need to exceed the RAM in the server (to measure
>>> without cache effects) then you could do so by using
>>> the -U option, or you could use large files, or you could
>>> use the -t option, and have the aggregate file data set size
>>> be larger than the amount of RAM in the server.
>>>
>> Large file size (here I need more than 4GB beacuase I have 4GB of ram on
>> the NFS server), makes test very long :-( and I don't think it reflects
>> the dailly usage of users .
>> I tried -t option:
>>
>> Server export in sync, client mount in async
>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>> $time iozone -i 1 -i 0 -t 4 -r 64 -s 128M -F ./foo1 ./foo2 ./foo3 ./foo4
>> Throughput test with 4 processes
>> Each process writes a 131072 Kbyte file in 64 Kbyte records
>>
>> Children see throughput for 4 initial writers = 10994.48
>> KB/sec
>> Parent sees throughput for 4 initial writers = 8561.40
>> KB/sec
>> Min throughput per process = 2085.77
>> KB/sec
>> Max throughput per process = 3647.83
>> KB/sec
>> Avg throughput per process = 2748.62
>> KB/sec
>> Min xfer = 82944.00 KB
>>
>> here with this sample 8561.40 KB/sec I'am still at network 11MB/s bottle
>> neck .
>>
>>> 5. Be wary of using -I (Direct I/O) The problem is not
>>> in Iozone, but in the fact that there are many versions
>>> of Linux, and other Unixes, that do not actually honor
>>> the O_DIRECT, but also do not return errors when
>>> it is used. For example: Some systems have:
>>>
>>> Example #1: #define O_DIRECT
>>> Example #2: #define O_DIRECT O_SYNC
>>> Example #3: #define O_DIRECT O_RSYNC|O_SYNC
>>>
>>> None of the above are actually equivalent to a real
>>> Direct I/O method.
>>
>>
>> OK, I'll be carefull, although I don't know how to check what my system
>> honors (redhat entreprise server 3 kernel 2.4.21-27.ELsmp ). where these
>> #define can be check ?.
>> Anyway, I just blindly test -I option directly on the server:
>> [root@cobra3 /p2v5f3/iozone/cobra3]
>> $ /opt/iozone/bin/iozone -a -I -i 0 -i 1
>> KB reclen write rewrite read reread 4096 64 38611
>> 38196 63046 63567
>>
>> not too bad, although 38MB is not 150 MB, but this is commercial
>> specification (150MB) maybe not reallity !
>>
>>>
>>> 6. Getting very tricky here:
>>> You might try using the -W option. This enables file locking.
>>> Not that you wanted file locking, but you might want its
>>> side effect. In many Unix systems, enabling file locking over
>>> NFS completely disables the NFS client caching, for
>>> reads, and writes :-) and does so for ALL file sizes.
>>>
>> I tried that, but no sinificative changes, still arount 10MB/s .
>>
>> Thanks a lot for all your help, I hope I will finally use that iozone
>> tool correcly :-) .
>>
>>> Enjoy,
>>> Don Capps
>>>
>>> ----- Original Message ----- From: "jehan.procaccia"
>>> <[email protected]>
>>> To: "Trond Myklebust" <[email protected]>
>>> Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
>>> <[email protected]>
>>> Sent: Saturday, January 29, 2005 4:48 AM
>>> Subject: Re: [NFS] 2.4.21 NFSv3 performance graph
>>>
>>>
>>>> OK so now I run with your recommanded options and I get Output perfs as
>>>> high as my network speed !! I am very surprised ! I don't think I am
>>>> measuring NFS perfs here but network speed :-( .
>>>> Indeed for any couple filesize/record lenght I get wites result (see
>>>> sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s ->
>>>> or 88Mbits/s ~= my 100Mbits ethernet througput ! (less ethernet/ip
>>>> overhead !)
>>>>
>>>> here's what I did:
>>>> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
>>>> [root@arvouin /mnt/cobra3/iozone/arvouin]
>>>> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>>>>
>>>> Command line used: iozone -a -c -e -i 0 -i 1
>>>> Output is in Kbytes/sec
>>>> Processor cache size set to 1024 Kbytes.
>>>> Processor cache line size set to 32 bytes.
>>>> File stride size set to 17 * record size.
>>>> random
>>>> random bkwd
>>>> record stride
>>>> KB reclen write rewrite read reread read
>>>> write read rewrite read fwrite frewrite fread freread
>>>> 1024 4 10529 10603 409270 408936 1024 8
>>>> 10571 10666 472558 533076
>>>> ....
>>>> 262144 64 11146 11156 11230 11225 262144
>>>> 128 11152 11172 11228 10948
>>>>
>>>> here only read/reread changes as filesize increases , anyway
>>>> 400/500MB/s reads is well over my 12.5 theorical ethernet througput, I
>>>> suspect cache intervention here, no ? although I did put -e -c options
>>>> !
>>>>
>>>> Any comment , advices ? what kind of result do you get for NFS writings
>>>> with iozone ? as high as I get ? which options I am missing ?
>>>>
>>>> Thanks.
>>>> Trond Myklebust wrote:
>>>>
>>>>> fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>>>>
>>>>>> more generaly, what tool do you recommand to bench NFS ?
>>>>>> I tried bonnie, bonnie++ and iozone.
>>>>>> for the latest here's the kind of command I ran (so that it doesn't
>>>>>> takes hours to run the test!):
>>>>>> /opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s
>>>>>> 100m -i -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r
>>>>>> 16384 -c -U /mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux >
>>>>>> iozone-result
>>>>>>
>>>>>> My problem is that my NFS server has 4Go of ram, and bench programs
>>>>>> always recommand to use filesize for tests higher than RAM size and
>>>>>> even double size of the RAM so that it is not messuring cache
>>>>>> activities !
>>>>>>
>>>>>
>>>>> For tests of reading, this is undoubtedly true. For tests of writing
>>>>> over NFS, this may be false: see the discussions of the iozone "-c"
>>>>> and
>>>>> "-e" flags below.
>>>>>
>>>>> Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>>>> flags, and so are indeed not good for testing wire speeds unless you
>>>>> use
>>>>> very large files.
>>>>>
>>>>>
>>>>>> Can you give me a sample of the iozone arguments you used ?
>>>>>> Any other tools ?
>>>>>>
>>>>>
>>>>> It depends on what I want to test 8-)
>>>>>
>>>>>
>>>>> Something like "iozone -c -a" should be fine for a basic test of the
>>>>> generic read/write code functionality.
>>>>> Note the "-c" which *is* usually necessary under NFS since any cached
>>>>> writes are going to be flushed to disk by the "close()" (or when the
>>>>> process exits). This means that close() will normally end up
>>>>> dominating
>>>>> your write timings for files < memory size.
>>>>>
>>>>> If you want to test mmap(), something like "iozone -e -B -a". I
>>>>> believe
>>>>> that "-e" should normally ensure that any writes are flushed to disk
>>>>> using the fsync() command, and that this is timed.
>>>>> Note that if you don't care about knowing how long it takes for the
>>>>> writes to be flushed to disk then you can drop the "-e": unlike
>>>>> ordinary
>>>>> read/write, mmap() does not guarantee that writes are flushed to disk
>>>>> after the file is closed.
>>>>>
>>>>> For direct IO, "iozone -I -a" suffices. Since direct IO is uncached,
>>>>> all
>>>>> write operations are synchronous, so "-c" and "-e" are unnecessary.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Trond
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-17 17:57:26

by Iozone

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Jehan,

Comments below:

----- Original Message -----
From: "jehan.procaccia" <[email protected]>
To: "Iozone" <[email protected]>
Cc: "Trond Myklebust" <[email protected]>; "Jeff Blaine"
<[email protected]>; <[email protected]>
Sent: Thursday, February 17, 2005 11:27 AM
Subject: Re: [NFS] 2.4.21 NFSv3 performance graph

>>
>> I noticed on your web page that you boiled the
>> 3D surface data down to a single value, and picked
>> the highest value for this number. Ok... that's
>> funky and pretty much meaningless.
>
>
> Yes , I did that on purpose to have a single value of comparition with
> bonnie++ tests, see the beware paragraph in the 13.1.4 section . But
> indedd I must adit it is pretty much meaningless !.

If you want to compare a single value from Iozone with Bonnie then
you need to run Iozone and Bonnie with the same file size, and
since you seem to want physical I/O, use a file size that is
larger than the amount of RAM in the NFS client.
With that done, you should get 11 Mbytes/sec on both
tests, as you are 100Mbit limited. Again, not terribly revealing
but at least both tests were run under similar conditions.

>
>> If you must reduce the
>> data down to a single value, you'll have to pick what you
>> want to represent.
>>
>> (Client side cache performance, or NFS server performance)
>>
>> and pick the value from the appropriate region on the
>> surface of the plot. Left side is the client cache, right side
>> is NFS server.
>
> the thing is that the slope of the surface (not sure of my english here
> ...)

Your English is far better than my French :-)

> is rather climbing from left to right in writes tests, and droping from
> left to right in read tests !
> exemple :
> http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/Aa-C2s/write/write.png
> so when the client cache is overfull I get better write performances .. ?
> surprising !?.
>
> http://www.int-evry.fr/s2ia/user/procacci/Doc/NFS/Aa-C2s/read/read.png
> here in reads, when the cache is overfull I get lower read performances
> with a signyficative break at the 128MB file size ! Here I find this to be
> logical .

Look at the Y axis scale. The Writer is scaled to 12 Mbytes/sec. The
Reader is scaled to 700 Mbytes/sec. For the writer, the larger
files and larger transfers do better as there is less system call
overhead
and more work being done on each NFS request. This continues
until you become network limited.

For the Reader, the client side cache dominates, until the file size
exceeds the cache. At that point you become network limited
and obtain basically the same result as the writer.

>
>> The cross over is where the file size
>> no longer fits in the client side cache. Also, you'll need
>> to document which thing you are measuring.
>
> Actually I want to measure the whole NFS performance, for daily users
> usage, from client to server reponses, isn't it why I did ?

By using a range of file sizes and tranfer sizes and plotting
the 3D graph, you indeed represent the complete NFS performance
of the NFS client and server. So... single values (Bonnie, or Iozone)
do not fully represent the entire picture. Thus, why Iozone does
ranges instead of points :-)

Hint: Although one can use Gnuplot, the Excel graphs show
more detail and are much easier to interpret.

Enjoy,
Don Capps

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-29 17:25:31

by Iozone

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Jehan,

Your results are what I would expect, given your
configuration.

1. The -e and -c will flush the writes from the
client to the server, and from the server to
its disks. However, if the file size is smaller than
the amount of ram in the server, then a copy of the
data still exists in the server's cache. Thus, client
reads can be satisfied from the server's cache and
wire speeds would be expected.
If the file size is smaller than the amount of RAM in the
client, then reads could be satisfied from the client's
cache. Thus, the results that are higher than wire speed.
Note: In Trond's runs, he uses the -U option. This option
un-mounts and re-mounts the NFS filesystem on the
client. This defeats the client's cache, even for files
that would fit in the client's RAM.

2. If you are using mmap, you may control the sync
behavior with the -D and -G options. The -D causes
msync() to occur with it happening async. The -G
causes msync() to occur with it happening sync.

3. It is not too surprising that you see 11 Mbytes/sec over
100 Mbit. It's not very challenging for even a single
disk, on the server, to satisfy this flow rate. It would
be more interesting to use Gigabit networking, as this
would put more load on the server's disk subsystem.

4. If you need to exceed the RAM in the server (to measure
without cache effects) then you could do so by using
the -U option, or you could use large files, or you could
use the -t option, and have the aggregate file data set size
be larger than the amount of RAM in the server.

5. Be wary of using -I (Direct I/O) The problem is not
in Iozone, but in the fact that there are many versions
of Linux, and other Unixes, that do not actually honor
the O_DIRECT, but also do not return errors when
it is used. For example: Some systems have:

Example #1: #define O_DIRECT
Example #2: #define O_DIRECT O_SYNC
Example #3: #define O_DIRECT O_RSYNC|O_SYNC

None of the above are actually equivalent to a real
Direct I/O method.

6. Getting very tricky here:
You might try using the -W option. This enables file locking.
Not that you wanted file locking, but you might want its
side effect. In many Unix systems, enabling file locking over
NFS completely disables the NFS client caching, for
reads, and writes :-) and does so for ALL file sizes.

Enjoy,
Don Capps

----- Original Message -----
From: "jehan.procaccia" <[email protected]>
To: "Trond Myklebust" <[email protected]>
Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
<[email protected]>
Sent: Saturday, January 29, 2005 4:48 AM
Subject: Re: [NFS] 2.4.21 NFSv3 performance graph

> OK so now I run with your recommanded options and I get Output perfs as
> high as my network speed !! I am very surprised ! I don't think I am
> measuring NFS perfs here but network speed :-( .
> Indeed for any couple filesize/record lenght I get wites result (see
> sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s -> or
> 88Mbits/s ~= my 100Mbits ethernet througput ! (less ethernet/ip overhead
> !)
>
> here's what I did:
> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
> [root@arvouin /mnt/cobra3/iozone/arvouin]
> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>
> Command line used: iozone -a -c -e -i 0 -i 1
> Output is in Kbytes/sec
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> random random
> bkwd
> record stride
> KB reclen write rewrite read reread read write
> read rewrite read fwrite frewrite fread freread
> 1024 4 10529 10603 409270 408936
> 1024 8 10571 10666 472558 533076
> ....
> 262144 64 11146 11156 11230 11225
> 262144 128 11152 11172 11228 10948
>
> here only read/reread changes as filesize increases , anyway 400/500MB/s
> reads is well over my 12.5 theorical ethernet througput, I suspect cache
> intervention here, no ? although I did put -e -c options !
>
> Any comment , advices ? what kind of result do you get for NFS writings
> with iozone ? as high as I get ? which options I am missing ?
>
> Thanks.
> Trond Myklebust wrote:
>
>>fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>
>>>more generaly, what tool do you recommand to bench NFS ?
>>>I tried bonnie, bonnie++ and iozone.
>>>for the latest here's the kind of command I ran (so that it doesn't takes
>>>hours to run the test!):
>>>/opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m -i
>>> -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384 -c -U
>>>/mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result
>>>
>>>My problem is that my NFS server has 4Go of ram, and bench programs
>>>always recommand to use filesize for tests higher than RAM size and even
>>>double size of the RAM so that it is not messuring cache activities !
>>>
>>
>>For tests of reading, this is undoubtedly true. For tests of writing
>>over NFS, this may be false: see the discussions of the iozone "-c" and
>>"-e" flags below.
>>
>>Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>flags, and so are indeed not good for testing wire speeds unless you use
>>very large files.
>>
>>
>>>Can you give me a sample of the iozone arguments you used ?
>>>Any other tools ?
>>>
>>
>>It depends on what I want to test 8-)
>>
>>
>>Something like "iozone -c -a" should be fine for a basic test of the
>>generic read/write code functionality.
>>Note the "-c" which *is* usually necessary under NFS since any cached
>>writes are going to be flushed to disk by the "close()" (or when the
>>process exits). This means that close() will normally end up dominating
>>your write timings for files < memory size.
>>
>>If you want to test mmap(), something like "iozone -e -B -a". I believe
>>that "-e" should normally ensure that any writes are flushed to disk
>>using the fsync() command, and that this is timed.
>>Note that if you don't care about knowing how long it takes for the
>>writes to be flushed to disk then you can drop the "-e": unlike ordinary
>>read/write, mmap() does not guarantee that writes are flushed to disk
>>after the file is closed.
>>
>>For direct IO, "iozone -I -a" suffices. Since direct IO is uncached, all
>>write operations are synchronous, so "-c" and "-e" are unnecessary.
>>
>>
>>Cheers,
>> Trond
>>
>
>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-29 17:31:45

by Iozone

[permalink] [raw]

Subject: Re: 2.4.21 NFSv3 performance graph

Jehan,

In my previous email I mentioned the use of the -U option.
This option will un-mount and re-mount the filesystem
on the client. This will cause the client to flush, and invalidate
it's cache. It will not cause the server to flush, or invalidate
it's cache. If you wish to measure the physical I/O subsystem
on the server, then you will need to either use a file size
that is larger than the amount of RAM on the server, (see: -s option)
or an aggregate data set size that is larger than the amount
of RAM on the server. ( see: -s and -t options)

Enjoy,
Don Capps

----- Original Message -----
From: "jehan.procaccia" <[email protected]>
To: "Trond Myklebust" <[email protected]>
Cc: "Jeff Blaine" <[email protected]>; <[email protected]>;
<[email protected]>
Sent: Saturday, January 29, 2005 4:48 AM
Subject: Re: [NFS] 2.4.21 NFSv3 performance graph

> OK so now I run with your recommanded options and I get Output perfs as
> high as my network speed !! I am very surprised ! I don't think I am
> measuring NFS perfs here but network speed :-( .
> Indeed for any couple filesize/record lenght I get wites result (see
> sample below) around 11000Kbytes/sec -> so if I am right -> 11MB/s -> or
> 88Mbits/s ~= my 100Mbits ethernet througput ! (less ethernet/ip overhead
> !)
>
> here's what I did:
> $mount cobra3:/p2v5f3 /mnt/cobra3/ -o async,nfsvers=3
> [root@arvouin /mnt/cobra3/iozone/arvouin]
> $time iozone -a -c -e -i 0 -i 1 > arvouin-cobra3-i01-a-c-e.iozone
>
> Command line used: iozone -a -c -e -i 0 -i 1
> Output is in Kbytes/sec
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> random random
> bkwd
> record stride
> KB reclen write rewrite read reread read write
> read rewrite read fwrite frewrite fread freread
> 1024 4 10529 10603 409270 408936
> 1024 8 10571 10666 472558 533076
> ....
> 262144 64 11146 11156 11230 11225
> 262144 128 11152 11172 11228 10948
>
> here only read/reread changes as filesize increases , anyway 400/500MB/s
> reads is well over my 12.5 theorical ethernet througput, I suspect cache
> intervention here, no ? although I did put -e -c options !
>
> Any comment , advices ? what kind of result do you get for NFS writings
> with iozone ? as high as I get ? which options I am missing ?
>
> Thanks.
> Trond Myklebust wrote:
>
>>fr den 21.01.2005 Klokka 18:09 (+0100) skreiv Jehan PROCACCIA:
>>
>>>more generaly, what tool do you recommand to bench NFS ?
>>>I tried bonnie, bonnie++ and iozone.
>>>for the latest here's the kind of command I ran (so that it doesn't takes
>>>hours to run the test!):
>>>/opt/iozone/bin/iozone -p -s 10k -s 100k -s 1m -s 5m -s 10m -s 100m -i
>>> -i 1 -r 4 -r 64 -r 256 -r 512 -r 1024 -r 4096 -r8192 -r 16384 -c -U
>>>/mnt/cobra3 -f /mnt/cobra3/iozone.nagiostux > iozone-result
>>>
>>>My problem is that my NFS server has 4Go of ram, and bench programs
>>>always recommand to use filesize for tests higher than RAM size and even
>>>double size of the RAM so that it is not messuring cache activities !
>>>
>>
>>For tests of reading, this is undoubtedly true. For tests of writing
>>over NFS, this may be false: see the discussions of the iozone "-c" and
>>"-e" flags below.
>>
>>Note that bonnie and bonnie++ lack the equivalent of the "-e", "-c"
>>flags, and so are indeed not good for testing wire speeds unless you use
>>very large files.
>>
>>
>>>Can you give me a sample of the iozone arguments you used ?
>>>Any other tools ?
>>>
>>
>>It depends on what I want to test 8-)
>>
>>
>>Something like "iozone -c -a" should be fine for a basic test of the
>>generic read/write code functionality.
>>Note the "-c" which *is* usually necessary under NFS since any cached
>>writes are going to be flushed to disk by the "close()" (or when the
>>process exits). This means that close() will normally end up dominating
>>your write timings for files < memory size.
>>
>>If you want to test mmap(), something like "iozone -e -B -a". I believe
>>that "-e" should normally ensure that any writes are flushed to disk
>>using the fsync() command, and that this is timed.
>>Note that if you don't care about knowing how long it takes for the
>>writes to be flushed to disk then you can drop the "-e": unlike ordinary
>>read/write, mmap() does not guarantee that writes are flushed to disk
>>after the file is closed.
>>
>>For direct IO, "iozone -I -a" suffices. Since direct IO is uncached, all
>>write operations are synchronous, so "-c" and "-e" are unnecessary.
>>
>>
>>Cheers,
>> Trond
>>
>
>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs