LinuxLists.cc - How can I boost block I/O performance

2006-05-06 18:09:05

Subject: How can I boost block I/O performance

Hello all:

I've been trying some hacks to boost disk I/O performance mostly by
changing values
in the /proc/sys/vm filesystem. A vmstat display shows bursty block out
counts with
fairly consistent interrupt counts:

procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
4 0 720 80252 1820 7077456 0 0 9 852 5 11 1
14 84 0
1 0 720 80444 1820 7077456 0 0 0 0 15923 77712 4
20 76 0
2 0 720 80196 1820 7077524 0 0 0 0 14705 87207 4
18 79 0
2 0 720 79732 1828 7077856 0 0 8 116 16235 84459 4
20 76 0
4 0 720 104172 1812 7051964 0 0 24 62568 20447 73499 4
27 69 0
2 0 720 105172 1812 7051964 0 0 0 90740 16960 80149 1
21 78 0
2 0 720 104108 1812 7051964 0 0 0 0 14162 72632 3
13 85 0
4 0 720 103980 1812 7052032 0 0 0 0 13495 68133 4
16 80 0
1 0 720 103868 1820 7052704 0 0 0 128 15417 59969 4
17 79 0
0 0 720 104340 1828 7052696 0 0 0 280 19504 74281 0
8 92 0
0 0 720 104532 1828 7052696 0 0 0 0 14736 70017 0
5 95 0
1 0 720 104596 1828 7052696 0 0 0 0 16006 73173 0
6 94 0
2 0 720 92844 1828 7064256 0 0 12 0 16508 80601 0
9 91 0
2 0 720 91916 1836 7064248 0 0 4 104 20787 74676 0
7 92 0
0 0 720 92580 1844 7064240 0 0 0 14640 17789 71545 0
10 90 0
1 0 720 92900 1844 7064240 0 0 0 0 15460 74760 0
8 92 0
0 0 720 92668 1844 7065260 0 0 0 0 18585 77435 0
7 93 0
0 0 720 92604 1844 7065260 0 0 0 0 19187 86426 0
9 91 0
2 0 720 91964 1860 7065244 0 0 0 140 23659 87962 0
8 92 0
5 0 720 90364 1860 7067080 0 0 40 66956 17995 95384 0
17 82 0

This test is running several NFS clients to a RAID disk storage array. I
also see the
same behavior when running SFTP transfers. What I'd like is a more even
block
out behavior (even at the expense of other apps as this is a file server
not an app
server). The values that I've been hacking are the
dirty_writeback_centisecs,
dirty_background_ratio, etc. Am I barking up the wrong tree?

Thanks in advance.

--
Dave Pitts PULLMAN: Travel and sleep in safety and comfort.
[email protected] My other RV IS a Pullman (Colorado Pine).
http://www.cozx.com/~dpitts

2006-05-06 19:31:57

by [email protected]

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Does the adaptive readahead patch help in your case? Other people in
similar situations are saying that it helps a lot.

Wu Fengguang
Subject [PATCH 00/23] Adaptive read-ahead V11
http://lkml.org/lkml/2006/3/18/235

I'm not sure of the status of this patch. I looked in the mm tree and
didn't see it.

--
Jon Smirl
[email protected]

2006-05-06 19:33:07

by Avi Kivity

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Dave Pitts wrote:
> Hello all:
>
> I've been trying some hacks to boost disk I/O performance mostly by
> changing values
> in the /proc/sys/vm filesystem. A vmstat display shows bursty block
> out counts with
> fairly consistent interrupt counts:
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us
> sy id wa
> 4 0 720 80252 1820 7077456 0 0 9 852 5 11 1
> 14 84 0
[...]
> 5 0 720 90364 1860 7067080 0 0 40 66956 17995 95384 0
> 17 82 0
>
> This test is running several NFS clients to a RAID disk storage array.
> I also see the
> same behavior when running SFTP transfers. What I'd like is a more
> even block
> out behavior (even at the expense of other apps as this is a file
> server not an app
> server). The values that I've been hacking are the
> dirty_writeback_centisecs,
> dirty_background_ratio, etc. Am I barking up the wrong tree?
No iowait time, plenty of idle time: looks like you are network bound.
What time of network are you running?

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

2006-05-06 19:35:14

by Avi Kivity

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Jon Smirl wrote:
> Does the adaptive readahead patch help in your case? Other people in
> similar situations are saying that it helps a lot.
>
> Wu Fengguang
> Subject [PATCH 00/23] Adaptive read-ahead V11
> http://lkml.org/lkml/2006/3/18/235
>
Adaptive readahead is not going to help his write performance...

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

2006-05-06 21:14:04

by Dave Pitts

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Avi Kivity wrote:

> Dave Pitts wrote:
>
>> Hello all:
>>
>> I've been trying some hacks to boost disk I/O performance mostly by
>> changing values
>> in the /proc/sys/vm filesystem. A vmstat display shows bursty block
>> out counts with
>> fairly consistent interrupt counts:
>>
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>> r b swpd free buff cache si so bi bo in cs us
>> sy id wa
>> 4 0 720 80252 1820 7077456 0 0 9 852 5 11 1
>> 14 84 0
>
> [...]
>
>> 5 0 720 90364 1860 7067080 0 0 40 66956 17995 95384
>> 0 17 82 0
>>
>> This test is running several NFS clients to a RAID disk storage
>> array. I also see the
>> same behavior when running SFTP transfers. What I'd like is a more
>> even block
>> out behavior (even at the expense of other apps as this is a file
>> server not an app
>> server). The values that I've been hacking are the
>> dirty_writeback_centisecs,
>> dirty_background_ratio, etc. Am I barking up the wrong tree?
>
> No iowait time, plenty of idle time: looks like you are network
> bound. What time of network are you running?
>
Well, it's an 8 cpu system. Does the idle time reflect the idle time of
all cpu's?
The network is a Gigabit Ethernet.

--
Dave Pitts PULLMAN: Travel and sleep in safety and comfort.
[email protected] My other RV IS a Pullman (Colorado Pine).
http://www.cozx.com/~dpitts

2006-05-06 23:55:14

by Jesper Juhl

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

On 5/6/06, Dave Pitts <[email protected]> wrote:
> Hello all:
>
> I've been trying some hacks to boost disk I/O performance
[snip]
>
> This test is running several NFS clients to a RAID disk storage array. I
[snip]

For improving performance of NFS servers I've often had good success
with increasing the 'rsize' and 'wsize' options.
The default values are 4096, I personally set them to 16384 which
usually helps NFS performance quite a bit. At least that's my
experience.
Simply add rsize=16384,wsize=16384 to the nfs mount options in
/etc/fstab and see if that improves performance for you (values like
8192 and 32768 may also be worth testing, but personally I've found -
at least with my setups - that 16384 seems to be the magic value).
('man 8 mount' and 'man 5 exports' also have more interresting options
you may want to experiment with, but just rsize & wsize on their own
should be a boost)

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-05-07 05:13:37

by Avi Kivity

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Dave Pitts wrote:
> Avi Kivity wrote:
>
>> Dave Pitts wrote:
>>>
>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>> ----cpu----
>>> r b swpd free buff cache si so bi bo in cs us
>>> sy id wa
>>> 4 0 720 80252 1820 7077456 0 0 9 852 5 11
>>> 1 14 84 0
>>
>> [...]
>>
>>> 5 0 720 90364 1860 7067080 0 0 40 66956 17995 95384
>>> 0 17 82 0
>>>
>>> This test is running several NFS clients to a RAID disk storage
>>> array. I also see the
>>> same behavior when running SFTP transfers. What I'd like is a more
>>> even block
>>> out behavior (even at the expense of other apps as this is a file
>>> server not an app
>>> server). The values that I've been hacking are the
>>> dirty_writeback_centisecs,
>>> dirty_background_ratio, etc. Am I barking up the wrong tree?
>>
>> No iowait time, plenty of idle time: looks like you are network
>> bound. What time of network are you running?
>>
> Well, it's an 8 cpu system. Does the idle time reflect the idle time
> of all cpu's?

It's an average across all cpus. But the numbers are low even for a
single cpu system.

> The network is a Gigabit Ethernet.
>

I'd make sure the nics know that by running ethtool (on the clients as
well as on the server).

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

2006-05-08 11:18:29

by Erik Mouw

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

On Sun, May 07, 2006 at 01:55:12AM +0200, Jesper Juhl wrote:
> On 5/6/06, Dave Pitts <[email protected]> wrote:
> >Hello all:
> >
> >I've been trying some hacks to boost disk I/O performance
> [snip]
> >
> >This test is running several NFS clients to a RAID disk storage array. I
> [snip]
>
> For improving performance of NFS servers I've often had good success
> with increasing the 'rsize' and 'wsize' options.
> The default values are 4096, I personally set them to 16384 which
> usually helps NFS performance quite a bit. At least that's my
> experience.
> Simply add rsize=16384,wsize=16384 to the nfs mount options in
> /etc/fstab and see if that improves performance for you (values like
> 8192 and 32768 may also be worth testing, but personally I've found -
> at least with my setups - that 16384 seems to be the magic value).
> ('man 8 mount' and 'man 5 exports' also have more interresting options
> you may want to experiment with, but just rsize & wsize on their own
> should be a boost)

Another way is to increase the number of nfsd threads on the server.
See rpc.nfsd(8). I usually increase it from the default 8 to 32 on busy
machines.

Erik

--
+-- Erik Mouw -- http://www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

2006-05-08 15:38:44

by Dave Pitts

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Avi Kivity wrote:

> Dave Pitts wrote:
>
>> Avi Kivity wrote:
>>
>>> Dave Pitts wrote:
>>>
>>>>
>>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>>> ----cpu----
>>>> r b swpd free buff cache si so bi bo in cs
>>>> us sy id wa
>>>> 4 0 720 80252 1820 7077456 0 0 9 852 5 11
>>>> 1 14 84 0
>>>
>>>
>>> [...]
>>>
>>>> 5 0 720 90364 1860 7067080 0 0 40 66956 17995
>>>> 95384 0 17 82 0
>>>>
>>>> This test is running several NFS clients to a RAID disk storage
>>>> array. I also see the
>>>> same behavior when running SFTP transfers. What I'd like is a more
>>>> even block
>>>> out behavior (even at the expense of other apps as this is a file
>>>> server not an app
>>>> server). The values that I've been hacking are the
>>>> dirty_writeback_centisecs,
>>>> dirty_background_ratio, etc. Am I barking up the wrong tree?
>>>
>>>
>>> No iowait time, plenty of idle time: looks like you are network
>>> bound. What time of network are you running?
>>>
>> Well, it's an 8 cpu system. Does the idle time reflect the idle time
>> of all cpu's?
>
>
> It's an average across all cpus. But the numbers are low even for a
> single cpu system.

OK, I ran some tests, bypassing NFS, and got the following vmstat display:

procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
4 1 696 80692 844 7225424 0 0 92 263924 14380 6176 2
40 46 12
6 0 696 108668 784 7200936 0 0 456 6444 30515 20641 3
57 36 3
5 1 696 100916 768 7211220 0 0 508 97276 31761 19916 3
65 27 4
7 0 700 91188 764 7217340 0 0 428 121576 28704 21957 3
69 24 3
3 3 700 103356 780 7204472 0 0 408 121748 29513 22603 3
66 23 8
7 0 700 92836 780 7216168 0 0 360 43508 28784 21410 3
54 33 11
7 0 700 88364 768 7224272 0 0 296 158236 26530 17570 3
66 21 10
10 0 700 91068 776 7219096 0 0 444 141456 30306 16053 3
74 16 7
5 2 700 102212 752 7206676 0 0 340 170076 29249 14872 2
69 19 10
11 0 700 87884 768 7222096 0 0 392 143312 29743 19808 1
65 23 11
8 0 700 104692 744 7204644 0 0 240 159624 25814 18747 3
58 20 19
9 0 700 107196 736 7205196 0 0 344 148500 28191 18113 3
70 21 6
16 1 700 99300 768 7211364 0 12 464 164348 25671 18326 4
64 18 15
5 4 700 107412 1052 7204824 0 0 936 170904 28994 17062 3
73 11 14
8 1 700 98892 1284 7217512 0 0 596 182708 31520 18424 1
76 13 10

This ia with 6 concurrent data streams. In addition to the
/proc/sys/vm/dirty* values I also adjusted
values in mm/page-writeback.c eg. MAX_WRITEBACK_PAGES to 8192 . Our
goal is to blast out
as much as possible per pdflush invocation.

>
>> The network is a Gigabit Ethernet.
>>
>
> I'd make sure the nics know that by running ethtool (on the clients as
> well as on the server).
>
>
>

--
Dave Pitts PULLMAN: Travel and sleep in safety and comfort.
[email protected] My other RV IS a Pullman (Colorado Pine).
http://www.cozx.com/~dpitts

2006-05-08 15:50:35

by Avi Kivity

[permalink] [raw]

Subject: Re: How can I boost block I/O performance

Dave Pitts wrote:
>
> OK, I ran some tests, bypassing NFS, and got the following vmstat
> display:
>
> procs -----------memory---------- ---swap-- -----io---- --system--
> ----cpu----
> r b swpd free buff cache si so bi bo in cs us
> sy id wa
> 4 1 696 80692 844 7225424 0 0 92 263924 14380 6176
> 2 40 46 12
> 6 0 696 108668 784 7200936 0 0 456 6444 30515 20641 3
> 57 36 3
> 5 1 696 100916 768 7211220 0 0 508 97276 31761 19916 3
> 65 27 4
> 7 0 700 91188 764 7217340 0 0 428 121576 28704 21957
> 3 69 24 3
> 3 3 700 103356 780 7204472 0 0 408 121748 29513 22603
> 3 66 23 8
> 7 0 700 92836 780 7216168 0 0 360 43508 28784 21410 3
> 54 33 11
> 7 0 700 88364 768 7224272 0 0 296 158236 26530 17570
> 3 66 21 10
> 10 0 700 91068 776 7219096 0 0 444 141456 30306 16053
> 3 74 16 7
> 5 2 700 102212 752 7206676 0 0 340 170076 29249 14872
> 2 69 19 10
> 11 0 700 87884 768 7222096 0 0 392 143312 29743 19808
> 1 65 23 11
> 8 0 700 104692 744 7204644 0 0 240 159624 25814 18747
> 3 58 20 19
> 9 0 700 107196 736 7205196 0 0 344 148500 28191 18113
> 3 70 21 6
> 16 1 700 99300 768 7211364 0 12 464 164348 25671 18326
> 4 64 18 15
> 5 4 700 107412 1052 7204824 0 0 936 170904 28994 17062
> 3 73 11 14
> 8 1 700 98892 1284 7217512 0 0 596 182708 31520 18424
> 1 76 13 10
>
> This ia with 6 concurrent data streams. In addition to the
> /proc/sys/vm/dirty* values I also adjusted
> values in mm/page-writeback.c eg. MAX_WRITEBACK_PAGES to 8192 . Our
> goal is to blast out
> as much as possible per pdflush invocation.
>

These are healthier numbers. How many disks are on that machine? What
raid level? Does it have battery backed up cache?

Anyway you need to check you network, since it looks like block
throughput is not the issue.

--
error compiling committee.c: too many arguments to function