2009-06-30 10:54:54

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

Wu Fengguang, on 06/30/2009 05:04 AM wrote:
> On Mon, Jun 29, 2009 at 11:37:41PM +0800, Vladislav Bolkhovitin wrote:
>> Wu Fengguang, on 06/29/2009 07:01 PM wrote:
>>> On Mon, Jun 29, 2009 at 10:21:24PM +0800, Wu Fengguang wrote:
>>>> On Mon, Jun 29, 2009 at 10:00:20PM +0800, Ronald Moesbergen wrote:
>>>>> ... tests ...
>>>>>
>>>>>> We started with 2.6.29, so why not complete with it (to save additional
>>>>>> Ronald's effort to move on 2.6.30)?
>>>>>>
>>>>>>>> 2. Default vanilla 2.6.29 kernel, 512 KB read-ahead, the rest is default
>>>>>>> How about 2MB RAID readahead size? That transforms into about 512KB
>>>>>>> per-disk readahead size.
>>>>>> OK. Ronald, can you 4 more test cases, please:
>>>>>>
>>>>>> 7. Default vanilla 2.6.29 kernel, 2MB read-ahead, the rest is default
>>>>>>
>>>>>> 8. Default vanilla 2.6.29 kernel, 2MB read-ahead, 64 KB
>>>>>> max_sectors_kb, the rest is default
>>>>>>
>>>>>> 9. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>> read-ahead, the rest is default
>>>>>>
>>>>>> 10. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>> read-ahead, 64 KB max_sectors_kb, the rest is default
>>>>> The results:
>>>> I made a blindless average:
>>>>
>>>> N MB/s IOPS case
>>>>
>>>> 0 114.859 984.148 Unpatched, 128KB readahead, 512 max_sectors_kb
>>>> 1 122.960 981.213 Unpatched, 512KB readahead, 512 max_sectors_kb
>>>> 2 120.709 985.111 Unpatched, 2MB readahead, 512 max_sectors_kb
>>>> 3 158.732 1004.714 Unpatched, 512KB readahead, 64 max_sectors_kb
>>>> 4 159.237 979.659 Unpatched, 2MB readahead, 64 max_sectors_kb
>>>>
>>>> 5 114.583 982.998 Patched, 128KB readahead, 512 max_sectors_kb
>>>> 6 124.902 987.523 Patched, 512KB readahead, 512 max_sectors_kb
>>>> 7 127.373 984.848 Patched, 2MB readahead, 512 max_sectors_kb
>>>> 8 161.218 986.698 Patched, 512KB readahead, 64 max_sectors_kb
>>>> 9 163.908 574.651 Patched, 2MB readahead, 64 max_sectors_kb
>>>>
>>>> So before/after patch:
>>>>
>>>> avg throughput 135.299 => 138.397 by +2.3%
>>>> avg IOPS 986.969 => 903.344 by -8.5%
>>>>
>>>> The IOPS is a bit weird.
>>>>
>>>> Summaries:
>>>> - this patch improves RAID throughput by +2.3% on average
>>>> - after this patch, 2MB readahead performs slightly better
>>>> (by 1-2%) than 512KB readahead
>>> and the most important one:
>>> - 64 max_sectors_kb performs much better than 256 max_sectors_kb, by ~30% !
>> Yes, I've just wanted to point it out ;)
>
> OK, now I tend to agree on decreasing max_sectors_kb and increasing
> read_ahead_kb. But before actually trying to push that idea I'd like
> to
> - do more benchmarks
> - figure out why context readahead didn't help SCST performance
> (previous traces show that context readahead is submitting perfect
> large io requests, so I wonder if it's some io scheduler bug)

Because, as we found out, without your
http://lkml.org/lkml/2009/5/21/319 patch read-ahead was nearly disabled,
hence there were no difference which algorithm was used?

Ronald, can you run the following tests, please? This time with 2 hosts,
initiator (client) and target (server) connected using 1 Gbps iSCSI. It
would be the best if on the client vanilla 2.6.29 will be ran, but any
other kernel will be fine as well, only specify which. Blockdev-perftest
should be ran as before in buffered mode, i.e. with "-a" switch.

1. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 patch with all default
settings.

2. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 patch with default RA
size and 64KB max_sectors_kb.

3. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size
and default max_sectors_kb.

4. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size
and 64KB max_sectors_kb.

5. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 patch and with context RA
patch. RA size and max_sectors_kb are default. For your convenience I
committed the backported context RA patches into the SCST SVN repository.

6. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA patches
with default RA size and 64KB max_sectors_kb.

7. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA patches
with 2MB RA size and default max_sectors_kb.

8. All defaults on the client, on the server vanilla 2.6.29 with
Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA patches
with 2MB RA size and 64KB max_sectors_kb.

9. On the client default RA size and 64KB max_sectors_kb. On the server
vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
context RA patches with 2MB RA size and 64KB max_sectors_kb.

10. On the client 2MB RA size and default max_sectors_kb. On the server
vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
context RA patches with 2MB RA size and 64KB max_sectors_kb.

11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
context RA patches with 2MB RA size and 64KB max_sectors_kb.

(I guess, the results will be interesting not only to us, so I restored
linux-kernel@)

Thanks,
Vlad


2009-07-01 13:07:39

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/6/30 Vladislav Bolkhovitin <[email protected]>:
> Wu Fengguang, on 06/30/2009 05:04 AM wrote:
>>
>> On Mon, Jun 29, 2009 at 11:37:41PM +0800, Vladislav Bolkhovitin wrote:
>>>
>>> Wu Fengguang, on 06/29/2009 07:01 PM wrote:
>>>>
>>>> On Mon, Jun 29, 2009 at 10:21:24PM +0800, Wu Fengguang wrote:
>>>>>
>>>>> On Mon, Jun 29, 2009 at 10:00:20PM +0800, Ronald Moesbergen wrote:
>>>>>>
>>>>>> ... tests ...
>>>>>>
>>>>>>> We started with 2.6.29, so why not complete with it (to save
>>>>>>> additional
>>>>>>> Ronald's effort to move on 2.6.30)?
>>>>>>>
>>>>>>>>> 2. Default vanilla 2.6.29 kernel, 512 KB read-ahead, the rest is
>>>>>>>>> default
>>>>>>>>
>>>>>>>> How about 2MB RAID readahead size? That transforms into about 512KB
>>>>>>>> per-disk readahead size.
>>>>>>>
>>>>>>> OK. Ronald, can you 4 more test cases, please:
>>>>>>>
>>>>>>> 7. Default vanilla 2.6.29 kernel, 2MB read-ahead, the rest is default
>>>>>>>
>>>>>>> 8. Default vanilla 2.6.29 kernel, 2MB read-ahead, 64 KB
>>>>>>> max_sectors_kb, the rest is default
>>>>>>>
>>>>>>> 9. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>> read-ahead, the rest is default
>>>>>>>
>>>>>>> 10. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>> read-ahead, 64 KB max_sectors_kb, the rest is default
>>>>>>
>>>>>> The results:
>>>>>
>>>>> I made a blindless average:
>>>>>
>>>>> N ? ? ? MB/s ? ? ? ? ?IOPS ? ? ?case
>>>>>
>>>>> 0 ? ? ?114.859 ? ? ? 984.148 ? ?Unpatched, 128KB readahead, 512
>>>>> max_sectors_kb
>>>>> 1 ? ? ?122.960 ? ? ? 981.213 ? ?Unpatched, 512KB readahead, 512
>>>>> max_sectors_kb
>>>>> 2 ? ? ?120.709 ? ? ? 985.111 ? ?Unpatched, 2MB readahead, 512
>>>>> max_sectors_kb
>>>>> 3 ? ? ?158.732 ? ? ?1004.714 ? ?Unpatched, 512KB readahead, 64
>>>>> max_sectors_kb
>>>>> 4 ? ? ?159.237 ? ? ? 979.659 ? ?Unpatched, 2MB readahead, 64
>>>>> max_sectors_kb
>>>>>
>>>>> 5 ? ? ?114.583 ? ? ? 982.998 ? ?Patched, 128KB readahead, 512
>>>>> max_sectors_kb
>>>>> 6 ? ? ?124.902 ? ? ? 987.523 ? ?Patched, 512KB readahead, 512
>>>>> max_sectors_kb
>>>>> 7 ? ? ?127.373 ? ? ? 984.848 ? ?Patched, 2MB readahead, 512
>>>>> max_sectors_kb
>>>>> 8 ? ? ?161.218 ? ? ? 986.698 ? ?Patched, 512KB readahead, 64
>>>>> max_sectors_kb
>>>>> 9 ? ? ?163.908 ? ? ? 574.651 ? ?Patched, 2MB readahead, 64
>>>>> max_sectors_kb
>>>>>
>>>>> So before/after patch:
>>>>>
>>>>> ? ? ? ?avg throughput ? ? ?135.299 => 138.397 ?by +2.3%
>>>>> ? ? ? ?avg IOPS ? ? ? ? ? ?986.969 => 903.344 ?by -8.5%
>>>>>
>>>>> The IOPS is a bit weird.
>>>>>
>>>>> Summaries:
>>>>> - this patch improves RAID throughput by +2.3% on average
>>>>> - after this patch, 2MB readahead performs slightly better
>>>>> ?(by 1-2%) than 512KB readahead
>>>>
>>>> and the most important one:
>>>> - 64 max_sectors_kb performs much better than 256 max_sectors_kb, by
>>>> ~30% !
>>>
>>> Yes, I've just wanted to point it out ;)
>>
>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>> read_ahead_kb. But before actually trying to push that idea I'd like
>> to
>> - do more benchmarks
>> - figure out why context readahead didn't help SCST performance
>> ?(previous traces show that context readahead is submitting perfect
>> ? large io requests, so I wonder if it's some io scheduler bug)
>
> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
> patch read-ahead was nearly disabled, hence there were no difference which
> algorithm was used?
>
> Ronald, can you run the following tests, please? This time with 2 hosts,
> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
> would be the best if on the client vanilla 2.6.29 will be ran, but any other
> kernel will be fine as well, only specify which. Blockdev-perftest should be
> ran as before in buffered mode, i.e. with "-a" switch.

I could, but: only the first 'dd' run of blockdev-perftest will have
any value, since all others will be served from the target's cache,
won't that make the results pretty much useless (?). Are you sure this
is what you want me to test?

Ronald.

2009-07-01 18:12:19

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev


Ronald Moesbergen, on 07/01/2009 05:07 PM wrote:
> 2009/6/30 Vladislav Bolkhovitin <[email protected]>:
>> Wu Fengguang, on 06/30/2009 05:04 AM wrote:
>>> On Mon, Jun 29, 2009 at 11:37:41PM +0800, Vladislav Bolkhovitin wrote:
>>>> Wu Fengguang, on 06/29/2009 07:01 PM wrote:
>>>>> On Mon, Jun 29, 2009 at 10:21:24PM +0800, Wu Fengguang wrote:
>>>>>> On Mon, Jun 29, 2009 at 10:00:20PM +0800, Ronald Moesbergen wrote:
>>>>>>> ... tests ...
>>>>>>>
>>>>>>>> We started with 2.6.29, so why not complete with it (to save
>>>>>>>> additional
>>>>>>>> Ronald's effort to move on 2.6.30)?
>>>>>>>>
>>>>>>>>>> 2. Default vanilla 2.6.29 kernel, 512 KB read-ahead, the rest is
>>>>>>>>>> default
>>>>>>>>> How about 2MB RAID readahead size? That transforms into about 512KB
>>>>>>>>> per-disk readahead size.
>>>>>>>> OK. Ronald, can you 4 more test cases, please:
>>>>>>>>
>>>>>>>> 7. Default vanilla 2.6.29 kernel, 2MB read-ahead, the rest is default
>>>>>>>>
>>>>>>>> 8. Default vanilla 2.6.29 kernel, 2MB read-ahead, 64 KB
>>>>>>>> max_sectors_kb, the rest is default
>>>>>>>>
>>>>>>>> 9. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>>> read-ahead, the rest is default
>>>>>>>>
>>>>>>>> 10. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>>> read-ahead, 64 KB max_sectors_kb, the rest is default
>>>>>>> The results:
>>>>>> I made a blindless average:
>>>>>>
>>>>>> N MB/s IOPS case
>>>>>>
>>>>>> 0 114.859 984.148 Unpatched, 128KB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 1 122.960 981.213 Unpatched, 512KB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 2 120.709 985.111 Unpatched, 2MB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 3 158.732 1004.714 Unpatched, 512KB readahead, 64
>>>>>> max_sectors_kb
>>>>>> 4 159.237 979.659 Unpatched, 2MB readahead, 64
>>>>>> max_sectors_kb
>>>>>>
>>>>>> 5 114.583 982.998 Patched, 128KB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 6 124.902 987.523 Patched, 512KB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 7 127.373 984.848 Patched, 2MB readahead, 512
>>>>>> max_sectors_kb
>>>>>> 8 161.218 986.698 Patched, 512KB readahead, 64
>>>>>> max_sectors_kb
>>>>>> 9 163.908 574.651 Patched, 2MB readahead, 64
>>>>>> max_sectors_kb
>>>>>>
>>>>>> So before/after patch:
>>>>>>
>>>>>> avg throughput 135.299 => 138.397 by +2.3%
>>>>>> avg IOPS 986.969 => 903.344 by -8.5%
>>>>>>
>>>>>> The IOPS is a bit weird.
>>>>>>
>>>>>> Summaries:
>>>>>> - this patch improves RAID throughput by +2.3% on average
>>>>>> - after this patch, 2MB readahead performs slightly better
>>>>>> (by 1-2%) than 512KB readahead
>>>>> and the most important one:
>>>>> - 64 max_sectors_kb performs much better than 256 max_sectors_kb, by
>>>>> ~30% !
>>>> Yes, I've just wanted to point it out ;)
>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>> to
>>> - do more benchmarks
>>> - figure out why context readahead didn't help SCST performance
>>> (previous traces show that context readahead is submitting perfect
>>> large io requests, so I wonder if it's some io scheduler bug)
>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>> patch read-ahead was nearly disabled, hence there were no difference which
>> algorithm was used?
>>
>> Ronald, can you run the following tests, please? This time with 2 hosts,
>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>> would be the best if on the client vanilla 2.6.29 will be ran, but any other
>> kernel will be fine as well, only specify which. Blockdev-perftest should be
>> ran as before in buffered mode, i.e. with "-a" switch.
>
> I could, but: only the first 'dd' run of blockdev-perftest will have
> any value, since all others will be served from the target's cache,
> won't that make the results pretty much useless (?). Are you sure this
> is what you want me to test?

Hmm, I forgot about this.. Can you setup possibility to automatically
ssh from the client to the server and modify drop_caches() function in
blockdev-perftest on the client so it will instead of

sync
echo 3 > /proc/sys/vm/drop_caches

do

ssh root@target "sync; echo 3 > /proc/sys/vm/drop_caches"

Thanks,
Vlad

2009-07-03 09:14:20

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/6/30 Vladislav Bolkhovitin <[email protected]>:
> Wu Fengguang, on 06/30/2009 05:04 AM wrote:
>>
>> On Mon, Jun 29, 2009 at 11:37:41PM +0800, Vladislav Bolkhovitin wrote:
>>>
>>> Wu Fengguang, on 06/29/2009 07:01 PM wrote:
>>>>
>>>> On Mon, Jun 29, 2009 at 10:21:24PM +0800, Wu Fengguang wrote:
>>>>>
>>>>> On Mon, Jun 29, 2009 at 10:00:20PM +0800, Ronald Moesbergen wrote:
>>>>>>
>>>>>> ... tests ...
>>>>>>
>>>>>>> We started with 2.6.29, so why not complete with it (to save
>>>>>>> additional
>>>>>>> Ronald's effort to move on 2.6.30)?
>>>>>>>
>>>>>>>>> 2. Default vanilla 2.6.29 kernel, 512 KB read-ahead, the rest is
>>>>>>>>> default
>>>>>>>>
>>>>>>>> How about 2MB RAID readahead size? That transforms into about 512KB
>>>>>>>> per-disk readahead size.
>>>>>>>
>>>>>>> OK. Ronald, can you 4 more test cases, please:
>>>>>>>
>>>>>>> 7. Default vanilla 2.6.29 kernel, 2MB read-ahead, the rest is default
>>>>>>>
>>>>>>> 8. Default vanilla 2.6.29 kernel, 2MB read-ahead, 64 KB
>>>>>>> max_sectors_kb, the rest is default
>>>>>>>
>>>>>>> 9. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>> read-ahead, the rest is default
>>>>>>>
>>>>>>> 10. Patched by the Fengguang's patch vanilla 2.6.29 kernel, 2MB
>>>>>>> read-ahead, 64 KB max_sectors_kb, the rest is default
>>>>>>
>>>>>> The results:
>>>>>
>>>>> I made a blindless average:
>>>>>
>>>>> N ? ? ? MB/s ? ? ? ? ?IOPS ? ? ?case
>>>>>
>>>>> 0 ? ? ?114.859 ? ? ? 984.148 ? ?Unpatched, 128KB readahead, 512
>>>>> max_sectors_kb
>>>>> 1 ? ? ?122.960 ? ? ? 981.213 ? ?Unpatched, 512KB readahead, 512
>>>>> max_sectors_kb
>>>>> 2 ? ? ?120.709 ? ? ? 985.111 ? ?Unpatched, 2MB readahead, 512
>>>>> max_sectors_kb
>>>>> 3 ? ? ?158.732 ? ? ?1004.714 ? ?Unpatched, 512KB readahead, 64
>>>>> max_sectors_kb
>>>>> 4 ? ? ?159.237 ? ? ? 979.659 ? ?Unpatched, 2MB readahead, 64
>>>>> max_sectors_kb
>>>>>
>>>>> 5 ? ? ?114.583 ? ? ? 982.998 ? ?Patched, 128KB readahead, 512
>>>>> max_sectors_kb
>>>>> 6 ? ? ?124.902 ? ? ? 987.523 ? ?Patched, 512KB readahead, 512
>>>>> max_sectors_kb
>>>>> 7 ? ? ?127.373 ? ? ? 984.848 ? ?Patched, 2MB readahead, 512
>>>>> max_sectors_kb
>>>>> 8 ? ? ?161.218 ? ? ? 986.698 ? ?Patched, 512KB readahead, 64
>>>>> max_sectors_kb
>>>>> 9 ? ? ?163.908 ? ? ? 574.651 ? ?Patched, 2MB readahead, 64
>>>>> max_sectors_kb
>>>>>
>>>>> So before/after patch:
>>>>>
>>>>> ? ? ? ?avg throughput ? ? ?135.299 => 138.397 ?by +2.3%
>>>>> ? ? ? ?avg IOPS ? ? ? ? ? ?986.969 => 903.344 ?by -8.5%
>>>>>
>>>>> The IOPS is a bit weird.
>>>>>
>>>>> Summaries:
>>>>> - this patch improves RAID throughput by +2.3% on average
>>>>> - after this patch, 2MB readahead performs slightly better
>>>>> ?(by 1-2%) than 512KB readahead
>>>>
>>>> and the most important one:
>>>> - 64 max_sectors_kb performs much better than 256 max_sectors_kb, by
>>>> ~30% !
>>>
>>> Yes, I've just wanted to point it out ;)
>>
>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>> read_ahead_kb. But before actually trying to push that idea I'd like
>> to
>> - do more benchmarks
>> - figure out why context readahead didn't help SCST performance
>> ?(previous traces show that context readahead is submitting perfect
>> ? large io requests, so I wonder if it's some io scheduler bug)
>
> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
> patch read-ahead was nearly disabled, hence there were no difference which
> algorithm was used?
>
> Ronald, can you run the following tests, please? This time with 2 hosts,
> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
> would be the best if on the client vanilla 2.6.29 will be ran, but any other
> kernel will be fine as well, only specify which. Blockdev-perftest should be
> ran as before in buffered mode, i.e. with "-a" switch.
>
> 1. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>
> 2. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
> max_sectors_kb.
>
> 3. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
> max_sectors_kb.
>
> 4. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
> max_sectors_kb.
>
> 5. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA size
> and max_sectors_kb are default. For your convenience I committed the
> backported context RA patches into the SCST SVN repository.
>
> 6. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
> size and 64KB max_sectors_kb.
>
> 7. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
> and default max_sectors_kb.
>
> 8. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
> and 64KB max_sectors_kb.
>
> 9. On the client default RA size and 64KB max_sectors_kb. On the server
> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>
> 10. On the client 2MB RA size and default max_sectors_kb. On the server
> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>
> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server vanilla
> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
> patches with 2MB RA size and 64KB max_sectors_kb.

Ok, done. Performance is pretty bad overall :(

The kernels I used:
client kernel: 2.6.26-15lenny3 (debian)
server kernel: 2.6.29.5 with blk_dev_run patch

And I adjusted the blockdev-perftest script to drop caches on both the
server (via ssh) and the client.

The results:

1) client: default, server: default
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 19.808 20.078 20.180 51.147 0.402 0.799
33554432 19.162 19.952 20.375 51.673 1.322 1.615
16777216 19.714 20.331 19.948 51.214 0.649 3.201
8388608 18.572 20.126 20.345 52.116 2.149 6.515
4194304 18.711 19.663 19.811 52.831 1.350 13.208
2097152 19.112 19.927 19.130 52.832 1.022 26.416
1048576 19.771 19.686 20.010 51.661 0.356 51.661
524288 19.585 19.940 19.483 52.065 0.515 104.131
262144 19.168 20.794 19.605 51.634 1.757 206.535
131072 19.077 20.776 20.271 51.160 1.849 409.282
65536 19.643 21.230 19.144 51.284 2.227 820.549
32768 19.702 20.869 19.686 51.020 1.380 1632.635
16384 21.218 20.222 20.221 49.846 1.121 3190.174

2) client: default, server: 64 max_sectors_kb
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 20.881 20.102 21.689 49.065 1.522 0.767
33554432 20.329 19.938 20.522 50.543 0.609 1.579
16777216 20.247 19.744 20.912 50.468 1.185 3.154
8388608 19.739 20.184 21.032 50.433 1.318 6.304
4194304 19.968 18.748 20.230 52.174 1.750 13.043
2097152 19.633 20.068 19.858 51.584 0.462 25.792
1048576 20.552 20.618 20.974 49.437 0.440 49.437
524288 21.595 20.830 20.454 48.881 1.098 97.762
262144 21.720 20.602 20.176 49.201 1.515 196.805
131072 20.976 19.089 20.712 50.634 2.144 405.072
65536 20.661 19.952 19.312 51.303 1.414 820.854
32768 21.155 18.464 20.640 51.159 3.081 1637.090
16384 22.023 19.944 20.629 49.159 2.008 3146.205

3) client: default, server: default max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 21.709 19.315 18.319 52.028 3.631 0.813
33554432 20.745 19.209 19.048 52.142 1.976 1.629
16777216 19.762 19.175 19.485 52.591 0.649 3.287
8388608 19.812 19.142 19.574 52.498 0.749 6.562
4194304 19.931 19.786 19.505 51.877 0.466 12.969
2097152 19.473 19.208 19.438 52.859 0.322 26.430
1048576 19.524 19.033 19.477 52.941 0.610 52.941
524288 20.115 20.402 19.542 51.166 0.920 102.333
262144 19.291 19.715 21.016 51.249 1.844 204.996
131072 18.782 19.130 20.334 52.802 1.775 422.419
65536 19.030 19.233 20.328 52.475 1.504 839.599
32768 19.147 19.326 19.411 53.074 0.303 1698.357
16384 19.573 19.596 20.417 51.575 1.005 3300.788

4) client: default, server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 22.604 21.707 20.721 47.298 1.683 0.739
33554432 21.654 20.812 21.162 48.293 0.784 1.509
16777216 20.461 19.782 21.160 50.068 1.377 3.129
8388608 20.886 20.434 21.512 48.914 1.028 6.114
4194304 22.154 20.512 21.433 47.974 1.517 11.993
2097152 22.258 20.971 20.738 48.071 1.478 24.035
1048576 19.953 21.294 19.662 50.497 1.731 50.497
524288 21.577 20.884 20.883 48.509 0.743 97.019
262144 20.959 20.749 20.256 49.587 0.712 198.347
131072 19.926 21.542 19.634 50.360 2.022 402.877
65536 20.973 22.546 20.840 47.793 1.685 764.690
32768 20.695 21.031 21.182 48.837 0.476 1562.791
16384 20.163 21.112 20.037 50.133 1.159 3208.481

5) Server RA-context Patched, client: default, server: default
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 19.756 23.647 18.852 49.818 4.717 0.778
33554432 18.892 19.727 18.857 53.472 1.106 1.671
16777216 18.943 19.255 18.949 53.760 0.409 3.360
8388608 18.766 19.105 18.847 54.165 0.413 6.771
4194304 19.177 19.609 20.191 52.111 1.097 13.028
2097152 18.968 19.517 18.862 53.581 0.797 26.790
1048576 18.833 19.912 18.626 53.592 1.551 53.592
524288 19.128 19.379 19.134 53.298 0.324 106.596
262144 18.955 19.328 18.879 53.748 0.550 214.992
131072 18.401 19.642 18.928 53.961 1.439 431.691
65536 19.366 19.822 18.615 53.182 1.384 850.908
32768 19.252 19.229 18.752 53.683 0.653 1717.857
16384 21.373 19.507 19.162 51.282 2.415 3282.039

6) Server RA-context Patched, client: default, server: 64
max_sectors_kb, RA default
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 22.753 21.071 20.532 47.825 2.061 0.747
33554432 20.404 19.239 20.722 50.943 1.644 1.592
16777216 20.914 20.114 21.854 48.910 1.655 3.057
8388608 19.524 21.932 21.465 48.949 2.510 6.119
4194304 20.306 20.809 20.000 50.279 0.820 12.570
2097152 20.133 20.194 20.181 50.770 0.066 25.385
1048576 19.515 21.593 20.052 50.321 2.128 50.321
524288 20.231 20.502 20.299 50.335 0.284 100.670
262144 19.620 19.737 19.911 51.834 0.313 207.336
131072 20.486 21.138 22.339 48.089 1.711 384.714
65536 20.113 18.322 22.247 50.943 4.025 815.088
32768 23.341 23.328 20.809 45.659 2.511 1461.089
16384 20.962 21.839 23.405 46.496 2.100 2975.773

7) Server RA-context Patched, client: default, server: default
max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 19.565 19.028 19.164 53.196 0.627 0.831
33554432 19.048 18.401 18.940 54.491 0.828 1.703
16777216 18.728 19.330 19.076 53.778 0.699 3.361
8388608 19.174 18.710 19.922 53.179 1.368 6.647
4194304 19.133 18.514 19.672 53.628 1.331 13.407
2097152 18.903 18.547 20.070 53.468 1.782 26.734
1048576 19.210 19.204 18.994 53.513 0.282 53.513
524288 18.978 18.723 20.839 52.596 2.464 105.192
262144 18.912 18.590 18.635 54.726 0.415 218.905
131072 18.732 18.578 19.797 53.837 1.505 430.694
65536 19.046 18.872 19.318 53.678 0.516 858.852
32768 18.490 18.582 20.374 53.583 2.353 1714.661
16384 19.138 19.215 20.602 52.167 1.744 3338.700

8) Server RA-context Patched, client: default, server: 64
max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 21.029 21.654 21.093 48.177 0.630 0.753
33554432 21.174 19.759 20.659 49.918 1.435 1.560
16777216 20.385 20.235 22.145 49.026 1.976 3.064
8388608 19.053 20.162 20.158 51.778 1.391 6.472
4194304 20.123 23.173 20.073 48.696 3.188 12.174
2097152 19.401 20.824 20.326 50.778 1.500 25.389
1048576 21.821 21.401 21.026 47.825 0.724 47.825
524288 21.478 20.742 21.355 48.332 0.742 96.664
262144 20.290 20.183 20.980 50.004 0.853 200.015
131072 20.299 21.501 20.766 49.127 1.158 393.020
65536 21.087 19.340 20.867 50.193 1.959 803.092
32768 21.597 21.223 23.504 46.410 2.039 1485.132
16384 21.681 21.709 22.944 46.343 1.212 2965.967

9) Server RA-context Patched, client: 64 max_sectors_kb, default RA.
server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 42.767 40.615 41.188 24.672 0.535 0.386
33554432 41.204 42.294 40.514 24.780 0.437 0.774
16777216 39.774 42.809 41.804 24.720 0.762 1.545
8388608 42.292 41.799 40.386 24.689 0.486 3.086
4194304 41.784 39.037 41.830 25.073 0.819 6.268
2097152 41.983 41.145 44.115 24.164 0.703 12.082
1048576 41.468 43.495 41.640 24.276 0.520 24.276
524288 42.631 42.724 41.267 24.267 0.387 48.535
262144 41.930 41.954 41.975 24.408 0.011 97.634
131072 42.511 41.266 42.835 24.269 0.393 194.154
65536 41.307 41.544 40.746 24.857 0.203 397.704
32768 42.270 42.728 40.822 24.425 0.478 781.607
16384 39.307 40.044 40.259 25.686 0.264 1643.908
8192 41.258 40.879 40.969 24.955 0.098 3194.183

10) Server RA-context Patched, client: default max_sectors_kb, 2MB RA.
server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 26.160 26.878 25.790 38.982 0.666 0.609
33554432 25.832 25.362 25.695 39.956 0.309 1.249
16777216 26.119 24.769 25.526 40.221 0.876 2.514
8388608 25.660 26.257 25.106 39.898 0.730 4.987
4194304 26.603 25.404 25.271 39.773 0.910 9.943
2097152 26.012 24.815 26.064 39.973 0.914 19.986
1048576 25.256 27.073 25.153 39.693 1.323 39.693
524288 29.452 28.883 29.146 35.118 0.280 70.236
262144 26.559 27.315 26.837 38.067 0.440 152.268
131072 25.259 25.794 25.992 39.879 0.483 319.030
65536 26.417 25.205 26.177 39.503 0.808 632.047
32768 26.453 26.401 25.759 39.083 0.474 1250.669
16384 24.701 24.609 25.143 41.265 0.385 2640.945

11) Server RA-context Patched, client: 64 max_sectors_kb, 2MB. RA
server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 29.629 31.703 30.407 33.513 0.930 0.524
33554432 29.768 29.598 30.717 34.111 0.553 1.066
16777216 30.054 30.640 30.102 33.837 0.295 2.115
8388608 29.906 29.744 31.394 33.762 0.813 4.220
4194304 30.708 30.797 30.418 33.420 0.177 8.355
2097152 31.364 29.646 30.712 33.511 0.781 16.755
1048576 30.757 30.600 30.470 33.455 0.128 33.455
524288 29.715 31.176 29.977 33.822 0.701 67.644
262144 30.533 30.218 30.259 33.755 0.155 135.021
131072 30.403 32.609 30.651 32.831 1.016 262.645
65536 30.846 30.208 32.116 32.993 0.835 527.889
32768 30.526 29.794 30.556 33.809 0.397 1081.878
16384 31.560 31.532 30.938 32.673 0.301 2091.092


Ronald.

2009-07-03 10:56:17

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev


Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>> to
>>> - do more benchmarks
>>> - figure out why context readahead didn't help SCST performance
>>> (previous traces show that context readahead is submitting perfect
>>> large io requests, so I wonder if it's some io scheduler bug)
>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>> patch read-ahead was nearly disabled, hence there were no difference which
>> algorithm was used?
>>
>> Ronald, can you run the following tests, please? This time with 2 hosts,
>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>> would be the best if on the client vanilla 2.6.29 will be ran, but any other
>> kernel will be fine as well, only specify which. Blockdev-perftest should be
>> ran as before in buffered mode, i.e. with "-a" switch.
>>
>> 1. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>
>> 2. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>> max_sectors_kb.
>>
>> 3. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>> max_sectors_kb.
>>
>> 4. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>> max_sectors_kb.
>>
>> 5. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA size
>> and max_sectors_kb are default. For your convenience I committed the
>> backported context RA patches into the SCST SVN repository.
>>
>> 6. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>> size and 64KB max_sectors_kb.
>>
>> 7. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
>> and default max_sectors_kb.
>>
>> 8. All defaults on the client, on the server vanilla 2.6.29 with Fengguang's
>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA size
>> and 64KB max_sectors_kb.
>>
>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server vanilla
>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>> patches with 2MB RA size and 64KB max_sectors_kb.
>
> Ok, done. Performance is pretty bad overall :(
>
> The kernels I used:
> client kernel: 2.6.26-15lenny3 (debian)
> server kernel: 2.6.29.5 with blk_dev_run patch
>
> And I adjusted the blockdev-perftest script to drop caches on both the
> server (via ssh) and the client.
>
> The results:
>
> 1) client: default, server: default
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 19.808 20.078 20.180 51.147 0.402 0.799
> 33554432 19.162 19.952 20.375 51.673 1.322 1.615
> 16777216 19.714 20.331 19.948 51.214 0.649 3.201
> 8388608 18.572 20.126 20.345 52.116 2.149 6.515
> 4194304 18.711 19.663 19.811 52.831 1.350 13.208
> 2097152 19.112 19.927 19.130 52.832 1.022 26.416
> 1048576 19.771 19.686 20.010 51.661 0.356 51.661
> 524288 19.585 19.940 19.483 52.065 0.515 104.131
> 262144 19.168 20.794 19.605 51.634 1.757 206.535
> 131072 19.077 20.776 20.271 51.160 1.849 409.282
> 65536 19.643 21.230 19.144 51.284 2.227 820.549
> 32768 19.702 20.869 19.686 51.020 1.380 1632.635
> 16384 21.218 20.222 20.221 49.846 1.121 3190.174
>
> 2) client: default, server: 64 max_sectors_kb
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 20.881 20.102 21.689 49.065 1.522 0.767
> 33554432 20.329 19.938 20.522 50.543 0.609 1.579
> 16777216 20.247 19.744 20.912 50.468 1.185 3.154
> 8388608 19.739 20.184 21.032 50.433 1.318 6.304
> 4194304 19.968 18.748 20.230 52.174 1.750 13.043
> 2097152 19.633 20.068 19.858 51.584 0.462 25.792
> 1048576 20.552 20.618 20.974 49.437 0.440 49.437
> 524288 21.595 20.830 20.454 48.881 1.098 97.762
> 262144 21.720 20.602 20.176 49.201 1.515 196.805
> 131072 20.976 19.089 20.712 50.634 2.144 405.072
> 65536 20.661 19.952 19.312 51.303 1.414 820.854
> 32768 21.155 18.464 20.640 51.159 3.081 1637.090
> 16384 22.023 19.944 20.629 49.159 2.008 3146.205
>
> 3) client: default, server: default max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 21.709 19.315 18.319 52.028 3.631 0.813
> 33554432 20.745 19.209 19.048 52.142 1.976 1.629
> 16777216 19.762 19.175 19.485 52.591 0.649 3.287
> 8388608 19.812 19.142 19.574 52.498 0.749 6.562
> 4194304 19.931 19.786 19.505 51.877 0.466 12.969
> 2097152 19.473 19.208 19.438 52.859 0.322 26.430
> 1048576 19.524 19.033 19.477 52.941 0.610 52.941
> 524288 20.115 20.402 19.542 51.166 0.920 102.333
> 262144 19.291 19.715 21.016 51.249 1.844 204.996
> 131072 18.782 19.130 20.334 52.802 1.775 422.419
> 65536 19.030 19.233 20.328 52.475 1.504 839.599
> 32768 19.147 19.326 19.411 53.074 0.303 1698.357
> 16384 19.573 19.596 20.417 51.575 1.005 3300.788
>
> 4) client: default, server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 22.604 21.707 20.721 47.298 1.683 0.739
> 33554432 21.654 20.812 21.162 48.293 0.784 1.509
> 16777216 20.461 19.782 21.160 50.068 1.377 3.129
> 8388608 20.886 20.434 21.512 48.914 1.028 6.114
> 4194304 22.154 20.512 21.433 47.974 1.517 11.993
> 2097152 22.258 20.971 20.738 48.071 1.478 24.035
> 1048576 19.953 21.294 19.662 50.497 1.731 50.497
> 524288 21.577 20.884 20.883 48.509 0.743 97.019
> 262144 20.959 20.749 20.256 49.587 0.712 198.347
> 131072 19.926 21.542 19.634 50.360 2.022 402.877
> 65536 20.973 22.546 20.840 47.793 1.685 764.690
> 32768 20.695 21.031 21.182 48.837 0.476 1562.791
> 16384 20.163 21.112 20.037 50.133 1.159 3208.481
>
> 5) Server RA-context Patched, client: default, server: default
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 19.756 23.647 18.852 49.818 4.717 0.778
> 33554432 18.892 19.727 18.857 53.472 1.106 1.671
> 16777216 18.943 19.255 18.949 53.760 0.409 3.360
> 8388608 18.766 19.105 18.847 54.165 0.413 6.771
> 4194304 19.177 19.609 20.191 52.111 1.097 13.028
> 2097152 18.968 19.517 18.862 53.581 0.797 26.790
> 1048576 18.833 19.912 18.626 53.592 1.551 53.592
> 524288 19.128 19.379 19.134 53.298 0.324 106.596
> 262144 18.955 19.328 18.879 53.748 0.550 214.992
> 131072 18.401 19.642 18.928 53.961 1.439 431.691
> 65536 19.366 19.822 18.615 53.182 1.384 850.908
> 32768 19.252 19.229 18.752 53.683 0.653 1717.857
> 16384 21.373 19.507 19.162 51.282 2.415 3282.039
>
> 6) Server RA-context Patched, client: default, server: 64
> max_sectors_kb, RA default
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 22.753 21.071 20.532 47.825 2.061 0.747
> 33554432 20.404 19.239 20.722 50.943 1.644 1.592
> 16777216 20.914 20.114 21.854 48.910 1.655 3.057
> 8388608 19.524 21.932 21.465 48.949 2.510 6.119
> 4194304 20.306 20.809 20.000 50.279 0.820 12.570
> 2097152 20.133 20.194 20.181 50.770 0.066 25.385
> 1048576 19.515 21.593 20.052 50.321 2.128 50.321
> 524288 20.231 20.502 20.299 50.335 0.284 100.670
> 262144 19.620 19.737 19.911 51.834 0.313 207.336
> 131072 20.486 21.138 22.339 48.089 1.711 384.714
> 65536 20.113 18.322 22.247 50.943 4.025 815.088
> 32768 23.341 23.328 20.809 45.659 2.511 1461.089
> 16384 20.962 21.839 23.405 46.496 2.100 2975.773
>
> 7) Server RA-context Patched, client: default, server: default
> max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 19.565 19.028 19.164 53.196 0.627 0.831
> 33554432 19.048 18.401 18.940 54.491 0.828 1.703
> 16777216 18.728 19.330 19.076 53.778 0.699 3.361
> 8388608 19.174 18.710 19.922 53.179 1.368 6.647
> 4194304 19.133 18.514 19.672 53.628 1.331 13.407
> 2097152 18.903 18.547 20.070 53.468 1.782 26.734
> 1048576 19.210 19.204 18.994 53.513 0.282 53.513
> 524288 18.978 18.723 20.839 52.596 2.464 105.192
> 262144 18.912 18.590 18.635 54.726 0.415 218.905
> 131072 18.732 18.578 19.797 53.837 1.505 430.694
> 65536 19.046 18.872 19.318 53.678 0.516 858.852
> 32768 18.490 18.582 20.374 53.583 2.353 1714.661
> 16384 19.138 19.215 20.602 52.167 1.744 3338.700
>
> 8) Server RA-context Patched, client: default, server: 64
> max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 21.029 21.654 21.093 48.177 0.630 0.753
> 33554432 21.174 19.759 20.659 49.918 1.435 1.560
> 16777216 20.385 20.235 22.145 49.026 1.976 3.064
> 8388608 19.053 20.162 20.158 51.778 1.391 6.472
> 4194304 20.123 23.173 20.073 48.696 3.188 12.174
> 2097152 19.401 20.824 20.326 50.778 1.500 25.389
> 1048576 21.821 21.401 21.026 47.825 0.724 47.825
> 524288 21.478 20.742 21.355 48.332 0.742 96.664
> 262144 20.290 20.183 20.980 50.004 0.853 200.015
> 131072 20.299 21.501 20.766 49.127 1.158 393.020
> 65536 21.087 19.340 20.867 50.193 1.959 803.092
> 32768 21.597 21.223 23.504 46.410 2.039 1485.132
> 16384 21.681 21.709 22.944 46.343 1.212 2965.967
>
> 9) Server RA-context Patched, client: 64 max_sectors_kb, default RA.
> server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 42.767 40.615 41.188 24.672 0.535 0.386
> 33554432 41.204 42.294 40.514 24.780 0.437 0.774
> 16777216 39.774 42.809 41.804 24.720 0.762 1.545
> 8388608 42.292 41.799 40.386 24.689 0.486 3.086
> 4194304 41.784 39.037 41.830 25.073 0.819 6.268
> 2097152 41.983 41.145 44.115 24.164 0.703 12.082
> 1048576 41.468 43.495 41.640 24.276 0.520 24.276
> 524288 42.631 42.724 41.267 24.267 0.387 48.535
> 262144 41.930 41.954 41.975 24.408 0.011 97.634
> 131072 42.511 41.266 42.835 24.269 0.393 194.154
> 65536 41.307 41.544 40.746 24.857 0.203 397.704
> 32768 42.270 42.728 40.822 24.425 0.478 781.607
> 16384 39.307 40.044 40.259 25.686 0.264 1643.908
> 8192 41.258 40.879 40.969 24.955 0.098 3194.183
>
> 10) Server RA-context Patched, client: default max_sectors_kb, 2MB RA.
> server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 26.160 26.878 25.790 38.982 0.666 0.609
> 33554432 25.832 25.362 25.695 39.956 0.309 1.249
> 16777216 26.119 24.769 25.526 40.221 0.876 2.514
> 8388608 25.660 26.257 25.106 39.898 0.730 4.987
> 4194304 26.603 25.404 25.271 39.773 0.910 9.943
> 2097152 26.012 24.815 26.064 39.973 0.914 19.986
> 1048576 25.256 27.073 25.153 39.693 1.323 39.693
> 524288 29.452 28.883 29.146 35.118 0.280 70.236
> 262144 26.559 27.315 26.837 38.067 0.440 152.268
> 131072 25.259 25.794 25.992 39.879 0.483 319.030
> 65536 26.417 25.205 26.177 39.503 0.808 632.047
> 32768 26.453 26.401 25.759 39.083 0.474 1250.669
> 16384 24.701 24.609 25.143 41.265 0.385 2640.945
>
> 11) Server RA-context Patched, client: 64 max_sectors_kb, 2MB. RA
> server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 29.629 31.703 30.407 33.513 0.930 0.524
> 33554432 29.768 29.598 30.717 34.111 0.553 1.066
> 16777216 30.054 30.640 30.102 33.837 0.295 2.115
> 8388608 29.906 29.744 31.394 33.762 0.813 4.220
> 4194304 30.708 30.797 30.418 33.420 0.177 8.355
> 2097152 31.364 29.646 30.712 33.511 0.781 16.755
> 1048576 30.757 30.600 30.470 33.455 0.128 33.455
> 524288 29.715 31.176 29.977 33.822 0.701 67.644
> 262144 30.533 30.218 30.259 33.755 0.155 135.021
> 131072 30.403 32.609 30.651 32.831 1.016 262.645
> 65536 30.846 30.208 32.116 32.993 0.835 527.889
> 32768 30.526 29.794 30.556 33.809 0.397 1081.878
> 16384 31.560 31.532 30.938 32.673 0.301 2091.092

Those are on the server without io_context-2.6.29 and readahead-2.6.29
patches applied and with CFQ scheduler, correct?

Then we see how reorder of requests caused by many I/O threads
submitting I/O in separate I/O contexts badly affect performance and no
RA, especially with default 128KB RA size, can solve it. Less
max_sectors_kb on the client => more requests it sends at once => more
reorder on the server => worse throughput. Although, Fengguang, in
theory, context RA with 2MB RA size should considerably help it, no?

Ronald, can you perform those tests again with both io_context-2.6.29
and readahead-2.6.29 patches applied on the server, please?

Thanks,
Vlad

2009-07-03 12:41:40

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>
> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>
>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>> to
>>>> - do more benchmarks
>>>> - figure out why context readahead didn't help SCST performance
>>>> ?(previous traces show that context readahead is submitting perfect
>>>> ?large io requests, so I wonder if it's some io scheduler bug)
>>>
>>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>>> patch read-ahead was nearly disabled, hence there were no difference
>>> which
>>> algorithm was used?
>>>
>>> Ronald, can you run the following tests, please? This time with 2 hosts,
>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>> other
>>> kernel will be fine as well, only specify which. Blockdev-perftest should
>>> be
>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>
>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>
>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>> max_sectors_kb.
>>>
>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>> max_sectors_kb.
>>>
>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>> max_sectors_kb.
>>>
>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>> size
>>> and max_sectors_kb are default. For your convenience I committed the
>>> backported context RA patches into the SCST SVN repository.
>>>
>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>>> size and 64KB max_sectors_kb.
>>>
>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>> size
>>> and default max_sectors_kb.
>>>
>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>> size
>>> and 64KB max_sectors_kb.
>>>
>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>
>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>
>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>> vanilla
>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> Ok, done. Performance is pretty bad overall :(
>>
>> The kernels I used:
>> client kernel: 2.6.26-15lenny3 (debian)
>> server kernel: 2.6.29.5 with blk_dev_run patch
>>
>> And I adjusted the blockdev-perftest script to drop caches on both the
>> server (via ssh) and the client.
>>
>> The results:
>>

... results ...

> Those are on the server without io_context-2.6.29 and readahead-2.6.29
> patches applied and with CFQ scheduler, correct?

No. It was done with the readahead patch
(http://lkml.org/lkml/2009/5/21/319) and the context RA patch
(starting at test 5) as you requested.

> Then we see how reorder of requests caused by many I/O threads submitting
> I/O in separate I/O contexts badly affect performance and no RA, especially
> with default 128KB RA size, can solve it. Less max_sectors_kb on the client
> => more requests it sends at once => more reorder on the server => worse
> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
> should considerably help it, no?

Wouldn't setting scst_threads to 1 help also in this case?

> Ronald, can you perform those tests again with both io_context-2.6.29 and
> readahead-2.6.29 patches applied on the server, please?

Ok. I only have access to the test systems during the week, so results
might not be ready before Monday. Are there tests that we can exclude
to speed things up?

Ronald.

2009-07-03 12:47:19

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev


Ronald Moesbergen, on 07/03/2009 04:41 PM wrote:
> 2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>>> to
>>>>> - do more benchmarks
>>>>> - figure out why context readahead didn't help SCST performance
>>>>> (previous traces show that context readahead is submitting perfect
>>>>> large io requests, so I wonder if it's some io scheduler bug)
>>>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>>>> patch read-ahead was nearly disabled, hence there were no difference
>>>> which
>>>> algorithm was used?
>>>>
>>>> Ronald, can you run the following tests, please? This time with 2 hosts,
>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>>> other
>>>> kernel will be fine as well, only specify which. Blockdev-perftest should
>>>> be
>>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>>
>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>>
>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>>> max_sectors_kb.
>>>>
>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>>> max_sectors_kb.
>>>>
>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>>> max_sectors_kb.
>>>>
>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>>> size
>>>> and max_sectors_kb are default. For your convenience I committed the
>>>> backported context RA patches into the SCST SVN repository.
>>>>
>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>>>> size and 64KB max_sectors_kb.
>>>>
>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>> size
>>>> and default max_sectors_kb.
>>>>
>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>> size
>>>> and 64KB max_sectors_kb.
>>>>
>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>
>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>
>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>>> vanilla
>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>> Ok, done. Performance is pretty bad overall :(
>>>
>>> The kernels I used:
>>> client kernel: 2.6.26-15lenny3 (debian)
>>> server kernel: 2.6.29.5 with blk_dev_run patch
>>>
>>> And I adjusted the blockdev-perftest script to drop caches on both the
>>> server (via ssh) and the client.
>>>
>>> The results:
>>>
>
> ... results ...
>
>> Those are on the server without io_context-2.6.29 and readahead-2.6.29
>> patches applied and with CFQ scheduler, correct?
>
> No. It was done with the readahead patch
> (http://lkml.org/lkml/2009/5/21/319) and the context RA patch
> (starting at test 5) as you requested.

OK, just wanted to clear.

>> Then we see how reorder of requests caused by many I/O threads submitting
>> I/O in separate I/O contexts badly affect performance and no RA, especially
>> with default 128KB RA size, can solve it. Less max_sectors_kb on the client
>> => more requests it sends at once => more reorder on the server => worse
>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
>> should considerably help it, no?
>
> Wouldn't setting scst_threads to 1 help also in this case?

Let's check it in another time.

>> Ronald, can you perform those tests again with both io_context-2.6.29 and
>> readahead-2.6.29 patches applied on the server, please?
>
> Ok. I only have access to the test systems during the week, so results
> might not be ready before Monday. Are there tests that we can exclude
> to speed things up?

Unfortunately, no. But this isn't urgent at all, so next week is OK.

Thanks,
Vlad

2009-07-04 15:19:20

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>
> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>
>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>> to
>>>> - do more benchmarks
>>>> - figure out why context readahead didn't help SCST performance
>>>> ?(previous traces show that context readahead is submitting perfect
>>>> ?large io requests, so I wonder if it's some io scheduler bug)
>>>
>>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>>> patch read-ahead was nearly disabled, hence there were no difference
>>> which
>>> algorithm was used?
>>>
>>> Ronald, can you run the following tests, please? This time with 2 hosts,
>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>> other
>>> kernel will be fine as well, only specify which. Blockdev-perftest should
>>> be
>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>
>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>
>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>> max_sectors_kb.
>>>
>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>> max_sectors_kb.
>>>
>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>> max_sectors_kb.
>>>
>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>> size
>>> and max_sectors_kb are default. For your convenience I committed the
>>> backported context RA patches into the SCST SVN repository.
>>>
>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>>> size and 64KB max_sectors_kb.
>>>
>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>> size
>>> and default max_sectors_kb.
>>>
>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>> Fengguang's
>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>> size
>>> and 64KB max_sectors_kb.
>>>
>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>
>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>
>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>> vanilla
>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>
>> Ok, done. Performance is pretty bad overall :(
>>
>> The kernels I used:
>> client kernel: 2.6.26-15lenny3 (debian)
>> server kernel: 2.6.29.5 with blk_dev_run patch
>>
>> And I adjusted the blockdev-perftest script to drop caches on both the
>> server (via ssh) and the client.
>>
>> The results:
>>

... previous results ...

> Those are on the server without io_context-2.6.29 and readahead-2.6.29
> patches applied and with CFQ scheduler, correct?
>
> Then we see how reorder of requests caused by many I/O threads submitting
> I/O in separate I/O contexts badly affect performance and no RA, especially
> with default 128KB RA size, can solve it. Less max_sectors_kb on the client
> => more requests it sends at once => more reorder on the server => worse
> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
> should considerably help it, no?
>
> Ronald, can you perform those tests again with both io_context-2.6.29 and
> readahead-2.6.29 patches applied on the server, please?

Hi Vlad,

I have retested with the patches you requested (and got access to the
systems today :) ) The results are better, but still not great.

client kernel: 2.6.26-15lenny3 (debian)
server kernel: 2.6.29.5 with io_context and readahead patch

5) client: default, server: default
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 18.303 19.867 18.481 54.299 1.961 0.848
33554432 18.321 17.681 18.708 56.181 1.314 1.756
16777216 17.816 17.406 19.257 56.494 2.410 3.531
8388608 18.077 17.727 19.338 55.789 2.056 6.974
4194304 17.918 16.601 18.287 58.276 2.454 14.569
2097152 17.426 17.334 17.610 58.661 0.384 29.331
1048576 19.358 18.764 17.253 55.607 2.734 55.607
524288 17.951 18.163 17.440 57.379 0.983 114.757
262144 18.196 17.724 17.520 57.499 0.907 229.995
131072 18.342 18.259 17.551 56.751 1.131 454.010
65536 17.733 18.572 17.134 57.548 1.893 920.766
32768 19.081 19.321 17.364 55.213 2.673 1766.818
16384 17.181 18.729 17.731 57.343 2.033 3669.932

6) client: default, server: 64 max_sectors_kb, RA default
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 21.790 20.062 19.534 50.153 2.304 0.784
33554432 20.212 19.744 19.564 51.623 0.706 1.613
16777216 20.404 19.329 19.738 51.680 1.148 3.230
8388608 20.170 20.772 19.509 50.852 1.304 6.356
4194304 19.334 18.742 18.522 54.296 0.978 13.574
2097152 19.413 18.858 18.884 53.758 0.715 26.879
1048576 20.472 18.755 18.476 53.347 2.377 53.347
524288 19.120 20.104 18.404 53.378 1.925 106.756
262144 20.337 19.213 18.636 52.866 1.901 211.464
131072 19.199 18.312 19.970 53.510 1.900 428.083
65536 19.855 20.114 19.592 51.584 0.555 825.342
32768 20.586 18.724 20.340 51.592 2.204 1650.941
16384 21.119 19.834 19.594 50.792 1.651 3250.669

7) client: default, server: default max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 17.767 16.489 16.949 60.050 1.842 0.938
33554432 16.777 17.034 17.102 60.341 0.500 1.886
16777216 18.509 16.784 16.971 58.891 2.537 3.681
8388608 18.058 17.949 17.599 57.313 0.632 7.164
4194304 18.286 17.648 17.026 58.055 1.692 14.514
2097152 17.387 18.451 17.875 57.226 1.388 28.613
1048576 18.270 17.698 17.570 57.397 0.969 57.397
524288 16.708 17.900 17.233 59.306 1.668 118.611
262144 18.041 17.381 18.035 57.484 1.011 229.934
131072 17.994 17.777 18.146 56.981 0.481 455.844
65536 17.097 18.597 17.737 57.563 1.975 921.011
32768 17.167 17.035 19.693 57.254 3.721 1832.127
16384 17.144 16.664 17.623 59.762 1.367 3824.774

8) client: default, server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 20.003 21.133 19.308 50.894 1.881 0.795
33554432 19.448 20.015 18.908 52.657 1.222 1.646
16777216 19.964 19.350 19.106 52.603 0.967 3.288
8388608 18.961 19.213 19.318 53.437 0.419 6.680
4194304 18.135 19.508 19.361 53.948 1.788 13.487
2097152 18.753 19.471 18.367 54.315 1.306 27.158
1048576 19.189 18.586 18.867 54.244 0.707 54.244
524288 18.985 19.199 18.840 53.874 0.417 107.749
262144 19.064 21.143 19.674 51.398 2.204 205.592
131072 18.691 18.664 19.116 54.406 0.594 435.245
65536 18.468 20.673 18.554 53.389 2.729 854.229
32768 20.401 21.156 19.552 50.323 1.623 1610.331
16384 19.532 20.028 20.466 51.196 0.977 3276.567

9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 16.458 16.649 17.346 60.919 1.364 0.952
33554432 16.479 16.744 17.069 61.096 0.878 1.909
16777216 17.128 16.585 17.112 60.456 0.910 3.778
8388608 17.322 16.780 16.885 60.262 0.824 7.533
4194304 17.530 16.725 16.756 60.250 1.299 15.063
2097152 16.580 17.875 16.619 60.221 2.076 30.110
1048576 17.550 17.406 17.075 59.049 0.681 59.049
524288 16.492 18.211 16.832 59.718 2.519 119.436
262144 17.241 17.115 17.365 59.397 0.352 237.588
131072 17.430 16.902 17.511 59.271 0.936 474.167
65536 16.726 16.894 17.246 60.404 0.768 966.461
32768 16.662 17.517 17.052 59.989 1.224 1919.658
16384 17.429 16.793 16.753 60.285 1.085 3858.268

10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 17.601 18.334 17.379 57.650 1.307 0.901
33554432 18.281 18.128 17.169 57.381 1.610 1.793
16777216 17.660 17.875 17.356 58.091 0.703 3.631
8388608 17.724 17.810 18.383 56.992 0.918 7.124
4194304 17.475 17.770 19.003 56.704 2.031 14.176
2097152 17.287 17.674 18.492 57.516 1.604 28.758
1048576 17.972 17.460 18.777 56.721 1.689 56.721
524288 18.680 18.952 19.445 53.837 0.890 107.673
262144 18.070 18.337 18.639 55.817 0.707 223.270
131072 16.990 16.651 16.862 60.832 0.507 486.657
65536 17.707 16.972 17.520 58.870 1.066 941.924
32768 17.767 17.208 17.205 58.887 0.885 1884.399
16384 18.258 17.252 18.035 57.407 1.407 3674.059

11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
blocksize R R R R(avg, R(std R
(bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
67108864 17.993 18.307 18.718 55.850 0.902 0.873
33554432 19.554 18.485 17.902 54.988 1.993 1.718
16777216 18.829 18.236 18.748 55.052 0.785 3.441
8388608 21.152 19.065 18.738 52.257 2.745 6.532
4194304 19.131 19.703 17.850 54.288 2.268 13.572
2097152 19.093 19.152 19.509 53.196 0.504 26.598
1048576 19.371 18.775 18.804 53.953 0.772 53.953
524288 20.003 17.911 18.602 54.470 2.476 108.940
262144 19.182 19.460 18.476 53.809 1.183 215.236
131072 19.403 19.192 18.907 53.429 0.567 427.435
65536 19.502 19.656 18.599 53.219 1.309 851.509
32768 18.746 18.747 18.250 55.119 0.701 1763.817
16384 20.977 19.437 18.840 51.951 2.319 3324.862

Ronald.

2009-07-06 11:13:15

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

(Restored the original list of recipients in this thread as I was asked.)

Hi Ronald,

Ronald Moesbergen, on 07/04/2009 07:19 PM wrote:
> 2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>>> to
>>>>> - do more benchmarks
>>>>> - figure out why context readahead didn't help SCST performance
>>>>> (previous traces show that context readahead is submitting perfect
>>>>> large io requests, so I wonder if it's some io scheduler bug)
>>>> Because, as we found out, without your http://lkml.org/lkml/2009/5/21/319
>>>> patch read-ahead was nearly disabled, hence there were no difference
>>>> which
>>>> algorithm was used?
>>>>
>>>> Ronald, can you run the following tests, please? This time with 2 hosts,
>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>>> other
>>>> kernel will be fine as well, only specify which. Blockdev-perftest should
>>>> be
>>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>>
>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>>
>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>>> max_sectors_kb.
>>>>
>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>>> max_sectors_kb.
>>>>
>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>>> max_sectors_kb.
>>>>
>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>>> size
>>>> and max_sectors_kb are default. For your convenience I committed the
>>>> backported context RA patches into the SCST SVN repository.
>>>>
>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default RA
>>>> size and 64KB max_sectors_kb.
>>>>
>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>> size
>>>> and default max_sectors_kb.
>>>>
>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>>> Fengguang's
>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>> size
>>>> and 64KB max_sectors_kb.
>>>>
>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>
>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>
>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>>> vanilla
>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context RA
>>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>> Ok, done. Performance is pretty bad overall :(
>>>
>>> The kernels I used:
>>> client kernel: 2.6.26-15lenny3 (debian)
>>> server kernel: 2.6.29.5 with blk_dev_run patch
>>>
>>> And I adjusted the blockdev-perftest script to drop caches on both the
>>> server (via ssh) and the client.
>>>
>>> The results:
>>>
>
> ... previous results ...
>
>> Those are on the server without io_context-2.6.29 and readahead-2.6.29
>> patches applied and with CFQ scheduler, correct?
>>
>> Then we see how reorder of requests caused by many I/O threads submitting
>> I/O in separate I/O contexts badly affect performance and no RA, especially
>> with default 128KB RA size, can solve it. Less max_sectors_kb on the client
>> => more requests it sends at once => more reorder on the server => worse
>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
>> should considerably help it, no?
>>
>> Ronald, can you perform those tests again with both io_context-2.6.29 and
>> readahead-2.6.29 patches applied on the server, please?
>
> Hi Vlad,
>
> I have retested with the patches you requested (and got access to the
> systems today :) ) The results are better, but still not great.
>
> client kernel: 2.6.26-15lenny3 (debian)
> server kernel: 2.6.29.5 with io_context and readahead patch
>
> 5) client: default, server: default
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 18.303 19.867 18.481 54.299 1.961 0.848
> 33554432 18.321 17.681 18.708 56.181 1.314 1.756
> 16777216 17.816 17.406 19.257 56.494 2.410 3.531
> 8388608 18.077 17.727 19.338 55.789 2.056 6.974
> 4194304 17.918 16.601 18.287 58.276 2.454 14.569
> 2097152 17.426 17.334 17.610 58.661 0.384 29.331
> 1048576 19.358 18.764 17.253 55.607 2.734 55.607
> 524288 17.951 18.163 17.440 57.379 0.983 114.757
> 262144 18.196 17.724 17.520 57.499 0.907 229.995
> 131072 18.342 18.259 17.551 56.751 1.131 454.010
> 65536 17.733 18.572 17.134 57.548 1.893 920.766
> 32768 19.081 19.321 17.364 55.213 2.673 1766.818
> 16384 17.181 18.729 17.731 57.343 2.033 3669.932
>
> 6) client: default, server: 64 max_sectors_kb, RA default
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 21.790 20.062 19.534 50.153 2.304 0.784
> 33554432 20.212 19.744 19.564 51.623 0.706 1.613
> 16777216 20.404 19.329 19.738 51.680 1.148 3.230
> 8388608 20.170 20.772 19.509 50.852 1.304 6.356
> 4194304 19.334 18.742 18.522 54.296 0.978 13.574
> 2097152 19.413 18.858 18.884 53.758 0.715 26.879
> 1048576 20.472 18.755 18.476 53.347 2.377 53.347
> 524288 19.120 20.104 18.404 53.378 1.925 106.756
> 262144 20.337 19.213 18.636 52.866 1.901 211.464
> 131072 19.199 18.312 19.970 53.510 1.900 428.083
> 65536 19.855 20.114 19.592 51.584 0.555 825.342
> 32768 20.586 18.724 20.340 51.592 2.204 1650.941
> 16384 21.119 19.834 19.594 50.792 1.651 3250.669
>
> 7) client: default, server: default max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 17.767 16.489 16.949 60.050 1.842 0.938
> 33554432 16.777 17.034 17.102 60.341 0.500 1.886
> 16777216 18.509 16.784 16.971 58.891 2.537 3.681
> 8388608 18.058 17.949 17.599 57.313 0.632 7.164
> 4194304 18.286 17.648 17.026 58.055 1.692 14.514
> 2097152 17.387 18.451 17.875 57.226 1.388 28.613
> 1048576 18.270 17.698 17.570 57.397 0.969 57.397
> 524288 16.708 17.900 17.233 59.306 1.668 118.611
> 262144 18.041 17.381 18.035 57.484 1.011 229.934
> 131072 17.994 17.777 18.146 56.981 0.481 455.844
> 65536 17.097 18.597 17.737 57.563 1.975 921.011
> 32768 17.167 17.035 19.693 57.254 3.721 1832.127
> 16384 17.144 16.664 17.623 59.762 1.367 3824.774
>
> 8) client: default, server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 20.003 21.133 19.308 50.894 1.881 0.795
> 33554432 19.448 20.015 18.908 52.657 1.222 1.646
> 16777216 19.964 19.350 19.106 52.603 0.967 3.288
> 8388608 18.961 19.213 19.318 53.437 0.419 6.680
> 4194304 18.135 19.508 19.361 53.948 1.788 13.487
> 2097152 18.753 19.471 18.367 54.315 1.306 27.158
> 1048576 19.189 18.586 18.867 54.244 0.707 54.244
> 524288 18.985 19.199 18.840 53.874 0.417 107.749
> 262144 19.064 21.143 19.674 51.398 2.204 205.592
> 131072 18.691 18.664 19.116 54.406 0.594 435.245
> 65536 18.468 20.673 18.554 53.389 2.729 854.229
> 32768 20.401 21.156 19.552 50.323 1.623 1610.331
> 16384 19.532 20.028 20.466 51.196 0.977 3276.567
>
> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 16.458 16.649 17.346 60.919 1.364 0.952
> 33554432 16.479 16.744 17.069 61.096 0.878 1.909
> 16777216 17.128 16.585 17.112 60.456 0.910 3.778
> 8388608 17.322 16.780 16.885 60.262 0.824 7.533
> 4194304 17.530 16.725 16.756 60.250 1.299 15.063
> 2097152 16.580 17.875 16.619 60.221 2.076 30.110
> 1048576 17.550 17.406 17.075 59.049 0.681 59.049
> 524288 16.492 18.211 16.832 59.718 2.519 119.436
> 262144 17.241 17.115 17.365 59.397 0.352 237.588
> 131072 17.430 16.902 17.511 59.271 0.936 474.167
> 65536 16.726 16.894 17.246 60.404 0.768 966.461
> 32768 16.662 17.517 17.052 59.989 1.224 1919.658
> 16384 17.429 16.793 16.753 60.285 1.085 3858.268
>
> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 17.601 18.334 17.379 57.650 1.307 0.901
> 33554432 18.281 18.128 17.169 57.381 1.610 1.793
> 16777216 17.660 17.875 17.356 58.091 0.703 3.631
> 8388608 17.724 17.810 18.383 56.992 0.918 7.124
> 4194304 17.475 17.770 19.003 56.704 2.031 14.176
> 2097152 17.287 17.674 18.492 57.516 1.604 28.758
> 1048576 17.972 17.460 18.777 56.721 1.689 56.721
> 524288 18.680 18.952 19.445 53.837 0.890 107.673
> 262144 18.070 18.337 18.639 55.817 0.707 223.270
> 131072 16.990 16.651 16.862 60.832 0.507 486.657
> 65536 17.707 16.972 17.520 58.870 1.066 941.924
> 32768 17.767 17.208 17.205 58.887 0.885 1884.399
> 16384 18.258 17.252 18.035 57.407 1.407 3674.059
>
> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
> blocksize R R R R(avg, R(std R
> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
> 67108864 17.993 18.307 18.718 55.850 0.902 0.873
> 33554432 19.554 18.485 17.902 54.988 1.993 1.718
> 16777216 18.829 18.236 18.748 55.052 0.785 3.441
> 8388608 21.152 19.065 18.738 52.257 2.745 6.532
> 4194304 19.131 19.703 17.850 54.288 2.268 13.572
> 2097152 19.093 19.152 19.509 53.196 0.504 26.598
> 1048576 19.371 18.775 18.804 53.953 0.772 53.953
> 524288 20.003 17.911 18.602 54.470 2.476 108.940
> 262144 19.182 19.460 18.476 53.809 1.183 215.236
> 131072 19.403 19.192 18.907 53.429 0.567 427.435
> 65536 19.502 19.656 18.599 53.219 1.309 851.509
> 32768 18.746 18.747 18.250 55.119 0.701 1763.817
> 16384 20.977 19.437 18.840 51.951 2.319 3324.862

The results look inconsistently with what you had previously (89.7
MB/s). How can you explain it?

I think, most likely, there was some confusion between the tested and
patched versions of the kernel or you forgot to apply the io_context
patch. Please recheck.

> Ronald.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2009-07-06 14:37:17

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/7/6 Vladislav Bolkhovitin <[email protected]>:
> (Restored the original list of recipients in this thread as I was asked.)
>
> Hi Ronald,
>
> Ronald Moesbergen, on 07/04/2009 07:19 PM wrote:
>>
>> 2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>>>
>>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>>>
>>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>>>> to
>>>>>> - do more benchmarks
>>>>>> - figure out why context readahead didn't help SCST performance
>>>>>> ?(previous traces show that context readahead is submitting perfect
>>>>>> ?large io requests, so I wonder if it's some io scheduler bug)
>>>>>
>>>>> Because, as we found out, without your
>>>>> http://lkml.org/lkml/2009/5/21/319
>>>>> patch read-ahead was nearly disabled, hence there were no difference
>>>>> which
>>>>> algorithm was used?
>>>>>
>>>>> Ronald, can you run the following tests, please? This time with 2
>>>>> hosts,
>>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>>>> other
>>>>> kernel will be fine as well, only specify which. Blockdev-perftest
>>>>> should
>>>>> be
>>>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>>>
>>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>>>
>>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>>>> max_sectors_kb.
>>>>>
>>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>>>> max_sectors_kb.
>>>>>
>>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>>>> max_sectors_kb.
>>>>>
>>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>>>> size
>>>>> and max_sectors_kb are default. For your convenience I committed the
>>>>> backported context RA patches into the SCST SVN repository.
>>>>>
>>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default
>>>>> RA
>>>>> size and 64KB max_sectors_kb.
>>>>>
>>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>> size
>>>>> and default max_sectors_kb.
>>>>>
>>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>>>> Fengguang's
>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>> size
>>>>> and 64KB max_sectors_kb.
>>>>>
>>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>
>>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>
>>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>>>> vanilla
>>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context
>>>>> RA
>>>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>>>
>>>> Ok, done. Performance is pretty bad overall :(
>>>>
>>>> The kernels I used:
>>>> client kernel: 2.6.26-15lenny3 (debian)
>>>> server kernel: 2.6.29.5 with blk_dev_run patch
>>>>
>>>> And I adjusted the blockdev-perftest script to drop caches on both the
>>>> server (via ssh) and the client.
>>>>
>>>> The results:
>>>>
>>
>> ... previous results ...
>>
>>> Those are on the server without io_context-2.6.29 and readahead-2.6.29
>>> patches applied and with CFQ scheduler, correct?
>>>
>>> Then we see how reorder of requests caused by many I/O threads submitting
>>> I/O in separate I/O contexts badly affect performance and no RA,
>>> especially
>>> with default 128KB RA size, can solve it. Less max_sectors_kb on the
>>> client
>>> => more requests it sends at once => more reorder on the server => worse
>>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
>>> should considerably help it, no?
>>>
>>> Ronald, can you perform those tests again with both io_context-2.6.29 and
>>> readahead-2.6.29 patches applied on the server, please?
>>
>> Hi Vlad,
>>
>> I have retested with the patches you requested (and got access to the
>> systems today :) ) The results are better, but still not great.
>>
>> client kernel: 2.6.26-15lenny3 (debian)
>> server kernel: 2.6.29.5 with io_context and readahead patch
>>
>> 5) client: default, server: default
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?18.303 ? 19.867 ? 18.481 ? 54.299 ? ?1.961 ? ?0.848
>> ?33554432 ?18.321 ? 17.681 ? 18.708 ? 56.181 ? ?1.314 ? ?1.756
>> ?16777216 ?17.816 ? 17.406 ? 19.257 ? 56.494 ? ?2.410 ? ?3.531
>> ?8388608 ?18.077 ? 17.727 ? 19.338 ? 55.789 ? ?2.056 ? ?6.974
>> ?4194304 ?17.918 ? 16.601 ? 18.287 ? 58.276 ? ?2.454 ? 14.569
>> ?2097152 ?17.426 ? 17.334 ? 17.610 ? 58.661 ? ?0.384 ? 29.331
>> ?1048576 ?19.358 ? 18.764 ? 17.253 ? 55.607 ? ?2.734 ? 55.607
>> ? 524288 ?17.951 ? 18.163 ? 17.440 ? 57.379 ? ?0.983 ?114.757
>> ? 262144 ?18.196 ? 17.724 ? 17.520 ? 57.499 ? ?0.907 ?229.995
>> ? 131072 ?18.342 ? 18.259 ? 17.551 ? 56.751 ? ?1.131 ?454.010
>> ? ?65536 ?17.733 ? 18.572 ? 17.134 ? 57.548 ? ?1.893 ?920.766
>> ? ?32768 ?19.081 ? 19.321 ? 17.364 ? 55.213 ? ?2.673 1766.818
>> ? ?16384 ?17.181 ? 18.729 ? 17.731 ? 57.343 ? ?2.033 3669.932
>>
>> 6) client: default, server: 64 max_sectors_kb, RA default
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?21.790 ? 20.062 ? 19.534 ? 50.153 ? ?2.304 ? ?0.784
>> ?33554432 ?20.212 ? 19.744 ? 19.564 ? 51.623 ? ?0.706 ? ?1.613
>> ?16777216 ?20.404 ? 19.329 ? 19.738 ? 51.680 ? ?1.148 ? ?3.230
>> ?8388608 ?20.170 ? 20.772 ? 19.509 ? 50.852 ? ?1.304 ? ?6.356
>> ?4194304 ?19.334 ? 18.742 ? 18.522 ? 54.296 ? ?0.978 ? 13.574
>> ?2097152 ?19.413 ? 18.858 ? 18.884 ? 53.758 ? ?0.715 ? 26.879
>> ?1048576 ?20.472 ? 18.755 ? 18.476 ? 53.347 ? ?2.377 ? 53.347
>> ? 524288 ?19.120 ? 20.104 ? 18.404 ? 53.378 ? ?1.925 ?106.756
>> ? 262144 ?20.337 ? 19.213 ? 18.636 ? 52.866 ? ?1.901 ?211.464
>> ? 131072 ?19.199 ? 18.312 ? 19.970 ? 53.510 ? ?1.900 ?428.083
>> ? ?65536 ?19.855 ? 20.114 ? 19.592 ? 51.584 ? ?0.555 ?825.342
>> ? ?32768 ?20.586 ? 18.724 ? 20.340 ? 51.592 ? ?2.204 1650.941
>> ? ?16384 ?21.119 ? 19.834 ? 19.594 ? 50.792 ? ?1.651 3250.669
>>
>> 7) client: default, server: default max_sectors_kb, RA 2MB
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?17.767 ? 16.489 ? 16.949 ? 60.050 ? ?1.842 ? ?0.938
>> ?33554432 ?16.777 ? 17.034 ? 17.102 ? 60.341 ? ?0.500 ? ?1.886
>> ?16777216 ?18.509 ? 16.784 ? 16.971 ? 58.891 ? ?2.537 ? ?3.681
>> ?8388608 ?18.058 ? 17.949 ? 17.599 ? 57.313 ? ?0.632 ? ?7.164
>> ?4194304 ?18.286 ? 17.648 ? 17.026 ? 58.055 ? ?1.692 ? 14.514
>> ?2097152 ?17.387 ? 18.451 ? 17.875 ? 57.226 ? ?1.388 ? 28.613
>> ?1048576 ?18.270 ? 17.698 ? 17.570 ? 57.397 ? ?0.969 ? 57.397
>> ? 524288 ?16.708 ? 17.900 ? 17.233 ? 59.306 ? ?1.668 ?118.611
>> ? 262144 ?18.041 ? 17.381 ? 18.035 ? 57.484 ? ?1.011 ?229.934
>> ? 131072 ?17.994 ? 17.777 ? 18.146 ? 56.981 ? ?0.481 ?455.844
>> ? ?65536 ?17.097 ? 18.597 ? 17.737 ? 57.563 ? ?1.975 ?921.011
>> ? ?32768 ?17.167 ? 17.035 ? 19.693 ? 57.254 ? ?3.721 1832.127
>> ? ?16384 ?17.144 ? 16.664 ? 17.623 ? 59.762 ? ?1.367 3824.774
>>
>> 8) client: default, server: 64 max_sectors_kb, RA 2MB
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?20.003 ? 21.133 ? 19.308 ? 50.894 ? ?1.881 ? ?0.795
>> ?33554432 ?19.448 ? 20.015 ? 18.908 ? 52.657 ? ?1.222 ? ?1.646
>> ?16777216 ?19.964 ? 19.350 ? 19.106 ? 52.603 ? ?0.967 ? ?3.288
>> ?8388608 ?18.961 ? 19.213 ? 19.318 ? 53.437 ? ?0.419 ? ?6.680
>> ?4194304 ?18.135 ? 19.508 ? 19.361 ? 53.948 ? ?1.788 ? 13.487
>> ?2097152 ?18.753 ? 19.471 ? 18.367 ? 54.315 ? ?1.306 ? 27.158
>> ?1048576 ?19.189 ? 18.586 ? 18.867 ? 54.244 ? ?0.707 ? 54.244
>> ? 524288 ?18.985 ? 19.199 ? 18.840 ? 53.874 ? ?0.417 ?107.749
>> ? 262144 ?19.064 ? 21.143 ? 19.674 ? 51.398 ? ?2.204 ?205.592
>> ? 131072 ?18.691 ? 18.664 ? 19.116 ? 54.406 ? ?0.594 ?435.245
>> ? ?65536 ?18.468 ? 20.673 ? 18.554 ? 53.389 ? ?2.729 ?854.229
>> ? ?32768 ?20.401 ? 21.156 ? 19.552 ? 50.323 ? ?1.623 1610.331
>> ? ?16384 ?19.532 ? 20.028 ? 20.466 ? 51.196 ? ?0.977 3276.567
>>
>> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA
>> 2MB
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?16.458 ? 16.649 ? 17.346 ? 60.919 ? ?1.364 ? ?0.952
>> ?33554432 ?16.479 ? 16.744 ? 17.069 ? 61.096 ? ?0.878 ? ?1.909
>> ?16777216 ?17.128 ? 16.585 ? 17.112 ? 60.456 ? ?0.910 ? ?3.778
>> ?8388608 ?17.322 ? 16.780 ? 16.885 ? 60.262 ? ?0.824 ? ?7.533
>> ?4194304 ?17.530 ? 16.725 ? 16.756 ? 60.250 ? ?1.299 ? 15.063
>> ?2097152 ?16.580 ? 17.875 ? 16.619 ? 60.221 ? ?2.076 ? 30.110
>> ?1048576 ?17.550 ? 17.406 ? 17.075 ? 59.049 ? ?0.681 ? 59.049
>> ? 524288 ?16.492 ? 18.211 ? 16.832 ? 59.718 ? ?2.519 ?119.436
>> ? 262144 ?17.241 ? 17.115 ? 17.365 ? 59.397 ? ?0.352 ?237.588
>> ? 131072 ?17.430 ? 16.902 ? 17.511 ? 59.271 ? ?0.936 ?474.167
>> ? ?65536 ?16.726 ? 16.894 ? 17.246 ? 60.404 ? ?0.768 ?966.461
>> ? ?32768 ?16.662 ? 17.517 ? 17.052 ? 59.989 ? ?1.224 1919.658
>> ? ?16384 ?17.429 ? 16.793 ? 16.753 ? 60.285 ? ?1.085 3858.268
>>
>> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA
>> 2MB
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?17.601 ? 18.334 ? 17.379 ? 57.650 ? ?1.307 ? ?0.901
>> ?33554432 ?18.281 ? 18.128 ? 17.169 ? 57.381 ? ?1.610 ? ?1.793
>> ?16777216 ?17.660 ? 17.875 ? 17.356 ? 58.091 ? ?0.703 ? ?3.631
>> ?8388608 ?17.724 ? 17.810 ? 18.383 ? 56.992 ? ?0.918 ? ?7.124
>> ?4194304 ?17.475 ? 17.770 ? 19.003 ? 56.704 ? ?2.031 ? 14.176
>> ?2097152 ?17.287 ? 17.674 ? 18.492 ? 57.516 ? ?1.604 ? 28.758
>> ?1048576 ?17.972 ? 17.460 ? 18.777 ? 56.721 ? ?1.689 ? 56.721
>> ? 524288 ?18.680 ? 18.952 ? 19.445 ? 53.837 ? ?0.890 ?107.673
>> ? 262144 ?18.070 ? 18.337 ? 18.639 ? 55.817 ? ?0.707 ?223.270
>> ? 131072 ?16.990 ? 16.651 ? 16.862 ? 60.832 ? ?0.507 ?486.657
>> ? ?65536 ?17.707 ? 16.972 ? 17.520 ? 58.870 ? ?1.066 ?941.924
>> ? ?32768 ?17.767 ? 17.208 ? 17.205 ? 58.887 ? ?0.885 1884.399
>> ? ?16384 ?18.258 ? 17.252 ? 18.035 ? 57.407 ? ?1.407 3674.059
>>
>> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>> ?67108864 ?17.993 ? 18.307 ? 18.718 ? 55.850 ? ?0.902 ? ?0.873
>> ?33554432 ?19.554 ? 18.485 ? 17.902 ? 54.988 ? ?1.993 ? ?1.718
>> ?16777216 ?18.829 ? 18.236 ? 18.748 ? 55.052 ? ?0.785 ? ?3.441
>> ?8388608 ?21.152 ? 19.065 ? 18.738 ? 52.257 ? ?2.745 ? ?6.532
>> ?4194304 ?19.131 ? 19.703 ? 17.850 ? 54.288 ? ?2.268 ? 13.572
>> ?2097152 ?19.093 ? 19.152 ? 19.509 ? 53.196 ? ?0.504 ? 26.598
>> ?1048576 ?19.371 ? 18.775 ? 18.804 ? 53.953 ? ?0.772 ? 53.953
>> ? 524288 ?20.003 ? 17.911 ? 18.602 ? 54.470 ? ?2.476 ?108.940
>> ? 262144 ?19.182 ? 19.460 ? 18.476 ? 53.809 ? ?1.183 ?215.236
>> ? 131072 ?19.403 ? 19.192 ? 18.907 ? 53.429 ? ?0.567 ?427.435
>> ? ?65536 ?19.502 ? 19.656 ? 18.599 ? 53.219 ? ?1.309 ?851.509
>> ? ?32768 ?18.746 ? 18.747 ? 18.250 ? 55.119 ? ?0.701 1763.817
>> ? ?16384 ?20.977 ? 19.437 ? 18.840 ? 51.951 ? ?2.319 3324.862
>
> The results look inconsistently with what you had previously (89.7 MB/s).
> How can you explain it?

I had more patches applied with that test: (scst_exec_req_fifo-2.6.29,
put_page_callback-2.6.29) and I used a different dd command:

dd if=/dev/sdc of=/dev/zero bs=512K count=2000

But all that said, I can't reproduce speeds that high now. Must have
made a mistake back then (maybe I forgot to clear the pagecache).

> I think, most likely, there was some confusion between the tested and
> patched versions of the kernel or you forgot to apply the io_context patch.
> Please recheck.

The tests above were definitely done right, I just rechecked the
patches, and I do see an average increase of about 10MB/s over an
unpatched kernel. But overall the performance is still pretty bad.

Ronald.

2009-07-06 17:49:20

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

Ronald Moesbergen, on 07/06/2009 06:37 PM wrote:
> 2009/7/6 Vladislav Bolkhovitin <[email protected]>:
>> (Restored the original list of recipients in this thread as I was asked.)
>>
>> Hi Ronald,
>>
>> Ronald Moesbergen, on 07/04/2009 07:19 PM wrote:
>>> 2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>>>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>>>>> to
>>>>>>> - do more benchmarks
>>>>>>> - figure out why context readahead didn't help SCST performance
>>>>>>> (previous traces show that context readahead is submitting perfect
>>>>>>> large io requests, so I wonder if it's some io scheduler bug)
>>>>>> Because, as we found out, without your
>>>>>> http://lkml.org/lkml/2009/5/21/319
>>>>>> patch read-ahead was nearly disabled, hence there were no difference
>>>>>> which
>>>>>> algorithm was used?
>>>>>>
>>>>>> Ronald, can you run the following tests, please? This time with 2
>>>>>> hosts,
>>>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI. It
>>>>>> would be the best if on the client vanilla 2.6.29 will be ran, but any
>>>>>> other
>>>>>> kernel will be fine as well, only specify which. Blockdev-perftest
>>>>>> should
>>>>>> be
>>>>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>>>>
>>>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>>>>
>>>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and 64KB
>>>>>> max_sectors_kb.
>>>>>>
>>>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>>>>> max_sectors_kb.
>>>>>>
>>>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>>>>> max_sectors_kb.
>>>>>>
>>>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch. RA
>>>>>> size
>>>>>> and max_sectors_kb are default. For your convenience I committed the
>>>>>> backported context RA patches into the SCST SVN repository.
>>>>>>
>>>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with default
>>>>>> RA
>>>>>> size and 64KB max_sectors_kb.
>>>>>>
>>>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>>> size
>>>>>> and default max_sectors_kb.
>>>>>>
>>>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>> Fengguang's
>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>>> size
>>>>>> and 64KB max_sectors_kb.
>>>>>>
>>>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the server
>>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>>
>>>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the server
>>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>>
>>>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>>>>> vanilla
>>>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and context
>>>>>> RA
>>>>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>>>> Ok, done. Performance is pretty bad overall :(
>>>>>
>>>>> The kernels I used:
>>>>> client kernel: 2.6.26-15lenny3 (debian)
>>>>> server kernel: 2.6.29.5 with blk_dev_run patch
>>>>>
>>>>> And I adjusted the blockdev-perftest script to drop caches on both the
>>>>> server (via ssh) and the client.
>>>>>
>>>>> The results:
>>>>>
>>> ... previous results ...
>>>
>>>> Those are on the server without io_context-2.6.29 and readahead-2.6.29
>>>> patches applied and with CFQ scheduler, correct?
>>>>
>>>> Then we see how reorder of requests caused by many I/O threads submitting
>>>> I/O in separate I/O contexts badly affect performance and no RA,
>>>> especially
>>>> with default 128KB RA size, can solve it. Less max_sectors_kb on the
>>>> client
>>>> => more requests it sends at once => more reorder on the server => worse
>>>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
>>>> should considerably help it, no?
>>>>
>>>> Ronald, can you perform those tests again with both io_context-2.6.29 and
>>>> readahead-2.6.29 patches applied on the server, please?
>>> Hi Vlad,
>>>
>>> I have retested with the patches you requested (and got access to the
>>> systems today :) ) The results are better, but still not great.
>>>
>>> client kernel: 2.6.26-15lenny3 (debian)
>>> server kernel: 2.6.29.5 with io_context and readahead patch
>>>
>>> 5) client: default, server: default
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 18.303 19.867 18.481 54.299 1.961 0.848
>>> 33554432 18.321 17.681 18.708 56.181 1.314 1.756
>>> 16777216 17.816 17.406 19.257 56.494 2.410 3.531
>>> 8388608 18.077 17.727 19.338 55.789 2.056 6.974
>>> 4194304 17.918 16.601 18.287 58.276 2.454 14.569
>>> 2097152 17.426 17.334 17.610 58.661 0.384 29.331
>>> 1048576 19.358 18.764 17.253 55.607 2.734 55.607
>>> 524288 17.951 18.163 17.440 57.379 0.983 114.757
>>> 262144 18.196 17.724 17.520 57.499 0.907 229.995
>>> 131072 18.342 18.259 17.551 56.751 1.131 454.010
>>> 65536 17.733 18.572 17.134 57.548 1.893 920.766
>>> 32768 19.081 19.321 17.364 55.213 2.673 1766.818
>>> 16384 17.181 18.729 17.731 57.343 2.033 3669.932
>>>
>>> 6) client: default, server: 64 max_sectors_kb, RA default
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 21.790 20.062 19.534 50.153 2.304 0.784
>>> 33554432 20.212 19.744 19.564 51.623 0.706 1.613
>>> 16777216 20.404 19.329 19.738 51.680 1.148 3.230
>>> 8388608 20.170 20.772 19.509 50.852 1.304 6.356
>>> 4194304 19.334 18.742 18.522 54.296 0.978 13.574
>>> 2097152 19.413 18.858 18.884 53.758 0.715 26.879
>>> 1048576 20.472 18.755 18.476 53.347 2.377 53.347
>>> 524288 19.120 20.104 18.404 53.378 1.925 106.756
>>> 262144 20.337 19.213 18.636 52.866 1.901 211.464
>>> 131072 19.199 18.312 19.970 53.510 1.900 428.083
>>> 65536 19.855 20.114 19.592 51.584 0.555 825.342
>>> 32768 20.586 18.724 20.340 51.592 2.204 1650.941
>>> 16384 21.119 19.834 19.594 50.792 1.651 3250.669
>>>
>>> 7) client: default, server: default max_sectors_kb, RA 2MB
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 17.767 16.489 16.949 60.050 1.842 0.938
>>> 33554432 16.777 17.034 17.102 60.341 0.500 1.886
>>> 16777216 18.509 16.784 16.971 58.891 2.537 3.681
>>> 8388608 18.058 17.949 17.599 57.313 0.632 7.164
>>> 4194304 18.286 17.648 17.026 58.055 1.692 14.514
>>> 2097152 17.387 18.451 17.875 57.226 1.388 28.613
>>> 1048576 18.270 17.698 17.570 57.397 0.969 57.397
>>> 524288 16.708 17.900 17.233 59.306 1.668 118.611
>>> 262144 18.041 17.381 18.035 57.484 1.011 229.934
>>> 131072 17.994 17.777 18.146 56.981 0.481 455.844
>>> 65536 17.097 18.597 17.737 57.563 1.975 921.011
>>> 32768 17.167 17.035 19.693 57.254 3.721 1832.127
>>> 16384 17.144 16.664 17.623 59.762 1.367 3824.774
>>>
>>> 8) client: default, server: 64 max_sectors_kb, RA 2MB
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 20.003 21.133 19.308 50.894 1.881 0.795
>>> 33554432 19.448 20.015 18.908 52.657 1.222 1.646
>>> 16777216 19.964 19.350 19.106 52.603 0.967 3.288
>>> 8388608 18.961 19.213 19.318 53.437 0.419 6.680
>>> 4194304 18.135 19.508 19.361 53.948 1.788 13.487
>>> 2097152 18.753 19.471 18.367 54.315 1.306 27.158
>>> 1048576 19.189 18.586 18.867 54.244 0.707 54.244
>>> 524288 18.985 19.199 18.840 53.874 0.417 107.749
>>> 262144 19.064 21.143 19.674 51.398 2.204 205.592
>>> 131072 18.691 18.664 19.116 54.406 0.594 435.245
>>> 65536 18.468 20.673 18.554 53.389 2.729 854.229
>>> 32768 20.401 21.156 19.552 50.323 1.623 1610.331
>>> 16384 19.532 20.028 20.466 51.196 0.977 3276.567
>>>
>>> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA
>>> 2MB
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 16.458 16.649 17.346 60.919 1.364 0.952
>>> 33554432 16.479 16.744 17.069 61.096 0.878 1.909
>>> 16777216 17.128 16.585 17.112 60.456 0.910 3.778
>>> 8388608 17.322 16.780 16.885 60.262 0.824 7.533
>>> 4194304 17.530 16.725 16.756 60.250 1.299 15.063
>>> 2097152 16.580 17.875 16.619 60.221 2.076 30.110
>>> 1048576 17.550 17.406 17.075 59.049 0.681 59.049
>>> 524288 16.492 18.211 16.832 59.718 2.519 119.436
>>> 262144 17.241 17.115 17.365 59.397 0.352 237.588
>>> 131072 17.430 16.902 17.511 59.271 0.936 474.167
>>> 65536 16.726 16.894 17.246 60.404 0.768 966.461
>>> 32768 16.662 17.517 17.052 59.989 1.224 1919.658
>>> 16384 17.429 16.793 16.753 60.285 1.085 3858.268
>>>
>>> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb, RA
>>> 2MB
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 17.601 18.334 17.379 57.650 1.307 0.901
>>> 33554432 18.281 18.128 17.169 57.381 1.610 1.793
>>> 16777216 17.660 17.875 17.356 58.091 0.703 3.631
>>> 8388608 17.724 17.810 18.383 56.992 0.918 7.124
>>> 4194304 17.475 17.770 19.003 56.704 2.031 14.176
>>> 2097152 17.287 17.674 18.492 57.516 1.604 28.758
>>> 1048576 17.972 17.460 18.777 56.721 1.689 56.721
>>> 524288 18.680 18.952 19.445 53.837 0.890 107.673
>>> 262144 18.070 18.337 18.639 55.817 0.707 223.270
>>> 131072 16.990 16.651 16.862 60.832 0.507 486.657
>>> 65536 17.707 16.972 17.520 58.870 1.066 941.924
>>> 32768 17.767 17.208 17.205 58.887 0.885 1884.399
>>> 16384 18.258 17.252 18.035 57.407 1.407 3674.059
>>>
>>> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
>>> blocksize R R R R(avg, R(std R
>>> (bytes) (s) (s) (s) MB/s) ,MB/s) (IOPS)
>>> 67108864 17.993 18.307 18.718 55.850 0.902 0.873
>>> 33554432 19.554 18.485 17.902 54.988 1.993 1.718
>>> 16777216 18.829 18.236 18.748 55.052 0.785 3.441
>>> 8388608 21.152 19.065 18.738 52.257 2.745 6.532
>>> 4194304 19.131 19.703 17.850 54.288 2.268 13.572
>>> 2097152 19.093 19.152 19.509 53.196 0.504 26.598
>>> 1048576 19.371 18.775 18.804 53.953 0.772 53.953
>>> 524288 20.003 17.911 18.602 54.470 2.476 108.940
>>> 262144 19.182 19.460 18.476 53.809 1.183 215.236
>>> 131072 19.403 19.192 18.907 53.429 0.567 427.435
>>> 65536 19.502 19.656 18.599 53.219 1.309 851.509
>>> 32768 18.746 18.747 18.250 55.119 0.701 1763.817
>>> 16384 20.977 19.437 18.840 51.951 2.319 3324.862
>> The results look inconsistently with what you had previously (89.7 MB/s).
>> How can you explain it?
>
> I had more patches applied with that test: (scst_exec_req_fifo-2.6.29,
> put_page_callback-2.6.29) and I used a different dd command:
>
> dd if=/dev/sdc of=/dev/zero bs=512K count=2000
>
> But all that said, I can't reproduce speeds that high now. Must have
> made a mistake back then (maybe I forgot to clear the pagecache).

If you forgot to clear the cache, you would had had the wire throughput
(110 MB/s) or more.

>> I think, most likely, there was some confusion between the tested and
>> patched versions of the kernel or you forgot to apply the io_context patch.
>> Please recheck.
>
> The tests above were definitely done right, I just rechecked the
> patches, and I do see an average increase of about 10MB/s over an
> unpatched kernel. But overall the performance is still pretty bad.

Have you rebuild and reinstall SCST after patching kernel?

> Ronald.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2009-07-07 06:49:33

by Ronald Moesbergen

[permalink] [raw]
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev

2009/7/6 Vladislav Bolkhovitin <[email protected]>:
> Ronald Moesbergen, on 07/06/2009 06:37 PM wrote:
>>
>> 2009/7/6 Vladislav Bolkhovitin <[email protected]>:
>>>
>>> (Restored the original list of recipients in this thread as I was asked.)
>>>
>>> Hi Ronald,
>>>
>>> Ronald Moesbergen, on 07/04/2009 07:19 PM wrote:
>>>>
>>>> 2009/7/3 Vladislav Bolkhovitin <[email protected]>:
>>>>>
>>>>> Ronald Moesbergen, on 07/03/2009 01:14 PM wrote:
>>>>>>>>
>>>>>>>> OK, now I tend to agree on decreasing max_sectors_kb and increasing
>>>>>>>> read_ahead_kb. But before actually trying to push that idea I'd like
>>>>>>>> to
>>>>>>>> - do more benchmarks
>>>>>>>> - figure out why context readahead didn't help SCST performance
>>>>>>>> ?(previous traces show that context readahead is submitting perfect
>>>>>>>> ?large io requests, so I wonder if it's some io scheduler bug)
>>>>>>>
>>>>>>> Because, as we found out, without your
>>>>>>> http://lkml.org/lkml/2009/5/21/319
>>>>>>> patch read-ahead was nearly disabled, hence there were no difference
>>>>>>> which
>>>>>>> algorithm was used?
>>>>>>>
>>>>>>> Ronald, can you run the following tests, please? This time with 2
>>>>>>> hosts,
>>>>>>> initiator (client) and target (server) connected using 1 Gbps iSCSI.
>>>>>>> It
>>>>>>> would be the best if on the client vanilla 2.6.29 will be ran, but
>>>>>>> any
>>>>>>> other
>>>>>>> kernel will be fine as well, only specify which. Blockdev-perftest
>>>>>>> should
>>>>>>> be
>>>>>>> ran as before in buffered mode, i.e. with "-a" switch.
>>>>>>>
>>>>>>> 1. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with all default settings.
>>>>>>>
>>>>>>> 2. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with default RA size and
>>>>>>> 64KB
>>>>>>> max_sectors_kb.
>>>>>>>
>>>>>>> 3. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and default
>>>>>>> max_sectors_kb.
>>>>>>>
>>>>>>> 4. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 patch with 2MB RA size and 64KB
>>>>>>> max_sectors_kb.
>>>>>>>
>>>>>>> 5. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 patch and with context RA patch.
>>>>>>> RA
>>>>>>> size
>>>>>>> and max_sectors_kb are default. For your convenience I committed the
>>>>>>> backported context RA patches into the SCST SVN repository.
>>>>>>>
>>>>>>> 6. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with
>>>>>>> default
>>>>>>> RA
>>>>>>> size and 64KB max_sectors_kb.
>>>>>>>
>>>>>>> 7. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>>>> size
>>>>>>> and default max_sectors_kb.
>>>>>>>
>>>>>>> 8. All defaults on the client, on the server vanilla 2.6.29 with
>>>>>>> Fengguang's
>>>>>>> http://lkml.org/lkml/2009/5/21/319 and context RA patches with 2MB RA
>>>>>>> size
>>>>>>> and 64KB max_sectors_kb.
>>>>>>>
>>>>>>> 9. On the client default RA size and 64KB max_sectors_kb. On the
>>>>>>> server
>>>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319
>>>>>>> and
>>>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>>>
>>>>>>> 10. On the client 2MB RA size and default max_sectors_kb. On the
>>>>>>> server
>>>>>>> vanilla 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319
>>>>>>> and
>>>>>>> context RA patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>>>
>>>>>>> 11. On the client 2MB RA size and 64KB max_sectors_kb. On the server
>>>>>>> vanilla
>>>>>>> 2.6.29 with Fengguang's http://lkml.org/lkml/2009/5/21/319 and
>>>>>>> context
>>>>>>> RA
>>>>>>> patches with 2MB RA size and 64KB max_sectors_kb.
>>>>>>
>>>>>> Ok, done. Performance is pretty bad overall :(
>>>>>>
>>>>>> The kernels I used:
>>>>>> client kernel: 2.6.26-15lenny3 (debian)
>>>>>> server kernel: 2.6.29.5 with blk_dev_run patch
>>>>>>
>>>>>> And I adjusted the blockdev-perftest script to drop caches on both the
>>>>>> server (via ssh) and the client.
>>>>>>
>>>>>> The results:
>>>>>>
>>>> ... previous results ...
>>>>
>>>>> Those are on the server without io_context-2.6.29 and readahead-2.6.29
>>>>> patches applied and with CFQ scheduler, correct?
>>>>>
>>>>> Then we see how reorder of requests caused by many I/O threads
>>>>> submitting
>>>>> I/O in separate I/O contexts badly affect performance and no RA,
>>>>> especially
>>>>> with default 128KB RA size, can solve it. Less max_sectors_kb on the
>>>>> client
>>>>> => more requests it sends at once => more reorder on the server =>
>>>>> worse
>>>>> throughput. Although, Fengguang, in theory, context RA with 2MB RA size
>>>>> should considerably help it, no?
>>>>>
>>>>> Ronald, can you perform those tests again with both io_context-2.6.29
>>>>> and
>>>>> readahead-2.6.29 patches applied on the server, please?
>>>>
>>>> Hi Vlad,
>>>>
>>>> I have retested with the patches you requested (and got access to the
>>>> systems today :) ) The results are better, but still not great.
>>>>
>>>> client kernel: 2.6.26-15lenny3 (debian)
>>>> server kernel: 2.6.29.5 with io_context and readahead patch
>>>>
>>>> 5) client: default, server: default
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?18.303 ? 19.867 ? 18.481 ? 54.299 ? ?1.961 ? ?0.848
>>>> ?33554432 ?18.321 ? 17.681 ? 18.708 ? 56.181 ? ?1.314 ? ?1.756
>>>> ?16777216 ?17.816 ? 17.406 ? 19.257 ? 56.494 ? ?2.410 ? ?3.531
>>>> ?8388608 ?18.077 ? 17.727 ? 19.338 ? 55.789 ? ?2.056 ? ?6.974
>>>> ?4194304 ?17.918 ? 16.601 ? 18.287 ? 58.276 ? ?2.454 ? 14.569
>>>> ?2097152 ?17.426 ? 17.334 ? 17.610 ? 58.661 ? ?0.384 ? 29.331
>>>> ?1048576 ?19.358 ? 18.764 ? 17.253 ? 55.607 ? ?2.734 ? 55.607
>>>> ?524288 ?17.951 ? 18.163 ? 17.440 ? 57.379 ? ?0.983 ?114.757
>>>> ?262144 ?18.196 ? 17.724 ? 17.520 ? 57.499 ? ?0.907 ?229.995
>>>> ?131072 ?18.342 ? 18.259 ? 17.551 ? 56.751 ? ?1.131 ?454.010
>>>> ? 65536 ?17.733 ? 18.572 ? 17.134 ? 57.548 ? ?1.893 ?920.766
>>>> ? 32768 ?19.081 ? 19.321 ? 17.364 ? 55.213 ? ?2.673 1766.818
>>>> ? 16384 ?17.181 ? 18.729 ? 17.731 ? 57.343 ? ?2.033 3669.932
>>>>
>>>> 6) client: default, server: 64 max_sectors_kb, RA default
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?21.790 ? 20.062 ? 19.534 ? 50.153 ? ?2.304 ? ?0.784
>>>> ?33554432 ?20.212 ? 19.744 ? 19.564 ? 51.623 ? ?0.706 ? ?1.613
>>>> ?16777216 ?20.404 ? 19.329 ? 19.738 ? 51.680 ? ?1.148 ? ?3.230
>>>> ?8388608 ?20.170 ? 20.772 ? 19.509 ? 50.852 ? ?1.304 ? ?6.356
>>>> ?4194304 ?19.334 ? 18.742 ? 18.522 ? 54.296 ? ?0.978 ? 13.574
>>>> ?2097152 ?19.413 ? 18.858 ? 18.884 ? 53.758 ? ?0.715 ? 26.879
>>>> ?1048576 ?20.472 ? 18.755 ? 18.476 ? 53.347 ? ?2.377 ? 53.347
>>>> ?524288 ?19.120 ? 20.104 ? 18.404 ? 53.378 ? ?1.925 ?106.756
>>>> ?262144 ?20.337 ? 19.213 ? 18.636 ? 52.866 ? ?1.901 ?211.464
>>>> ?131072 ?19.199 ? 18.312 ? 19.970 ? 53.510 ? ?1.900 ?428.083
>>>> ? 65536 ?19.855 ? 20.114 ? 19.592 ? 51.584 ? ?0.555 ?825.342
>>>> ? 32768 ?20.586 ? 18.724 ? 20.340 ? 51.592 ? ?2.204 1650.941
>>>> ? 16384 ?21.119 ? 19.834 ? 19.594 ? 50.792 ? ?1.651 3250.669
>>>>
>>>> 7) client: default, server: default max_sectors_kb, RA 2MB
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?17.767 ? 16.489 ? 16.949 ? 60.050 ? ?1.842 ? ?0.938
>>>> ?33554432 ?16.777 ? 17.034 ? 17.102 ? 60.341 ? ?0.500 ? ?1.886
>>>> ?16777216 ?18.509 ? 16.784 ? 16.971 ? 58.891 ? ?2.537 ? ?3.681
>>>> ?8388608 ?18.058 ? 17.949 ? 17.599 ? 57.313 ? ?0.632 ? ?7.164
>>>> ?4194304 ?18.286 ? 17.648 ? 17.026 ? 58.055 ? ?1.692 ? 14.514
>>>> ?2097152 ?17.387 ? 18.451 ? 17.875 ? 57.226 ? ?1.388 ? 28.613
>>>> ?1048576 ?18.270 ? 17.698 ? 17.570 ? 57.397 ? ?0.969 ? 57.397
>>>> ?524288 ?16.708 ? 17.900 ? 17.233 ? 59.306 ? ?1.668 ?118.611
>>>> ?262144 ?18.041 ? 17.381 ? 18.035 ? 57.484 ? ?1.011 ?229.934
>>>> ?131072 ?17.994 ? 17.777 ? 18.146 ? 56.981 ? ?0.481 ?455.844
>>>> ? 65536 ?17.097 ? 18.597 ? 17.737 ? 57.563 ? ?1.975 ?921.011
>>>> ? 32768 ?17.167 ? 17.035 ? 19.693 ? 57.254 ? ?3.721 1832.127
>>>> ? 16384 ?17.144 ? 16.664 ? 17.623 ? 59.762 ? ?1.367 3824.774
>>>>
>>>> 8) client: default, server: 64 max_sectors_kb, RA 2MB
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?20.003 ? 21.133 ? 19.308 ? 50.894 ? ?1.881 ? ?0.795
>>>> ?33554432 ?19.448 ? 20.015 ? 18.908 ? 52.657 ? ?1.222 ? ?1.646
>>>> ?16777216 ?19.964 ? 19.350 ? 19.106 ? 52.603 ? ?0.967 ? ?3.288
>>>> ?8388608 ?18.961 ? 19.213 ? 19.318 ? 53.437 ? ?0.419 ? ?6.680
>>>> ?4194304 ?18.135 ? 19.508 ? 19.361 ? 53.948 ? ?1.788 ? 13.487
>>>> ?2097152 ?18.753 ? 19.471 ? 18.367 ? 54.315 ? ?1.306 ? 27.158
>>>> ?1048576 ?19.189 ? 18.586 ? 18.867 ? 54.244 ? ?0.707 ? 54.244
>>>> ?524288 ?18.985 ? 19.199 ? 18.840 ? 53.874 ? ?0.417 ?107.749
>>>> ?262144 ?19.064 ? 21.143 ? 19.674 ? 51.398 ? ?2.204 ?205.592
>>>> ?131072 ?18.691 ? 18.664 ? 19.116 ? 54.406 ? ?0.594 ?435.245
>>>> ? 65536 ?18.468 ? 20.673 ? 18.554 ? 53.389 ? ?2.729 ?854.229
>>>> ? 32768 ?20.401 ? 21.156 ? 19.552 ? 50.323 ? ?1.623 1610.331
>>>> ? 16384 ?19.532 ? 20.028 ? 20.466 ? 51.196 ? ?0.977 3276.567
>>>>
>>>> 9) client: 64 max_sectors_kb, default RA. server: 64 max_sectors_kb, RA
>>>> 2MB
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?16.458 ? 16.649 ? 17.346 ? 60.919 ? ?1.364 ? ?0.952
>>>> ?33554432 ?16.479 ? 16.744 ? 17.069 ? 61.096 ? ?0.878 ? ?1.909
>>>> ?16777216 ?17.128 ? 16.585 ? 17.112 ? 60.456 ? ?0.910 ? ?3.778
>>>> ?8388608 ?17.322 ? 16.780 ? 16.885 ? 60.262 ? ?0.824 ? ?7.533
>>>> ?4194304 ?17.530 ? 16.725 ? 16.756 ? 60.250 ? ?1.299 ? 15.063
>>>> ?2097152 ?16.580 ? 17.875 ? 16.619 ? 60.221 ? ?2.076 ? 30.110
>>>> ?1048576 ?17.550 ? 17.406 ? 17.075 ? 59.049 ? ?0.681 ? 59.049
>>>> ?524288 ?16.492 ? 18.211 ? 16.832 ? 59.718 ? ?2.519 ?119.436
>>>> ?262144 ?17.241 ? 17.115 ? 17.365 ? 59.397 ? ?0.352 ?237.588
>>>> ?131072 ?17.430 ? 16.902 ? 17.511 ? 59.271 ? ?0.936 ?474.167
>>>> ? 65536 ?16.726 ? 16.894 ? 17.246 ? 60.404 ? ?0.768 ?966.461
>>>> ? 32768 ?16.662 ? 17.517 ? 17.052 ? 59.989 ? ?1.224 1919.658
>>>> ? 16384 ?17.429 ? 16.793 ? 16.753 ? 60.285 ? ?1.085 3858.268
>>>>
>>>> 10) client: default max_sectors_kb, 2MB RA. server: 64 max_sectors_kb,
>>>> RA
>>>> 2MB
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?17.601 ? 18.334 ? 17.379 ? 57.650 ? ?1.307 ? ?0.901
>>>> ?33554432 ?18.281 ? 18.128 ? 17.169 ? 57.381 ? ?1.610 ? ?1.793
>>>> ?16777216 ?17.660 ? 17.875 ? 17.356 ? 58.091 ? ?0.703 ? ?3.631
>>>> ?8388608 ?17.724 ? 17.810 ? 18.383 ? 56.992 ? ?0.918 ? ?7.124
>>>> ?4194304 ?17.475 ? 17.770 ? 19.003 ? 56.704 ? ?2.031 ? 14.176
>>>> ?2097152 ?17.287 ? 17.674 ? 18.492 ? 57.516 ? ?1.604 ? 28.758
>>>> ?1048576 ?17.972 ? 17.460 ? 18.777 ? 56.721 ? ?1.689 ? 56.721
>>>> ?524288 ?18.680 ? 18.952 ? 19.445 ? 53.837 ? ?0.890 ?107.673
>>>> ?262144 ?18.070 ? 18.337 ? 18.639 ? 55.817 ? ?0.707 ?223.270
>>>> ?131072 ?16.990 ? 16.651 ? 16.862 ? 60.832 ? ?0.507 ?486.657
>>>> ? 65536 ?17.707 ? 16.972 ? 17.520 ? 58.870 ? ?1.066 ?941.924
>>>> ? 32768 ?17.767 ? 17.208 ? 17.205 ? 58.887 ? ?0.885 1884.399
>>>> ? 16384 ?18.258 ? 17.252 ? 18.035 ? 57.407 ? ?1.407 3674.059
>>>>
>>>> 11) client: 64 max_sectors_kb, 2MB. RA server: 64 max_sectors_kb, RA 2MB
>>>> blocksize ? ? ? R ? ? ? ?R ? ? ? ?R ? R(avg, ? ?R(std ? ? ? ?R
>>>> ?(bytes) ? ? (s) ? ? ?(s) ? ? ?(s) ? ?MB/s) ? ,MB/s) ? (IOPS)
>>>> ?67108864 ?17.993 ? 18.307 ? 18.718 ? 55.850 ? ?0.902 ? ?0.873
>>>> ?33554432 ?19.554 ? 18.485 ? 17.902 ? 54.988 ? ?1.993 ? ?1.718
>>>> ?16777216 ?18.829 ? 18.236 ? 18.748 ? 55.052 ? ?0.785 ? ?3.441
>>>> ?8388608 ?21.152 ? 19.065 ? 18.738 ? 52.257 ? ?2.745 ? ?6.532
>>>> ?4194304 ?19.131 ? 19.703 ? 17.850 ? 54.288 ? ?2.268 ? 13.572
>>>> ?2097152 ?19.093 ? 19.152 ? 19.509 ? 53.196 ? ?0.504 ? 26.598
>>>> ?1048576 ?19.371 ? 18.775 ? 18.804 ? 53.953 ? ?0.772 ? 53.953
>>>> ?524288 ?20.003 ? 17.911 ? 18.602 ? 54.470 ? ?2.476 ?108.940
>>>> ?262144 ?19.182 ? 19.460 ? 18.476 ? 53.809 ? ?1.183 ?215.236
>>>> ?131072 ?19.403 ? 19.192 ? 18.907 ? 53.429 ? ?0.567 ?427.435
>>>> ? 65536 ?19.502 ? 19.656 ? 18.599 ? 53.219 ? ?1.309 ?851.509
>>>> ? 32768 ?18.746 ? 18.747 ? 18.250 ? 55.119 ? ?0.701 1763.817
>>>> ? 16384 ?20.977 ? 19.437 ? 18.840 ? 51.951 ? ?2.319 3324.862
>>>
>>> The results look inconsistently with what you had previously (89.7 MB/s).
>>> How can you explain it?
>>
>> I had more patches applied with that test: (scst_exec_req_fifo-2.6.29,
>> put_page_callback-2.6.29) and I used a different dd command:
>>
>> dd if=/dev/sdc of=/dev/zero bs=512K count=2000
>>
>> But all that said, I can't reproduce speeds that high now. Must have
>> made a mistake back then (maybe I forgot to clear the pagecache).
>
> If you forgot to clear the cache, you would had had the wire throughput (110
> MB/s) or more.

Maybe. Maybe just part of what I was transferring was in cache. I had
done some tests on the filesystem on that same block device too.

>>> I think, most likely, there was some confusion between the tested and
>>> patched versions of the kernel or you forgot to apply the io_context
>>> patch.
>>> Please recheck.
>>
>> The tests above were definitely done right, I just rechecked the
>> patches, and I do see an average increase of about 10MB/s over an
>> unpatched kernel. But overall the performance is still pretty bad.
>
> Have you rebuild and reinstall SCST after patching kernel?

Yes I have. And the warning about missing io_context patches wasn't
there during the compilation.

Ronald.