2014-10-01 20:27:54

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback

> Any comments or suggestions are more than welcome.

Hey Arianna,

Thank you for posting this patchset. I've gone over each patchset and the design
looks quite sound. There are just a some required changes mentioned by Christopher,
David, and me in regards to the different uses of APIs and such.

And I should on the next revision be able to review much quicker now that the
giant nexus of _everything_ (Xen 4.5 feature freeze, LinuxCon, tons of reviews,
internal bugs, etc) has slowed down.

Again, thank you for doing this work and posting the patchset!


2015-04-28 07:46:47

by Arianna Avanzini

[permalink] [raw]
Subject: Re: [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback

Hello Christoph,

Il 28/04/2015 09:36, Christoph Hellwig ha scritto:
> What happened to this patchset?
>

It was passed on to Bob Liu, who published a follow-up patchset here:
https://lkml.org/lkml/2015/2/15/46

Thanks,
Arianna

2015-05-13 10:31:05

by Bob Liu

[permalink] [raw]
Subject: Re: [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback


On 04/28/2015 03:46 PM, Arianna Avanzini wrote:
> Hello Christoph,
>
> Il 28/04/2015 09:36, Christoph Hellwig ha scritto:
>> What happened to this patchset?
>>
>
> It was passed on to Bob Liu, who published a follow-up patchset here: https://lkml.org/lkml/2015/2/15/46
>

Right, and then I was interrupted by another xen-block feature: 'multi-page' ring.
Will back on this patchset soon. Thank you!

-Bob

2015-06-30 14:21:46

by Marcus Granado

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback

On 13/05/15 11:29, Bob Liu wrote:
>
> On 04/28/2015 03:46 PM, Arianna Avanzini wrote:
>> Hello Christoph,
>>
>> Il 28/04/2015 09:36, Christoph Hellwig ha scritto:
>>> What happened to this patchset?
>>>
>>
>> It was passed on to Bob Liu, who published a follow-up patchset here: https://lkml.org/lkml/2015/2/15/46
>>
>
> Right, and then I was interrupted by another xen-block feature: 'multi-page' ring.
> Will back on this patchset soon. Thank you!
>
> -Bob
>

Hi,

Our measurements for the multiqueue patch indicate a clear improvement
in iops when more queues are used.

The measurements were obtained under the following conditions:

- using blkback as the dom0 backend with the multiqueue patch applied to
a dom0 kernel 4.0 on 8 vcpus.

- using a recent Ubuntu 15.04 kernel 3.19 with multiqueue frontend
applied to be used as a guest on 4 vcpus

- using a micron RealSSD P320h as the underlying local storage on a Dell
PowerEdge R720 with 2 Xeon E5-2643 v2 cpus.

- fio 2.2.7-22-g36870 as the generator of synthetic loads in the guest.
We used direct_io to skip caching in the guest and ran fio for 60s
reading a number of block sizes ranging from 512 bytes to 4MiB. Queue
depth of 32 for each queue was used to saturate individual vcpus in the
guest.

We were interested in observing storage iops for different values of
block sizes. Our expectation was that iops would improve when increasing
the number of queues, because both the guest and dom0 would be able to
make use of more vcpus to handle these requests.

These are the results (as aggregate iops for all the fio threads) that
we got for the conditions above with sequential reads:

fio_threads io_depth block_size 1-queue_iops 8-queue_iops
8 32 512 158K 264K
8 32 1K 157K 260K
8 32 2K 157K 258K
8 32 4K 148K 257K
8 32 8K 124K 207K
8 32 16K 84K 105K
8 32 32K 50K 54K
8 32 64K 24K 27K
8 32 128K 11K 13K

8-queue iops was better than single queue iops for all the block sizes.
There were very good improvements as well for sequential writes with
block size 4K (from 80K iops with single queue to 230K iops with 8
queues), and no regressions were visible in any measurement performed.

Marcus

2015-07-01 00:04:51

by Bob Liu

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback


On 06/30/2015 10:21 PM, Marcus Granado wrote:
> On 13/05/15 11:29, Bob Liu wrote:
>>
>> On 04/28/2015 03:46 PM, Arianna Avanzini wrote:
>>> Hello Christoph,
>>>
>>> Il 28/04/2015 09:36, Christoph Hellwig ha scritto:
>>>> What happened to this patchset?
>>>>
>>>
>>> It was passed on to Bob Liu, who published a follow-up patchset here: https://lkml.org/lkml/2015/2/15/46
>>>
>>
>> Right, and then I was interrupted by another xen-block feature: 'multi-page' ring.
>> Will back on this patchset soon. Thank you!
>>
>> -Bob
>>
>
> Hi,
>
> Our measurements for the multiqueue patch indicate a clear improvement in iops when more queues are used.
>
> The measurements were obtained under the following conditions:
>
> - using blkback as the dom0 backend with the multiqueue patch applied to a dom0 kernel 4.0 on 8 vcpus.
>
> - using a recent Ubuntu 15.04 kernel 3.19 with multiqueue frontend applied to be used as a guest on 4 vcpus
>
> - using a micron RealSSD P320h as the underlying local storage on a Dell PowerEdge R720 with 2 Xeon E5-2643 v2 cpus.
>
> - fio 2.2.7-22-g36870 as the generator of synthetic loads in the guest. We used direct_io to skip caching in the guest and ran fio for 60s reading a number of block sizes ranging from 512 bytes to 4MiB. Queue depth of 32 for each queue was used to saturate individual vcpus in the guest.
>
> We were interested in observing storage iops for different values of block sizes. Our expectation was that iops would improve when increasing the number of queues, because both the guest and dom0 would be able to make use of more vcpus to handle these requests.
>
> These are the results (as aggregate iops for all the fio threads) that we got for the conditions above with sequential reads:
>
> fio_threads io_depth block_size 1-queue_iops 8-queue_iops
> 8 32 512 158K 264K
> 8 32 1K 157K 260K
> 8 32 2K 157K 258K
> 8 32 4K 148K 257K
> 8 32 8K 124K 207K
> 8 32 16K 84K 105K
> 8 32 32K 50K 54K
> 8 32 64K 24K 27K
> 8 32 128K 11K 13K
>
> 8-queue iops was better than single queue iops for all the block sizes. There were very good improvements as well for sequential writes with block size 4K (from 80K iops with single queue to 230K iops with 8 queues), and no regressions were visible in any measurement performed.
>

Great! Thank you very much for the test.

I'm trying to rebase these patches to the latest kernel version(v4.1) and will send out in following days.

--
Regards,
-Bob

2015-07-01 03:03:32

by Jens Axboe

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback

On 06/30/2015 08:21 AM, Marcus Granado wrote:
> On 13/05/15 11:29, Bob Liu wrote:
>>
>> On 04/28/2015 03:46 PM, Arianna Avanzini wrote:
>>> Hello Christoph,
>>>
>>> Il 28/04/2015 09:36, Christoph Hellwig ha scritto:
>>>> What happened to this patchset?
>>>>
>>>
>>> It was passed on to Bob Liu, who published a follow-up patchset here:
>>> https://lkml.org/lkml/2015/2/15/46
>>>
>>
>> Right, and then I was interrupted by another xen-block feature:
>> 'multi-page' ring.
>> Will back on this patchset soon. Thank you!
>>
>> -Bob
>>
>
> Hi,
>
> Our measurements for the multiqueue patch indicate a clear improvement
> in iops when more queues are used.
>
> The measurements were obtained under the following conditions:
>
> - using blkback as the dom0 backend with the multiqueue patch applied to
> a dom0 kernel 4.0 on 8 vcpus.
>
> - using a recent Ubuntu 15.04 kernel 3.19 with multiqueue frontend
> applied to be used as a guest on 4 vcpus
>
> - using a micron RealSSD P320h as the underlying local storage on a Dell
> PowerEdge R720 with 2 Xeon E5-2643 v2 cpus.
>
> - fio 2.2.7-22-g36870 as the generator of synthetic loads in the guest.
> We used direct_io to skip caching in the guest and ran fio for 60s
> reading a number of block sizes ranging from 512 bytes to 4MiB. Queue
> depth of 32 for each queue was used to saturate individual vcpus in the
> guest.
>
> We were interested in observing storage iops for different values of
> block sizes. Our expectation was that iops would improve when increasing
> the number of queues, because both the guest and dom0 would be able to
> make use of more vcpus to handle these requests.
>
> These are the results (as aggregate iops for all the fio threads) that
> we got for the conditions above with sequential reads:
>
> fio_threads io_depth block_size 1-queue_iops 8-queue_iops
> 8 32 512 158K 264K
> 8 32 1K 157K 260K
> 8 32 2K 157K 258K
> 8 32 4K 148K 257K
> 8 32 8K 124K 207K
> 8 32 16K 84K 105K
> 8 32 32K 50K 54K
> 8 32 64K 24K 27K
> 8 32 128K 11K 13K
>
> 8-queue iops was better than single queue iops for all the block sizes.
> There were very good improvements as well for sequential writes with
> block size 4K (from 80K iops with single queue to 230K iops with 8
> queues), and no regressions were visible in any measurement performed.

Great results! And I don't know why this code has lingered for so long,
so thanks for helping get some attention to this again.

Personally I'd be really interested in the results for the same set of
tests, but without the blk-mq patches. Do you have them, or could you
potentially run them?

--
Jens Axboe