2017-08-04 21:00:50

by Richard W.M. Jones

[permalink] [raw]
Subject: Increased memory usage with scsi-mq

https://bugzilla.redhat.com/show_bug.cgi?id=1478201

We have a libguestfs test which adds 256 virtio-scsi disks to a qemu
virtual machine. The VM has 500 MB of RAM, 1 vCPU and no swap.

This test has been failing for a little while. It runs out of memory
during SCSI enumeration in early boot.

Tonight I bisected the cause to:

5c279bd9e40624f4ab6e688671026d6005b066fa is the first bad commit
commit 5c279bd9e40624f4ab6e688671026d6005b066fa
Author: Christoph Hellwig <[email protected]>
Date: Fri Jun 16 10:27:55 2017 +0200

scsi: default to scsi-mq

Remove the SCSI_MQ_DEFAULT config option and default to the blk-mq I/O
path now that we had plenty of testing, and have I/O schedulers for
blk-mq. The module option to disable the blk-mq path is kept around for
now.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>

:040000 040000 57ec7d5d2ba76592a695f533a69f747700c31966
c79f6ecb070acc4fadf6fc05ca9ba32bc9c0c665 M drivers

I also wrote a small test to see the maximum number of virtio-scsi
disks I could add to the above VM. The results were very surprising
(to me anyhow):

With scsi-mq enabled: 175 disks
With scsi-mq disabled: 1755 disks

I don't know why the ratio is almost exactly 10 times.

I read your slides about scsi-mq and it seems like a significant
benefit to large machines, but could the out of the box defaults be
made more friendly for small memory machines?

Thanks,

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2017-08-05 08:44:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Fri, Aug 04, 2017 at 10:00:47PM +0100, Richard W.M. Jones wrote:
> I read your slides about scsi-mq and it seems like a significant
> benefit to large machines, but could the out of the box defaults be
> made more friendly for small memory machines?

The default inumber of queues and queue depth and thus memory usage is
set by the LLDD.

Try to reduce the can_queue value in virtio_scsi and/or make sure
you use the single queue variant in your VM (which should be tunable
in qemu).

2017-08-05 09:27:08

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Sat, Aug 05, 2017 at 10:44:36AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 04, 2017 at 10:00:47PM +0100, Richard W.M. Jones wrote:
> > I read your slides about scsi-mq and it seems like a significant
> > benefit to large machines, but could the out of the box defaults be
> > made more friendly for small memory machines?
>
> The default inumber of queues and queue depth and thus memory usage is
> set by the LLDD.
>
> Try to reduce the can_queue value in virtio_scsi and/or make sure
> you use the single queue variant in your VM (which should be tunable
> in qemu).

Thanks, this is interesting.

Virtio-scsi seems to have a few settable parameters that might be
related to this:

DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 1),
DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, parent_obj.conf.max_sectors,
0xFFFF),
DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, parent_obj.conf.cmd_per_lun,
128),

Unfortunately (assuming I'm setting them right - see below), none of
them have any effect on the number of disks that I can add to the VM.

I am testing them by placing them in the ‘-device virtio-scsi-pci’
parameter, ie. as a property of the controller, not a property of the
LUN, eg:

-device virtio-scsi-pci,cmd_per_lun=32,id=scsi \
-drive file=/home/rjones/d/libguestfs/tmp/libguestfshXImTv/scratch.1,cache=unsafe,format=raw,id=hd0,if=none \
-device scsi-hd,drive=hd0 \

The debugging output is a bit too large to attach to this email, but I
have placed it at the link below. It contains (if you scroll down a
bit) the full qemu command line and full kernel output.

http://oirase.annexia.org/tmp/bz1478201-log.txt

I can add some extra debugging into the kernel if you like. Just
point me to the right place.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

2017-08-05 13:39:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

For now can you apply this testing patch to the guest kernel?

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 9be211d68b15..0cbe2c882e1c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -818,7 +818,7 @@ static struct scsi_host_template virtscsi_host_template_single = {
.eh_timed_out = virtscsi_eh_timed_out,
.slave_alloc = virtscsi_device_alloc,

- .can_queue = 1024,
+ .can_queue = 64,
.dma_boundary = UINT_MAX,
.use_clustering = ENABLE_CLUSTERING,
.target_alloc = virtscsi_target_alloc,
@@ -839,7 +839,7 @@ static struct scsi_host_template virtscsi_host_template_multi = {
.eh_timed_out = virtscsi_eh_timed_out,
.slave_alloc = virtscsi_device_alloc,

- .can_queue = 1024,
+ .can_queue = 64,
.dma_boundary = UINT_MAX,
.use_clustering = ENABLE_CLUSTERING,
.target_alloc = virtscsi_target_alloc,
@@ -983,7 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
shost->max_id = num_targets;
shost->max_channel = 0;
shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
- shost->nr_hw_queues = num_queues;
+ shost->nr_hw_queues = 1;

#ifdef CONFIG_BLK_DEV_INTEGRITY
if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {

2017-08-05 15:51:07

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Sat, Aug 05, 2017 at 03:39:54PM +0200, Christoph Hellwig wrote:
> For now can you apply this testing patch to the guest kernel?
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..0cbe2c882e1c 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -818,7 +818,7 @@ static struct scsi_host_template virtscsi_host_template_single = {
> .eh_timed_out = virtscsi_eh_timed_out,
> .slave_alloc = virtscsi_device_alloc,
>
> - .can_queue = 1024,
> + .can_queue = 64,
> .dma_boundary = UINT_MAX,
> .use_clustering = ENABLE_CLUSTERING,
> .target_alloc = virtscsi_target_alloc,
> @@ -839,7 +839,7 @@ static struct scsi_host_template virtscsi_host_template_multi = {
> .eh_timed_out = virtscsi_eh_timed_out,
> .slave_alloc = virtscsi_device_alloc,
>
> - .can_queue = 1024,
> + .can_queue = 64,
> .dma_boundary = UINT_MAX,
> .use_clustering = ENABLE_CLUSTERING,
> .target_alloc = virtscsi_target_alloc,
> @@ -983,7 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
> shost->max_id = num_targets;
> shost->max_channel = 0;
> shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
> - shost->nr_hw_queues = num_queues;
> + shost->nr_hw_queues = 1;
>
> #ifdef CONFIG_BLK_DEV_INTEGRITY
> if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {

Yes, that's an improvement, although it's still a little way off the
density possible the old way:

With scsi-mq enabled: 175 disks
* With this patch: 319 disks *
With scsi-mq disabled: 1755 disks

Also only the first two hunks are necessary. The kernel behaves
exactly the same way with or without the third hunk (ie. num_queues
must already be 1).

Can I infer from this that qemu needs a way to specify the can_queue
setting to the virtio-scsi driver in the guest kernel?

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org

2017-08-07 12:11:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 05/08/2017 17:51, Richard W.M. Jones wrote:
> On Sat, Aug 05, 2017 at 03:39:54PM +0200, Christoph Hellwig wrote:
>> For now can you apply this testing patch to the guest kernel?
>>
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 9be211d68b15..0cbe2c882e1c 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -818,7 +818,7 @@ static struct scsi_host_template virtscsi_host_template_single = {
>> .eh_timed_out = virtscsi_eh_timed_out,
>> .slave_alloc = virtscsi_device_alloc,
>>
>> - .can_queue = 1024,
>> + .can_queue = 64,
>> .dma_boundary = UINT_MAX,
>> .use_clustering = ENABLE_CLUSTERING,
>> .target_alloc = virtscsi_target_alloc,
>> @@ -839,7 +839,7 @@ static struct scsi_host_template virtscsi_host_template_multi = {
>> .eh_timed_out = virtscsi_eh_timed_out,
>> .slave_alloc = virtscsi_device_alloc,
>>
>> - .can_queue = 1024,
>> + .can_queue = 64,
>> .dma_boundary = UINT_MAX,
>> .use_clustering = ENABLE_CLUSTERING,
>> .target_alloc = virtscsi_target_alloc,
>> @@ -983,7 +983,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
>> shost->max_id = num_targets;
>> shost->max_channel = 0;
>> shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE;
>> - shost->nr_hw_queues = num_queues;
>> + shost->nr_hw_queues = 1;
>>
>> #ifdef CONFIG_BLK_DEV_INTEGRITY
>> if (virtio_has_feature(vdev, VIRTIO_SCSI_F_T10_PI)) {
>
> Yes, that's an improvement, although it's still a little way off the
> density possible the old way:
>
> With scsi-mq enabled: 175 disks
> * With this patch: 319 disks *
> With scsi-mq disabled: 1755 disks
>
> Also only the first two hunks are necessary. The kernel behaves
> exactly the same way with or without the third hunk (ie. num_queues
> must already be 1).
>
> Can I infer from this that qemu needs a way to specify the can_queue
> setting to the virtio-scsi driver in the guest kernel?

You could also add a module parameter to the driver, and set it to 64 on
the kernel command line (there is an example in
drivers/scsi/vmw_pvscsi.c of how to do it).

Paolo

2017-08-07 12:27:06

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Mon, Aug 07, 2017 at 02:11:39PM +0200, Paolo Bonzini wrote:
> You could also add a module parameter to the driver, and set it to 64 on
> the kernel command line (there is an example in
> drivers/scsi/vmw_pvscsi.c of how to do it).

[Proviso: I've not tested the performance of difference values, nor do
I have any particular knowledge in this area]

can_queue is documented as:

* This determines if we will use a non-interrupt driven
* or an interrupt driven scheme. It is set to the maximum number
* of simultaneous commands a given host adapter will accept.

Wouldn't it be better to make it default to k * number of CPUs for
some small integer k?

I looked at the other scsi drivers and they set it to all kinds of
values. 1, small integers, large integers, configurable values.

Also I noticed this code in virtio_scsi.c:

cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);

but setting cmd_per_lun (as a qemu virtio-scsi-pci parameter) didn't
seem to make any difference to memory usage.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

2017-08-07 13:07:58

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 07/08/2017 14:27, Richard W.M. Jones wrote:
> Also I noticed this code in virtio_scsi.c:
>
> cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
>
> but setting cmd_per_lun (as a qemu virtio-scsi-pci parameter) didn't
> seem to make any difference to memory usage.

That's because memory depends on shost->can_queue, not
shost->cmd_per_lun (see scsi_mq_setup_tags).

can_queue should depend on the virtqueue size, which unfortunately can
vary for each virtio-scsi device in theory. The virtqueue size is
retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
in QEMU it is the second argument to virtio_add_queue.

For virtio-scsi this is VIRTIO_SCSI_VQ_SIZE, which is 128, so changing
can_queue to 128 should be a no brainer.

Thanks,

Paolo

2017-08-09 16:01:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
> can_queue should depend on the virtqueue size, which unfortunately can
> vary for each virtio-scsi device in theory. The virtqueue size is
> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
> in QEMU it is the second argument to virtio_add_queue.

Why is that unfortunate? We don't even have to set can_queue in
the host template, we can dynamically set it on per-host basis.

2017-08-09 16:50:16

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 09/08/2017 18:01, Christoph Hellwig wrote:
> On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
>> can_queue should depend on the virtqueue size, which unfortunately can
>> vary for each virtio-scsi device in theory. The virtqueue size is
>> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
>> in QEMU it is the second argument to virtio_add_queue.
>
> Why is that unfortunate? We don't even have to set can_queue in
> the host template, we can dynamically set it on per-host basis.

Ah, cool, I thought allocations based on can_queue happened already in
scsi_host_alloc, but they happen at scsi_add_host time.

Paolo

2017-08-10 12:22:38

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Wed, Aug 09, 2017 at 06:50:10PM +0200, Paolo Bonzini wrote:
> On 09/08/2017 18:01, Christoph Hellwig wrote:
> > On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
> >> can_queue should depend on the virtqueue size, which unfortunately can
> >> vary for each virtio-scsi device in theory. The virtqueue size is
> >> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
> >> in QEMU it is the second argument to virtio_add_queue.
> >
> > Why is that unfortunate? We don't even have to set can_queue in
> > the host template, we can dynamically set it on per-host basis.
>
> Ah, cool, I thought allocations based on can_queue happened already in
> scsi_host_alloc, but they happen at scsi_add_host time.

I think I've decoded all that information into the patch below.

I tested it, and it appears to work: when I set cmd_per_lun on the
qemu command line, I see that the guest can add more disks:

With scsi-mq enabled: 175 disks
cmd_per_lun not set: 177 disks *
cmd_per_lun=16: 776 disks *
cmd_per_lun=4: 1160 disks *
With scsi-mq disabled: 1755 disks
* = new result

>From my point of view, this is a good result, but you should be warned
that I don't fully understand what's going on here and I may have made
obvious or not-so-obvious mistakes.

I tested the performance impact and it's not noticable in the
libguestfs case even with very small cmd_per_lun settings, but
libguestfs is largely serial and so this result won't be applicable to
guests in general.

Also, should the guest kernel validate cmd_per_lun to make sure it's
not too small or large? And if so, what would the limits be?

Rich.

>From e923e49846189b2f55f3f02b70a290d4be237ed5 Mon Sep 17 00:00:00 2001
From: "Richard W.M. Jones" <[email protected]>
Date: Thu, 10 Aug 2017 12:21:47 +0100
Subject: [PATCH] scsi: virtio_scsi: Set can_queue based on cmd_per_lun passed
by hypervisor.

Signed-off-by: Richard W.M. Jones <[email protected]>
---
drivers/scsi/virtio_scsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 9be211d68b15..b22591e9b16b 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -973,7 +973,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
goto virtscsi_init_failed;

cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
- shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
+ shost->cmd_per_lun = shost->can_queue = cmd_per_lun;
shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;

/* LUNs > 256 are reported with format 1, so they go in the range
--
2.13.1


--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

2017-08-10 12:54:36

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 10/08/2017 14:22, Richard W.M. Jones wrote:
> On Wed, Aug 09, 2017 at 06:50:10PM +0200, Paolo Bonzini wrote:
>> On 09/08/2017 18:01, Christoph Hellwig wrote:
>>> On Mon, Aug 07, 2017 at 03:07:48PM +0200, Paolo Bonzini wrote:
>>>> can_queue should depend on the virtqueue size, which unfortunately can
>>>> vary for each virtio-scsi device in theory. The virtqueue size is
>>>> retrieved by drivers/virtio prior to calling vring_create_virtqueue, and
>>>> in QEMU it is the second argument to virtio_add_queue.
>>>
>>> Why is that unfortunate? We don't even have to set can_queue in
>>> the host template, we can dynamically set it on per-host basis.
>>
>> Ah, cool, I thought allocations based on can_queue happened already in
>> scsi_host_alloc, but they happen at scsi_add_host time.
>
> I think I've decoded all that information into the patch below.
>
> I tested it, and it appears to work: when I set cmd_per_lun on the
> qemu command line, I see that the guest can add more disks:
>
> With scsi-mq enabled: 175 disks
> cmd_per_lun not set: 177 disks *
> cmd_per_lun=16: 776 disks *
> cmd_per_lun=4: 1160 disks *
> With scsi-mq disabled: 1755 disks
> * = new result
>
> From my point of view, this is a good result, but you should be warned
> that I don't fully understand what's going on here and I may have made
> obvious or not-so-obvious mistakes.

can_queue and cmd_per_lun are different. can_queue should be set to the
value of vq->vring.num where vq is the command virtqueue (the first one
is okay if there's >1).

If you want to change it, you'll have to do so in QEMU.

Paolo

> I tested the performance impact and it's not noticable in the
> libguestfs case even with very small cmd_per_lun settings, but
> libguestfs is largely serial and so this result won't be applicable to
> guests in general.
>
> Also, should the guest kernel validate cmd_per_lun to make sure it's
> not too small or large? And if so, what would the limits be?
>
> Rich.
>
> From e923e49846189b2f55f3f02b70a290d4be237ed5 Mon Sep 17 00:00:00 2001
> From: "Richard W.M. Jones" <[email protected]>
> Date: Thu, 10 Aug 2017 12:21:47 +0100
> Subject: [PATCH] scsi: virtio_scsi: Set can_queue based on cmd_per_lun passed
> by hypervisor.
>
> Signed-off-by: Richard W.M. Jones <[email protected]>
> ---
> drivers/scsi/virtio_scsi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..b22591e9b16b 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -973,7 +973,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
> goto virtscsi_init_failed;
>
> cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> - shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
> + shost->cmd_per_lun = shost->can_queue = cmd_per_lun;
> shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;
>
> /* LUNs > 256 are reported with format 1, so they go in the range
>

2017-08-10 14:16:40

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On Thu, Aug 10, 2017 at 02:53:58PM +0200, Paolo Bonzini wrote:
> can_queue and cmd_per_lun are different. can_queue should be set to the
> value of vq->vring.num where vq is the command virtqueue (the first one
> is okay if there's >1).
>
> If you want to change it, you'll have to do so in QEMU.

Here are a couple more patches I came up with, the first to Linux,
the second to qemu.

There are a few problems ...

(1) Setting virtqueue_size to < 128 causes a BUG_ON failure in
virtio_ring.c:virtqueue_add at:

BUG_ON(total_sg > vq->vring.num);

I initially thought that I should also set cmd_per_lun to the same
value, but that doesn't help. Is there some relationship between
virtqueue_size and other settings?

(2) Can/should the ctrl, event and cmd queue sizes be set to the same
values? Or should ctrl & event be left at 128?

(3) It seems like it might be a problem that virtqueue_size is not
passed through the virtio_scsi_conf struct (which is ABI between the
kernel and qemu AFAICT and so not easily modifiable). However I think
this is not a problem because virtqueue size is stored in the
Virtqueue Descriptor table, which is how the kernel gets the value.
Is that right?

Rich.


--- kernel patch ---

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 9be211d68b15..d6b4ff634c0d 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
if (err)
goto virtscsi_init_failed;

+ shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
+
cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;


--- qemu patch ---

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index eb639442d1..aadd99aad1 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;

- s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
- s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
+ s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
+ s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
for (i = 0; i < s->conf.num_queues; i++) {
- s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
+ s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
}
}

@@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev, Error **errp)

static Property virtio_scsi_properties[] = {
DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 1),
+ DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, parent_obj.conf.virtqueue_size, 128),
DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, parent_obj.conf.max_sectors,
0xFFFF),
DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, parent_obj.conf.cmd_per_lun,
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index de6ae5a9f6..e30a92d3e7 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;

struct VirtIOSCSIConf {
uint32_t num_queues;
+ uint32_t virtqueue_size;
uint32_t max_sectors;
uint32_t cmd_per_lun;
#ifdef CONFIG_VHOST_SCSI


--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

2017-08-10 14:30:58

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 10/08/2017 16:16, Richard W.M. Jones wrote:
> On Thu, Aug 10, 2017 at 02:53:58PM +0200, Paolo Bonzini wrote:
>> can_queue and cmd_per_lun are different. can_queue should be set to the
>> value of vq->vring.num where vq is the command virtqueue (the first one
>> is okay if there's >1).
>>
>> If you want to change it, you'll have to do so in QEMU.
>
> Here are a couple more patches I came up with, the first to Linux,
> the second to qemu.
>
> There are a few problems ...
>
> (1) Setting virtqueue_size to < 128 causes a BUG_ON failure in
> virtio_ring.c:virtqueue_add at:
>
> BUG_ON(total_sg > vq->vring.num);

That bug is bogus. You can change it to

WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);

and move it in the "else" here:

if (vq->indirect && total_sg > 1 && vq->vq.num_free)
desc = alloc_indirect(_vq, total_sg, gfp);
else
desc = NULL;

You just get a -ENOSPC after the WARN, so no need to crash the kernel!

> (2) Can/should the ctrl, event and cmd queue sizes be set to the same
> values? Or should ctrl & event be left at 128?

It's okay if they're all the same.

> (3) It seems like it might be a problem that virtqueue_size is not
> passed through the virtio_scsi_conf struct (which is ABI between the
> kernel and qemu AFAICT and so not easily modifiable). However I think
> this is not a problem because virtqueue size is stored in the
> Virtqueue Descriptor table, which is how the kernel gets the value.
> Is that right?

Yes.

Paolo

> Rich.
>
>
> --- kernel patch ---
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..d6b4ff634c0d 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
> if (err)
> goto virtscsi_init_failed;
>
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
> cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
> shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;
>
>
> --- qemu patch ---
>
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index eb639442d1..aadd99aad1 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
> s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
> s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
>
> - s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
> - s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
> + s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
> + s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
> for (i = 0; i < s->conf.num_queues; i++) {
> - s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
> + s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
> }
> }
>
> @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev, Error **errp)
>
> static Property virtio_scsi_properties[] = {
> DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 1),
> + DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, parent_obj.conf.virtqueue_size, 128),
> DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, parent_obj.conf.max_sectors,
> 0xFFFF),
> DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, parent_obj.conf.cmd_per_lun,
> diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
> index de6ae5a9f6..e30a92d3e7 100644
> --- a/include/hw/virtio/virtio-scsi.h
> +++ b/include/hw/virtio/virtio-scsi.h
> @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;
>
> struct VirtIOSCSIConf {
> uint32_t num_queues;
> + uint32_t virtqueue_size;
> uint32_t max_sectors;
> uint32_t cmd_per_lun;
> #ifdef CONFIG_VHOST_SCSI
>
>

2017-08-10 15:40:31

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

OK this is looking a bit better now.

With scsi-mq enabled: 175 disks
virtqueue_size=64: 318 disks *
virtqueue_size=16: 775 disks *
With scsi-mq disabled: 1755 disks
* = new results

I also ran the whole libguestfs test suite with virtqueue_size=16
(with no failures shown). As this tests many different disk I/O
operations, it gives me some confidence that things generally work.

Do you have any other comments about the patches? I'm not sure I know
enough to write an intelligent commit message for the kernel patch.

Rich.


--- kernel patch ---

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 9be211d68b15..d6b4ff634c0d 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
if (err)
goto virtscsi_init_failed;

+ shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
+
cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5e1b548828e6..2d7509da9f39 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq,
}
#endif

- BUG_ON(total_sg > vq->vring.num);
BUG_ON(total_sg == 0);

head = vq->free_head;
@@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq,
* buffers, then go indirect. FIXME: tune this threshold */
if (vq->indirect && total_sg > 1 && vq->vq.num_free)
desc = alloc_indirect(_vq, total_sg, gfp);
- else
+ else {
desc = NULL;
+ WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);
+ }

if (desc) {
/* Use a single buffer which doesn't continue */

--- qemu patch ---

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index eb639442d1..aadd99aad1 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;

- s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
- s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
+ s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
+ s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
for (i = 0; i < s->conf.num_queues; i++) {
- s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
+ s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
}
}

@@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev, Error **errp)

static Property virtio_scsi_properties[] = {
DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 1),
+ DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, parent_obj.conf.virtqueue_size, 128),
DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, parent_obj.conf.max_sectors,
0xFFFF),
DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, parent_obj.conf.cmd_per_lun,
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index de6ae5a9f6..e30a92d3e7 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;

struct VirtIOSCSIConf {
uint32_t num_queues;
+ uint32_t virtqueue_size;
uint32_t max_sectors;
uint32_t cmd_per_lun;
#ifdef CONFIG_VHOST_SCSI



--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

2017-08-10 16:04:30

by Paolo Bonzini

[permalink] [raw]
Subject: Re: Increased memory usage with scsi-mq

On 10/08/2017 17:40, Richard W.M. Jones wrote:
> OK this is looking a bit better now.
>
> With scsi-mq enabled: 175 disks
> virtqueue_size=64: 318 disks *
> virtqueue_size=16: 775 disks *
> With scsi-mq disabled: 1755 disks
> * = new results
>
> I also ran the whole libguestfs test suite with virtqueue_size=16
> (with no failures shown). As this tests many different disk I/O
> operations, it gives me some confidence that things generally work.

Yes, it looks good. I'll grab you on IRC to discuss the commit message.

Paolo

> Do you have any other comments about the patches? I'm not sure I know
> enough to write an intelligent commit message for the kernel patch.
>
> Rich.

>
> --- kernel patch ---
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 9be211d68b15..d6b4ff634c0d 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -972,6 +972,8 @@ static int virtscsi_probe(struct virtio_device *vdev)
> if (err)
> goto virtscsi_init_failed;
>
> + shost->can_queue = virtqueue_get_vring_size(vscsi->req_vqs[0].vq);
> +
> cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1;
> shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue);
> shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF;
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5e1b548828e6..2d7509da9f39 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -296,7 +296,6 @@ static inline int virtqueue_add(struct virtqueue *_vq,
> }
> #endif
>
> - BUG_ON(total_sg > vq->vring.num);
> BUG_ON(total_sg == 0);
>
> head = vq->free_head;
> @@ -305,8 +304,10 @@ static inline int virtqueue_add(struct virtqueue *_vq,
> * buffers, then go indirect. FIXME: tune this threshold */
> if (vq->indirect && total_sg > 1 && vq->vq.num_free)
> desc = alloc_indirect(_vq, total_sg, gfp);
> - else
> + else {
> desc = NULL;
> + WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);
> + }
>
> if (desc) {
> /* Use a single buffer which doesn't continue */
>
> --- qemu patch ---
>
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index eb639442d1..aadd99aad1 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -867,10 +867,10 @@ void virtio_scsi_common_realize(DeviceState *dev,
> s->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE;
> s->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE;
>
> - s->ctrl_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, ctrl);
> - s->event_vq = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, evt);
> + s->ctrl_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, ctrl);
> + s->event_vq = virtio_add_queue(vdev, s->conf.virtqueue_size, evt);
> for (i = 0; i < s->conf.num_queues; i++) {
> - s->cmd_vqs[i] = virtio_add_queue(vdev, VIRTIO_SCSI_VQ_SIZE, cmd);
> + s->cmd_vqs[i] = virtio_add_queue(vdev, s->conf.virtqueue_size, cmd);
> }
> }
>
> @@ -917,6 +917,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev, Error **errp)
>
> static Property virtio_scsi_properties[] = {
> DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, 1),
> + DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, parent_obj.conf.virtqueue_size, 128),
> DEFINE_PROP_UINT32("max_sectors", VirtIOSCSI, parent_obj.conf.max_sectors,
> 0xFFFF),
> DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSI, parent_obj.conf.cmd_per_lun,
> diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
> index de6ae5a9f6..e30a92d3e7 100644
> --- a/include/hw/virtio/virtio-scsi.h
> +++ b/include/hw/virtio/virtio-scsi.h
> @@ -48,6 +48,7 @@ typedef struct virtio_scsi_config VirtIOSCSIConfig;
>
> struct VirtIOSCSIConf {
> uint32_t num_queues;
> + uint32_t virtqueue_size;
> uint32_t max_sectors;
> uint32_t cmd_per_lun;
> #ifdef CONFIG_VHOST_SCSI
>
>
>