2018-03-15 23:54:58

by Long Li

[permalink] [raw]
Subject: [PATCH] storvsc: Set up correct queue depth values for IDE devices

From: Long Li <[email protected]>

Unlike SCSI and FC, we don't use multiple channels for IDE. So set queue depth
correctly for IDE.

Also set the correct cmd_per_lun for all devices.

Signed-off-by: Long Li <[email protected]>
---
drivers/scsi/storvsc_drv.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 8c51d628b52e..fba170640e9c 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1722,15 +1722,19 @@ static int storvsc_probe(struct hv_device *device,
max_targets = STORVSC_MAX_TARGETS;
max_channels = STORVSC_MAX_CHANNELS;
/*
- * On Windows8 and above, we support sub-channels for storage.
+ * On Windows8 and above, we support sub-channels for storage
+ * on SCSI and FC controllers.
* The number of sub-channels offerred is based on the number of
* VCPUs in the guest.
*/
- max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel);
+ if (!dev_is_ide)
+ max_sub_channels =
+ num_cpus / storvsc_vcpus_per_sub_channel;
}

scsi_driver.can_queue = (max_outstanding_req_per_channel *
(max_sub_channels + 1));
+ scsi_driver.cmd_per_lun = scsi_driver.can_queue;

host = scsi_host_alloc(&scsi_driver,
sizeof(struct hv_host_device));
--
2.14.1



2018-03-16 01:55:08

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH] storvsc: Set up correct queue depth values for IDE devices

On Thu, 15 Mar 2018 16:52:23 -0700
Long Li <[email protected]> wrote:

> From: Long Li <[email protected]>
>
> Unlike SCSI and FC, we don't use multiple channels for IDE. So set queue depth
> correctly for IDE.
>
> Also set the correct cmd_per_lun for all devices.
>
> Signed-off-by: Long Li <[email protected]>

Looks correct.
Acked-by: Stephen Hemminger <[email protected]>

2018-03-16 15:56:03

by Michael Kelley (EOSG)

[permalink] [raw]
Subject: RE: [PATCH] storvsc: Set up correct queue depth values for IDE devices

> -----Original Message-----
> From: [email protected] <[email protected]> On Behalf
> Of Long Li
> Sent: Thursday, March 15, 2018 4:52 PM
> To: KY Srinivasan <[email protected]>; Haiyang Zhang <[email protected]>; Stephen
> Hemminger <[email protected]>; James E . J . Bottomley <[email protected]>;
> Martin K . Petersen <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]
> Cc: Long Li <[email protected]>
> Subject: [PATCH] storvsc: Set up correct queue depth values for IDE devices
>
> From: Long Li <[email protected]>
>
> Unlike SCSI and FC, we don't use multiple channels for IDE. So set queue depth
> correctly for IDE.
>
> Also set the correct cmd_per_lun for all devices.
>
> Signed-off-by: Long Li <[email protected]>
> ---
> drivers/scsi/storvsc_drv.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 8c51d628b52e..fba170640e9c 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -1722,15 +1722,19 @@ static int storvsc_probe(struct hv_device *device,
> max_targets = STORVSC_MAX_TARGETS;
> max_channels = STORVSC_MAX_CHANNELS;
> /*
> - * On Windows8 and above, we support sub-channels for storage.
> + * On Windows8 and above, we support sub-channels for storage
> + * on SCSI and FC controllers.
> * The number of sub-channels offerred is based on the number of
> * VCPUs in the guest.
> */
> - max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel);
> + if (!dev_is_ide)
> + max_sub_channels =
> + num_cpus / storvsc_vcpus_per_sub_channel;

This calculation of the # of sub-channels doesn't get the right answer (and it
didn't before this patch either). storvsc_vcpus_per_sub_channel defaults to 4.
If num_cpus is 8, max_sub_channels will be 2, but it should be 1. The
sub-channel count should not include the main channel since we add 1 to the
sub-channel count below when calculating can_queue.

Furthermore, this is calculation is just a guess, in the sense that we're
replicating the algorithm we think Hyper-V is using to determine the number
of sub-channels to offer. It turns out Hyper-V is changing that algorithm for
particular devices in an upcoming new Azure VM size. But the only use of
max_sub_channels is in the calculation of can_queue below, so the impact
is minimal.

> }
>
> scsi_driver.can_queue = (max_outstanding_req_per_channel *
> (max_sub_channels + 1));
> + scsi_driver.cmd_per_lun = scsi_driver.can_queue;

can_queue is defined as "int", while cmd_per_lun is defined as "short".
The calculated value of can_queue could easily be over 32,767 with
15 sub-channels and max_outstanding_req_per_channel being 3036
for the default 1 Mbyte ring buffer. So the assignment to cmd_per_lun
could produce truncation and even a negative number.

More broadly, since max_outstanding_req_per_channel is based on
the ring buffer size, these calculations imply that Hyper-V storvsp's
queuing capacity is based on the ring buffer size. I don't think that's
the case. From conversations with the storvsp folks, I think Hyper-V
always removes entries from the guest->host ring buffer and then
lets storvsp queue them separately. So we don't want to be linking
cmd_per_lun (or even can_queue, for that matter) to the ring buffer
size. The current default ring buffer size of 1 Mbyte is probably 10x
bigger than needed, and we want to be able to adjust that without
ending up with can_queue and cmd_per_lun values that are too
small.

We would probably do better to set can_queue to a constant, and
leave cmd_per_lun at its current value of 2048. The can_queue
value is already capped at 10240 in the blk-mq layer, so maybe
that's a reasonable constant to use.

Thoughts?

>
> host = scsi_host_alloc(&scsi_driver,
> sizeof(struct hv_host_device));
> --
> 2.14.1


2018-03-16 18:29:55

by Long Li

[permalink] [raw]
Subject: RE: [PATCH] storvsc: Set up correct queue depth values for IDE devices

> > Subject: [PATCH] storvsc: Set up correct queue depth values for IDE
> > devices
> >
> > From: Long Li <[email protected]>
> >
> > Unlike SCSI and FC, we don't use multiple channels for IDE. So set
> > queue depth correctly for IDE.
> >
> > Also set the correct cmd_per_lun for all devices.
> >
> > Signed-off-by: Long Li <[email protected]>
> > ---
> > drivers/scsi/storvsc_drv.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > index 8c51d628b52e..fba170640e9c 100644
> > --- a/drivers/scsi/storvsc_drv.c
> > +++ b/drivers/scsi/storvsc_drv.c
> > @@ -1722,15 +1722,19 @@ static int storvsc_probe(struct hv_device
> *device,
> > max_targets = STORVSC_MAX_TARGETS;
> > max_channels = STORVSC_MAX_CHANNELS;
> > /*
> > - * On Windows8 and above, we support sub-channels for
> storage.
> > + * On Windows8 and above, we support sub-channels for
> storage
> > + * on SCSI and FC controllers.
> > * The number of sub-channels offerred is based on the
> number of
> > * VCPUs in the guest.
> > */
> > - max_sub_channels = (num_cpus /
> storvsc_vcpus_per_sub_channel);
> > + if (!dev_is_ide)
> > + max_sub_channels =
> > + num_cpus / storvsc_vcpus_per_sub_channel;
>
> This calculation of the # of sub-channels doesn't get the right answer (and it
> didn't before this patch either). storvsc_vcpus_per_sub_channel defaults to
> 4.
> If num_cpus is 8, max_sub_channels will be 2, but it should be 1. The sub-
> channel count should not include the main channel since we add 1 to the
> sub-channel count below when calculating can_queue.

This is a good point. I will fix the code calculating can_queue.

>
> Furthermore, this is calculation is just a guess, in the sense that we're
> replicating the algorithm we think Hyper-V is using to determine the number
> of sub-channels to offer. It turns out Hyper-V is changing that algorithm for
> particular devices in an upcoming new Azure VM size. But the only use of
> max_sub_channels is in the calculation of can_queue below, so the impact is
> minimal.
>
> > }
> >
> > scsi_driver.can_queue = (max_outstanding_req_per_channel *
> > (max_sub_channels + 1));
> > + scsi_driver.cmd_per_lun = scsi_driver.can_queue;
>
> can_queue is defined as "int", while cmd_per_lun is defined as "short".
> The calculated value of can_queue could easily be over 32,767 with
> 15 sub-channels and max_outstanding_req_per_channel being 3036 for the
> default 1 Mbyte ring buffer. So the assignment to cmd_per_lun could
> produce truncation and even a negative number.

This is a good catch. I think I should try calling blk_set_queue_depth() and pass the correct value.

>
> More broadly, since max_outstanding_req_per_channel is based on the ring
> buffer size, these calculations imply that Hyper-V storvsp's queuing capacity
> is based on the ring buffer size. I don't think that's the case. From
> conversations with the storvsp folks, I think Hyper-V always removes entries
> from the guest->host ring buffer and then
> lets storvsp queue them separately. So we don't want to be linking
> cmd_per_lun (or even can_queue, for that matter) to the ring buffer size.
> The current default ring buffer size of 1 Mbyte is probably 10x bigger than
> needed, and we want to be able to adjust that without ending up with
> can_queue and cmd_per_lun values that are too small.

cmd_per_lun needs to reflect the device capacity. What value do you propose? It's not a good idea to leave them constant. Setting those values are also important because we don't' want to return BUSY on writing to ring buffer on full, that will slow down the SCSI stack.

Historically we use ring buffer size to calculate device properties (e.g. can_queue for SCSI host).

I agree this doesn't need to be based on the exact queuing capacity of ring buffer, maybe we can do 2X of that value (e.g. look at how block uses nr_request in MQ). Setting those values smaller is more conservative and I don't see an ill effect.

>
> We would probably do better to set can_queue to a constant, and
> leave cmd_per_lun at its current value of 2048. The can_queue
> value is already capped at 10240 in the blk-mq layer, so maybe that's a
> reasonable constant to use.

Actually this is not a good idea for smaller ring buffers. You'll see the problem when setting both ring buffer sizes to 10 pages.

>
> Thoughts?
>
> >
> > host = scsi_host_alloc(&scsi_driver,
> > sizeof(struct hv_host_device));
> > --
> > 2.14.1