Current code is based on the idea that the max number of SGL entries
also determines the max size of an I/O request. While this idea was
true in older versions of the storvsc driver when SGL entry length
was limited to 4 Kbytes, commit 3d9c3dcc58e9 ("scsi: storvsc: Enable
scatterlist entry lengths > 4Kbytes") removed that limitation. It's
now theoretically possible for the block layer to send requests that
exceed the maximum size supported by Hyper-V. This problem doesn't
currently happen in practice because the block layer defaults to a
512 Kbyte maximum, while Hyper-V in Azure supports 2 Mbyte I/O sizes.
But some future configuration of Hyper-V could have a smaller max I/O
size, and the block layer could exceed that max.
Fix this by correctly setting max_sectors as well as sg_tablesize to
reflect the maximum I/O size that Hyper-V reports. While allowing
I/O sizes larger than the block layer default of 512 Kbytes doesn’t
provide any noticeable performance benefit in the tests we ran, it's
still appropriate to report the correct underlying Hyper-V capabilities
to the Linux block layer.
Also tweak the virt_boundary_mask to reflect that the required
alignment derives from Hyper-V communication using a 4 Kbyte page size,
and not on the guest page size, which might be bigger (eg. ARM64).
Fixes: '3d9c3dcc58e9 ("scsi: storvsc: Enable scatter list entry lengths > 4Kbytes")'
Signed-off-by: Saurabh Sengar <[email protected]>
---
V2
- More descriptive commit subject and message
- Better logic by considering max_transfer_bytes aligning to HV_HYP_PAGE_SIZE
drivers/scsi/storvsc_drv.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index ca3530982e52..99d3be1b6089 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1844,7 +1844,7 @@ static struct scsi_host_template scsi_driver = {
.cmd_per_lun = 2048,
.this_id = -1,
/* Ensure there are no gaps in presented sgls */
- .virt_boundary_mask = PAGE_SIZE-1,
+ .virt_boundary_mask = HV_HYP_PAGE_SIZE - 1,
.no_write_same = 1,
.track_queue_depth = 1,
.change_queue_depth = storvsc_change_queue_depth,
@@ -1895,6 +1895,7 @@ static int storvsc_probe(struct hv_device *device,
int target = 0;
struct storvsc_device *stor_device;
int max_sub_channels = 0;
+ u32 max_tx_bytes;
/*
* We support sub-channels for storage on SCSI and FC controllers.
@@ -1968,12 +1969,27 @@ static int storvsc_probe(struct hv_device *device,
}
/* max cmd length */
host->max_cmd_len = STORVSC_MAX_CMD_LEN;
-
+ /* Any reasonable Hyper-V configuration should provide
+ * max_transfer_bytes value aligning to HV_HYP_PAGE_SIZE,
+ * protecting it from any weird value.
+ */
+ max_tx_bytes = round_down(stor_device->max_transfer_bytes, HV_HYP_PAGE_SIZE);
+ /* max_hw_sectors_kb */
+ host->max_sectors = max_tx_bytes >> 9;
/*
- * set the table size based on the info we got
- * from the host.
+ * There are 2 requirements for Hyper-V storvsc sgl segments,
+ * based on which the below calculation for max segments is
+ * done:
+ *
+ * 1. Except for the first and last sgl segment, all sgl segments
+ * should be align to HV_HYP_PAGE_SIZE, that also means the
+ * maximum number of segments in a sgl can be calculated by
+ * dividing the total max transfer length by HV_HYP_PAGE_SIZE.
+ *
+ * 2. Except for the first and last, each entry in the SGL must
+ * have an offset that is a multiple of HV_HYP_PAGE_SIZE.
*/
- host->sg_tablesize = (stor_device->max_transfer_bytes >> PAGE_SHIFT);
+ host->sg_tablesize = (max_tx_bytes >> HV_HYP_PAGE_SHIFT) + 1;
/*
* For non-IDE disks, the host supports multiple channels.
* Set the number of HW queues we are supporting.
--
2.25.1
From: Saurabh Sengar <[email protected]> Sent: Monday, June 13, 2022 10:42 AM
>
> Current code is based on the idea that the max number of SGL entries
> also determines the max size of an I/O request. While this idea was
> true in older versions of the storvsc driver when SGL entry length
> was limited to 4 Kbytes, commit 3d9c3dcc58e9 ("scsi: storvsc: Enable
> scatterlist entry lengths > 4Kbytes") removed that limitation. It's
> now theoretically possible for the block layer to send requests that
> exceed the maximum size supported by Hyper-V. This problem doesn't
> currently happen in practice because the block layer defaults to a
> 512 Kbyte maximum, while Hyper-V in Azure supports 2 Mbyte I/O sizes.
> But some future configuration of Hyper-V could have a smaller max I/O
> size, and the block layer could exceed that max.
>
> Fix this by correctly setting max_sectors as well as sg_tablesize to
> reflect the maximum I/O size that Hyper-V reports. While allowing
> I/O sizes larger than the block layer default of 512 Kbytes doesn’t
> provide any noticeable performance benefit in the tests we ran, it's
> still appropriate to report the correct underlying Hyper-V capabilities
> to the Linux block layer.
>
> Also tweak the virt_boundary_mask to reflect that the required
> alignment derives from Hyper-V communication using a 4 Kbyte page size,
> and not on the guest page size, which might be bigger (eg. ARM64).
>
> Fixes: '3d9c3dcc58e9 ("scsi: storvsc: Enable scatter list entry lengths > 4Kbytes")'
I don't think the Fixes: tag should have the surrounding single quotes.
At least the Documentation/process/submitting-patches.rst file
doesn't show it that way.
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> V2
> - More descriptive commit subject and message
> - Better logic by considering max_transfer_bytes aligning to HV_HYP_PAGE_SIZE
>
> drivers/scsi/storvsc_drv.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index ca3530982e52..99d3be1b6089 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -1844,7 +1844,7 @@ static struct scsi_host_template scsi_driver = {
> .cmd_per_lun = 2048,
> .this_id = -1,
> /* Ensure there are no gaps in presented sgls */
> - .virt_boundary_mask = PAGE_SIZE-1,
> + .virt_boundary_mask = HV_HYP_PAGE_SIZE - 1,
> .no_write_same = 1,
> .track_queue_depth = 1,
> .change_queue_depth = storvsc_change_queue_depth,
> @@ -1895,6 +1895,7 @@ static int storvsc_probe(struct hv_device *device,
> int target = 0;
> struct storvsc_device *stor_device;
> int max_sub_channels = 0;
> + u32 max_tx_bytes;
I don't normally nitpick local variables names, but this one follows the
wrong convention. In all the cases I've seen, "tx" means "transmit",
and "rx" means "receive", usually for network code. My brain
instinctively parses the above as "max transmit bytes", which
seems weird in this context. "xfer" is usually the short form of
"transfer", so "max_xfer_bytes".
>
> /*
> * We support sub-channels for storage on SCSI and FC controllers.
> @@ -1968,12 +1969,27 @@ static int storvsc_probe(struct hv_device *device,
> }
> /* max cmd length */
> host->max_cmd_len = STORVSC_MAX_CMD_LEN;
> -
> + /* Any reasonable Hyper-V configuration should provide
> + * max_transfer_bytes value aligning to HV_HYP_PAGE_SIZE,
> + * protecting it from any weird value.
> + */
Outside of the "net" area, multi-line comments should start with a
blank "/*" line, like is done below.
> + max_tx_bytes = round_down(stor_device->max_transfer_bytes, HV_HYP_PAGE_SIZE);
> + /* max_hw_sectors_kb */
> + host->max_sectors = max_tx_bytes >> 9;
> /*
> - * set the table size based on the info we got
> - * from the host.
> + * There are 2 requirements for Hyper-V storvsc sgl segments,
> + * based on which the below calculation for max segments is
> + * done:
> + *
> + * 1. Except for the first and last sgl segment, all sgl segments
> + * should be align to HV_HYP_PAGE_SIZE, that also means the
> + * maximum number of segments in a sgl can be calculated by
> + * dividing the total max transfer length by HV_HYP_PAGE_SIZE.
> + *
> + * 2. Except for the first and last, each entry in the SGL must
> + * have an offset that is a multiple of HV_HYP_PAGE_SIZE.
> */
> - host->sg_tablesize = (stor_device->max_transfer_bytes >> PAGE_SHIFT);
> + host->sg_tablesize = (max_tx_bytes >> HV_HYP_PAGE_SHIFT) + 1;
> /*
> * For non-IDE disks, the host supports multiple channels.
> * Set the number of HW queues we are supporting.
> --
> 2.25.1