After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
boot because scsi_add_host_with_dma() sets shost->cmd_per_lun to a
negative number (the below numbers may differ in different kernel versions):
in drivers/scsi/storvsc_drv.c, storvsc_drv_init() sets
'max_outstanding_req_per_channel' to 352, and storvsc_probe() sets
'max_sub_channels' to (416 - 1) / 4 = 103 and sets scsi_driver.can_queue to
352 * (103 + 1) * (100 - 10) / 100 = 32947, which exceeds SHRT_MAX.
Use min_t(int, ...) to fix the issue.
Fixes: ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue")
Cc: [email protected]
Signed-off-by: Dexuan Cui <[email protected]>
---
v1 tried to fix the issue by changing the storvsc driver:
https://lwn.net/ml/linux-kernel/BYAPR21MB1270BBC14D5F1AE69FC31A16BFB09@BYAPR21MB1270.namprd21.prod.outlook.com/
v2 directly fixes the scsi core change instead as Michael Kelley and
John Garry suggested (refer to the above link).
drivers/scsi/hosts.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 3f6f14f0cafb..24b72ee4246f 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -220,7 +220,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
goto fail;
}
- shost->cmd_per_lun = min_t(short, shost->cmd_per_lun,
+ /* Use min_t(int, ...) in case shost->can_queue exceeds SHRT_MAX */
+ shost->cmd_per_lun = min_t(int, shost->cmd_per_lun,
shost->can_queue);
error = scsi_init_sense_cache(shost);
--
2.25.1
> -----Original Message-----
> From: Dexuan Cui <[email protected]>
> Sent: Thursday, October 7, 2021 1:50 PM
> To: KY Srinivasan <[email protected]>; Stephen Hemminger
> <[email protected]>; [email protected]; [email protected];
> [email protected]; Haiyang Zhang <[email protected]>;
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; Long Li
> <[email protected]>; Michael Kelley <[email protected]>
> Cc: [email protected]; Dexuan Cui <[email protected]>;
> [email protected]
> Subject: [PATCH v2] scsi: core: Fix shost->cmd_per_lun calculation in
> scsi_add_host_with_dma()
>
> After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
> boot because scsi_add_host_with_dma() sets shost->cmd_per_lun to a
> negative number (the below numbers may differ in different kernel
> versions):
> in drivers/scsi/storvsc_drv.c, storvsc_drv_init() sets
> 'max_outstanding_req_per_channel' to 352, and storvsc_probe() sets
> 'max_sub_channels' to (416 - 1) / 4 = 103 and sets scsi_driver.can_queue
> to
> 352 * (103 + 1) * (100 - 10) / 100 = 32947, which exceeds SHRT_MAX.
>
> Use min_t(int, ...) to fix the issue.
>
> Fixes: ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at
> can_queue")
> Cc: [email protected]
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
>
> v1 tried to fix the issue by changing the storvsc driver:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.ne
> t%2Fml%2Flinux-
> kernel%2FBYAPR21MB1270BBC14D5F1AE69FC31A16BFB09%40BYAPR21MB1270.namprd21
> .prod.outlook.com%2F&data=04%7C01%7Chaiyangz%40microsoft.com%7C366e6
> d0bf755492c631c08d989baf4b9%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
> C637692258384408217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=RbNgx1aBBBzfHC3p
> EdKyBZWaaQIQXS3U%2FItEQUe4NfQ%3D&reserved=0
>
> v2 directly fixes the scsi core change instead as Michael Kelley and
> John Garry suggested (refer to the above link).
>
> drivers/scsi/hosts.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 3f6f14f0cafb..24b72ee4246f 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -220,7 +220,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost,
> struct device *dev,
> goto fail;
> }
>
> - shost->cmd_per_lun = min_t(short, shost->cmd_per_lun,
> + /* Use min_t(int, ...) in case shost->can_queue exceeds SHRT_MAX */
> + shost->cmd_per_lun = min_t(int, shost->cmd_per_lun,
> shost->can_queue);
Since shost->can_queue is int, the min_t type should also be int (the
longer type of the two vars).
Reviewed-by: Haiyang Zhang <[email protected]>
On 07/10/2021 18:49, Dexuan Cui wrote:
> After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
> boot because scsi_add_host_with_dma() sets shost->cmd_per_lun to a
> negative number (the below numbers may differ in different kernel versions):
> in drivers/scsi/storvsc_drv.c, storvsc_drv_init() sets
> 'max_outstanding_req_per_channel' to 352, and storvsc_probe() sets
> 'max_sub_channels' to (416 - 1) / 4 = 103 and sets scsi_driver.can_queue to
> 352 * (103 + 1) * (100 - 10) / 100 = 32947, which exceeds SHRT_MAX.
I think that you just need to mention that if can_queue exceeds
SHRT_MAX, then there is a data truncation issue.
>
> Use min_t(int, ...) to fix the issue.
>
> Fixes: ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue")
> Cc: [email protected]
> Signed-off-by: Dexuan Cui <[email protected]>
It looks ok, I'd just like to test it a bit more.
Thanks,
John
> ---
>
> v1 tried to fix the issue by changing the storvsc driver:
> https://lwn.net/ml/linux-kernel/BYAPR21MB1270BBC14D5F1AE69FC31A16BFB09@BYAPR21MB1270.namprd21.prod.outlook.com/
>
> v2 directly fixes the scsi core change instead as Michael Kelley and
> John Garry suggested (refer to the above link).
To be fair, it was Michael's suggestion
>
> drivers/scsi/hosts.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 3f6f14f0cafb..24b72ee4246f 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -220,7 +220,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
> goto fail;
> }
>
> - shost->cmd_per_lun = min_t(short, shost->cmd_per_lun,
> + /* Use min_t(int, ...) in case shost->can_queue exceeds SHRT_MAX */
> + shost->cmd_per_lun = min_t(int, shost->cmd_per_lun,
> shost->can_queue);
>
> error = scsi_init_sense_cache(shost);
>
> From: John Garry <[email protected]>
> Sent: Thursday, October 7, 2021 2:42 PM
>
> On 07/10/2021 18:49, Dexuan Cui wrote:
> > After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
> > boot because scsi_add_host_with_dma() sets shost->cmd_per_lun to a
> > negative number (the below numbers may differ in different kernel versions):
> > in drivers/scsi/storvsc_drv.c, storvsc_drv_init() sets
> > 'max_outstanding_req_per_channel' to 352, and storvsc_probe() sets
> > 'max_sub_channels' to (416 - 1) / 4 = 103 and sets scsi_driver.can_queue to
> > 352 * (103 + 1) * (100 - 10) / 100 = 32947, which exceeds SHRT_MAX.
>
> I think that you just need to mention that if can_queue exceeds
> SHRT_MAX, then there is a data truncation issue.
I just hoped the explanation how the too big 'can_queue' value is generated is helpful.
OK, I think I can change the commit log to:
After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
boot because the hv_storvsc driver sets scsi_driver.can_queue to an "int"
value that exceeds SHRT_MAX, and hence scsi_add_host_with_dma() sets
shost->cmd_per_lun to a negative "short" number.
Use min_t(int, ...) to fix the issue.
> It looks ok, I'd just like to test it a bit more.
>
> Thanks,
> John
Thanks! I'll post v3 with the above commit log, and I look forward to your review/test.
> > v2 directly fixes the scsi core change instead as Michael Kelley and
> > John Garry suggested (refer to the above link).
>
> To be fair, it was Michael's suggestion
Yeah. Michael always gives good suggstions when reviewing patches. :-)
On Thu, Oct 07, 2021 at 10:49:57AM -0700, Dexuan Cui wrote:
> After commit ea2f0f77538c, a 416-CPU VM running on Hyper-V hangs during
> boot because scsi_add_host_with_dma() sets shost->cmd_per_lun to a
> negative number (the below numbers may differ in different kernel versions):
> in drivers/scsi/storvsc_drv.c, storvsc_drv_init() sets
> 'max_outstanding_req_per_channel' to 352, and storvsc_probe() sets
> 'max_sub_channels' to (416 - 1) / 4 = 103 and sets scsi_driver.can_queue to
> 352 * (103 + 1) * (100 - 10) / 100 = 32947, which exceeds SHRT_MAX.
>
> Use min_t(int, ...) to fix the issue.
>
> Fixes: ea2f0f77538c ("scsi: core: Cap scsi_host cmd_per_lun at can_queue")
> Cc: [email protected]
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
>
> v1 tried to fix the issue by changing the storvsc driver:
> https://lwn.net/ml/linux-kernel/BYAPR21MB1270BBC14D5F1AE69FC31A16BFB09@BYAPR21MB1270.namprd21.prod.outlook.com/
>
> v2 directly fixes the scsi core change instead as Michael Kelley and
> John Garry suggested (refer to the above link).
>
> drivers/scsi/hosts.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 3f6f14f0cafb..24b72ee4246f 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -220,7 +220,8 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
> goto fail;
> }
>
> - shost->cmd_per_lun = min_t(short, shost->cmd_per_lun,
> + /* Use min_t(int, ...) in case shost->can_queue exceeds SHRT_MAX */
> + shost->cmd_per_lun = min_t(int, shost->cmd_per_lun,
> shost->can_queue);
Looks fine:
Reviewed-by: Ming Lei <[email protected]>
--
Ming