2022-10-21 10:44:19

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt

Following scenario would make scsi_device's iodone_cnt mismatch with
iorequest_cnt even if there is no request on this device any more.

1. request timeout happened. If we do not retry the timeouted command,
this command would be finished in scsi_finish_command() which would
not increase the iodone_cnt; if the timeouted command is retried,
another increasement for iorequest_cnt would be performed, the
command might add iorequest_cnt for multiple times but iodone_cnt
only once. Increase iodone_cnt in scsi_timeout() can handle this
scenario.

2. scsi_dispatch_cmd() failed, while the iorequest_cnt has already been
increased. If scsi_dispatch_cmd() failed, the request would be
requeued, then another iorequest_cnt would be added. So we should not
increase iorequest_cnt if dispatch command failed

V2:
- Add description about why we can add iodone_cnt in scsi_timeout()
- Do not increase iorequest_cnt if dispatch command failed

Wenchao Hao (2):
scsi: increase scsi device's iodone_cnt in scsi_timeout()
scsi: donot increase scsi_device's iorequest_cnt if dispatch failed

drivers/scsi/scsi_error.c | 1 +
drivers/scsi/scsi_lib.c | 3 +--
2 files changed, 2 insertions(+), 2 deletions(-)

--
2.35.3


2022-10-21 10:44:38

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 2/2] scsi: donot increase scsi_device's iorequest_cnt if dispatch failed

If scsi_dispatch_cmd() failed, the scsi command did not send to disks,
so it would never done from LLDs.

scsi scsi_queue_rq() would return BLK_STS_RESOURCE if
scsi_dispatch_cmd() failed, the related request would be requeued, and
the timeout of this request would not fired any more, so no one
would increase iodone_cnt which matches with this increase of
iorequest_cnt.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_lib.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 8b89fab7c420..71edb9ffbe16 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1469,8 +1469,6 @@ static int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
struct Scsi_Host *host = cmd->device->host;
int rtn = 0;

- atomic_inc(&cmd->device->iorequest_cnt);
-
/* check if the device is still usable */
if (unlikely(cmd->device->sdev_state == SDEV_DEL)) {
/* in SDEV_DEL we error all commands. DID_NO_CONNECT
@@ -1764,6 +1762,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
goto out_dec_host_busy;
}

+ atomic_inc(&cmd->device->iorequest_cnt);
return BLK_STS_OK;

out_dec_host_busy:
--
2.35.3

2022-10-21 11:35:28

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 1/2] scsi: increase scsi device's iodone_cnt in scsi_timeout()

If an scsi command time out and going to be aborted, we should
increase the iodone_cnt of the related scsi device, or the
iodone_cnt would be less than iorequest_cnt

Increase iodone_cnt in scsi_timeout() would not cause double
accounting issue, briefly analysed as following:

- we add the iodone_cnt when BLK_EH_DONE would be returned in
scsi_timeout(), so the related scsi command's timeout event
would not happened

- if the abort succeed and do not retry, the command would be done
with scsi_finish_command() which would not increase iodone_cnt;

- if the abort succeed and retry the command, it would be requeue,
a scsi_dispatch_cmd() would be called and iorequest_cnt would be
increased again

- if the abort failed, the error handler successfully recover the
device, do not retry this command, the command would be done
with scsi_finish_command() which would not increase iodone_cnt;

- if the abort failed, the error handler successfully recover the
device, and retry this command, the iorequest_cnt would be
increased again

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6995c8979230..052b00f57b56 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -356,6 +356,7 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
*/
if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
return BLK_EH_RESET_TIMER;
+ atomic_inc(&scmd->device->iodone_cnt);
if (scsi_abort_command(scmd) != SUCCESS) {
set_host_byte(scmd, DID_TIME_OUT);
scsi_eh_scmd_add(scmd);
--
2.35.3

2022-11-08 03:47:31

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt


On 2022/10/22 7:56, Wenchao Hao wrote:
> Following scenario would make scsi_device's iodone_cnt mismatch with
> iorequest_cnt even if there is no request on this device any more.
>
> 1. request timeout happened. If we do not retry the timeouted command,
> this command would be finished in scsi_finish_command() which would
> not increase the iodone_cnt; if the timeouted command is retried,
> another increasement for iorequest_cnt would be performed, the
> command might add iorequest_cnt for multiple times but iodone_cnt
> only once. Increase iodone_cnt in scsi_timeout() can handle this
> scenario.
>
> 2. scsi_dispatch_cmd() failed, while the iorequest_cnt has already been
> increased. If scsi_dispatch_cmd() failed, the request would be
> requeued, then another iorequest_cnt would be added. So we should not
> increase iorequest_cnt if dispatch command failed
>
> V2:
> - Add description about why we can add iodone_cnt in scsi_timeout()
> - Do not increase iorequest_cnt if dispatch command failed
>
> Wenchao Hao (2):
> scsi: increase scsi device's iodone_cnt in scsi_timeout()
> scsi: donot increase scsi_device's iorequest_cnt if dispatch failed
>
> drivers/scsi/scsi_error.c | 1 +
> drivers/scsi/scsi_lib.c | 3 +--
> 2 files changed, 2 insertions(+), 2 deletions(-)
>

Friendly ping...

2022-11-21 16:13:07

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt

On 2022/10/22 7:56, Wenchao Hao wrote:
> Following scenario would make scsi_device's iodone_cnt mismatch with
> iorequest_cnt even if there is no request on this device any more.
>
> 1. request timeout happened. If we do not retry the timeouted command,
> this command would be finished in scsi_finish_command() which would
> not increase the iodone_cnt; if the timeouted command is retried,
> another increasement for iorequest_cnt would be performed, the
> command might add iorequest_cnt for multiple times but iodone_cnt
> only once. Increase iodone_cnt in scsi_timeout() can handle this
> scenario.
>
> 2. scsi_dispatch_cmd() failed, while the iorequest_cnt has already been
> increased. If scsi_dispatch_cmd() failed, the request would be
> requeued, then another iorequest_cnt would be added. So we should not
> increase iorequest_cnt if dispatch command failed
>
> V2:
> - Add description about why we can add iodone_cnt in scsi_timeout()
> - Do not increase iorequest_cnt if dispatch command failed
>
> Wenchao Hao (2):
> scsi: increase scsi device's iodone_cnt in scsi_timeout()
> scsi: donot increase scsi_device's iorequest_cnt if dispatch failed
>
> drivers/scsi/scsi_error.c | 1 +
> drivers/scsi/scsi_lib.c | 3 +--
> 2 files changed, 2 insertions(+), 2 deletions(-)
>

Hi Martin, these two count is useful for us, would like take a look?


2022-11-22 16:33:52

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] scsi: increase scsi device's iodone_cnt in scsi_timeout()

On 10/21/22 6:56 PM, Wenchao Hao wrote:
> If an scsi command time out and going to be aborted, we should
> increase the iodone_cnt of the related scsi device, or the
> iodone_cnt would be less than iorequest_cnt
>
> Increase iodone_cnt in scsi_timeout() would not cause double
> accounting issue, briefly analysed as following:
>
> - we add the iodone_cnt when BLK_EH_DONE would be returned in
> scsi_timeout(), so the related scsi command's timeout event
> would not happened
>
> - if the abort succeed and do not retry, the command would be done
> with scsi_finish_command() which would not increase iodone_cnt;
>
> - if the abort succeed and retry the command, it would be requeue,
> a scsi_dispatch_cmd() would be called and iorequest_cnt would be
> increased again
>
> - if the abort failed, the error handler successfully recover the
> device, do not retry this command, the command would be done
> with scsi_finish_command() which would not increase iodone_cnt;
>
> - if the abort failed, the error handler successfully recover the
> device, and retry this command, the iorequest_cnt would be
> increased again
>
> Signed-off-by: Wenchao Hao <[email protected]>
> ---
> drivers/scsi/scsi_error.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 6995c8979230..052b00f57b56 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -356,6 +356,7 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
> */
> if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
> return BLK_EH_RESET_TIMER;

You will need to rebase this patch because the above line is different now
so it doesn't apply.

It looks ok to me though.

Reviewed-by: Mike Christie <[email protected]>

> + atomic_inc(&scmd->device->iodone_cnt);
> if (scsi_abort_command(scmd) != SUCCESS) {
> set_host_byte(scmd, DID_TIME_OUT);
> scsi_eh_scmd_add(scmd);

2022-11-22 16:46:38

by Mike Christie

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] scsi: donot increase scsi_device's iorequest_cnt if dispatch failed

On 10/21/22 6:56 PM, Wenchao Hao wrote:
> If scsi_dispatch_cmd() failed, the scsi command did not send to disks,
> so it would never done from LLDs.
>
> scsi scsi_queue_rq() would return BLK_STS_RESOURCE if
> scsi_dispatch_cmd() failed, the related request would be requeued, and
> the timeout of this request would not fired any more, so no one
> would increase iodone_cnt which matches with this increase of
> iorequest_cnt.
>
> Signed-off-by: Wenchao Hao <[email protected]>
> ---
> drivers/scsi/scsi_lib.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 8b89fab7c420..71edb9ffbe16 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1469,8 +1469,6 @@ static int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
> struct Scsi_Host *host = cmd->device->host;
> int rtn = 0;
>
> - atomic_inc(&cmd->device->iorequest_cnt);
> -> /* check if the device is still usable */
> if (unlikely(cmd->device->sdev_state == SDEV_DEL)) {
> /* in SDEV_DEL we error all commands. DID_NO_CONNECT
> @@ -1764,6 +1762,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
> goto out_dec_host_busy;
> }
>
> + atomic_inc(&cmd->device->iorequest_cnt);
> return BLK_STS_OK;
>
> out_dec_host_busy:

Reviewed-by: Mike Christie <[email protected]>