2022-11-23 12:26:41

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v3 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt

Following scenario would make scsi_device's iodone_cnt mismatch with
iorequest_cnt even if there is no request on this device any more.

1. request timeout happened. If we do not retry the timeouted command,
this command would be finished in scsi_finish_command() which would
not increase the iodone_cnt; if the timeouted command is retried,
another increasement for iorequest_cnt would be performed, the
command might add iorequest_cnt for multiple times but iodone_cnt
only once. Increase iodone_cnt in scsi_timeout() can handle this
scenario.

2. scsi_dispatch_cmd() failed, while the iorequest_cnt has already been
increased. If scsi_dispatch_cmd() failed, the request would be
requeued, then another iorequest_cnt would be added. So we should not
increase iorequest_cnt if dispatch command failed

V3:
- Rebase to solve conflicts caused by context when apply patch

V2:
- Add description about why we can add iodone_cnt in scsi_timeout()
- Do not increase iorequest_cnt if dispatch command failed

Wenchao Hao (2):
scsi: increase scsi device's iodone_cnt in scsi_timeout()
scsi: donot increase scsi_device's iorequest_cnt if dispatch failed

drivers/scsi/scsi_error.c | 1 +
drivers/scsi/scsi_lib.c | 3 +--
2 files changed, 2 insertions(+), 2 deletions(-)

--
2.32.0


2022-11-23 12:26:42

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v3 1/2] scsi: increase scsi device's iodone_cnt in scsi_timeout()

If an scsi command time out and going to be aborted, we should
increase the iodone_cnt of the related scsi device, or the
iodone_cnt would be less than iorequest_cnt

Increase iodone_cnt in scsi_timeout() would not cause double
accounting issue, briefly analysed as following:

- we add the iodone_cnt when BLK_EH_DONE would be returned in
scsi_timeout(), so the related scsi command's timeout event
would not happened

- if the abort succeed and do not retry, the command would be done
with scsi_finish_command() which would not increase iodone_cnt;

- if the abort succeed and retry the command, it would be requeue,
a scsi_dispatch_cmd() would be called and iorequest_cnt would be
increased again

- if the abort failed, the error handler successfully recover the
device, do not retry this command, the command would be done
with scsi_finish_command() which would not increase iodone_cnt;

- if the abort failed, the error handler successfully recover the
device, and retry this command, the iorequest_cnt would be
increased again

Signed-off-by: Wenchao Hao <[email protected]>
Reviewed-by: Mike Christie <[email protected]>
---
drivers/scsi/scsi_error.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index be2a70c5ac6d..613d5aeb1e3c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -354,6 +354,7 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
*/
if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
return BLK_EH_DONE;
+ atomic_inc(&scmd->device->iodone_cnt);
if (scsi_abort_command(scmd) != SUCCESS) {
set_host_byte(scmd, DID_TIME_OUT);
scsi_eh_scmd_add(scmd);
--
2.32.0

2022-11-23 12:26:51

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v3 2/2] scsi: donot increase scsi_device's iorequest_cnt if dispatch failed

If scsi_dispatch_cmd() failed, the scsi command did not send to disks,
so it would never done from LLDs.

scsi scsi_queue_rq() would return BLK_STS_RESOURCE if
scsi_dispatch_cmd() failed, the related request would be requeued, and
the timeout of this request would not fired any more, so no one
would increase iodone_cnt which matches with this increase of
iorequest_cnt.

Signed-off-by: Wenchao Hao <[email protected]>
Reviewed-by: Mike Christie <[email protected]>
---
drivers/scsi/scsi_lib.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ec890865abae..a29d87e57430 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1464,8 +1464,6 @@ static int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
struct Scsi_Host *host = cmd->device->host;
int rtn = 0;

- atomic_inc(&cmd->device->iorequest_cnt);
-
/* check if the device is still usable */
if (unlikely(cmd->device->sdev_state == SDEV_DEL)) {
/* in SDEV_DEL we error all commands. DID_NO_CONNECT
@@ -1764,6 +1762,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
goto out_dec_host_busy;
}

+ atomic_inc(&cmd->device->iorequest_cnt);
return BLK_STS_OK;

out_dec_host_busy:
--
2.32.0

2022-11-24 03:47:10

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt


Wenchao,

> Following scenario would make scsi_device's iodone_cnt mismatch with
> iorequest_cnt even if there is no request on this device any more.

Applied to 6.2/scsi-staging, thanks!

--
Martin K. Petersen Oracle Linux Engineering

2022-12-01 03:49:47

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] Fix scsi device's iodone_cnt mismatch with iorequest_cnt

On Wed, 23 Nov 2022 20:21:35 +0800, Wenchao Hao wrote:

> Following scenario would make scsi_device's iodone_cnt mismatch with
> iorequest_cnt even if there is no request on this device any more.
>
> 1. request timeout happened. If we do not retry the timeouted command,
> this command would be finished in scsi_finish_command() which would
> not increase the iodone_cnt; if the timeouted command is retried,
> another increasement for iorequest_cnt would be performed, the
> command might add iorequest_cnt for multiple times but iodone_cnt
> only once. Increase iodone_cnt in scsi_timeout() can handle this
> scenario.
>
> [...]

Applied to 6.2/scsi-queue, thanks!

[1/2] scsi: increase scsi device's iodone_cnt in scsi_timeout()
https://git.kernel.org/mkp/scsi/c/ec9780e48c77
[2/2] scsi: donot increase scsi_device's iorequest_cnt if dispatch failed
https://git.kernel.org/mkp/scsi/c/cfee29ffb45b

--
Martin K. Petersen Oracle Linux Engineering