2023-09-28 07:53:11

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle

I am testing SCSI error handle with my previous scsi_debug error
injection patches, and found some issues when removing device and
error handler happened together.

These issues are triggered because devices in removing would be skipped
when calling shost_for_each_device().

Three issues are found:
1. statistic info printed at beginning of scsi_error_handler is wrong
2. device reset is not triggered
3. IO requeued to request_queue would be hang after error handle

V2:
- Fix IO hang by run all devices' queue after error handler
- Do not modify shost_for_each_device() directly but add a new
helper to iterate devices but do not skip devices in removing

Wenchao Hao (4):
scsi: core: Add new helper to iterate all devices of host
scsi: scsi_error: Fix wrong statistic when print error info
scsi: scsi_error: Fix device reset is not triggered
scsi: scsi_core: Fix IO hang when device removing

drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++-------------
drivers/scsi/scsi_error.c | 4 ++--
drivers/scsi/scsi_lib.c | 2 +-
include/scsi/scsi_device.h | 25 +++++++++++++++++++---
4 files changed, 53 insertions(+), 21 deletions(-)

--
2.32.0


2023-10-07 09:46:25

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle

On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
>
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
>

ping...

> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
>
> V2:
> - Fix IO hang by run all devices' queue after error handler
> - Do not modify shost_for_each_device() directly but add a new
> helper to iterate devices but do not skip devices in removing
>
> Wenchao Hao (4):
> scsi: core: Add new helper to iterate all devices of host
> scsi: scsi_error: Fix wrong statistic when print error info
> scsi: scsi_error: Fix device reset is not triggered
> scsi: scsi_core: Fix IO hang when device removing
>
> drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++-------------
> drivers/scsi/scsi_error.c | 4 ++--
> drivers/scsi/scsi_lib.c | 2 +-
> include/scsi/scsi_device.h | 25 +++++++++++++++++++---
> 4 files changed, 53 insertions(+), 21 deletions(-)
>

2023-10-09 06:59:31

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle

On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
>
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
>
> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
>

These patches fix bug which is easy to recurrent when removing device
and error handle happened together, so friendly ping again...

> V2:
> - Fix IO hang by run all devices' queue after error handler
> - Do not modify shost_for_each_device() directly but add a new
> helper to iterate devices but do not skip devices in removing
>
> Wenchao Hao (4):
> scsi: core: Add new helper to iterate all devices of host
> scsi: scsi_error: Fix wrong statistic when print error info
> scsi: scsi_error: Fix device reset is not triggered
> scsi: scsi_core: Fix IO hang when device removing
>
> drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++-------------
> drivers/scsi/scsi_error.c | 4 ++--
> drivers/scsi/scsi_lib.c | 2 +-
> include/scsi/scsi_device.h | 25 +++++++++++++++++++---
> 4 files changed, 53 insertions(+), 21 deletions(-)
>

2023-10-10 02:16:03

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle

On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
>
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
>
> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
>

Hi Martin, would you help review these patches?

> V2:
> - Fix IO hang by run all devices' queue after error handler
> - Do not modify shost_for_each_device() directly but add a new
> helper to iterate devices but do not skip devices in removing
>
> Wenchao Hao (4):
> scsi: core: Add new helper to iterate all devices of host
> scsi: scsi_error: Fix wrong statistic when print error info
> scsi: scsi_error: Fix device reset is not triggered
> scsi: scsi_core: Fix IO hang when device removing
>
> drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++-------------
> drivers/scsi/scsi_error.c | 4 ++--
> drivers/scsi/scsi_lib.c | 2 +-
> include/scsi/scsi_device.h | 25 +++++++++++++++++++---
> 4 files changed, 53 insertions(+), 21 deletions(-)
>