2023-09-01 14:19:32

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

It's unbearable for systems with large scale scsi devices share HBAs to
block all devices' IOs when handle error commands, we need a new error
handle mechanism to address this issue.

I consulted about this issue a year ago, the discuss link can be found in
refenence. Hannes replied about why we have to block the SCSI host
then perform error recovery kindly. I think it's unnecessary to block
SCSI host for all drivers and can try a small level recovery(LUN based for
example) first to avoid block the SCSI host.

The new error handle mechanism introduced in this patchset has been
developed and tested with out self developed hardware since one year
ago, now we want this mechanism can be used by more drivers.

Drivers can decide if using the new error handle mechanism and how to
handle error commands when scsi_device are scanned,the new mechanism
makes SCSI error handle more flexible.

SCSI error recovery strategy after blocking host's IO is mainly
following steps:

- LUN reset
- Target reset
- Bus reset
- Host reset

Some drivers did not implement callbacks for host reset, it's unnecessary
to block host's IO for these drivers. For example, smartpqi only registered
device reset, if device reset failed, it's meaningless to fallback to target
reset, bus reset or host reset any more, because these steps would also
failed.

Here are some drivers we concerned:(there are too many kinds of drivers
to figure out, so here I just list some drivers I am familiar with)

+-------------+--------------+--------------+-----------+------------+
| drivers | device_reset | target_reset | bus_reset | host_reset |
+-------------+--------------+--------------+-----------+------------+
| mpt3sas | Y | Y | N | Y |
+-------------+--------------+--------------+-----------+------------+
| smartpqi | Y | N | N | N |
+-------------+--------------+--------------+-----------+------------+
| megaraidsas | N | Y | N | Y |
+-------------+--------------+--------------+-----------+------------+
| virtioscsi | Y | N | N | N |
+-------------+--------------+--------------+-----------+------------+
| iscsi_tcp | Y | Y | N | N |
+-------------+--------------+--------------+-----------+------------+
| hisisas | Y | Y | N | N |
+-------------+--------------+--------------+-----------+------------+

For LUN based error handle, when scsi command is classified as error,
we would block the scsi device's IO and try to recover this scsi
device, if still can not recover all error commands, it might
fallback to target or host level recovery.

It's same for target based error handle, but target based error handle
would block the scsi target's IO then try to recover the error commands
of this target.

The first patch defines basic framework to support LUN/target based error
handle mechanism, three key operations are abstracted which are:
- add error command
- wake up error handle
- block IOs when error command is added and recoverying.

Drivers can implement these three function callbacks and setup to SCSI
middle level; I also add a general LUN/target based error handle strategy
which can be called directly from drivers to implement LUN/tartget based
error handle.

The changes of SCSI middle level's error handle are tested with scsi_debug
which support single LUN error injection, the scsi_debug patches can be
found in reference, following scenarios are tested.

Scenario1: LUN based error handle is enabled:
+-----------+---------+-------------------------------------------------------+
| lun reset | TUR | Desired result |
+ --------- + ------- + ------------------------------------------------------+
| success | success | retry or finish with EIO(may offline disk) |
+ --------- + ------- + ------------------------------------------------------+
| success | fail | fallback to host recovery, retry or finish with |
| | | EIO(may offline disk) |
+ --------- + ------- + ------------------------------------------------------+
| fail | NA | fallback to host recovery, retry or finish with |
| | | EIO(may offline disk) |
+ --------- + ------- + ------------------------------------------------------+

Scenario2: target based error handle is enabled:
+-----------+---------+--------------+---------+------------------------------+
| lun reset | TUR | target reset | TUR | Desired result |
+-----------+---------+--------------+---------+------------------------------+
| success | success | NA | NA | retry or finish with |
| | | | | EIO(may offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| success | fail | success | success | retry or finish with |
| | | | | EIO(may offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | success | success | retry or finish with |
| | | | | EIO(may offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | success | fail | fallback to host recovery, |
| | | | | retry or finish with EIO(may |
| | | | | offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | fail | NA | fallback to host recovery, |
| | | | | retry or finish with EIO(may |
| | | | | offline disk) |
+-----------+---------+--------------+---------+------------------------------+

Scenario3: both LUN and target based error handle are enabled:
+-----------+---------+--------------+---------+------------------------------+
| lun reset | TUR | target reset | TUR | Desired result |
+-----------+---------+--------------+---------+------------------------------+
| success | success | NA | NA | retry or finish with |
| | | | | EIO(may offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| success | fail | success | success | lun recovery fallback to |
| | | | | target recovery, retry or |
| | | | | finish with EIO(may offline |
| | | | | disk |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | success | success | lun recovery fallback to |
| | | | | target recovery, retry or |
| | | | | finish with EIO(may offline |
| | | | | disk |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | success | fail | lun recovery fallback to |
| | | | | target recovery, then fall |
| | | | | back to host recovery, retry |
| | | | | or fhinsi with EIO(may |
| | | | | offline disk) |
+-----------+---------+--------------+---------+------------------------------+
| fail | NA | fail | NA | lun recovery fallback to |
| | | | | target recovery, then fall |
| | | | | back to host recovery, retry |
| | | | | or fhinsi with EIO(may |
| | | | | offline disk) |
+-----------+---------+--------------+---------+------------------------------+

References: https://lore.kernel.org/linux-scsi/[email protected]/
References: https://lore.kernel.org/linux-scsi/[email protected]/

Wenchao Hao (19):
scsi: scsi_error: Define framework for LUN/target based error handle
scsi: scsi_error: Move complete variable eh_action from shost to sdevice
scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
scsi: scsi_error: Add flags to mark error handle steps has done
scsi: scsi_error: Add helper to handle scsi device's error command list
scsi: scsi_error: Add a general LUN based error handler
scsi: core: increase/decrease target_busy without check can_queue
scsi: scsi_error: Add helper to handle scsi target's error command list
scsi: scsi_error: Add a general target based error handler
scsi: scsi_debug: Add param to control LUN bassed error handler
scsi: scsi_debug: Add param to control target based error handle
scsi: mpt3sas: Add param to control LUN based error handle
scsi: mpt3sas: Add param to control target based error handle
scsi: smartpqi: Add param to control LUN based error handle
scsi: megaraid_sas: Add param to control target based error handle
scsi: virtio_scsi: Add param to control LUN based error handle
scsi: iscsi_tcp: Add param to control LUN based error handle

drivers/scsi/iscsi_tcp.c | 20 +
drivers/scsi/megaraid/megaraid_sas_base.c | 20 +
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 28 +
drivers/scsi/scsi_debug.c | 24 +
drivers/scsi/scsi_error.c | 756 ++++++++++++++++++++--
drivers/scsi/scsi_lib.c | 23 +-
drivers/scsi/scsi_priv.h | 18 +
drivers/scsi/smartpqi/smartpqi_init.c | 14 +
drivers/scsi/virtio_scsi.c | 16 +-
include/scsi/scsi_device.h | 97 +++
include/scsi/scsi_eh.h | 8 +
include/scsi/scsi_host.h | 2 -
12 files changed, 963 insertions(+), 63 deletions(-)

--
2.35.3



2023-09-01 14:21:21

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 02/19] scsi: scsi_error: Move complete variable eh_action from shost to sdevice

eh_action is used to wait for error handle command's completion if scsi
command is send in error handle. Now the error handler might based on
scsi_device, so move it to scsi_device.

This is preparation for a genernal LUN/target based error handle
strategy.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 6 +++---
include/scsi/scsi_device.h | 2 ++
include/scsi/scsi_host.h | 2 --
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 1d1d97b94613..879fdd7c165b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -914,7 +914,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd)
SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_INFO, scmd,
"%s result: %x\n", __func__, scmd->result));

- eh_action = scmd->device->host->eh_action;
+ eh_action = scmd->device->eh_action;
if (eh_action)
complete(eh_action);
}
@@ -1203,7 +1203,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd,

retry:
scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes);
- shost->eh_action = &done;
+ sdev->eh_action = &done;

scsi_log_send(scmd);
scmd->submitter = SUBMITTED_BY_SCSI_ERROR_HANDLER;
@@ -1246,7 +1246,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd,
rtn = SUCCESS;
}

- shost->eh_action = NULL;
+ sdev->eh_action = NULL;

scsi_log_completion(scmd, rtn);

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 08ed9a03015d..df3f1b8d1390 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -324,6 +324,8 @@ struct scsi_device {
enum scsi_device_state sdev_state;
struct task_struct *quiesced_by;
struct scsi_device_eh *eh;
+ struct completion *eh_action; /* Wait for specific actions */
+ /* on the device. */
unsigned long sdev_data[];
} __attribute__((aligned(sizeof(unsigned long))));

diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 70b7475dcf56..def0d99e9b36 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -554,8 +554,6 @@ struct Scsi_Host {
struct list_head eh_abort_list;
struct list_head eh_cmd_q;
struct task_struct * ehandler; /* Error recovery thread. */
- struct completion * eh_action; /* Wait for specific actions on the
- host. */
wait_queue_head_t host_wait;
const struct scsi_host_template *hostt;
struct scsi_transport_template *transportt;
--
2.35.3


2023-09-01 14:34:42

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 15/19] scsi: mpt3sas: Add param to control target based error handle

Add new module param target_eh to control if enable target based
error handle, since mpt3sas defined callback eh_host_reset, so make
it fallback to further recover when target based recovery can not
recover all error commands.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 7a48e89c3e5d..6170d8a772d4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -178,6 +178,10 @@ static bool lun_eh;
module_param(lun_eh, bool, 0444);
MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");

+static bool target_eh;
+module_param(target_eh, bool, 0444);
+MODULE_PARM_DESC(target_eh, "target based error handle (def=0)");
+
/* raid transport support */
static struct raid_template *mpt3sas_raid_template;
static struct raid_template *mpt2sas_raid_template;
@@ -1879,6 +1883,13 @@ scsih_target_alloc(struct scsi_target *starget)
struct _pcie_device *pcie_device;
unsigned long flags;
struct sas_rphy *rphy;
+ int ret = 0;
+
+ if (target_eh) {
+ ret = scsi_target_setup_eh(starget, 1);
+ if (ret)
+ return ret;
+ }

sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data),
GFP_KERNEL);
@@ -1969,6 +1980,9 @@ scsih_target_destroy(struct scsi_target *starget)
struct _pcie_device *pcie_device;
unsigned long flags;

+ if (target_eh)
+ scsi_target_clear_eh(starget);
+
sas_target_priv_data = starget->hostdata;
if (!sas_target_priv_data)
return;
--
2.35.3


2023-09-01 15:38:41

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 17/19] scsi: megaraid_sas: Add param to control target based error handle

Add new param target_eh to control if enable target based error
handler, since megaraid_sas did not define callback eh_device_reset,
so only target based error handler is enabled; and megaraid_sas
defined eh_host_reset, so make it fallback to further recover
when target based recovery can not recover all error commands.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/megaraid/megaraid_sas_base.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 050eed8e2684..cc00cd5b213d 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -45,6 +45,7 @@
#include <scsi/scsi_host.h>
#include <scsi/scsi_tcq.h>
#include <scsi/scsi_dbg.h>
+#include <scsi/scsi_eh.h>
#include "megaraid_sas_fusion.h"
#include "megaraid_sas.h"

@@ -127,6 +128,10 @@ int host_tagset_enable = 1;
module_param(host_tagset_enable, int, 0444);
MODULE_PARM_DESC(host_tagset_enable, "Shared host tagset enable/disable Default: enable(1)");

+static bool target_eh;
+module_param(target_eh, bool, 0444);
+MODULE_PARM_DESC(target_eh, "target based error handle (def=0)");
+
MODULE_LICENSE("GPL");
MODULE_VERSION(MEGASAS_VERSION);
MODULE_AUTHOR("[email protected]");
@@ -2174,6 +2179,19 @@ static void megasas_slave_destroy(struct scsi_device *sdev)
sdev->hostdata = NULL;
}

+static int megasas_target_alloc(struct scsi_target *starget)
+{
+ if (target_eh)
+ return scsi_target_setup_eh(starget, 1);
+ return 0;
+}
+
+static void megasas_target_destroy(struct scsi_target *starget)
+{
+ if (target_eh)
+ scsi_target_clear_eh(starget);
+}
+
/*
* megasas_complete_outstanding_ioctls - Complete outstanding ioctls after a
* kill adapter
@@ -3525,6 +3543,8 @@ static const struct scsi_host_template megasas_template = {
.change_queue_depth = scsi_change_queue_depth,
.max_segment_size = 0xffffffff,
.cmd_size = sizeof(struct megasas_cmd_priv),
+ .target_alloc = megasas_target_alloc,
+ .target_destroy = megasas_target_destroy,
};

/**
--
2.35.3


2023-09-01 17:06:05

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 18/19] scsi: virtio_scsi: Add param to control LUN based error handle

Add new param lun_eh to control if enable LUN based error handler, since
virtio_scsi did not define other further reset callbacks, it is not
necessary to fallback to further recover any more, so set the LUN
error handler with fallback set to 0.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/virtio_scsi.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index bd5633667d01..7bf4a34cdd20 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -28,6 +28,7 @@
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_tcq.h>
#include <scsi/scsi_devinfo.h>
+#include <scsi/scsi_eh.h>
#include <linux/seqlock.h>
#include <linux/blk-mq-virtio.h>

@@ -37,6 +38,10 @@
#define VIRTIO_SCSI_EVENT_LEN 8
#define VIRTIO_SCSI_VQ_BASE 2

+static bool lun_eh;
+module_param(lun_eh, bool, 0444);
+MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");
+
/* Command queue element */
struct virtio_scsi_cmd {
struct scsi_cmnd *sc;
@@ -679,9 +684,18 @@ static int virtscsi_device_alloc(struct scsi_device *sdevice)
*/
sdevice->sdev_bflags = BLIST_TRY_VPD_PAGES;

+ if (lun_eh)
+ return scsi_device_setup_eh(sdevice, 0);
+
return 0;
}

+static void virtscsi_device_destroy(struct scsi_device *sdevice)
+{
+ if (lun_eh)
+ return scsi_device_clear_eh(sdevice);
+}
+

/**
* virtscsi_change_queue_depth() - Change a virtscsi target's queue depth
@@ -757,7 +771,7 @@ static const struct scsi_host_template virtscsi_host_template = {
.eh_device_reset_handler = virtscsi_device_reset,
.eh_timed_out = virtscsi_eh_timed_out,
.slave_alloc = virtscsi_device_alloc,
-
+ .slave_destroy = virtscsi_device_destroy,
.dma_boundary = UINT_MAX,
.map_queues = virtscsi_map_queues,
.track_queue_depth = 1,
--
2.35.3


2023-09-01 17:33:17

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 10/19] scsi: scsi_error: Add helper to handle scsi target's error command list

Add helper scsi_starget_eh() to handle scsi target's error command list,
it would perform some steps which can be done with target's IO blocked,
including check sense, start unit, reset lun and reset target.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 129 ++++++++++++++++++++++++++++++++++++++
include/scsi/scsi_eh.h | 2 +
2 files changed, 131 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b17bf1dea799..50cd8104175d 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -2547,6 +2547,135 @@ int scsi_sdev_eh(struct scsi_device *sdev,
}
EXPORT_SYMBOL_GPL(scsi_sdev_eh);

+static int starget_eh_stu(struct scsi_target *starget,
+ struct list_head *work_q,
+ struct list_head *done_q)
+{
+ struct scsi_device *sdev;
+ struct scsi_cmnd *scmd, *stu_scmd;
+
+ list_for_each_entry(sdev, &starget->devices, same_target_siblings) {
+ if (sdev_stu_done(sdev))
+ continue;
+
+ stu_scmd = NULL;
+ list_for_each_entry(scmd, work_q, eh_entry)
+ if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
+ scsi_check_sense(scmd) == FAILED) {
+ stu_scmd = scmd;
+ break;
+ }
+ if (!stu_scmd)
+ continue;
+
+ if (scsi_eh_sdev_stu(stu_scmd, work_q, done_q))
+ return 1;
+ }
+
+ return 0;
+}
+
+static int starget_eh_reset_lun(struct scsi_target *starget,
+ struct list_head *work_q,
+ struct list_head *done_q)
+{
+ struct scsi_device *sdev;
+ struct scsi_cmnd *scmd, *bdr_scmd;
+
+ list_for_each_entry(sdev, &starget->devices, same_target_siblings) {
+ if (sdev_reset_done(sdev))
+ continue;
+
+ bdr_scmd = NULL;
+ list_for_each_entry(scmd, work_q, eh_entry)
+ if (scmd->device) {
+ bdr_scmd = scmd;
+ break;
+ }
+ if (!bdr_scmd)
+ continue;
+
+ if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q))
+ return 1;
+ }
+
+ return 0;
+}
+
+static int starget_eh_reset_target(struct scsi_target *starget,
+ struct list_head *work_q,
+ struct list_head *done_q)
+{
+ enum scsi_disposition rtn;
+ struct scsi_cmnd *scmd, *next;
+ LIST_HEAD(check_list);
+
+ scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry);
+
+ SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget,
+ "%s: Sending target reset\n", current->comm));
+
+ rtn = scsi_try_target_reset(scmd);
+ if (rtn != SUCCESS && rtn != FAST_IO_FAIL) {
+ SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget,
+ "%s: Target reset failed\n",
+ current->comm));
+ return 0;
+ }
+
+ SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget,
+ "%s: Target reset success\n", current->comm));
+
+ list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
+ if (rtn == SUCCESS)
+ list_move_tail(&scmd->eh_entry, &check_list);
+ else if (rtn == FAST_IO_FAIL)
+ scsi_eh_finish_cmd(scmd, done_q);
+ }
+
+ return scsi_eh_test_devices(&check_list, work_q, done_q, 0);
+}
+
+/*
+ * Target based error handle
+ *
+ * @work_q: list of scsi commands need to recovery
+ * @done_q: list of scsi commands handled
+ *
+ * return: return 1 if all commands in work_q is recoveryed, else 0 is returned
+ */
+int scsi_starget_eh(struct scsi_target *starget,
+ struct list_head *work_q,
+ struct list_head *done_q)
+{
+ int ret = 0;
+
+ SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget,
+ "%s:targeteh: checking sense\n", current->comm));
+ ret = scsi_eh_get_sense(work_q, done_q);
+ if (ret)
+ return ret;
+
+ SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget,
+ "%s:targeteh: start unit\n", current->comm));
+ ret = starget_eh_stu(starget, work_q, done_q);
+ if (ret)
+ return ret;
+
+ SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget,
+ "%s:targeteh reset LUN\n", current->comm));
+ ret = starget_eh_reset_lun(starget, work_q, done_q);
+ if (ret)
+ return ret;
+
+ SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget,
+ "%s:targeteh reset target\n", current->comm));
+ ret = starget_eh_reset_target(starget, work_q, done_q);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(scsi_starget_eh);
+
/*
* Function: scsi_report_bus_reset()
*
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index 89b471aa484f..80e2f130e884 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd,
extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *);
extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq,
struct list_head *doneq);
+extern int scsi_starget_eh(struct scsi_target *starget,
+ struct list_head *workq, struct list_head *doneq);
extern int scsi_device_setup_eh(struct scsi_device *sdev, int fallback);
extern void scsi_device_clear_eh(struct scsi_device *sdev);

--
2.35.3


2023-09-01 22:41:46

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 09/19] scsi: core: increase/decrease target_busy without check can_queue

This is preparation for a genernal target based error handle strategy
to check if to wake up actual error handler.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_lib.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index db0a42fe49c0..4a7fb48aa60f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -293,8 +293,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)

scsi_dec_host_busy(shost, cmd);

- if (starget->can_queue > 0)
- atomic_dec(&starget->target_busy);
+ atomic_dec(&starget->target_busy);

sbitmap_put(&sdev->budget_map, cmd->budget_token);
cmd->budget_token = -1;
@@ -1311,10 +1310,10 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost,
spin_unlock_irq(shost->host_lock);
}

+ busy = atomic_inc_return(&starget->target_busy) - 1;
if (starget->can_queue <= 0)
return 1;

- busy = atomic_inc_return(&starget->target_busy) - 1;
if (atomic_read(&starget->target_blocked) > 0) {
if (busy)
goto starved;
@@ -1339,8 +1338,7 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost,
list_move_tail(&sdev->starved_entry, &shost->starved_list);
spin_unlock_irq(shost->host_lock);
out_dec:
- if (starget->can_queue > 0)
- atomic_dec(&starget->target_busy);
+ atomic_dec(&starget->target_busy);
return 0;
}

@@ -1784,8 +1782,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
out_dec_host_busy:
scsi_dec_host_busy(shost, cmd);
out_dec_target_busy:
- if (scsi_target(sdev)->can_queue > 0)
- atomic_dec(&scsi_target(sdev)->target_busy);
+ atomic_dec(&scsi_target(sdev)->target_busy);
out_put_budget:
scsi_mq_put_budget(q, cmd->budget_token);
cmd->budget_token = -1;
--
2.35.3


2023-09-01 23:19:57

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 19/19] scsi: iscsi_tcp: Add param to control LUN based error handle

Add new param lun_eh to control if enable LUN based error handler,
since iscsi_tcp defined callback eh_target_reset, so make it
fallback to further recover when LUN based recovery can not recover
all error commands.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/iscsi_tcp.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 9ab8555180a3..83474dc0ecd5 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -35,6 +35,7 @@
#include <scsi/scsi_host.h>
#include <scsi/scsi.h>
#include <scsi/scsi_transport_iscsi.h>
+#include <scsi/scsi_eh.h>
#include <trace/events/iscsi.h>
#include <trace/events/sock.h>

@@ -63,6 +64,10 @@ module_param_named(debug_iscsi_tcp, iscsi_sw_tcp_dbg, int,
MODULE_PARM_DESC(debug_iscsi_tcp, "Turn on debugging for iscsi_tcp module "
"Set to 1 to turn on, and zero to turn off. Default is off.");

+static bool iscsi_sw_tcp_lun_eh;
+module_param_named(lun_eh, iscsi_sw_tcp_lun_eh, bool, 0444);
+MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");
+
#define ISCSI_SW_TCP_DBG(_conn, dbg_fmt, arg...) \
do { \
if (iscsi_sw_tcp_dbg) \
@@ -1065,6 +1070,19 @@ static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev)
return 0;
}

+static int iscsi_sw_tcp_slave_alloc(struct scsi_device *sdev)
+{
+ if (iscsi_sw_tcp_lun_eh)
+ return scsi_device_setup_eh(sdev, 1);
+ return 0;
+}
+
+static void iscsi_sw_tcp_slave_destroy(struct scsi_device *sdev)
+{
+ if (iscsi_sw_tcp_lun_eh)
+ return scsi_device_clear_eh(sdev);
+}
+
static const struct scsi_host_template iscsi_sw_tcp_sht = {
.module = THIS_MODULE,
.name = "iSCSI Initiator over TCP/IP",
@@ -1080,6 +1098,8 @@ static const struct scsi_host_template iscsi_sw_tcp_sht = {
.eh_target_reset_handler = iscsi_eh_recover_target,
.dma_boundary = PAGE_SIZE - 1,
.slave_configure = iscsi_sw_tcp_slave_configure,
+ .slave_alloc = iscsi_sw_tcp_slave_alloc,
+ .slave_destroy = iscsi_sw_tcp_slave_destroy,
.proc_name = "iscsi_tcp",
.this_id = -1,
.track_queue_depth = 1,
--
2.35.3

2023-09-02 08:23:10

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 08/19] scsi: scsi_error: Add a general LUN based error handler

Add a general LUN based error handler which can be used by drivers
directly. This error handler implements an scsi_device_eh, when handling
error commands, it would call helper function scsi_sdev_eh() added before
to try recover error commands.

The behavior if scsi_sdev_eh() can not recover all error commands
depends on fallback flag, which is initialized when scsi_device is
allocated. If fallback is set, it would fallback to further error
recover strategy like old host based error handle; else it would
mark this scsi device offline and flush all error commands.

To using this error handler, drivers should call scsi_device_setup_eh()
in its slave_alloc() to setup it's LUN based error handler;
call scsi_device_clear_eh() in its slave_destroy() to clear LUN based
error handler.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 170 ++++++++++++++++++++++++++++++++++++++
include/scsi/scsi_eh.h | 2 +
2 files changed, 172 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f24f081fc637..b17bf1dea799 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -2759,3 +2759,173 @@ bool scsi_get_sense_info_fld(const u8 *sense_buffer, int sb_len,
}
}
EXPORT_SYMBOL(scsi_get_sense_info_fld);
+
+struct scsi_lun_eh {
+ spinlock_t eh_lock;
+ unsigned int eh_num;
+ struct list_head eh_cmd_q;
+ struct scsi_device *sdev;
+ struct work_struct eh_handle_work;
+ unsigned int fallback:1; /* If fallback to further */
+ /* recovery on failure */
+};
+
+/*
+ * error handle strategy based on LUN, following steps
+ * is applied to recovery error commands in list:
+ * check sense data
+ * send start unit
+ * reset lun
+ * if there are still error commands, it would fallback to
+ * target based or host based error handle for further recovery.
+ */
+static void sdev_eh_work(struct work_struct *work)
+{
+ unsigned long flags;
+ struct scsi_lun_eh *luneh =
+ container_of(work, struct scsi_lun_eh, eh_handle_work);
+ struct scsi_device *sdev = luneh->sdev;
+ struct scsi_device_eh *eh = sdev->eh;
+ struct Scsi_Host *shost = sdev->host;
+ struct scsi_cmnd *scmd, *next;
+ LIST_HEAD(eh_work_q);
+ LIST_HEAD(eh_done_q);
+
+ spin_lock_irqsave(&luneh->eh_lock, flags);
+ list_splice_init(&luneh->eh_cmd_q, &eh_work_q);
+ spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+ if (scsi_sdev_eh(sdev, &eh_work_q, &eh_done_q))
+ goto out_flush_done;
+
+ if (!luneh->fallback) {
+ list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry)
+ scsi_eh_finish_cmd(scmd, &eh_done_q);
+
+ sdev_printk(KERN_INFO, sdev, "%s:luneh: Device offlined - "
+ "not ready after error recovery\n", current->comm);
+
+ mutex_lock(&sdev->state_mutex);
+ scsi_device_set_state(sdev, SDEV_OFFLINE);
+ mutex_unlock(&sdev->state_mutex);
+
+ goto out_flush_done;
+ }
+
+ /*
+ * fallback to target or host based error handle
+ */
+ SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+ "%s:luneh fallback to further recovery\n", current->comm));
+ list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) {
+ list_del_init(&scmd->eh_entry);
+
+ if (scsi_host_in_recovery(shost) ||
+ __scsi_eh_scmd_add_starget(scmd))
+ __scsi_eh_scmd_add(scmd);
+ }
+
+ eh->get_sense_done = 1;
+ eh->stu_done = 1;
+ eh->reset_done = 1;
+
+out_flush_done:
+ scsi_eh_flush_done_q(&eh_done_q);
+ spin_lock_irqsave(&luneh->eh_lock, flags);
+ luneh->eh_num = 0;
+ spin_unlock_irqrestore(&luneh->eh_lock, flags);
+}
+static void sdev_eh_add_cmnd(struct scsi_cmnd *scmd)
+{
+ unsigned long flags;
+ struct scsi_lun_eh *luneh;
+ struct scsi_device *sdev = scmd->device;
+
+ luneh = (struct scsi_lun_eh *)sdev->eh->driver_data;
+
+ spin_lock_irqsave(&luneh->eh_lock, flags);
+ list_add_tail(&scmd->eh_entry, &luneh->eh_cmd_q);
+ luneh->eh_num++;
+ spin_unlock_irqrestore(&luneh->eh_lock, flags);
+}
+static int sdev_eh_is_busy(struct scsi_device *sdev)
+{
+ int ret = 0;
+ unsigned long flags;
+ struct scsi_lun_eh *luneh;
+
+ if (!sdev->eh)
+ return 0;
+
+ luneh = (struct scsi_lun_eh *)sdev->eh->driver_data;
+
+ spin_lock_irqsave(&luneh->eh_lock, flags);
+ ret = luneh->eh_num;
+ spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+ return ret;
+}
+static int sdev_eh_wakeup(struct scsi_device *sdev)
+{
+ unsigned long flags;
+ unsigned int nr_error;
+ unsigned int nr_busy;
+ struct scsi_lun_eh *luneh;
+
+ luneh = (struct scsi_lun_eh *)sdev->eh->driver_data;
+
+ spin_lock_irqsave(&luneh->eh_lock, flags);
+ nr_error = luneh->eh_num;
+ spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+ nr_busy = scsi_device_busy(sdev);
+
+ if (!nr_error || nr_busy != nr_error) {
+ SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_INFO, sdev,
+ "%s:luneh: do not wake up, busy/error: %d/%d\n",
+ current->comm, nr_busy, nr_error));
+ return 0;
+ }
+
+ SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+ "%s:luneh: waking up, busy/error: %d/%d\n",
+ current->comm, nr_busy, nr_error));
+
+ return schedule_work(&luneh->eh_handle_work);
+}
+
+int scsi_device_setup_eh(struct scsi_device *sdev, int fallback)
+{
+ struct scsi_device_eh *eh;
+ struct scsi_lun_eh *luneh;
+
+ eh = kzalloc(sizeof(struct scsi_device_eh) + sizeof(struct scsi_lun_eh),
+ GFP_KERNEL);
+ if (!eh) {
+ sdev_printk(KERN_ERR, sdev, "failed to setup error handle\n");
+ return -ENOMEM;
+ }
+ luneh = (struct scsi_lun_eh *)eh->driver_data;
+
+ eh->add_cmnd = sdev_eh_add_cmnd;
+ eh->is_busy = sdev_eh_is_busy;
+ eh->wakeup = sdev_eh_wakeup;
+
+ luneh->fallback = fallback;
+ luneh->sdev = sdev;
+ spin_lock_init(&luneh->eh_lock);
+ INIT_LIST_HEAD(&luneh->eh_cmd_q);
+ INIT_WORK(&luneh->eh_handle_work, sdev_eh_work);
+
+ sdev->eh = eh;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(scsi_device_setup_eh);
+
+void scsi_device_clear_eh(struct scsi_device *sdev)
+{
+ kfree(sdev->eh);
+ sdev->eh = NULL;
+}
+EXPORT_SYMBOL_GPL(scsi_device_clear_eh);
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index 5ce791063baf..89b471aa484f 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd,
extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *);
extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq,
struct list_head *doneq);
+extern int scsi_device_setup_eh(struct scsi_device *sdev, int fallback);
+extern void scsi_device_clear_eh(struct scsi_device *sdev);

static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr)
{
--
2.35.3

2023-09-02 08:49:26

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 12/19] scsi: scsi_debug: Add param to control LUN bassed error handler

Add new module param lun_eh to control if enable LUN based
error handle, and param lun_eh_fallback to control if fallback
to further recover when LUN recovery can not recover all
error commands. This is used to test the LUN based error handle.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index f30399a75ec0..af3d43c9db6f 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -841,6 +841,8 @@ static bool have_dif_prot;
static bool write_since_sync;
static bool sdebug_statistics = DEF_STATISTICS;
static bool sdebug_wp;
+static bool sdebug_lun_eh;
+static bool sdebug_lun_eh_fallback;
/* Following enum: 0: no zbc, def; 1: host aware; 2: host managed */
static enum blk_zoned_model sdeb_zbc_model = BLK_ZONED_NONE;
static char *sdeb_zbc_model_s;
@@ -5437,6 +5439,9 @@ static int scsi_debug_slave_alloc(struct scsi_device *sdp)
pr_info("slave_alloc <%u %u %u %llu>\n",
sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);

+ if (sdebug_lun_eh)
+ return scsi_device_setup_eh(sdp, sdebug_lun_eh_fallback);
+
return 0;
}

@@ -5491,6 +5496,9 @@ static void scsi_debug_slave_destroy(struct scsi_device *sdp)
/* make this slot available for re-use */
devip->used = false;
sdp->hostdata = NULL;
+
+ if (sdebug_lun_eh)
+ scsi_device_clear_eh(sdp);
}

/* Returns true if we require the queued memory to be freed by the caller. */
@@ -6167,6 +6175,8 @@ module_param_named(zone_cap_mb, sdeb_zbc_zone_cap_mb, int, S_IRUGO);
module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO);
+module_param_named(lun_eh, sdebug_lun_eh, bool, S_IRUGO);
+module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, S_IRUGO);

MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -6239,6 +6249,8 @@ MODULE_PARM_DESC(zone_cap_mb, "Zone capacity in MiB (def=zone size)");
MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit (def=auto)");
MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)");
MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)");
+MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");
+MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN recovery failed (def=0)");

#define SDEBUG_INFO_LEN 256
static char sdebug_info[SDEBUG_INFO_LEN];
--
2.35.3

2023-09-03 13:45:47

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 13/19] scsi: scsi_debug: Add param to control target based error handle

Add new module param target_eh to control if enable target based error
handler, and param target_eh_fallback to control if fallback to further
recover when target recovery can not recover all error commands.
This is used to test the target based error handle.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index af3d43c9db6f..9a8aa48fb8a4 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -843,6 +843,8 @@ static bool sdebug_statistics = DEF_STATISTICS;
static bool sdebug_wp;
static bool sdebug_lun_eh;
static bool sdebug_lun_eh_fallback;
+static bool sdebug_target_eh;
+static bool sdebug_target_eh_fallback;
/* Following enum: 0: no zbc, def; 1: host aware; 2: host managed */
static enum blk_zoned_model sdeb_zbc_model = BLK_ZONED_NONE;
static char *sdeb_zbc_model_s;
@@ -1137,6 +1139,9 @@ static int sdebug_target_alloc(struct scsi_target *starget)

starget->hostdata = targetip;

+ if (sdebug_target_eh)
+ return scsi_target_setup_eh(starget, sdebug_target_eh_fallback);
+
return 0;
}

@@ -1152,6 +1157,9 @@ static void sdebug_target_destroy(struct scsi_target *starget)
{
struct sdebug_target_info *targetip;

+ if (sdebug_target_eh)
+ scsi_target_clear_eh(starget);
+
targetip = (struct sdebug_target_info *)starget->hostdata;
if (targetip) {
starget->hostdata = NULL;
@@ -6177,6 +6185,8 @@ module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO);
module_param_named(lun_eh, sdebug_lun_eh, bool, S_IRUGO);
module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, S_IRUGO);
+module_param_named(target_eh, sdebug_target_eh, bool, S_IRUGO);
+module_param_named(target_eh_fallback, sdebug_target_eh_fallback, bool, S_IRUGO);

MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -6251,6 +6261,8 @@ MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)");
MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)");
MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");
MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN recovery failed (def=0)");
+MODULE_PARM_DESC(target_eh, "target based error handle (def=0)");
+MODULE_PARM_DESC(target_eh_fallback, "Fallback to further recovery if target recovery failed (def=0)");

#define SDEBUG_INFO_LEN 256
static char sdebug_info[SDEBUG_INFO_LEN];
--
2.35.3

2023-09-04 07:04:44

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 05/19] scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset

Add helper function scsi_eh_sdev_reset() to perform lun reset and check
if to finish some error commands.

This is preparation for a genernal LUN/target based error handle
strategy and did not change original logic.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 54 +++++++++++++++++++++++----------------
1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 64eb616261ec..16888540b663 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1635,6 +1635,34 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
return list_empty(work_q);
}

+static int scsi_eh_sdev_reset(struct scsi_cmnd *scmd,
+ struct list_head *work_q,
+ struct list_head *done_q)
+{
+ struct scsi_cmnd *next;
+ struct scsi_device *sdev = scmd->device;
+ enum scsi_disposition rtn;
+
+ SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev,
+ "%s: Sending BDR\n", current->comm));
+
+ rtn = scsi_try_bus_device_reset(scmd);
+ if (rtn != SUCCESS && rtn != FAST_IO_FAIL) {
+ SCSI_LOG_ERROR_RECOVERY(3,
+ sdev_printk(KERN_INFO, sdev,
+ "%s: BDR failed\n", current->comm));
+ return 0;
+ }
+
+ if (!scsi_device_online(sdev) || rtn == FAST_IO_FAIL ||
+ !scsi_eh_tur(scmd))
+ list_for_each_entry_safe(scmd, next, work_q, eh_entry)
+ if (scmd->device == sdev &&
+ scsi_eh_action(scmd, rtn) != FAILED)
+ scsi_eh_finish_cmd(scmd, done_q);
+
+ return list_empty(work_q);
+}

/**
* scsi_eh_bus_device_reset - send bdr if needed
@@ -1652,9 +1680,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
struct list_head *work_q,
struct list_head *done_q)
{
- struct scsi_cmnd *scmd, *bdr_scmd, *next;
+ struct scsi_cmnd *scmd, *bdr_scmd;
struct scsi_device *sdev;
- enum scsi_disposition rtn;

shost_for_each_device(sdev, shost) {
if (scsi_host_eh_past_deadline(shost)) {
@@ -1675,26 +1702,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
if (!bdr_scmd)
continue;

- SCSI_LOG_ERROR_RECOVERY(3,
- sdev_printk(KERN_INFO, sdev,
- "%s: Sending BDR\n", current->comm));
- rtn = scsi_try_bus_device_reset(bdr_scmd);
- if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
- if (!scsi_device_online(sdev) ||
- rtn == FAST_IO_FAIL ||
- !scsi_eh_tur(bdr_scmd)) {
- list_for_each_entry_safe(scmd, next,
- work_q, eh_entry) {
- if (scmd->device == sdev &&
- scsi_eh_action(scmd, rtn) != FAILED)
- scsi_eh_finish_cmd(scmd,
- done_q);
- }
- }
- } else {
- SCSI_LOG_ERROR_RECOVERY(3,
- sdev_printk(KERN_INFO, sdev,
- "%s: BDR failed\n", current->comm));
+ if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q)) {
+ scsi_device_put(sdev);
+ break;
}
}

--
2.35.3

2023-09-04 18:47:56

by Wenchao Hao

[permalink] [raw]
Subject: [RFC PATCH v2 01/19] scsi: scsi_error: Define framework for LUN/target based error handle

The old scsi error handle logic is based on host, once a scsi command
in one LUN of this host is classfied as failed, SCSI mid-level would
set the whole host to recovery state, and no IO can be submitted to
all LUNs of this host any more before recovery finished, while the
recovery process might take a long time to finish.
It's unreasonable when there are a lot of LUNs in one host.

This change introduce a way for driver to implement its own
error handle logic which can be based on scsi LUN or scsi target
as minimum unit.

scsi_device_eh is defined for error handle based on scsi LUN, and
pointer struct scsi_device_eh "eh" is added in scsi_device, which
is NULL by default.
LLDs can initialize the sdev->eh in hostt->slave_alloc to implement an
scsi LUN based error handle. If this member is not NULL, SCSI mid-level
would branch to drivers' error handler rather than the old one which
block whole host's IO.

scsi_target_eh is defined for error handle based on scsi target, and
pointer struct scsi_target_eh "eh" is added in scsi_target, which is NULL
by default.
LLDs can initialize the starget->eh in hostt->target_alloc to implement
an scsi target based error handle. If this member is not NULL, SCSI
mid-level would branch to drivers' error handler rather than the
old one which block whole host's IO.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_error.c | 57 +++++++++++++++++++++++++++++++-
drivers/scsi/scsi_lib.c | 12 +++++++
drivers/scsi/scsi_priv.h | 18 ++++++++++
include/scsi/scsi_device.h | 67 ++++++++++++++++++++++++++++++++++++++
4 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c67cdcdc3ba8..1d1d97b94613 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -290,11 +290,48 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head)
spin_unlock_irqrestore(shost->host_lock, flags);
}

+#define SCSI_EH_NO_HANDLER 1
+
+static int __scsi_eh_scmd_add_sdev(struct scsi_cmnd *scmd)
+{
+ struct scsi_device *sdev = scmd->device;
+ struct scsi_device_eh *eh = sdev->eh;
+
+ if (!eh || !eh->add_cmnd)
+ return SCSI_EH_NO_HANDLER;
+
+ scsi_eh_reset(scmd);
+ eh->add_cmnd(scmd);
+
+ if (eh->wakeup)
+ eh->wakeup(sdev);
+
+ return 0;
+}
+
+static int __scsi_eh_scmd_add_starget(struct scsi_cmnd *scmd)
+{
+ struct scsi_device *sdev = scmd->device;
+ struct scsi_target *starget = scsi_target(sdev);
+ struct scsi_target_eh *eh = starget->eh;
+
+ if (!eh || !eh->add_cmnd)
+ return SCSI_EH_NO_HANDLER;
+
+ scsi_eh_reset(scmd);
+ eh->add_cmnd(scmd);
+
+ if (eh->wakeup)
+ eh->wakeup(starget);
+
+ return 0;
+}
+
/**
* scsi_eh_scmd_add - add scsi cmd to error handling.
* @scmd: scmd to run eh on.
*/
-void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
+static void __scsi_eh_scmd_add(struct scsi_cmnd *scmd)
{
struct Scsi_Host *shost = scmd->device->host;
unsigned long flags;
@@ -320,6 +357,24 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
call_rcu_hurry(&scmd->rcu, scsi_eh_inc_host_failed);
}

+void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
+{
+ struct scsi_device *sdev = scmd->device;
+ struct scsi_target *starget = scsi_target(sdev);
+ struct Scsi_Host *shost = sdev->host;
+
+ if (unlikely(scsi_host_in_recovery(shost)))
+ __scsi_eh_scmd_add(scmd);
+
+ if (unlikely(scsi_target_in_recovery(starget)))
+ if (__scsi_eh_scmd_add_starget(scmd))
+ __scsi_eh_scmd_add(scmd);
+
+ if (__scsi_eh_scmd_add_sdev(scmd))
+ if (__scsi_eh_scmd_add_starget(scmd))
+ __scsi_eh_scmd_add(scmd);
+}
+
/**
* scsi_timeout - Timeout function for normal scsi commands.
* @req: request that is timing out.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ad9afae49544..db0a42fe49c0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -298,6 +298,12 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)

sbitmap_put(&sdev->budget_map, cmd->budget_token);
cmd->budget_token = -1;
+
+ if (sdev->eh && sdev->eh->wakeup)
+ sdev->eh->wakeup(sdev);
+
+ if (starget->eh && starget->eh->wakeup)
+ starget->eh->wakeup(starget);
}

static void scsi_kick_queue(struct request_queue *q)
@@ -1253,6 +1259,9 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
{
int token;

+ if (scsi_device_in_recovery(sdev))
+ return -1;
+
token = sbitmap_get(&sdev->budget_map);
if (atomic_read(&sdev->device_blocked)) {
if (token < 0)
@@ -1288,6 +1297,9 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost,
struct scsi_target *starget = scsi_target(sdev);
unsigned int busy;

+ if (scsi_target_in_recovery(starget))
+ return 0;
+
if (starget->single_lun) {
spin_lock_irq(shost->host_lock);
if (starget->starget_sdev_user &&
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index f42388ecb024..f7605c2a1ed1 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -196,6 +196,24 @@ static inline void scsi_dh_add_device(struct scsi_device *sdev) { }
static inline void scsi_dh_release_device(struct scsi_device *sdev) { }
#endif

+static inline int scsi_device_in_recovery(struct scsi_device *sdev)
+{
+ struct scsi_device_eh *eh = sdev->eh;
+
+ if (eh && eh->is_busy)
+ return eh->is_busy(sdev);
+ return 0;
+}
+
+static inline int scsi_target_in_recovery(struct scsi_target *starget)
+{
+ struct scsi_target_eh *eh = starget->eh;
+
+ if (eh && eh->is_busy)
+ return eh->is_busy(starget);
+ return 0;
+}
+
struct bsg_device *scsi_bsg_register_queue(struct scsi_device *sdev);

extern int scsi_device_max_queue_depth(struct scsi_device *sdev);
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 75b2235b99e2..08ed9a03015d 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -104,6 +104,71 @@ enum scsi_vpd_parameters {
SCSI_VPD_HEADER_SIZE = 4,
};

+struct scsi_device;
+struct scsi_target;
+
+struct scsi_device_eh {
+ /*
+ * add scsi command to error handler so it would be handuled by
+ * driver's error handle strategy
+ */
+ void (*add_cmnd)(struct scsi_cmnd *scmd);
+
+ /*
+ * to judge if the device is busy handling errors, called before
+ * dispatch scsi cmnd
+ *
+ * return 0 if it's ready to accepy scsi cmnd
+ * return 0 if it's in error handle, command's would not be dispatched
+ */
+ int (*is_busy)(struct scsi_device *sdev);
+
+ /*
+ * wakeup device's error handle
+ *
+ * usually the error handler strategy would not run at once when
+ * error command is added. This function would be called when any
+ * scsi cmnd is finished or when scsi cmnd is added.
+ */
+ int (*wakeup)(struct scsi_device *sdev);
+
+ /*
+ * data entity for device specific error handler
+ */
+ unsigned long driver_data[];
+};
+
+struct scsi_target_eh {
+ /*
+ * add scsi command to error handler so it would be handuled by
+ * driver's error handle strategy
+ */
+ void (*add_cmnd)(struct scsi_cmnd *scmd);
+
+ /*
+ * to judge if the device is busy handling errors, called before
+ * dispatch scsi cmnd
+ *
+ * return 0 if it's ready to accepy scsi cmnd
+ * return 0 if it's in error handle, command's would not be dispatched
+ */
+ int (*is_busy)(struct scsi_target *starget);
+
+ /*
+ * wakeup device's error handle
+ *
+ * usually the error handler strategy would not run at once when
+ * error command is added. This function would be called when any
+ * scsi cmnd is finished or when scsi cmnd is added.
+ */
+ int (*wakeup)(struct scsi_target *starget);
+
+ /*
+ * data entity for device specific error handler
+ */
+ unsigned long driver_data[];
+};
+
struct scsi_device {
struct Scsi_Host *host;
struct request_queue *request_queue;
@@ -258,6 +323,7 @@ struct scsi_device {
struct mutex state_mutex;
enum scsi_device_state sdev_state;
struct task_struct *quiesced_by;
+ struct scsi_device_eh *eh;
unsigned long sdev_data[];
} __attribute__((aligned(sizeof(unsigned long))));

@@ -344,6 +410,7 @@ struct scsi_target {
char scsi_level;
enum scsi_target_state state;
void *hostdata; /* available to low-level driver */
+ struct scsi_target_eh *eh;
unsigned long starget_data[]; /* for the transport */
/* starget_data must be the last element!!!! */
} __attribute__((aligned(sizeof(unsigned long))));
--
2.35.3

2023-09-05 20:15:06

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/1 17:41, Wenchao Hao wrote:
> It's unbearable for systems with large scale scsi devices share HBAs to
> block all devices' IOs when handle error commands, we need a new error
> handle mechanism to address this issue.
>
> I consulted about this issue a year ago, the discuss link can be found in
> refenence. Hannes replied about why we have to block the SCSI host
> then perform error recovery kindly. I think it's unnecessary to block
> SCSI host for all drivers and can try a small level recovery(LUN based for
> example) first to avoid block the SCSI host.
>
> The new error handle mechanism introduced in this patchset has been
> developed and tested with out self developed hardware since one year
> ago, now we want this mechanism can be used by more drivers.
>
> Drivers can decide if using the new error handle mechanism and how to
> handle error commands when scsi_device are scanned,the new mechanism
> makes SCSI error handle more flexible.

Hi, Hannes and Mike, would you help take a look at these changes?

>
> SCSI error recovery strategy after blocking host's IO is mainly
> following steps:
>
> - LUN reset
> - Target reset
> - Bus reset
> - Host reset
>
> Some drivers did not implement callbacks for host reset, it's unnecessary
> to block host's IO for these drivers. For example, smartpqi only registered
> device reset, if device reset failed, it's meaningless to fallback to target
> reset, bus reset or host reset any more, because these steps would also
> failed.
>
> Here are some drivers we concerned:(there are too many kinds of drivers
> to figure out, so here I just list some drivers I am familiar with)
>
> +-------------+--------------+--------------+-----------+------------+
> | drivers | device_reset | target_reset | bus_reset | host_reset |
> +-------------+--------------+--------------+-----------+------------+
> | mpt3sas | Y | Y | N | Y |
> +-------------+--------------+--------------+-----------+------------+
> | smartpqi | Y | N | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | megaraidsas | N | Y | N | Y |
> +-------------+--------------+--------------+-----------+------------+
> | virtioscsi | Y | N | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | iscsi_tcp | Y | Y | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | hisisas | Y | Y | N | N |
> +-------------+--------------+--------------+-----------+------------+
>
> For LUN based error handle, when scsi command is classified as error,
> we would block the scsi device's IO and try to recover this scsi
> device, if still can not recover all error commands, it might
> fallback to target or host level recovery.
>
> It's same for target based error handle, but target based error handle
> would block the scsi target's IO then try to recover the error commands
> of this target.
>
> The first patch defines basic framework to support LUN/target based error
> handle mechanism, three key operations are abstracted which are:
> - add error command
> - wake up error handle
> - block IOs when error command is added and recoverying.
>
> Drivers can implement these three function callbacks and setup to SCSI
> middle level; I also add a general LUN/target based error handle strategy
> which can be called directly from drivers to implement LUN/tartget based
> error handle.
>
> The changes of SCSI middle level's error handle are tested with scsi_debug
> which support single LUN error injection, the scsi_debug patches can be
> found in reference, following scenarios are tested.
>
> Scenario1: LUN based error handle is enabled:
> +-----------+---------+-------------------------------------------------------+
> | lun reset | TUR | Desired result |
> + --------- + ------- + ------------------------------------------------------+
> | success | success | retry or finish with EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
> | success | fail | fallback to host recovery, retry or finish with |
> | | | EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
> | fail | NA | fallback to host recovery, retry or finish with |
> | | | EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
>
> Scenario2: target based error handle is enabled:
> +-----------+---------+--------------+---------+------------------------------+
> | lun reset | TUR | target reset | TUR | Desired result |
> +-----------+---------+--------------+---------+------------------------------+
> | success | success | NA | NA | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | success | fail | success | success | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | success | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | fail | fallback to host recovery, |
> | | | | | retry or finish with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | fail | NA | fallback to host recovery, |
> | | | | | retry or finish with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
>
> Scenario3: both LUN and target based error handle are enabled:
> +-----------+---------+--------------+---------+------------------------------+
> | lun reset | TUR | target reset | TUR | Desired result |
> +-----------+---------+--------------+---------+------------------------------+
> | success | success | NA | NA | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | success | fail | success | success | lun recovery fallback to |
> | | | | | target recovery, retry or |
> | | | | | finish with EIO(may offline |
> | | | | | disk |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | success | lun recovery fallback to |
> | | | | | target recovery, retry or |
> | | | | | finish with EIO(may offline |
> | | | | | disk |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | fail | lun recovery fallback to |
> | | | | | target recovery, then fall |
> | | | | | back to host recovery, retry |
> | | | | | or fhinsi with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | fail | NA | lun recovery fallback to |
> | | | | | target recovery, then fall |
> | | | | | back to host recovery, retry |
> | | | | | or fhinsi with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
>
> References: https://lore.kernel.org/linux-scsi/[email protected]/
> References: https://lore.kernel.org/linux-scsi/[email protected]/
>
> Wenchao Hao (19):
> scsi: scsi_error: Define framework for LUN/target based error handle
> scsi: scsi_error: Move complete variable eh_action from shost to sdevice
> scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
> scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
> scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
> scsi: scsi_error: Add flags to mark error handle steps has done
> scsi: scsi_error: Add helper to handle scsi device's error command list
> scsi: scsi_error: Add a general LUN based error handler
> scsi: core: increase/decrease target_busy without check can_queue
> scsi: scsi_error: Add helper to handle scsi target's error command list
> scsi: scsi_error: Add a general target based error handler
> scsi: scsi_debug: Add param to control LUN bassed error handler
> scsi: scsi_debug: Add param to control target based error handle
> scsi: mpt3sas: Add param to control LUN based error handle
> scsi: mpt3sas: Add param to control target based error handle
> scsi: smartpqi: Add param to control LUN based error handle
> scsi: megaraid_sas: Add param to control target based error handle
> scsi: virtio_scsi: Add param to control LUN based error handle
> scsi: iscsi_tcp: Add param to control LUN based error handle
>
> drivers/scsi/iscsi_tcp.c | 20 +
> drivers/scsi/megaraid/megaraid_sas_base.c | 20 +
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 28 +
> drivers/scsi/scsi_debug.c | 24 +
> drivers/scsi/scsi_error.c | 756 ++++++++++++++++++++--
> drivers/scsi/scsi_lib.c | 23 +-
> drivers/scsi/scsi_priv.h | 18 +
> drivers/scsi/smartpqi/smartpqi_init.c | 14 +
> drivers/scsi/virtio_scsi.c | 16 +-
> include/scsi/scsi_device.h | 97 +++
> include/scsi/scsi_eh.h | 8 +
> include/scsi/scsi_host.h | 2 -
> 12 files changed, 963 insertions(+), 63 deletions(-)
>

2023-09-06 12:40:52

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/6 8:22, Mike Christie wrote:
> On 9/1/23 4:41 AM, Wenchao Hao wrote:
>> It's unbearable for systems with large scale scsi devices share HBAs to
>> block all devices' IOs when handle error commands, we need a new error
>> handle mechanism to address this issue.
>>
>> I consulted about this issue a year ago, the discuss link can be found in
>> refenence. Hannes replied about why we have to block the SCSI host
>> then perform error recovery kindly. I think it's unnecessary to block
>> SCSI host for all drivers and can try a small level recovery(LUN based for
>> example) first to avoid block the SCSI host.
>>
>> The new error handle mechanism introduced in this patchset has been
>> developed and tested with out self developed hardware since one year
>> ago, now we want this mechanism can be used by more drivers.
>>
>> Drivers can decide if using the new error handle mechanism and how to
>> handle error commands when scsi_device are scanned,the new mechanism
>> makes SCSI error handle more flexible.
>>
>> SCSI error recovery strategy after blocking host's IO is mainly
>> following steps:
>>
>> - LUN reset
>> - Target reset
>> - Bus reset
>> - Host reset
>>
>> Some drivers did not implement callbacks for host reset, it's unnecessary
>> to block host's IO for these drivers. For example, smartpqi only registered
>> device reset, if device reset failed, it's meaningless to fallback to target
>> reset, bus reset or host reset any more, because these steps would also
>> failed.
>>
>> Here are some drivers we concerned:(there are too many kinds of drivers
>> to figure out, so here I just list some drivers I am familiar with)
>>
>> +-------------+--------------+--------------+-----------+------------+
>> | drivers | device_reset | target_reset | bus_reset | host_reset |
>> +-------------+--------------+--------------+-----------+------------+
>> | mpt3sas | Y | Y | N | Y |
>> +-------------+--------------+--------------+-----------+------------+
>> | smartpqi | Y | N | N | N |
>> +-------------+--------------+--------------+-----------+------------+
>> | megaraidsas | N | Y | N | Y |
>> +-------------+--------------+--------------+-----------+------------+
>> | virtioscsi | Y | N | N | N |
>> +-------------+--------------+--------------+-----------+------------+
>> | iscsi_tcp | Y | Y | N | N |
>> +-------------+--------------+--------------+-----------+------------+
>> | hisisas | Y | Y | N | N |
>> +-------------+--------------+--------------+-----------+------------+
>>
>> For LUN based error handle, when scsi command is classified as error,
>> we would block the scsi device's IO and try to recover this scsi
>> device, if still can not recover all error commands, it might
>> fallback to target or host level recovery.
>>
>> It's same for target based error handle, but target based error handle
>> would block the scsi target's IO then try to recover the error commands
>> of this target.
>>
>> The first patch defines basic framework to support LUN/target based error
>> handle mechanism, three key operations are abstracted which are:
>> - add error command
>> - wake up error handle
>> - block IOs when error command is added and recoverying.
>>
>> Drivers can implement these three function callbacks and setup to SCSI
>> middle level; I also add a general LUN/target based error handle strategy
>> which can be called directly from drivers to implement LUN/tartget based
>> error handle.
>>
>> The changes of SCSI middle level's error handle are tested with scsi_debug
>> which support single LUN error injection, the scsi_debug patches can be
>> found in reference, following scenarios are tested.
>>
>> Scenario1: LUN based error handle is enabled:
>> +-----------+---------+-------------------------------------------------------+
>> | lun reset | TUR | Desired result |
>> + --------- + ------- + ------------------------------------------------------+
>> | success | success | retry or finish with EIO(may offline disk) |
>> + --------- + ------- + ------------------------------------------------------+
>> | success | fail | fallback to host recovery, retry or finish with |
>> | | | EIO(may offline disk) |
>
>
> I didn't get this part about when we offline the disk now. For this LUN case, is
> it if the driver has called scsi_device_setup_eh with the fallback arg with false?
> If so why doesn't it just try target reset -> bus reset -> host reset like before?
>
>

There are two scenarios to offline disk when LUN reset succeed and TUR failed:

1. The driver has called scsi_device_setup_eh with the fallback arg with false.
Drivers who did not implement eh_xxx_reset for target reset, bus reset and
host reset should set fallback to false because these steps would failed,
so it's meaningless to try target reset -> bus reset -> host reset any more.

2. The disk would also be offlined by sd_eh_action() defined in sd.c when error
recovery succeed but command times out contiuously.

int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp) {
...
/*
* If the device keeps failing read/write commands but TEST UNIT
* READY always completes successfully we assume that medium
* access is no longer possible and take the device offline.
*/
if (sdkp->medium_access_timed_out >= sdkp->max_medium_access_timeouts) {
scmd_printk(KERN_ERR, scmd,
"Medium access timeout failure. Offlining disk!\n");
mutex_lock(&sdev->state_mutex);
scsi_device_set_state(sdev, SDEV_OFFLINE);
mutex_unlock(&sdev->state_mutex);

return SUCCESS;
}
...
}

>>From a high level I have the following questions/comments:
>
> 1. The modparam and drivers calling scsi_device_setup_eh/scsi_device_clear_eh
> seem uneeded.
>
> If the driver supports performing multiple TMFs/resets in parallel then why
> not always enable it?
>

Not all hardware/drivers support performing multiple TMFs/resets in parallel,
so I think it is necessary to call scsi_device_setup_eh/scsi_device_clear_eh
in specific drivers.
As to modparam, my original intention is let administrators determine whether
to enable LUN/target based error handle since it's a feature for host with
large scale of scsi devices.
If most users think the modparam is unnecessary, I would remove it.

> 2. You probably need to work more closely for some of the drivers. For the iscsi
> patch, we would want to allocate a tmf per device and per target, or if a cmd
> timesout on a second lun, that will always fail since the iscsi host has only
> one tmf preallocated and it would be in use by the first reset.
>

Sorry I did not check iscsi's device/target reset in detail.

> For the other drivers, did you review what they support? If so, how many drivers
> can you just turn this on for? Or what drivers did you test so far?
>
I reviewed the drivers changed in this patchset briefly, and it seems ok for
these drivers to turn this on. The tests are in progress but not finished now.
I would post the results in future.

My intention of posting this patchset is want people discuss about this new
error recovery mechanism. As to drivers, we would test drivers one by one, and
we are welcome others join us to analyze and test more drivers with these

2023-09-06 18:20:10

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 09/19] scsi: core: increase/decrease target_busy without check can_queue

On 2023/9/6 7:55, Mike Christie wrote:
> On 9/1/23 4:41 AM, Wenchao Hao wrote:
>> This is preparation for a genernal target based error handle strategy
>> to check if to wake up actual error handler.
>>
>> Signed-off-by: Wenchao Hao <[email protected]>
>> ---
>> drivers/scsi/scsi_lib.c | 11 ++++-------
>> 1 file changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index db0a42fe49c0..4a7fb48aa60f 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -293,8 +293,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
>>
>> scsi_dec_host_busy(shost, cmd);
>>
>> - if (starget->can_queue > 0)
>> - atomic_dec(&starget->target_busy);
>> + atomic_dec(&starget->target_busy);
>>
>
> Ming had found that removing the atomics improves perf.
> Since most drivers didn't care about the target level counters
> it was moved to the can_queue check so only drivers using the
> feature suffer the perf hit.
>
> Your patch should do the same.
>
Would update in next version

2023-09-14 07:56:05

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/1 17:41, Wenchao Hao wrote:
> It's unbearable for systems with large scale scsi devices share HBAs to
> block all devices' IOs when handle error commands, we need a new error
> handle mechanism to address this issue.
>
> I consulted about this issue a year ago, the discuss link can be found in
> refenence. Hannes replied about why we have to block the SCSI host
> then perform error recovery kindly. I think it's unnecessary to block
> SCSI host for all drivers and can try a small level recovery(LUN based for
> example) first to avoid block the SCSI host.
>
> The new error handle mechanism introduced in this patchset has been
> developed and tested with out self developed hardware since one year
> ago, now we want this mechanism can be used by more drivers.
>
> Drivers can decide if using the new error handle mechanism and how to
> handle error commands when scsi_device are scanned,the new mechanism
> makes SCSI error handle more flexible.
>
> SCSI error recovery strategy after blocking host's IO is mainly
> following steps:
>
> - LUN reset
> - Target reset
> - Bus reset
> - Host reset
>

Mike gave some suggestions and I found a bug in fallback logic, I would
address these and resend in next few days.

> Some drivers did not implement callbacks for host reset, it's unnecessary
> to block host's IO for these drivers. For example, smartpqi only registered
> device reset, if device reset failed, it's meaningless to fallback to target
> reset, bus reset or host reset any more, because these steps would also
> failed.
>
> Here are some drivers we concerned:(there are too many kinds of drivers
> to figure out, so here I just list some drivers I am familiar with)
>
> +-------------+--------------+--------------+-----------+------------+
> | drivers | device_reset | target_reset | bus_reset | host_reset |
> +-------------+--------------+--------------+-----------+------------+
> | mpt3sas | Y | Y | N | Y |
> +-------------+--------------+--------------+-----------+------------+
> | smartpqi | Y | N | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | megaraidsas | N | Y | N | Y |
> +-------------+--------------+--------------+-----------+------------+
> | virtioscsi | Y | N | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | iscsi_tcp | Y | Y | N | N |
> +-------------+--------------+--------------+-----------+------------+
> | hisisas | Y | Y | N | N |
> +-------------+--------------+--------------+-----------+------------+
>
> For LUN based error handle, when scsi command is classified as error,
> we would block the scsi device's IO and try to recover this scsi
> device, if still can not recover all error commands, it might
> fallback to target or host level recovery.
>
> It's same for target based error handle, but target based error handle
> would block the scsi target's IO then try to recover the error commands
> of this target.
>
> The first patch defines basic framework to support LUN/target based error
> handle mechanism, three key operations are abstracted which are:
> - add error command
> - wake up error handle
> - block IOs when error command is added and recoverying.
>
> Drivers can implement these three function callbacks and setup to SCSI
> middle level; I also add a general LUN/target based error handle strategy
> which can be called directly from drivers to implement LUN/tartget based
> error handle.
>
> The changes of SCSI middle level's error handle are tested with scsi_debug
> which support single LUN error injection, the scsi_debug patches can be
> found in reference, following scenarios are tested.
>
> Scenario1: LUN based error handle is enabled:
> +-----------+---------+-------------------------------------------------------+
> | lun reset | TUR | Desired result |
> + --------- + ------- + ------------------------------------------------------+
> | success | success | retry or finish with EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
> | success | fail | fallback to host recovery, retry or finish with |
> | | | EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
> | fail | NA | fallback to host recovery, retry or finish with |
> | | | EIO(may offline disk) |
> + --------- + ------- + ------------------------------------------------------+
>
> Scenario2: target based error handle is enabled:
> +-----------+---------+--------------+---------+------------------------------+
> | lun reset | TUR | target reset | TUR | Desired result |
> +-----------+---------+--------------+---------+------------------------------+
> | success | success | NA | NA | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | success | fail | success | success | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | success | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | fail | fallback to host recovery, |
> | | | | | retry or finish with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | fail | NA | fallback to host recovery, |
> | | | | | retry or finish with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
>
> Scenario3: both LUN and target based error handle are enabled:
> +-----------+---------+--------------+---------+------------------------------+
> | lun reset | TUR | target reset | TUR | Desired result |
> +-----------+---------+--------------+---------+------------------------------+
> | success | success | NA | NA | retry or finish with |
> | | | | | EIO(may offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | success | fail | success | success | lun recovery fallback to |
> | | | | | target recovery, retry or |
> | | | | | finish with EIO(may offline |
> | | | | | disk |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | success | lun recovery fallback to |
> | | | | | target recovery, retry or |
> | | | | | finish with EIO(may offline |
> | | | | | disk |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | success | fail | lun recovery fallback to |
> | | | | | target recovery, then fall |
> | | | | | back to host recovery, retry |
> | | | | | or fhinsi with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
> | fail | NA | fail | NA | lun recovery fallback to |
> | | | | | target recovery, then fall |
> | | | | | back to host recovery, retry |
> | | | | | or fhinsi with EIO(may |
> | | | | | offline disk) |
> +-----------+---------+--------------+---------+------------------------------+
>
> References: https://lore.kernel.org/linux-scsi/[email protected]/
> References: https://lore.kernel.org/linux-scsi/[email protected]/
>
> Wenchao Hao (19):
> scsi: scsi_error: Define framework for LUN/target based error handle
> scsi: scsi_error: Move complete variable eh_action from shost to sdevice
> scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
> scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
> scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
> scsi: scsi_error: Add flags to mark error handle steps has done
> scsi: scsi_error: Add helper to handle scsi device's error command list
> scsi: scsi_error: Add a general LUN based error handler
> scsi: core: increase/decrease target_busy without check can_queue
> scsi: scsi_error: Add helper to handle scsi target's error command list
> scsi: scsi_error: Add a general target based error handler
> scsi: scsi_debug: Add param to control LUN bassed error handler
> scsi: scsi_debug: Add param to control target based error handle
> scsi: mpt3sas: Add param to control LUN based error handle
> scsi: mpt3sas: Add param to control target based error handle
> scsi: smartpqi: Add param to control LUN based error handle
> scsi: megaraid_sas: Add param to control target based error handle
> scsi: virtio_scsi: Add param to control LUN based error handle
> scsi: iscsi_tcp: Add param to control LUN based error handle
>
> drivers/scsi/iscsi_tcp.c | 20 +
> drivers/scsi/megaraid/megaraid_sas_base.c | 20 +
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 28 +
> drivers/scsi/scsi_debug.c | 24 +
> drivers/scsi/scsi_error.c | 756 ++++++++++++++++++++--
> drivers/scsi/scsi_lib.c | 23 +-
> drivers/scsi/scsi_priv.h | 18 +
> drivers/scsi/smartpqi/smartpqi_init.c | 14 +
> drivers/scsi/virtio_scsi.c | 16 +-
> include/scsi/scsi_device.h | 97 +++
> include/scsi/scsi_eh.h | 8 +
> include/scsi/scsi_host.h | 2 -
> 12 files changed, 963 insertions(+), 63 deletions(-)
>

2023-09-25 15:15:50

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/25 22:55, Christoph Hellwig wrote:
> Before we add another new error handling mechanism we need to fix the
> old one first. Hannes' work on not passing the scsi_cmnd to the various
> reset handlers hasn't made a lot of progress in the last five years and
> we'll need to urgently fix that first before adding even more
> complexity.
>
I observed Hannes's patches posted about one year ago, it has not been
applied yet. I don't know if he is still working on it.

My patches do not depend much on that work, I think the conflict can be
solved fast between two changes.

2023-09-25 20:12:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

Before we add another new error handling mechanism we need to fix the
old one first. Hannes' work on not passing the scsi_cmnd to the various
reset handlers hasn't made a lot of progress in the last five years and
we'll need to urgently fix that first before adding even more
complexity.

2023-09-25 20:26:01

by Mike Christie

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 9/14/23 1:20 AM, Wenchao Hao wrote:
> On 2023/9/1 17:41, Wenchao Hao wrote:
>> It's unbearable for systems with large scale scsi devices share HBAs to
>> block all devices' IOs when handle error commands, we need a new error
>> handle mechanism to address this issue.
>>
>> I consulted about this issue a year ago, the discuss link can be found in
>> refenence. Hannes replied about why we have to block the SCSI host
>> then perform error recovery kindly. I think it's unnecessary to block
>> SCSI host for all drivers and can try a small level recovery(LUN based for
>> example) first to avoid block the SCSI host.
>>
>> The new error handle mechanism introduced in this patchset has been
>> developed and tested with out self developed hardware since one year
>> ago, now we want this mechanism can be used by more drivers.
>>
>> Drivers can decide if using the new error handle mechanism and how to
>> handle error commands when scsi_device are scanned,the new mechanism
>> makes SCSI error handle more flexible.
>>
>> SCSI error recovery strategy after blocking host's IO is mainly
>> following steps:
>>
>> - LUN reset
>> - Target reset
>> - Bus reset
>> - Host reset
>>
>
> Mike gave some suggestions and I found a bug in fallback logic, I would
> address these and resend in next few days.

Please wait to resend. I'm still reviewing the patches. When I commented
last time I just did a quick look over to get an idea for the design and
what your goals were.

2023-09-25 22:05:24

by Mike Christie

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 9/25/23 10:07 AM, Wenchao Hao wrote:
> On 2023/9/25 22:55, Christoph Hellwig wrote:
>> Before we add another new error handling mechanism we need to fix the
>> old one first.  Hannes' work on not passing the scsi_cmnd to the various
>> reset handlers hasn't made a lot of progress in the last five years and
>> we'll need to urgently fix that first before adding even more
>> complexity.
>>
> I observed Hannes's patches posted about one year ago, it has not been
> applied yet. I don't know if he is still working on it.
>
> My patches do not depend much on that work, I think the conflict can be
> solved fast between two changes.

I think we want to figure out Hannes's patches first.

For a new EH design we will want to be able to do multiple TMFs in parallel
on the same host/target right?

The problem is that we need to be able to make forward progress in the EH
path and not fail just because we can't allocate memory for a TMF related
struct. To accomplish this now, drivers will use mempools, preallocate TMF
related structs/mem/tags with their scsi_cmnd related structs, preallocate
per host/target/device related structs or ignore what I wrote above and just
fail.

Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
when it's not needed. That seems nice because after that, then for your new
EH we can begin to standardize on how to handle preallocation of drivers
resources needed to perform TMFs for your new EH. It could be a per
device/target/host callout to allow drivers to preallocate, then scsi-ml calls
into the drivers with that data. It doesn't have to be exactly like that or
anything close. It would be nice for drivers to not have to think about this
type of thing and scsi-ml just to handle the resource management for us when
there are multiple TMFs in progress.

2023-09-26 11:48:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On Mon, Sep 25, 2023 at 12:54:48PM -0500, Mike Christie wrote:
> I think we want to figure out Hannes's patches first.

Yes.

> For a new EH design we will want to be able to do multiple TMFs in parallel
> on the same host/target right?
>
> The problem is that we need to be able to make forward progress in the EH
> path and not fail just because we can't allocate memory for a TMF related
> struct. To accomplish this now, drivers will use mempools, preallocate TMF
> related structs/mem/tags with their scsi_cmnd related structs, preallocate
> per host/target/device related structs or ignore what I wrote above and just
> fail.
>
> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
> when it's not needed. That seems nice because after that, then for your new
> EH we can begin to standardize on how to handle preallocation of drivers
> resources needed to perform TMFs for your new EH. It could be a per
> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
> into the drivers with that data. It doesn't have to be exactly like that or
> anything close. It would be nice for drivers to not have to think about this
> type of thing and scsi-ml just to handle the resource management for us when
> there are multiple TMFs in progress.

Exactly!

2023-09-26 13:09:45

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/26 1:54, Mike Christie wrote:
> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>> Before we add another new error handling mechanism we need to fix the
>>> old one first.  Hannes' work on not passing the scsi_cmnd to the various
>>> reset handlers hasn't made a lot of progress in the last five years and
>>> we'll need to urgently fix that first before adding even more
>>> complexity.
>>>
>> I observed Hannes's patches posted about one year ago, it has not been
>> applied yet. I don't know if he is still working on it.
>>
>> My patches do not depend much on that work, I think the conflict can be
>> solved fast between two changes.
>
> I think we want to figure out Hannes's patches first.
>
> For a new EH design we will want to be able to do multiple TMFs in parallel
> on the same host/target right?
>

It's not necessary to do multiple TMFs in parallel, it's ok to make sure
each TMFs do not affect each other.

For example, we have two devices: 0:0:0:0 and 0:0:0:1

Both of them request device reset, they do not happened in parallel, but
would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
just wait 0:0:0:0 to finish.

> The problem is that we need to be able to make forward progress in the EH
> path and not fail just because we can't allocate memory for a TMF related
> struct. To accomplish this now, drivers will use mempools, preallocate TMF
> related structs/mem/tags with their scsi_cmnd related structs, preallocate
> per host/target/device related structs or ignore what I wrote above and just
> fail.
>
> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
> when it's not needed. That seems nice because after that, then for your new
> EH we can begin to standardize on how to handle preallocation of drivers
> resources needed to perform TMFs for your new EH. It could be a per
> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
> into the drivers with that data. It doesn't have to be exactly like that or
> anything close. It would be nice for drivers to not have to think about this
> type of thing and scsi-ml just to handle the resource management for us when
> there are multiple TMFs in progress.
>

2023-09-26 22:55:33

by Mike Christie

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 9/26/23 7:57 AM, Wenchao Hao wrote:
> On 2023/9/26 1:54, Mike Christie wrote:
>> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>>> Before we add another new error handling mechanism we need to fix the
>>>> old one first.  Hannes' work on not passing the scsi_cmnd to the various
>>>> reset handlers hasn't made a lot of progress in the last five years and
>>>> we'll need to urgently fix that first before adding even more
>>>> complexity.
>>>>
>>> I observed Hannes's patches posted about one year ago, it has not been
>>> applied yet. I don't know if he is still working on it.
>>>
>>> My patches do not depend much on that work, I think the conflict can be
>>> solved fast between two changes.
>>
>> I think we want to figure out Hannes's patches first.
>>
>> For a new EH design we will want to be able to do multiple TMFs in parallel
>> on the same host/target right?
>>
>
> It's not necessary to do multiple TMFs in parallel, it's ok to make sure
> each TMFs do not affect each other.
>
> For example, we have two devices: 0:0:0:0 and 0:0:0:1
>
> Both of them request device reset, they do not happened in parallel, but
> would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
> just wait 0:0:0:0 to finish.

I see. I guess we still get the benefit of not having to stop other devices
when doing TMFs.

I think we still want a common way to allocate/free and manage resources
drivers will use during this time. Maybe have a init_device/target/cmd/eh_priv
and exit_device/target/eh_priv (I'm not sure of the name, but something similar
to the init_cmd_priv/exit_cmd_priv we have for normal commands.

scsi-ml then calls into the new eh with the priv data. Drivers don't have to
do the preallocation and worry if it's per device/target/host.

I'm not 100% sure about the low level details. Check out how Hannes's is
handling tag management for TMFs as well.


>
>> The problem is that we need to be able to make forward progress in the EH
>> path and not fail just because we can't allocate memory for a TMF related
>> struct. To accomplish this now, drivers will use mempools, preallocate TMF
>> related structs/mem/tags with their scsi_cmnd related structs, preallocate
>> per host/target/device related structs or ignore what I wrote above and just
>> fail.
>>
>> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
>> when it's not needed. That seems nice because after that, then for your new
>> EH we can begin to standardize on how to handle preallocation of drivers
>> resources needed to perform TMFs for your new EH. It could be a per
>> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
>> into the drivers with that data. It doesn't have to be exactly like that or
>> anything close. It would be nice for drivers to not have to think about this
>> type of thing and scsi-ml just to handle the resource management for us when
>> there are multiple TMFs in progress.
>>
>

2023-09-27 09:49:01

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/27 1:37, Mike Christie wrote:
> On 9/26/23 7:57 AM, Wenchao Hao wrote:
>> On 2023/9/26 1:54, Mike Christie wrote:
>>> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>>>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>>>> Before we add another new error handling mechanism we need to fix the
>>>>> old one first.  Hannes' work on not passing the scsi_cmnd to the various
>>>>> reset handlers hasn't made a lot of progress in the last five years and
>>>>> we'll need to urgently fix that first before adding even more
>>>>> complexity.
>>>>>
>>>> I observed Hannes's patches posted about one year ago, it has not been
>>>> applied yet. I don't know if he is still working on it.
>>>>
>>>> My patches do not depend much on that work, I think the conflict can be
>>>> solved fast between two changes.
>>>
>>> I think we want to figure out Hannes's patches first.
>>>
>>> For a new EH design we will want to be able to do multiple TMFs in parallel
>>> on the same host/target right?
>>>
>>
>> It's not necessary to do multiple TMFs in parallel, it's ok to make sure
>> each TMFs do not affect each other.
>>
>> For example, we have two devices: 0:0:0:0 and 0:0:0:1
>>
>> Both of them request device reset, they do not happened in parallel, but
>> would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
>> just wait 0:0:0:0 to finish.
>
> I see. I guess we still get the benefit of not having to stop other devices
> when doing TMFs.
>

Yes, it's better to support multiple TMFs in parallel than just run in serial.
I would wait for Hannes's changes to be applied and send my change again.

> I think we still want a common way to allocate/free and manage resources
> drivers will use during this time. Maybe have a init_device/target/cmd/eh_priv
> and exit_device/target/eh_priv (I'm not sure of the name, but something similar
> to the init_cmd_priv/exit_cmd_priv we have for normal commands.
>
> scsi-ml then calls into the new eh with the priv data. Drivers don't have to
> do the preallocation and worry if it's per device/target/host.
>
> I'm not 100% sure about the low level details. Check out how Hannes's is
> handling tag management for TMFs as well.
>
>
>>
>>> The problem is that we need to be able to make forward progress in the EH
>>> path and not fail just because we can't allocate memory for a TMF related
>>> struct. To accomplish this now, drivers will use mempools, preallocate TMF
>>> related structs/mem/tags with their scsi_cmnd related structs, preallocate
>>> per host/target/device related structs or ignore what I wrote above and just
>>> fail.
>>>
>>> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
>>> when it's not needed. That seems nice because after that, then for your new
>>> EH we can begin to standardize on how to handle preallocation of drivers
>>> resources needed to perform TMFs for your new EH. It could be a per
>>> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
>>> into the drivers with that data. It doesn't have to be exactly like that or
>>> anything close. It would be nice for drivers to not have to think about this
>>> type of thing and scsi-ml just to handle the resource management for us when
>>> there are multiple TMFs in progress.
>>>
>>
>

2023-09-27 10:39:19

by Wenchao Hao

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 2023/9/27 15:59, Hannes Reinecke wrote:
> On 9/26/23 14:57, Wenchao Hao wrote:
>> On 2023/9/26 1:54, Mike Christie wrote:
>>> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>>>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>>>> Before we add another new error handling mechanism we need to fix the
>>>>> old one first.  Hannes' work on not passing the scsi_cmnd to the various
>>>>> reset handlers hasn't made a lot of progress in the last five years and
>>>>> we'll need to urgently fix that first before adding even more
>>>>> complexity.
>>>>>
>>>> I observed Hannes's patches posted about one year ago, it has not been
>>>> applied yet. I don't know if he is still working on it.
>>>>
>>>> My patches do not depend much on that work, I think the conflict can be
>>>> solved fast between two changes.
>>>
>>> I think we want to figure out Hannes's patches first.
>>>
>>> For a new EH design we will want to be able to do multiple TMFs in parallel
>>> on the same host/target right?
>>>
>>
>> It's not necessary to do multiple TMFs in parallel, it's ok to make sure
>> each TMFs do not affect each other.
>>
>> For example, we have two devices: 0:0:0:0 and 0:0:0:1
>>
>> Both of them request device reset, they do not happened in parallel, but
>> would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
>> just wait 0:0:0:0 to finish.
>>
> Well, not quite. Any higher-order TMFs are serialized by virtue of SCSI-EH, but command aborts (which also devolve down to TMFs on certain drivers) do run in parallel, and there we will be requiring multiple TMFs.
>

It's best that multiple TMFs can run in parallel, again, looking forwarding
to your changes.

> Cheers,
>
> Hannes

2023-09-27 11:49:31

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 9/26/23 09:26, Christoph Hellwig wrote:
> On Mon, Sep 25, 2023 at 12:54:48PM -0500, Mike Christie wrote:
>> I think we want to figure out Hannes's patches first.
>
> Yes.
>
>> For a new EH design we will want to be able to do multiple TMFs in parallel
>> on the same host/target right?
>>
>> The problem is that we need to be able to make forward progress in the EH
>> path and not fail just because we can't allocate memory for a TMF related
>> struct. To accomplish this now, drivers will use mempools, preallocate TMF
>> related structs/mem/tags with their scsi_cmnd related structs, preallocate
>> per host/target/device related structs or ignore what I wrote above and just
>> fail.
>>
>> Hannes's patches fix up the eh callouts so they don't pass in a scsi_cmnd
>> when it's not needed. That seems nice because after that, then for your new
>> EH we can begin to standardize on how to handle preallocation of drivers
>> resources needed to perform TMFs for your new EH. It could be a per
>> device/target/host callout to allow drivers to preallocate, then scsi-ml calls
>> into the drivers with that data. It doesn't have to be exactly like that or
>> anything close. It would be nice for drivers to not have to think about this
>> type of thing and scsi-ml just to handle the resource management for us when
>> there are multiple TMFs in progress.
>
> Exactly!

Yeah, thanks for the vote of support.
Last time I tried the attempt got shot down, as it had been using the
'wrong' interface. But seeing that there's renewed interest I'll be
reposting them.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman

2023-09-27 18:56:07

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [RFC PATCH v2 00/18] scsi: scsi_error: Introduce new error handle mechanism

On 9/26/23 14:57, Wenchao Hao wrote:
> On 2023/9/26 1:54, Mike Christie wrote:
>> On 9/25/23 10:07 AM, Wenchao Hao wrote:
>>> On 2023/9/25 22:55, Christoph Hellwig wrote:
>>>> Before we add another new error handling mechanism we need to fix the
>>>> old one first.  Hannes' work on not passing the scsi_cmnd to the
>>>> various
>>>> reset handlers hasn't made a lot of progress in the last five years and
>>>> we'll need to urgently fix that first before adding even more
>>>> complexity.
>>>>
>>> I observed Hannes's patches posted about one year ago, it has not been
>>> applied yet. I don't know if he is still working on it.
>>>
>>> My patches do not depend much on that work, I think the conflict can be
>>> solved fast between two changes.
>>
>> I think we want to figure out Hannes's patches first.
>>
>> For a new EH design we will want to be able to do multiple TMFs in
>> parallel
>> on the same host/target right?
>>
>
> It's not necessary to do multiple TMFs in parallel, it's ok to make sure
> each TMFs do not affect each other.
>
> For example, we have two devices: 0:0:0:0 and 0:0:0:1
>
> Both of them request device reset, they do not happened in parallel, but
> would in serial. If 0:0:0:0 is performing device reset in progress, 0:0:0:1
> just wait 0:0:0:0 to finish.
>
Well, not quite. Any higher-order TMFs are serialized by virtue of
SCSI-EH, but command aborts (which also devolve down to TMFs on certain
drivers) do run in parallel, and there we will be requiring multiple TMFs.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman