2023-04-27 12:13:32

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 0/6] scsi:scsi_debug: Add error injection for single device

The original error injection mechanism was based on scsi_host which
could not inject fault for a single SCSI device.

This patchset provides the ability to inject errors for a single
SCSI device. Now we supports inject timeout errors, queuecommand
errors, and hostbyte, driverbyte, statusbyte, and sense data for
specific SCSI Command.

The first two patch add an debugfs interface to add and inquiry single
device's error injection info; the third patch defined how to remove
an injection which has been added. The following 3 patches use the
injection info and generate the related error type.

V2:
- Using debugfs rather than sysfs attribute interface to manage error

Wenchao Hao (6):
scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
scsi:scsi_debug: Add interface to manage single device's error inject
scsi:scsi_debug: Define grammar to remove added error injection
scsi:scsi_debug: timeout command if the error is injected
scsi:scsi_debug: Return failed value if the error is injected
scsi:scsi_debug: set command's result and sense data if the error is
injected

drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
1 file changed, 318 insertions(+)


--
2.32.0


2023-04-27 12:13:51

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 1/6] scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem

Create directory scsi_debug in the root of the debugfs filesystem.
Prepare to add interface for manage error injection.

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 8c58128ad32a..d4bf67f52fbb 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -41,6 +41,7 @@
#include <linux/random.h>
#include <linux/xarray.h>
#include <linux/prefetch.h>
+#include <linux/debugfs.h>

#include <net/checksum.h>

@@ -867,6 +868,8 @@ static const int device_qfull_result =

static const int condition_met_result = SAM_STAT_CONDITION_MET;

+static struct dentry *sdebug_debugfs_root;
+

/* Only do the extra work involved in logical block provisioning if one or
* more of the lbpu, lbpws or lbpws10 parameters are given and we are doing
@@ -7019,6 +7022,8 @@ static int __init scsi_debug_init(void)
goto driver_unreg;
}

+ sdebug_debugfs_root = debugfs_create_dir("scsi_debug", NULL);
+
for (k = 0; k < hosts_to_add; k++) {
if (want_store && k == 0) {
ret = sdebug_add_host_helper(idx);
@@ -7065,6 +7070,7 @@ static void __exit scsi_debug_exit(void)

sdebug_erase_all_stores(false);
xa_destroy(per_store_ap);
+ debugfs_remove(sdebug_debugfs_root);
}

device_initcall(scsi_debug_init);
--
2.32.0

2023-04-27 12:14:02

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error inject

Add interface "/sys/kernel/debug/scsi_debug/<h:c:t:l>/error" to
manage scsi_device's error injection rules. Write it to add/remove an error
injection rule, read it to get error injection rules.

The interface format is line-by-line, each line is an error injection rule.
Each rule contains integers separated by spaces, first three columns are
fixed to "Error code", "Error count" and "SCSI command", other columns
depend on error code.

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Error code |
| | | 0: timeout SCSI command |
| | | 1: fail queuecommand, make queuecommand return |
| | | specific value |
| | | 2: fail command, finish command with specific status |
| | | and sense date |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error Count |
| | | 0: the rule would not take effect |
| | | positive: the rule would always take effect |
| | | negative: the rule take effect for the absolute of |
| | | this value |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command this rule is for, 0xff for all command |
+--------+------+-------------------------------------------------------+
| ... | xxx | Error type specific fields |
+--------+------+-------------------------------------------------------+

- When multiple error injects are added to one scsi command, the one
with smaller error code would take effect.
- If an same error is added to one device, old one would be overwritten.
- Currently, basic types are (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal
types (x8/x16/x32/x64)

Timeout SCSI command:

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Error type, fixed to 0x0 |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error Count |
| | | 0: the rule would not take effect |
| | | positive: the rule would always take effect |
| | | negative: the rule take effect for the absolute of |
| | | this value |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command this rule is for, 0xff for all command |
+--------+------+-------------------------------------------------------+

Examples:
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "0 -10 0x12" > $error
would make the device's inquiry command timeout for 10 time.
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "0 1 0x12" > $error
would make the device's inquiry timeout forever.

Make queuecommand return specific value:

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Error type, fixed to 0x1 |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error Count |
| | | 0: the rule would not take effect |
| | | positive: the rule would always take effect |
| | | negative: the rule take effect for the absolute of |
| | | this value |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command this rule is for, 0xff for all command |
+--------+------+-------------------------------------------------------+
| 4 | x32 | The return value of queuecommand we desired |
+--------+------+-------------------------------------------------------+

Examples:
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "1 1 0x12 0x1055" > $error
would make device's inquiry command return 0x1055(SCSI_MLQUEUE_HOST_BUSY)
forever.

Set scsi_cmd's status and sense data to specific value:

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Error type, fixed to 0x2 |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error Count |
| | | 0: the rule would not take effect |
| | | positive: the rule would always take effect |
| | | negative: the rule take effect for the absolute of |
| | | this value |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command this rule is for, 0xff for all command |
+--------+------+-------------------------------------------------------+
| 4 | x8 | Host byte in scsi_cmd's status |
+--------+------+-------------------------------------------------------+
| 5 | x8 | Driver byte in scsi_cmd's status |
+--------+------+-------------------------------------------------------+
| 6 | x8 | Status byte in scsi_cmd's status |
+--------+------+-------------------------------------------------------+
| 7 | x8 | Sense Key of scsi_cmnd |
+--------+------+-------------------------------------------------------+
| 8 | x8 | ASC of scsi_cmnd |
+--------+------+-------------------------------------------------------+
| 9 | x8 | ASQ of scsi_cmnd |
+--------+------+-------------------------------------------------------+

Examples:
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >$error
would make device's read command return with media error UNC:

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 174 ++++++++++++++++++++++++++++++++++++++
1 file changed, 174 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index d4bf67f52fbb..1d19a2f1f5f2 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -286,6 +286,41 @@ struct sdeb_zone_state { /* ZBC: per zone state */
sector_t z_wp;
};

+enum sdebug_err_type {
+ ERR_TMOUT_CMD = 0, /* make specific scsi command timeout */
+ ERR_FAIL_QUEUE_CMD = 1, /* make specific scsi command's */
+ /* queuecmd return failed */
+ ERR_FAIL_CMD = 2, /* make specific scsi command's */
+ /* queuecmd return succeed but */
+ /* with errors set in scsi_cmnd */
+};
+
+struct sdebug_err_inject {
+ int type;
+ struct list_head list;
+ int cnt;
+ unsigned char cmd;
+
+ union {
+ /*
+ * For ERR_FAIL_QUEUE_CMD
+ */
+ int queuecmd_ret;
+
+ /*
+ * For ERR_FAIL_CMD
+ */
+ struct {
+ unsigned char host_byte;
+ unsigned char driver_byte;
+ unsigned char status_byte;
+ unsigned char sense_key;
+ unsigned char asc;
+ unsigned char asq;
+ };
+ };
+};
+
struct sdebug_dev_info {
struct list_head dev_list;
unsigned int channel;
@@ -311,6 +346,9 @@ struct sdebug_dev_info {
unsigned int max_open;
ktime_t create_ts; /* time since bootup that this device was created */
struct sdeb_zone_state *zstate;
+
+ struct dentry *debugfs_entry;
+ struct list_head inject_err_list;
};

struct sdebug_host_info {
@@ -870,6 +908,132 @@ static const int condition_met_result = SAM_STAT_CONDITION_MET;

static struct dentry *sdebug_debugfs_root;

+static void sdebug_err_add(struct scsi_device *sdev, struct sdebug_err_inject *new)
+{
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdev->hostdata;
+ struct sdebug_err_inject *tmp, *err;
+
+ list_for_each_entry_safe(err, tmp, &devip->inject_err_list, list) {
+ if (err->type == new->type && err->cmd == new->cmd) {
+ sdev_printk(KERN_INFO, sdev, "Substituted %d 0x%x\n",
+ err->type, err->cmd);
+ list_del(&err->list);
+ kfree(err);
+ }
+ }
+
+ list_add_tail(&new->list, &devip->inject_err_list);
+}
+
+static int sdebug_error_show(struct seq_file *m, void *p)
+{
+ struct scsi_device *sdev = (struct scsi_device *)m->private;
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdev->hostdata;
+ struct sdebug_err_inject *err;
+
+ seq_puts(m, "Type\tCount\tCommand\n");
+
+ list_for_each_entry(err, &devip->inject_err_list, list) {
+ switch (err->type) {
+ case ERR_TMOUT_CMD:
+ seq_printf(m, "%d\t%d\t0x%x\n", err->type, err->cnt,
+ err->cmd);
+ break;
+
+ case ERR_FAIL_QUEUE_CMD:
+ seq_printf(m, "%d\t%d\t0x%x\t0x%x\n", err->type,
+ err->cnt, err->cmd, err->queuecmd_ret);
+ break;
+
+ case ERR_FAIL_CMD:
+ seq_printf(m, "%d\t%d\t0x%x\t0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
+ err->type, err->cnt, err->cmd,
+ err->host_byte, err->driver_byte,
+ err->status_byte, err->sense_key,
+ err->asc, err->asq);
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static int sdebug_error_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, sdebug_error_show, inode->i_private);
+}
+
+static ssize_t sdebug_error_write(struct file *file, const char __user *ubuf,
+ size_t count, loff_t *ppos)
+{
+ char *buf;
+ unsigned int inject_type;
+ struct sdebug_err_inject *inject;
+ struct scsi_device *sdev = (struct scsi_device *)file->f_inode->i_private;
+
+ buf = kmalloc(count, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ if (copy_from_user(buf, ubuf, count)) {
+ kfree(buf);
+ return -EFAULT;
+ }
+
+ if (sscanf(buf, "%d", &inject_type) != 1) {
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ inject = kzalloc(sizeof(struct sdebug_err_inject), GFP_KERNEL);
+ if (!inject) {
+ kfree(buf);
+ return -ENOMEM;
+ }
+
+ switch (inject_type) {
+ case ERR_TMOUT_CMD:
+ if (sscanf(buf, "%d %d %hhx", &inject->type, &inject->cnt,
+ &inject->cmd) != 3)
+ goto out_error;
+ break;
+
+ case ERR_FAIL_QUEUE_CMD:
+ if (sscanf(buf, "%d %d %hhx %x", &inject->type, &inject->cnt,
+ &inject->cmd, &inject->queuecmd_ret) != 4)
+ goto out_error;
+ break;
+
+ case ERR_FAIL_CMD:
+ if (sscanf(buf, "%d %d %hhx %hhx %hhx %hhx %hhx %hhx %hhx",
+ &inject->type, &inject->cnt, &inject->cmd,
+ &inject->host_byte, &inject->driver_byte,
+ &inject->status_byte, &inject->sense_key,
+ &inject->asc, &inject->asq) != 9)
+ goto out_error;
+ break;
+
+ default:
+ goto out_error;
+ break;
+ }
+
+ kfree(buf);
+ sdebug_err_add(sdev, inject);
+
+ return count;
+
+out_error:
+ kfree(buf);
+ kfree(inject);
+ return -EINVAL;
+}
+
+static const struct file_operations sdebug_error_fops = {
+ .open = sdebug_error_open,
+ .read = seq_read,
+ .write = sdebug_error_write,
+};

/* Only do the extra work involved in logical block provisioning if one or
* more of the lbpu, lbpws or lbpws10 parameters are given and we are doing
@@ -5104,6 +5268,7 @@ static struct sdebug_dev_info *sdebug_device_create(
}
devip->create_ts = ktime_get_boottime();
atomic_set(&devip->stopped, (sdeb_tur_ms_to_ready > 0 ? 2 : 0));
+ INIT_LIST_HEAD(&devip->inject_err_list);
list_add_tail(&devip->dev_list, &sdbg_host->dev_info_list);
}
return devip;
@@ -5149,6 +5314,7 @@ static int scsi_debug_slave_alloc(struct scsi_device *sdp)
if (sdebug_verbose)
pr_info("slave_alloc <%u %u %u %llu>\n",
sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
+
return 0;
}

@@ -5171,6 +5337,12 @@ static int scsi_debug_slave_configure(struct scsi_device *sdp)
if (sdebug_no_uld)
sdp->no_uld_attach = 1;
config_cdb_len(sdp);
+
+ devip->debugfs_entry = debugfs_create_dir(dev_name(&sdp->sdev_dev),
+ sdebug_debugfs_root);
+ debugfs_create_file("error", 0600, devip->debugfs_entry, sdp,
+ &sdebug_error_fops);
+
return 0;
}

@@ -5182,6 +5354,8 @@ static void scsi_debug_slave_destroy(struct scsi_device *sdp)
if (sdebug_verbose)
pr_info("slave_destroy <%u %u %u %llu>\n",
sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
+
+ debugfs_remove(devip->debugfs_entry);
if (devip) {
/* make this slot available for re-use */
devip->used = false;
--
2.32.0

2023-04-27 12:14:37

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 4/6] scsi:scsi_debug: timeout command if the error is injected

If a timeout error is injected, return 0 from scsi_debug_queuecommand to
make the command timeout.

For example, the following command would inject timeout error for
inquiry(0x12) command for 10 times:
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "0 -10 0x12" > $error

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 2c380cddcb84..2d31ae15bd97 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7714,6 +7714,30 @@ static int sdebug_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num)
return num_entries;
}

+static int sdebug_timeout_cmd(struct scsi_cmnd *cmnd)
+{
+ struct scsi_device *sdp = cmnd->device;
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata;
+ struct sdebug_err_inject *err;
+ unsigned char *cmd = cmnd->cmnd;
+ int ret = 0;
+
+ if (devip == NULL)
+ return 0;
+
+ list_for_each_entry(err, &devip->inject_err_list, list) {
+ if (err->type == ERR_TMOUT_CMD &&
+ (err->cmd == cmd[0] || err->cmd == 0xff)) {
+ ret = !!err->cnt;
+ if (err->cnt < 0)
+ err->cnt++;
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
static int scsi_debug_queuecommand(struct Scsi_Host *shost,
struct scsi_cmnd *scp)
{
@@ -7772,6 +7796,12 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
if (NULL == devip)
goto err_out;
}
+
+ if (sdebug_timeout_cmd(scp)) {
+ scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
+ return 0;
+ }
+
if (unlikely(inject_now && !atomic_read(&sdeb_inject_pending)))
atomic_set(&sdeb_inject_pending, 1);

--
2.32.0

2023-04-27 12:25:32

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 3/6] scsi:scsi_debug: Define grammar to remove added error injection

The grammar to remove error injection is a line with fixed 3 columns
separated by spaces.

First column is fixed to "-". It tells this is a removal operation.
Second column is the error code to match.
Third column is the scsi command to match.

For example the following command would remove timeout injection of
inquiry command.

echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 1d19a2f1f5f2..2c380cddcb84 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -925,6 +925,33 @@ static void sdebug_err_add(struct scsi_device *sdev, struct sdebug_err_inject *n
list_add_tail(&new->list, &devip->inject_err_list);
}

+static int sdebug_err_remove(struct scsi_device *sdev, const char *buf, size_t count)
+{
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdev->hostdata;
+ struct sdebug_err_inject *tmp, *err;
+ int type;
+ unsigned char cmd;
+
+ if (sscanf(buf, "- %d %hhx", &type, &cmd) != 2) {
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ list_for_each_entry_safe(err, tmp, &devip->inject_err_list, list) {
+ if (err->type == type && err->cmd == cmd) {
+ sdev_printk(KERN_INFO, sdev, "Remove %d 0x%x\n",
+ err->type, err->cmd);
+ list_del(&err->list);
+ kfree(err);
+ kfree(buf);
+ return count;
+ }
+ }
+
+ kfree(buf);
+ return -EINVAL;
+}
+
static int sdebug_error_show(struct seq_file *m, void *p)
{
struct scsi_device *sdev = (struct scsi_device *)m->private;
@@ -980,6 +1007,9 @@ static ssize_t sdebug_error_write(struct file *file, const char __user *ubuf,
return -EFAULT;
}

+ if (buf[0] == '-')
+ return sdebug_err_remove(sdev, buf, count);
+
if (sscanf(buf, "%d", &inject_type) != 1) {
kfree(buf);
return -EINVAL;
--
2.32.0

2023-04-27 12:27:51

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 5/6] scsi:scsi_debug: Return failed value if the error is injected

If a fail queuecommand error is injected, return the failed value defined
in the rule from queuecommand.

We can make any scsi command's queuecommand return the value we desired,
for example make it return SCSI_MLQUEUE_HOST_BUSY.

error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "1 1 0x12 0x1055" > $error

would make all inquiry(0x12) command's queuecommand return 0x1055:

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 2d31ae15bd97..340299e63069 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7738,6 +7738,30 @@ static int sdebug_timeout_cmd(struct scsi_cmnd *cmnd)
return 0;
}

+static int sdebug_fail_queue_cmd(struct scsi_cmnd *cmnd)
+{
+ struct scsi_device *sdp = cmnd->device;
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata;
+ struct sdebug_err_inject *err;
+ unsigned char *cmd = cmnd->cmnd;
+ int ret = 0;
+
+ if (devip == NULL)
+ return 0;
+
+ list_for_each_entry(err, &devip->inject_err_list, list) {
+ if (err->type == ERR_FAIL_QUEUE_CMD &&
+ (err->cmd == cmd[0] || err->cmd == 0xff)) {
+ ret = err->cnt ? err->queuecmd_ret : 0;
+ if (err->cnt < 0)
+ err->cnt++;
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
static int scsi_debug_queuecommand(struct Scsi_Host *shost,
struct scsi_cmnd *scp)
{
@@ -7757,6 +7781,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
u8 opcode = cmd[0];
bool has_wlun_rl;
bool inject_now;
+ int ret = 0;

scsi_set_resid(scp, 0);
if (sdebug_statistics) {
@@ -7802,6 +7827,13 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
return 0;
}

+ ret = sdebug_fail_queue_cmd(scp);
+ if (ret) {
+ scmd_printk(KERN_INFO, scp, "fail queue command 0x%x with 0x%x\n",
+ opcode, ret);
+ return ret;
+ }
+
if (unlikely(inject_now && !atomic_read(&sdeb_inject_pending)))
atomic_set(&sdeb_inject_pending, 1);

--
2.32.0

2023-04-27 12:30:20

by Wenchao Hao

[permalink] [raw]
Subject: [PATCH v2 6/6] scsi:scsi_debug: set command's result and sense data if the error is injected

If a fail commnd error is injected, set the command's status and sense
data then finish this scsi command.

For example, the following command would make read(0x88) command finished
with UNC for 8 times:
error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
echo "2 -8 0x88 0 0 0x2 0x3 0x11 0x0" >$error

Signed-off-by: Wenchao Hao <[email protected]>
---
drivers/scsi/scsi_debug.c | 46 +++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 340299e63069..1e2aab708a8b 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7762,6 +7762,41 @@ static int sdebug_fail_queue_cmd(struct scsi_cmnd *cmnd)
return 0;
}

+static int sdebug_fail_cmd(struct scsi_cmnd *cmnd, int *retval,
+ struct sdebug_err_inject *info)
+{
+ struct scsi_device *sdp = cmnd->device;
+ struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata;
+ struct sdebug_err_inject *err;
+ unsigned char *cmd = cmnd->cmnd;
+ int ret = 0;
+ int result;
+
+ if (devip == NULL)
+ return 0;
+
+ list_for_each_entry(err, &devip->inject_err_list, list) {
+ if (err->type == ERR_FAIL_CMD &&
+ (err->cmd == cmd[0] || err->cmd == 0xff)) {
+ if (!err->cnt)
+ return 0;
+ ret = !!err->cnt;
+ goto out_handle;
+ }
+ }
+ return 0;
+
+out_handle:
+ if (err->cnt < 0)
+ err->cnt++;
+ mk_sense_buffer(cmnd, err->sense_key, err->asc, err->asq);
+ result = err->status_byte | err->host_byte << 16 | err->driver_byte << 24;
+ *info = *err;
+ *retval = schedule_resp(cmnd, devip, result, NULL, 0, 0);
+
+ return ret;
+}
+
static int scsi_debug_queuecommand(struct Scsi_Host *shost,
struct scsi_cmnd *scp)
{
@@ -7782,6 +7817,7 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
bool has_wlun_rl;
bool inject_now;
int ret = 0;
+ struct sdebug_err_inject err;

scsi_set_resid(scp, 0);
if (sdebug_statistics) {
@@ -7834,6 +7870,16 @@ static int scsi_debug_queuecommand(struct Scsi_Host *shost,
return ret;
}

+ if (sdebug_fail_cmd(scp, &ret, &err)) {
+ scmd_printk(KERN_INFO, scp,
+ "fail command 0x%x with hostbyte=0x%x, "
+ "driverbyte=0x%x, statusbyte=0x%x, "
+ "sense_key=0x%x, asc=0x%x, asq=0x%x\n",
+ opcode, err.host_byte, err.driver_byte,
+ err.status_byte, err.sense_key, err.asc, err.asq);
+ return ret;
+ }
+
if (unlikely(inject_now && !atomic_read(&sdeb_inject_pending)))
atomic_set(&sdeb_inject_pending, 1);

--
2.32.0

2023-05-02 23:57:09

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] scsi:scsi_debug: Add error injection for single device

On 2023-04-27 21:33, Wenchao Hao wrote:
> The original error injection mechanism was based on scsi_host which
> could not inject fault for a single SCSI device.
>
> This patchset provides the ability to inject errors for a single
> SCSI device. Now we supports inject timeout errors, queuecommand
> errors, and hostbyte, driverbyte, statusbyte, and sense data for
> specific SCSI Command.
>
> The first two patch add an debugfs interface to add and inquiry single
> device's error injection info; the third patch defined how to remove
> an injection which has been added. The following 3 patches use the
> injection info and generate the related error type.
>
> V2:
> - Using debugfs rather than sysfs attribute interface to manage error
>
> Wenchao Hao (6):
> scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
> scsi:scsi_debug: Add interface to manage single device's error inject
> scsi:scsi_debug: Define grammar to remove added error injection
> scsi:scsi_debug: timeout command if the error is injected
> scsi:scsi_debug: Return failed value if the error is injected
> scsi:scsi_debug: set command's result and sense data if the error is
> injected
>
> drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 318 insertions(+)

Been playing around with this patchset and it seems to work as expected. Took me
a while to work my way through interface description at the beginning of
[PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error
inject

so I cut and paste it into my scsi_debug.html page and did some work on it, see:
https://doug-gilbert.github.io/scsi_debug.html

There is a new chapter titled: Per device error injection
Kept the ASCII art so it could be ported back to [PATCH v2 2/6]'s description
if Wenchao is agreeable.

So for the whole series:
Acked-by: Douglas Gilbert <[email protected]>


One suggestion for later work: perhaps the Command opcode field could be
expanded to: x8[,x16] so optionally a Service Action (in hex) could be
given (e.g. '9e,10' for the READ CAPACITY (16) command).

Doug Gilbert

2023-05-04 13:30:36

by Wenchao Hao

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] scsi:scsi_debug: Add error injection for single device

On 2023/5/3 7:52, Douglas Gilbert wrote:
> On 2023-04-27 21:33, Wenchao Hao wrote:
>> The original error injection mechanism was based on scsi_host which
>> could not inject fault for a single SCSI device.
>>
>> This patchset provides the ability to inject errors for a single
>> SCSI device. Now we supports inject timeout errors, queuecommand
>> errors, and hostbyte, driverbyte, statusbyte, and sense data for
>> specific SCSI Command.
>>
>> The first two patch add an debugfs interface to add and inquiry single
>> device's error injection info; the third patch defined how to remove
>> an injection which has been added. The following 3 patches use the
>> injection info and generate the related error type.
>>
>> V2:
>>    - Using debugfs rather than sysfs attribute interface to manage error
>>
>> Wenchao Hao (6):
>>    scsi:scsi_debug: create scsi_debug directory in the debugfs filesystem
>>    scsi:scsi_debug: Add interface to manage single device's error inject
>>    scsi:scsi_debug: Define grammar to remove added error injection
>>    scsi:scsi_debug: timeout command if the error is injected
>>    scsi:scsi_debug: Return failed value if the error is injected
>>    scsi:scsi_debug: set command's result and sense data if the error is
>>      injected
>>
>>   drivers/scsi/scsi_debug.c | 318 ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 318 insertions(+)
>
> Been playing around with this patchset and it seems to work as expected. Took me
> a while to work my way through interface description at the beginning of
>   [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error inject
>
> so I cut and paste it into my scsi_debug.html page and did some work on it, see:
>    https://doug-gilbert.github.io/scsi_debug.html
>
> There is a new chapter titled: Per device error injection
> Kept the ASCII art so it could be ported back to [PATCH v2 2/6]'s description
> if Wenchao is agreeable.
>

Thank you a lot, I would update the patch's description in next version.

> So for the whole series:
>   Acked-by: Douglas Gilbert <[email protected]>
>
>
> One suggestion for later work: perhaps the Command opcode field could be
> expanded to: x8[,x16] so optionally a Service Action (in hex) could be
> given (e.g. '9e,10' for the READ CAPACITY (16) command).
>
> Doug Gilbert
>
>

Would you help me to check if my understanding is correct:

Define Command opcode as x16, split this 16bit as two parts, one for actually
SCSI Command opcode,, another one for Service Action. If so, it would make this
interface complex to use. I want to make it easy and we do not need to calculate
data.

I think there are other methods to support specify a Service Action:

method1. Redefine the General rule format and append Service Action to SCSI
command opcode as following:

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Error code |
| | | 0: timeout SCSI command |
| | | 1: fail queuecommand, make queuecommand return |
| | | given value |
| | | 2: fail command, finish command with SCSI status, |
| | | sense key and ASC/ASCQ values |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error count |
| | | 0: this rule will be ignored |
| | | positive: the rule will always take effect |
| | | negative: the rule takes effect n times where -n is |
| | | the value given. Ignored after n times |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command opcode, 0xff for all commands |
+--------+------+-------------------------------------------------------+
| 4 | x8 | specify a Service Action, 0xff for all commands |
+--------+------+-------------------------------------------------------+
| ... | xxx | Error type specific fields |
+--------+------+-------------------------------------------------------+

method2. define new Error code for commands which need a Service Action,
for example: define Error code 3 as the following format to timeout a
command commands which need a Service Action:

+--------+------+-------------------------------------------------------+
| Column | Type | Description |
+--------+------+-------------------------------------------------------+
| 1 | u8 | Fix to 3 |
+--------+------+-------------------------------------------------------+
| 2 | s32 | Error count |
| | | 0: this rule will be ignored |
| | | positive: the rule will always take effect |
| | | negative: the rule takes effect n times where -n is |
| | | the value given. Ignored after n times |
+--------+------+-------------------------------------------------------+
| 3 | x8 | SCSI command opcode, 0xff for all commands |
+--------+------+-------------------------------------------------------+
| 4 | x8 | specify a Service Action, 0xff for all commands |
+--------+------+-------------------------------------------------------+

We can inject timeout error for the READ CAPACITY (16) command with following:
echo "3 -10 0x9e 0x10" > ${error}

2023-05-11 12:09:30

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v2 6/6] scsi:scsi_debug: set command's result and sense data if the error is injected

Hi Wenchao,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Wenchao-Hao/scsi-scsi_debug-create-scsi_debug-directory-in-the-debugfs-filesystem/20230427-201534
base: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
patch link: https://lore.kernel.org/r/20230428013320.347050-7-haowenchao2%40huawei.com
patch subject: [PATCH v2 6/6] scsi:scsi_debug: set command's result and sense data if the error is injected
config: mips-randconfig-m041-20230509 (https://download.01.org/0day-ci/archive/20230511/[email protected]/config)
compiler: mips-linux-gcc (GCC) 12.1.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Reported-by: Dan Carpenter <[email protected]>
| Link: https://lore.kernel.org/r/[email protected]/

New smatch warnings:
drivers/scsi/scsi_debug.c:7880 scsi_debug_queuecommand() warn: missing error code? 'ret'

Old smatch warnings:
drivers/scsi/scsi_debug.c:5389 scsi_debug_slave_destroy() warn: variable dereferenced before check 'devip' (see line 5388)

vim +/ret +7880 drivers/scsi/scsi_debug.c

70abb3e2434633 Wenchao Hao 2023-04-28 7861 if (sdebug_timeout_cmd(scp)) {
70abb3e2434633 Wenchao Hao 2023-04-28 7862 scmd_printk(KERN_INFO, scp, "timeout command 0x%x\n", opcode);
70abb3e2434633 Wenchao Hao 2023-04-28 7863 return 0;
70abb3e2434633 Wenchao Hao 2023-04-28 7864 }
70abb3e2434633 Wenchao Hao 2023-04-28 7865
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7866 ret = sdebug_fail_queue_cmd(scp);
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7867 if (ret) {
^^^

b3f5d28c11bee2 Wenchao Hao 2023-04-28 7868 scmd_printk(KERN_INFO, scp, "fail queue command 0x%x with 0x%x\n",
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7869 opcode, ret);
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7870 return ret;
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7871 }
b3f5d28c11bee2 Wenchao Hao 2023-04-28 7872
ef1cd466d439a1 Wenchao Hao 2023-04-28 7873 if (sdebug_fail_cmd(scp, &ret, &err)) {
ef1cd466d439a1 Wenchao Hao 2023-04-28 7874 scmd_printk(KERN_INFO, scp,
ef1cd466d439a1 Wenchao Hao 2023-04-28 7875 "fail command 0x%x with hostbyte=0x%x, "
ef1cd466d439a1 Wenchao Hao 2023-04-28 7876 "driverbyte=0x%x, statusbyte=0x%x, "
ef1cd466d439a1 Wenchao Hao 2023-04-28 7877 "sense_key=0x%x, asc=0x%x, asq=0x%x\n",
ef1cd466d439a1 Wenchao Hao 2023-04-28 7878 opcode, err.host_byte, err.driver_byte,
ef1cd466d439a1 Wenchao Hao 2023-04-28 7879 err.status_byte, err.sense_key, err.asc, err.asq);
ef1cd466d439a1 Wenchao Hao 2023-04-28 @7880 return ret;

ret is success.

ef1cd466d439a1 Wenchao Hao 2023-04-28 7881 }
ef1cd466d439a1 Wenchao Hao 2023-04-28 7882
3a90a63d02b8b7 Douglas Gilbert 2020-07-12 7883 if (unlikely(inject_now && !atomic_read(&sdeb_inject_pending)))
3a90a63d02b8b7 Douglas Gilbert 2020-07-12 7884 atomic_set(&sdeb_inject_pending, 1);
3a90a63d02b8b7 Douglas Gilbert 2020-07-12 7885
c2248fc974df7b Douglas Gilbert 2014-11-24 7886 na = oip->num_attached;
c2248fc974df7b Douglas Gilbert 2014-11-24 7887 r_pfp = oip->pfp;

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


2023-05-11 12:11:38

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error inject

Hi Wenchao,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Wenchao-Hao/scsi-scsi_debug-create-scsi_debug-directory-in-the-debugfs-filesystem/20230427-201534
base: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
patch link: https://lore.kernel.org/r/20230428013320.347050-3-haowenchao2%40huawei.com
patch subject: [PATCH v2 2/6] scsi:scsi_debug: Add interface to manage single device's error inject
config: mips-randconfig-m041-20230509 (https://download.01.org/0day-ci/archive/20230511/[email protected]/config)
compiler: mips-linux-gcc (GCC) 12.1.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Reported-by: Dan Carpenter <[email protected]>
| Link: https://lore.kernel.org/r/[email protected]/

smatch warnings:
drivers/scsi/scsi_debug.c:5359 scsi_debug_slave_destroy() warn: variable dereferenced before check 'devip' (see line 5358)

vim +/devip +5359 drivers/scsi/scsi_debug.c

8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5349 static void scsi_debug_slave_destroy(struct scsi_device *sdp)
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5350 {
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5351 struct sdebug_dev_info *devip =
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5352 (struct sdebug_dev_info *)sdp->hostdata;
a34c4e98367965 FUJITA Tomonori 2008-03-25 5353
773642d95b8220 Douglas Gilbert 2016-04-25 5354 if (sdebug_verbose)
c1287970f4847a Tomas Winkler 2015-07-28 5355 pr_info("slave_destroy <%u %u %u %llu>\n",
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5356 sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
46ab9018f5b07d Wenchao Hao 2023-04-28 5357
46ab9018f5b07d Wenchao Hao 2023-04-28 @5358 debugfs_remove(devip->debugfs_entry);
^^^^^^^^^^^^^^^^^^^^
Dereference

8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 @5359 if (devip) {
^^^^^
Checked too late.

25985edcedea63 Lucas De Marchi 2011-03-30 5360 /* make this slot available for re-use */
c2248fc974df7b Douglas Gilbert 2014-11-24 5361 devip->used = false;
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5362 sdp->hostdata = NULL;
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5363 }
8dea0d02f8bb71 FUJITA Tomonori 2008-03-30 5364 }

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests