2023-03-13 09:31:50

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 00/11] scsi_debug: Some minor improvements

This series contains a bunch of minor improvements to the driver. I have
another bunch waiting with more major changes.

Most of the changes are quite straightforward, and the only patches of note are
as follows:
- Fix the command abort feature, enabled with host option SDEBUG_OPT_CMD_ABORT
- Drop driver count of queued commands per device
- Add poll mode completions to statistics. We already have poll mode callback
call count, so maybe it was intentional to omit poll mode from the
statistics.

Difference to v1:
- rename function to get sdebug host from shost (Bart)

Based on scsi-staging 6.4 @ commit 0b31b77f281a ("Merge patch series "PCI/AER: ...")

John Garry (11):
scsi: scsi_debug: Don't hold driver host struct pointer in
host->hostdata[]
scsi: scsi_debug: Stop setting devip->sdbg_host twice
scsi: scsi_debug: Drop scsi_debug_abort() NULL pointer checks
scsi: scsi_debug: Drop scsi_debug_device_reset() NULL pointer checks
scsi: scsi_debug: Drop scsi_debug_target_reset() NULL pointer checks
scsi: scsi_debug: Drop scsi_debug_bus_reset() NULL pointer checks
scsi: scsi_debug: Drop scsi_debug_host_reset() device NULL pointer
check
scsi: scsi_debug: Drop check for num_in_q exceeding queue depth
scsi: scsi_debug: Drop sdebug_dev_info.num_in_q
scsi: scsi_debug: Get command abort feature working again
scsi: scsi_debug: Add poll mode deferred completions to statistics

drivers/scsi/scsi_debug.c | 209 ++++++++++++++------------------------
1 file changed, 76 insertions(+), 133 deletions(-)

--
2.35.3



2023-03-13 09:31:54

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 03/11] scsi: scsi_debug: Drop scsi_debug_abort() NULL pointer checks

The SCSI cmnd pointer arg would never be NULL, so drop the check. In
addition, its SCSI device pointer would never be NULL.

The only caller is scsi_send_eh_cmnd() -> scsi_abort_eh_cmnd() ->
scsi_try_to_abort_cmd() -> scsi_try_to_abort_cmd(), and in the origin of
that chain those pointers cannot be NULL.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 4c60a055610a..2c2a41b99641 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5360,13 +5360,13 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt)
bool ok;

++num_aborts;
- if (SCpnt) {
- ok = stop_queued_cmnd(SCpnt);
- if (SCpnt->device && (SDEBUG_OPT_ALL_NOISE & sdebug_opts))
- sdev_printk(KERN_INFO, SCpnt->device,
- "%s: command%s found\n", __func__,
- ok ? "" : " not");
- }
+
+ ok = stop_queued_cmnd(SCpnt);
+ if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
+ sdev_printk(KERN_INFO, SCpnt->device,
+ "%s: command%s found\n", __func__,
+ ok ? "" : " not");
+
return SUCCESS;
}

--
2.35.3


2023-03-13 09:31:59

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 04/11] scsi: scsi_debug: Drop scsi_debug_device_reset() NULL pointer checks

The SCSI cmnd pointer arg would never be NULL, so drop the check. In
addition, its SCSI device pointer would never be NULL (so drop that check
also).

The only caller is scsi_try_bus_device_reset(), and the command and its
device pointer could not be NULL when calling eh_device_reset_handler()
there.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 2c2a41b99641..5b51c24f7d09 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5372,17 +5372,16 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt)

static int scsi_debug_device_reset(struct scsi_cmnd *SCpnt)
{
+ struct scsi_device *sdp = SCpnt->device;
+ struct sdebug_dev_info *devip = sdp->hostdata;
+
++num_dev_resets;
- if (SCpnt && SCpnt->device) {
- struct scsi_device *sdp = SCpnt->device;
- struct sdebug_dev_info *devip =
- (struct sdebug_dev_info *)sdp->hostdata;

- if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
- sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
- if (devip)
- set_bit(SDEBUG_UA_POR, devip->uas_bm);
- }
+ if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
+ sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
+ if (devip)
+ set_bit(SDEBUG_UA_POR, devip->uas_bm);
+
return SUCCESS;
}

--
2.35.3


2023-03-13 09:32:29

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 05/11] scsi: scsi_debug: Drop scsi_debug_target_reset() NULL pointer checks

The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so
drop them. Likewise, drop the NULL check for sdbg_host.

The only caller is scsi_try_target_reset() -> eh_target_reset_handler(),
and there those pointers cannot be NULL.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 31 ++++++++++---------------------
1 file changed, 10 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 5b51c24f7d09..6364d6f08861 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5387,37 +5387,26 @@ static int scsi_debug_device_reset(struct scsi_cmnd *SCpnt)

static int scsi_debug_target_reset(struct scsi_cmnd *SCpnt)
{
- struct sdebug_host_info *sdbg_host;
+ struct scsi_device *sdp = SCpnt->device;
+ struct sdebug_host_info *sdbg_host = shost_to_sdebug_host(sdp->host);
struct sdebug_dev_info *devip;
- struct scsi_device *sdp;
- struct Scsi_Host *hp;
int k = 0;

++num_target_resets;
- if (!SCpnt)
- goto lie;
- sdp = SCpnt->device;
- if (!sdp)
- goto lie;
if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
- hp = sdp->host;
- if (!hp)
- goto lie;
- sdbg_host = shost_to_sdebug_host(hp);
- if (sdbg_host) {
- list_for_each_entry(devip,
- &sdbg_host->dev_info_list,
- dev_list)
- if (devip->target == sdp->id) {
- set_bit(SDEBUG_UA_BUS_RESET, devip->uas_bm);
- ++k;
- }
+
+ list_for_each_entry(devip, &sdbg_host->dev_info_list, dev_list) {
+ if (devip->target == sdp->id) {
+ set_bit(SDEBUG_UA_BUS_RESET, devip->uas_bm);
+ ++k;
+ }
}
+
if (SDEBUG_OPT_RESET_NOISE & sdebug_opts)
sdev_printk(KERN_INFO, sdp,
"%s: %d device(s) found in target\n", __func__, k);
-lie:
+
return SUCCESS;
}

--
2.35.3


2023-03-13 09:32:52

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 06/11] scsi: scsi_debug: Drop scsi_debug_bus_reset() NULL pointer checks

The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so
drop them. Likewise, drop the NULL check for sdbg_host.

The only caller is scsi_try_bus_reset() -> eh_bus_reset_handler(),
and there those pointers cannot be NULL.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 26 ++++++++------------------
1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 6364d6f08861..749358b48335 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5412,34 +5412,24 @@ static int scsi_debug_target_reset(struct scsi_cmnd *SCpnt)

static int scsi_debug_bus_reset(struct scsi_cmnd *SCpnt)
{
- struct sdebug_host_info *sdbg_host;
+ struct scsi_device *sdp = SCpnt->device;
+ struct sdebug_host_info *sdbg_host = shost_to_sdebug_host(sdp->host);
struct sdebug_dev_info *devip;
- struct scsi_device *sdp;
- struct Scsi_Host *hp;
int k = 0;

++num_bus_resets;
- if (!(SCpnt && SCpnt->device))
- goto lie;
- sdp = SCpnt->device;
+
if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
- hp = sdp->host;
- if (hp) {
- sdbg_host = shost_to_sdebug_host(hp);
- if (sdbg_host) {
- list_for_each_entry(devip,
- &sdbg_host->dev_info_list,
- dev_list) {
- set_bit(SDEBUG_UA_BUS_RESET, devip->uas_bm);
- ++k;
- }
- }
+
+ list_for_each_entry(devip, &sdbg_host->dev_info_list, dev_list) {
+ set_bit(SDEBUG_UA_BUS_RESET, devip->uas_bm);
+ ++k;
}
+
if (SDEBUG_OPT_RESET_NOISE & sdebug_opts)
sdev_printk(KERN_INFO, sdp,
"%s: %d device(s) found in host\n", __func__, k);
-lie:
return SUCCESS;
}

--
2.35.3


2023-03-13 09:33:27

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 07/11] scsi: scsi_debug: Drop scsi_debug_host_reset() device NULL pointer check

The check for device pointer for the SCSI command is unnecessary, so drop
it.

The only caller is scsi_try_host_reset() -> eh_host_reset_handler(),
and there that pointer cannot be NULL.

Indeed, there is already code later in the same function which does not
check the device pointer for the SCSI command.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 749358b48335..47820b9f6326 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5440,7 +5440,7 @@ static int scsi_debug_host_reset(struct scsi_cmnd *SCpnt)
int k = 0;

++num_host_resets;
- if ((SCpnt->device) && (SDEBUG_OPT_ALL_NOISE & sdebug_opts))
+ if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
sdev_printk(KERN_INFO, SCpnt->device, "%s\n", __func__);
spin_lock(&sdebug_host_list_lock);
list_for_each_entry(sdbg_host, &sdebug_host_list, host_list) {
--
2.35.3


2023-03-13 09:33:45

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 08/11] scsi: scsi_debug: Drop check for num_in_q exceeding queue depth

The per-device num_in_q value cannot exceed the device queue depth, so drop
the check.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 47820b9f6326..0d515bac93bf 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5593,15 +5593,8 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
}
num_in_q = atomic_read(&devip->num_in_q);
qdepth = cmnd->device->queue_depth;
- if (unlikely((qdepth > 0) && (num_in_q >= qdepth))) {
- if (scsi_result) {
- spin_unlock_irqrestore(&sqp->qc_lock, iflags);
- goto respond_in_thread;
- } else
- scsi_result = device_qfull_result;
- } else if (unlikely(sdebug_every_nth &&
- (SDEBUG_OPT_RARE_TSF & sdebug_opts) &&
- (scsi_result == 0))) {
+ if (unlikely(sdebug_every_nth && (SDEBUG_OPT_RARE_TSF & sdebug_opts) &&
+ (scsi_result == 0))) {
if ((num_in_q == (qdepth - 1)) &&
(atomic_inc_return(&sdebug_a_tsf) >=
abs(sdebug_every_nth))) {
--
2.35.3


2023-03-13 09:34:15

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q

In schedule_resp(), under certain conditions we check whether the
per-device queue is full (num_in_q == queue depth - 1) and we may inject a
"task set full" (TSF) error if it is.

However how we read num_in_q is racy - many threads may see the same
"queue is full" value (and also issue a TSF).

There is per-queue locking in reading per-device num_in_q, but that would
not help.

Replace how we read num_in_q at this location with a call to
scsi_device_busy(). Calling scsi_device_busy() is likewise racy (as reading
num_in_q), so nothing lost or gained. Calling scsi_device_busy() is also
slow as it needs to read all bits in the per-device budget bitmap, but we
can live with that since we're just a simulator and it's only under
a certain configs which we would see this.

Also move the "task set full" print earlier as it would only be called
now under this condition. However, previously it may not have been
called - like returning early - but keep it simple and always call it.

At this point we can drop sdebug_dev_info.num_in_q - it is difficult to
maintain properly and adds extra normal case command processing.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 63 ++++++++++-----------------------------
1 file changed, 16 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 0d515bac93bf..449b460e4c1b 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -288,7 +288,6 @@ struct sdebug_dev_info {
uuid_t lu_name;
struct sdebug_host_info *sdbg_host;
unsigned long uas_bm[1];
- atomic_t num_in_q;
atomic_t stopped; /* 1: by SSU, 2: device start */
bool used;

@@ -4931,7 +4930,6 @@ static void sdebug_q_cmd_complete(struct sdebug_defer *sd_dp)
struct sdebug_queue *sqp;
struct sdebug_queued_cmd *sqcp;
struct scsi_cmnd *scp;
- struct sdebug_dev_info *devip;

if (unlikely(aborted))
sd_dp->aborted = false;
@@ -4956,11 +4954,7 @@ static void sdebug_q_cmd_complete(struct sdebug_defer *sd_dp)
sd_dp->sqa_idx, qc_idx, sd_dp->hc_idx);
return;
}
- devip = (struct sdebug_dev_info *)scp->device->hostdata;
- if (likely(devip))
- atomic_dec(&devip->num_in_q);
- else
- pr_err("devip=NULL\n");
+
if (unlikely(atomic_read(&retired_max_queue) > 0))
retiring = 1;

@@ -5192,7 +5186,6 @@ static struct sdebug_dev_info *find_build_dev_info(struct scsi_device *sdev)
open_devip->target = sdev->id;
open_devip->lun = sdev->lun;
open_devip->sdbg_host = sdbg_host;
- atomic_set(&open_devip->num_in_q, 0);
set_bit(SDEBUG_UA_POOCCUR, open_devip->uas_bm);
open_devip->used = true;
return open_devip;
@@ -5263,7 +5256,6 @@ static bool stop_queued_cmnd(struct scsi_cmnd *cmnd)
enum sdeb_defer_type l_defer_t;
struct sdebug_queue *sqp;
struct sdebug_queued_cmd *sqcp;
- struct sdebug_dev_info *devip;
struct sdebug_defer *sd_dp;

for (j = 0, sqp = sdebug_q_arr; j < submit_queues; ++j, ++sqp) {
@@ -5278,10 +5270,6 @@ static bool stop_queued_cmnd(struct scsi_cmnd *cmnd)
if (cmnd != sqcp->a_cmnd)
continue;
/* found */
- devip = (struct sdebug_dev_info *)
- cmnd->device->hostdata;
- if (devip)
- atomic_dec(&devip->num_in_q);
sqcp->a_cmnd = NULL;
sd_dp = sqcp->sd_dp;
if (sd_dp) {
@@ -5308,7 +5296,6 @@ static void stop_all_queued(void)
enum sdeb_defer_type l_defer_t;
struct sdebug_queue *sqp;
struct sdebug_queued_cmd *sqcp;
- struct sdebug_dev_info *devip;
struct sdebug_defer *sd_dp;

for (j = 0, sqp = sdebug_q_arr; j < submit_queues; ++j, ++sqp) {
@@ -5318,10 +5305,6 @@ static void stop_all_queued(void)
sqcp = &sqp->qc_arr[k];
if (sqcp->a_cmnd == NULL)
continue;
- devip = (struct sdebug_dev_info *)
- sqcp->a_cmnd->device->hostdata;
- if (devip)
- atomic_dec(&devip->num_in_q);
sqcp->a_cmnd = NULL;
sd_dp = sqcp->sd_dp;
if (sd_dp) {
@@ -5565,9 +5548,8 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
int delta_jiff, int ndelay)
{
bool new_sd_dp;
- bool inject = false;
bool polled = scsi_cmd_to_rq(cmnd)->cmd_flags & REQ_POLLED;
- int k, num_in_q, qdepth;
+ int k;
unsigned long iflags;
u64 ns_from_boot = 0;
struct sdebug_queue *sqp;
@@ -5591,16 +5573,21 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
spin_unlock_irqrestore(&sqp->qc_lock, iflags);
return SCSI_MLQUEUE_HOST_BUSY;
}
- num_in_q = atomic_read(&devip->num_in_q);
- qdepth = cmnd->device->queue_depth;
+
if (unlikely(sdebug_every_nth && (SDEBUG_OPT_RARE_TSF & sdebug_opts) &&
(scsi_result == 0))) {
+ int num_in_q = scsi_device_busy(sdp);
+ int qdepth = cmnd->device->queue_depth;
+
if ((num_in_q == (qdepth - 1)) &&
(atomic_inc_return(&sdebug_a_tsf) >=
abs(sdebug_every_nth))) {
atomic_set(&sdebug_a_tsf, 0);
- inject = true;
scsi_result = device_qfull_result;
+
+ if (unlikely(SDEBUG_OPT_Q_NOISE & sdebug_opts))
+ sdev_printk(KERN_INFO, sdp, "%s: num_in_q=%d +1, <inject> status: TASK SET FULL\n",
+ __func__, num_in_q);
}
}

@@ -5616,7 +5603,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
goto respond_in_thread;
}
set_bit(k, sqp->in_use_bm);
- atomic_inc(&devip->num_in_q);
sqcp = &sqp->qc_arr[k];
sqcp->a_cmnd = cmnd;
cmnd->host_scribble = (unsigned char *)sqcp;
@@ -5626,7 +5612,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
if (!sd_dp) {
sd_dp = kzalloc(sizeof(*sd_dp), GFP_ATOMIC);
if (!sd_dp) {
- atomic_dec(&devip->num_in_q);
clear_bit(k, sqp->in_use_bm);
return SCSI_MLQUEUE_HOST_BUSY;
}
@@ -5686,7 +5671,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
if (kt <= d) { /* elapsed duration >= kt */
spin_lock_irqsave(&sqp->qc_lock, iflags);
sqcp->a_cmnd = NULL;
- atomic_dec(&devip->num_in_q);
clear_bit(k, sqp->in_use_bm);
spin_unlock_irqrestore(&sqp->qc_lock, iflags);
if (new_sd_dp)
@@ -5762,9 +5746,7 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
sd_dp->aborted = false;
}
}
- if (unlikely((SDEBUG_OPT_Q_NOISE & sdebug_opts) && scsi_result == device_qfull_result))
- sdev_printk(KERN_INFO, sdp, "%s: num_in_q=%d +1, %s%s\n", __func__,
- num_in_q, (inject ? "<inject> " : ""), "status: TASK SET FULL");
+
return 0;

respond_in_thread: /* call back to mid-layer using invocation thread */
@@ -7369,17 +7351,12 @@ static void sdebug_do_remove_host(bool the_end)

static int sdebug_change_qdepth(struct scsi_device *sdev, int qdepth)
{
- int num_in_q = 0;
- struct sdebug_dev_info *devip;
+ struct sdebug_dev_info *devip = sdev->hostdata;

- block_unblock_all_queues(true);
- devip = (struct sdebug_dev_info *)sdev->hostdata;
- if (NULL == devip) {
- block_unblock_all_queues(false);
+ if (!devip)
return -ENODEV;
- }
- num_in_q = atomic_read(&devip->num_in_q);

+ block_unblock_all_queues(true);
if (qdepth > SDEBUG_CANQUEUE) {
qdepth = SDEBUG_CANQUEUE;
pr_warn("%s: requested qdepth [%d] exceeds canqueue [%d], trim\n", __func__,
@@ -7390,10 +7367,8 @@ static int sdebug_change_qdepth(struct scsi_device *sdev, int qdepth)
if (qdepth != sdev->queue_depth)
scsi_change_queue_depth(sdev, qdepth);

- if (SDEBUG_OPT_Q_NOISE & sdebug_opts) {
- sdev_printk(KERN_INFO, sdev, "%s: qdepth=%d, num_in_q=%d\n",
- __func__, qdepth, num_in_q);
- }
+ if (SDEBUG_OPT_Q_NOISE & sdebug_opts)
+ sdev_printk(KERN_INFO, sdev, "%s: qdepth=%d\n", __func__, qdepth);
block_unblock_all_queues(false);
return sdev->queue_depth;
}
@@ -7495,7 +7470,6 @@ static int sdebug_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num)
struct sdebug_queue *sqp;
struct sdebug_queued_cmd *sqcp;
struct scsi_cmnd *scp;
- struct sdebug_dev_info *devip;
struct sdebug_defer *sd_dp;

sqp = sdebug_q_arr + queue_num;
@@ -7533,11 +7507,6 @@ static int sdebug_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num)

} else /* ignoring non REQ_POLLED requests */
continue;
- devip = (struct sdebug_dev_info *)scp->device->hostdata;
- if (likely(devip))
- atomic_dec(&devip->num_in_q);
- else
- pr_err("devip=NULL from %s\n", __func__);
if (unlikely(atomic_read(&retired_max_queue) > 0))
retiring = true;

--
2.35.3


2023-03-13 09:34:30

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 10/11] scsi: scsi_debug: Get command abort feature working again

The command abort feature allows us to test aborting a command which has
timed-out.

The idea is that for specific commands we just don't call scsi_done() and
allow the request to timeout, which ensures SCSI EH kicks-in we try to
abort the command.

Since commit 4a0c6f432d15 ("scsi: scsi_debug: Add new defer type for
mq_poll") this does not seem to work. The issue is that we clear the
sd_dp->aborted flag in schedule_resp() before the completion callback has
run. When the completion callback actually runs, it calls scsi_done() as
normal as sd_dp->aborted unset. This is all very racy.

Fix by not clearing sd_dp->aborted in schedule_resp(). Also move the call
to blk_abort_request() from schedule_resp() to sdebug_q_cmd_complete(),
which makes the code have a more logical sequence.

I also note that this feature only works for commands which are classed
as "SDEG_RES_IMMED_MASK", but only practically triggered with prior RW
commands. So for my experiment I need to run fio to trigger the error on
the "nth" command (see inject_on_this_cmd()), and then run something like
sg_sync to queue a command to actually trigger the abort.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 449b460e4c1b..1463e54179bf 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -4983,7 +4983,8 @@ static void sdebug_q_cmd_complete(struct sdebug_defer *sd_dp)
spin_unlock_irqrestore(&sqp->qc_lock, iflags);
if (unlikely(aborted)) {
if (sdebug_verbose)
- pr_info("bypassing scsi_done() due to aborted cmd\n");
+ pr_info("bypassing scsi_done() due to aborted cmd, kicking-off EH\n");
+ blk_abort_request(scsi_cmd_to_rq(scp));
return;
}
scsi_done(scp); /* callback to mid level */
@@ -5712,8 +5713,13 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
sd_dp->issuing_cpu = raw_smp_processor_id();
} else { /* jdelay < 0, use work queue */
if (unlikely((sdebug_opts & SDEBUG_OPT_CMD_ABORT) &&
- atomic_read(&sdeb_inject_pending)))
+ atomic_read(&sdeb_inject_pending))) {
sd_dp->aborted = true;
+ atomic_set(&sdeb_inject_pending, 0);
+ sdev_printk(KERN_INFO, sdp, "abort request tag=%#x\n",
+ blk_mq_unique_tag_to_tag(get_tag(cmnd)));
+ }
+
if (polled) {
sd_dp->cmpl_ts = ns_to_ktime(ns_from_boot);
spin_lock_irqsave(&sqp->qc_lock, iflags);
@@ -5738,13 +5744,6 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
}
if (sdebug_statistics)
sd_dp->issuing_cpu = raw_smp_processor_id();
- if (unlikely(sd_dp->aborted)) {
- sdev_printk(KERN_INFO, sdp, "abort request tag %d\n",
- scsi_cmd_to_rq(cmnd)->tag);
- blk_abort_request(scsi_cmd_to_rq(cmnd));
- atomic_set(&sdeb_inject_pending, 0);
- sd_dp->aborted = false;
- }
}

return 0;
--
2.35.3


2023-03-13 09:34:37

by John Garry

[permalink] [raw]
Subject: [PATCH RESEND v2 11/11] scsi: scsi_debug: Add poll mode deferred completions to statistics

Currently commands completed via poll mode are not included in the
statistics gathering for deferred completions and missed CPUs.

Poll mode completions should be treated the same as other deferred
completion types, so add poll mode completions to the statistics.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_debug.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 1463e54179bf..073fc02f9fed 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -7531,6 +7531,13 @@ static int sdebug_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num)
}
WRITE_ONCE(sd_dp->defer_t, SDEB_DEFER_NONE);
spin_unlock_irqrestore(&sqp->qc_lock, iflags);
+
+ if (sdebug_statistics) {
+ atomic_inc(&sdebug_completions);
+ if (raw_smp_processor_id() != sd_dp->issuing_cpu)
+ atomic_inc(&sdebug_miss_cpus);
+ }
+
scsi_done(scp); /* callback to mid level */
num_entries++;
spin_lock_irqsave(&sqp->qc_lock, iflags);
--
2.35.3


2023-03-17 03:27:57

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 00/11] scsi_debug: Some minor improvements


John,

> This series contains a bunch of minor improvements to the driver. I
> have another bunch waiting with more major changes.

Applied to 6.4/scsi-staging, thanks!

--
Martin K. Petersen Oracle Linux Engineering

2023-03-20 05:34:28

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q


Greeting,

FYI, we noticed blktests.scsi/004.fail due to commit (built with gcc-11):

commit: 76fe86662c283bf322ac1e032c46ddaa0da5cb16 ("[PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q")
url: https://github.com/intel-lab-lkp/linux/commits/John-Garry/scsi-scsi_debug-Don-t-hold-driver-host-struct-pointer-in-host-hostdata/20230313-173528
base: https://git.kernel.org/cgit/linux/kernel/git/mkp/scsi.git for-next
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q

in testcase: blktests
version: blktests-x86_64-e3f2ca5-1_20230228
with following parameters:

disk: 1HDD
test: scsi-group-02



on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz (Broadwell-DE) with 48G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


2023-03-18 00:18:58 sed "s:^:scsi/:" /lkp/benchmarks/blktests/tests/scsi-group-02
2023-03-18 00:18:58 ./check scsi/004 scsi/005
scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command)
scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command) [failed]
runtime ... 1.467s
--- tests/scsi/004.out 2023-02-28 16:52:49.000000000 +0000
+++ /lkp/benchmarks/blktests/results/nodev/scsi/004.out.bad 2023-03-18 00:19:00.232079954 +0000
@@ -1,3 +1,2 @@
Running scsi/004
-Input/output error
Test complete



To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests



Attachments:
(No filename) (2.16 kB)
config-6.3.0-rc1-00044-g76fe86662c28 (166.32 kB)
job-script (5.86 kB)
dmesg.xz (40.50 kB)
blktests (726.00 B)
job.yaml (4.96 kB)
reproduce (88.00 B)
Download all attachments

2023-03-20 11:42:19

by John Garry

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q

On 20/03/2023 05:31, kernel test robot wrote:
>
> Greeting,
>
> FYI, we noticed blktests.scsi/004.fail due to commit (built with gcc-11):
>
> commit: 76fe86662c283bf322ac1e032c46ddaa0da5cb16 ("[PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q")
> url: https://github.com/intel-lab-lkp/linux/commits/John-Garry/scsi-scsi_debug-Don-t-hold-driver-host-struct-pointer-in-host-hostdata/20230313-173528
> base: https://git.kernel.org/cgit/linux/kernel/git/mkp/scsi.git for-next
> patch link: https://lore.kernel.org/all/[email protected]/
> patch subject: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q
>
> in testcase: blktests
> version: blktests-x86_64-e3f2ca5-1_20230228
> with following parameters:
>
> disk: 1HDD
> test: scsi-group-02
>
>
>
> on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz (Broadwell-DE) with 48G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <[email protected]>
> | Link: https://lore.kernel.org/oe-lkp/[email protected]
>
>
> 2023-03-18 00:18:58 sed "s:^:scsi/:" /lkp/benchmarks/blktests/tests/scsi-group-02
> 2023-03-18 00:18:58 ./check scsi/004 scsi/005
> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command)
> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command) [failed]
> runtime ... 1.467s
> --- tests/scsi/004.out 2023-02-28 16:52:49.000000000 +0000
> +++ /lkp/benchmarks/blktests/results/nodev/scsi/004.out.bad 2023-03-18 00:19:00.232079954 +0000
> @@ -1,3 +1,2 @@
> Running scsi/004
> -Input/output error
> Test complete
>

I'll check this now.

>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> sudo bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> sudo bin/lkp run generated-yaml-file
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>


2023-03-22 13:37:57

by John Garry

[permalink] [raw]
Subject: Re: [PATCH RESEND v2 09/11] scsi: scsi_debug: Drop sdebug_dev_info.num_in_q

On 20/03/2023 11:41, John Garry wrote:
>> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out
>> command) [failed]
>>      runtime    ...  1.467s
>>      --- tests/scsi/004.out    2023-02-28 16:52:49.000000000 +0000
>>      +++ /lkp/benchmarks/blktests/results/nodev/scsi/004.out.bad
>> 2023-03-18 00:19:00.232079954 +0000
>>      @@ -1,3 +1,2 @@
>>       Running scsi/004
>>      -Input/output error
>>       Test complete
>>
>
> I'll check this now.

This should fix it:

--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5580,7 +5580,7 @@ static int schedule_resp(struct scsi_cmnd *cmnd,
struct sdebug_dev_info *devip,
int num_in_q = scsi_device_busy(sdp);
int qdepth = cmnd->device->queue_depth;

- if ((num_in_q == (qdepth - 1)) &&
+ if ((num_in_q == qdepth) &&
(atomic_inc_return(&sdebug_a_tsf) >=


I'll send that change in my other scsi_debug series after I test further.