2021-12-15 14:43:03

by John Garry

[permalink] [raw]
Subject: [PATCH 0/8] hisi_sas: Some misc patches

This is a small series of misc patches. Please consider for 5.17.

Briefly the series covers:
- Factoring out the common and internal abort delivery code
- Fix some races with controller reset (again!)
- Fix out-of-order interrupt handling on FPGA

I also included a pretty straightforward libsas tidy-up.

This series conflicts with [0], so whoever gets accepted second
needs to rebase.

[0] https://lore.kernel.org/linux-scsi/[email protected]/T/#m3e21275f70f62bb123cb56f1db2cc25334d5117f

Thanks in advance!

John Garry (5):
scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly
scsi: hisi_sas: Make internal abort have no task proto
scsi: hisi_sas: Pass abort structure for internal abort
scsi: hisi_sas: Factor out task prep and delivery code
scsi: libsas: Decode SAM status and host byte codes

Qi Liu (3):
scsi: hisi_sas: Prevent parallel controller reset and control phy
command
scsi: hisi_sas: Prevent parallel FLR and controller reset
scsi: hisi_sas: Fix phyup timeout on FPGA

drivers/scsi/hisi_sas/hisi_sas.h | 5 +
drivers/scsi/hisi_sas/hisi_sas_main.c | 339 ++++++++++++-------------
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 11 +-
drivers/scsi/libsas/sas_scsi_host.c | 7 +-
4 files changed, 176 insertions(+), 186 deletions(-)

--
2.26.2



2021-12-15 14:43:07

by John Garry

[permalink] [raw]
Subject: [PATCH 1/8] scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly

Currently we start delivery of commands to the DQ after returning from
hisi_sas_task_exec() with success.

Let's just start delivery directly in that function without having to
check if some local variable is set.

Signed-off-by: John Garry <[email protected]>
Reviewed-by: Xiang Chen <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 889c36fa9309..0137ce7c544e 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -397,8 +397,7 @@ static int hisi_sas_dif_dma_map(struct hisi_hba *hisi_hba,

static int hisi_sas_task_prep(struct sas_task *task,
struct hisi_sas_dq **dq_pointer,
- bool is_tmf, struct hisi_sas_tmf_task *tmf,
- int *pass)
+ bool is_tmf, struct hisi_sas_tmf_task *tmf)
{
struct domain_device *device = task->dev;
struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
@@ -536,9 +535,12 @@ static int hisi_sas_task_prep(struct sas_task *task,
task->task_state_flags |= SAS_TASK_AT_INITIATOR;
spin_unlock_irqrestore(&task->task_state_lock, flags);

- ++(*pass);
WRITE_ONCE(slot->ready, 1);

+ spin_lock(&dq->lock);
+ hisi_hba->hw->start_delivery(dq);
+ spin_unlock(&dq->lock);
+
return 0;

err_out_dif_dma_unmap:
@@ -556,7 +558,6 @@ static int hisi_sas_task_exec(struct sas_task *task, gfp_t gfp_flags,
bool is_tmf, struct hisi_sas_tmf_task *tmf)
{
u32 rc;
- u32 pass = 0;
struct hisi_hba *hisi_hba;
struct device *dev;
struct domain_device *device = task->dev;
@@ -589,16 +590,10 @@ static int hisi_sas_task_exec(struct sas_task *task, gfp_t gfp_flags,
}

/* protect task_prep and start_delivery sequence */
- rc = hisi_sas_task_prep(task, &dq, is_tmf, tmf, &pass);
+ rc = hisi_sas_task_prep(task, &dq, is_tmf, tmf);
if (rc)
dev_err(dev, "task exec: failed[%d]!\n", rc);

- if (likely(pass)) {
- spin_lock(&dq->lock);
- hisi_hba->hw->start_delivery(dq);
- spin_unlock(&dq->lock);
- }
-
return rc;
}

--
2.26.2


2021-12-15 14:43:10

by John Garry

[permalink] [raw]
Subject: [PATCH 2/8] scsi: hisi_sas: Make internal abort have no task proto

For an internal abort, the task does not have a protocol, so set to none.

This will make it easier to differentiate internal abort tasks in future.

Signed-off-by: John Garry <[email protected]>
Reviewed-by: Xiang Chen <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 0137ce7c544e..b2299db01a80 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -2125,7 +2125,7 @@ _hisi_sas_internal_task_abort(struct hisi_hba *hisi_hba,
return -ENOMEM;

task->dev = device;
- task->task_proto = device->tproto;
+ task->task_proto = SAS_PROTOCOL_NONE;
task->task_done = hisi_sas_task_done;
task->slow_task->timer.function = hisi_sas_tmf_timedout;
task->slow_task->timer.expires = jiffies + INTERNAL_ABORT_TIMEOUT;
--
2.26.2


2021-12-15 14:43:14

by John Garry

[permalink] [raw]
Subject: [PATCH 3/8] scsi: hisi_sas: Pass abort structure for internal abort

To help factor out code in future, it's useful to know if we're executing
an internal abort, so pass a pointer to the structure. The idea is that
a NULL pointer means not an internal abort.

Signed-off-by: John Garry <[email protected]>
Reviewed-by: Xiang Chen <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas.h | 5 +++++
drivers/scsi/hisi_sas/hisi_sas_main.c | 21 ++++++++++++---------
2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index ed9419643235..07b473de9136 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -133,6 +133,11 @@ struct hisi_sas_rst {
bool done;
};

+struct hisi_sas_internal_abort {
+ unsigned int flag;
+ unsigned int tag;
+};
+
#define HISI_SAS_RST_WORK_INIT(r, c) \
{ .hisi_hba = hisi_hba, \
.completion = &c, \
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index b2299db01a80..9bad59e77eae 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -265,11 +265,11 @@ static void hisi_sas_task_prep_ata(struct hisi_hba *hisi_hba,
}

static void hisi_sas_task_prep_abort(struct hisi_hba *hisi_hba,
- struct hisi_sas_slot *slot,
- int device_id, int abort_flag, int tag_to_abort)
+ struct hisi_sas_internal_abort *abort,
+ struct hisi_sas_slot *slot, int device_id)
{
hisi_hba->hw->prep_abort(hisi_hba, slot,
- device_id, abort_flag, tag_to_abort);
+ device_id, abort->flag, abort->tag);
}

static void hisi_sas_dma_unmap(struct hisi_hba *hisi_hba,
@@ -2008,8 +2008,9 @@ static int hisi_sas_query_task(struct sas_task *task)

static int
hisi_sas_internal_abort_task_exec(struct hisi_hba *hisi_hba, int device_id,
- struct sas_task *task, int abort_flag,
- int task_tag, struct hisi_sas_dq *dq)
+ struct hisi_sas_internal_abort *abort,
+ struct sas_task *task,
+ struct hisi_sas_dq *dq)
{
struct domain_device *device = task->dev;
struct hisi_sas_device *sas_dev = device->lldd_dev;
@@ -2066,8 +2067,7 @@ hisi_sas_internal_abort_task_exec(struct hisi_hba *hisi_hba, int device_id,
memset(hisi_sas_status_buf_addr_mem(slot), 0,
sizeof(struct hisi_sas_err_record));

- hisi_sas_task_prep_abort(hisi_hba, slot, device_id,
- abort_flag, task_tag);
+ hisi_sas_task_prep_abort(hisi_hba, abort, slot, device_id);

spin_lock_irqsave(&task->task_state_lock, flags);
task->task_state_flags |= SAS_TASK_AT_INITIATOR;
@@ -2105,9 +2105,12 @@ _hisi_sas_internal_task_abort(struct hisi_hba *hisi_hba,
{
struct sas_task *task;
struct hisi_sas_device *sas_dev = device->lldd_dev;
+ struct hisi_sas_internal_abort abort = {
+ .flag = abort_flag,
+ .tag = tag,
+ };
struct device *dev = hisi_hba->dev;
int res;
-
/*
* The interface is not realized means this HW don't support internal
* abort, or don't need to do internal abort. Then here, we return
@@ -2132,7 +2135,7 @@ _hisi_sas_internal_task_abort(struct hisi_hba *hisi_hba,
add_timer(&task->slow_task->timer);

res = hisi_sas_internal_abort_task_exec(hisi_hba, sas_dev->device_id,
- task, abort_flag, tag, dq);
+ &abort, task, dq);
if (res) {
del_timer_sync(&task->slow_task->timer);
dev_err(dev, "internal task abort: executing internal task failed: %d\n",
--
2.26.2


2021-12-15 14:43:20

by John Garry

[permalink] [raw]
Subject: [PATCH 4/8] scsi: hisi_sas: Factor out task prep and delivery code

The task prep code is the same between the normal path (in
hisi_sas_task_prep()) and the internal abort path, so factor is out into
a common function.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 281 ++++++++++++--------------
1 file changed, 124 insertions(+), 157 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 9bad59e77eae..8df1fd680eac 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -395,94 +395,20 @@ static int hisi_sas_dif_dma_map(struct hisi_hba *hisi_hba,
return rc;
}

-static int hisi_sas_task_prep(struct sas_task *task,
- struct hisi_sas_dq **dq_pointer,
- bool is_tmf, struct hisi_sas_tmf_task *tmf)
+static
+void hisi_sas_task_deliver(struct hisi_hba *hisi_hba,
+ struct hisi_sas_slot *slot,
+ struct hisi_sas_dq *dq,
+ struct hisi_sas_device *sas_dev,
+ struct hisi_sas_internal_abort *abort,
+ struct hisi_sas_tmf_task *tmf)
{
- struct domain_device *device = task->dev;
- struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
- struct hisi_sas_device *sas_dev = device->lldd_dev;
- struct hisi_sas_port *port;
- struct hisi_sas_slot *slot;
- struct hisi_sas_cmd_hdr *cmd_hdr_base;
- struct asd_sas_port *sas_port = device->port;
- struct device *dev = hisi_hba->dev;
- int dlvry_queue_slot, dlvry_queue, rc, slot_idx;
- int n_elem = 0, n_elem_dif = 0, n_elem_req = 0;
- struct scsi_cmnd *scmd = NULL;
- struct hisi_sas_dq *dq;
+ struct hisi_sas_cmd_hdr *cmd_hdr_base;
+ int dlvry_queue_slot, dlvry_queue;
+ struct sas_task *task = slot->task;
unsigned long flags;
int wr_q_index;

- if (DEV_IS_GONE(sas_dev)) {
- if (sas_dev)
- dev_info(dev, "task prep: device %d not ready\n",
- sas_dev->device_id);
- else
- dev_info(dev, "task prep: device %016llx not ready\n",
- SAS_ADDR(device->sas_addr));
-
- return -ECOMM;
- }
-
- if (task->uldd_task) {
- struct ata_queued_cmd *qc;
-
- if (dev_is_sata(device)) {
- qc = task->uldd_task;
- scmd = qc->scsicmd;
- } else {
- scmd = task->uldd_task;
- }
- }
-
- if (scmd) {
- unsigned int dq_index;
- u32 blk_tag;
-
- blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
- dq_index = blk_mq_unique_tag_to_hwq(blk_tag);
- *dq_pointer = dq = &hisi_hba->dq[dq_index];
- } else {
- struct Scsi_Host *shost = hisi_hba->shost;
- struct blk_mq_queue_map *qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT];
- int queue = qmap->mq_map[raw_smp_processor_id()];
-
- *dq_pointer = dq = &hisi_hba->dq[queue];
- }
-
- port = to_hisi_sas_port(sas_port);
- if (port && !port->port_attached) {
- dev_info(dev, "task prep: %s port%d not attach device\n",
- (dev_is_sata(device)) ?
- "SATA/STP" : "SAS",
- device->port->id);
-
- return -ECOMM;
- }
-
- rc = hisi_sas_dma_map(hisi_hba, task, &n_elem,
- &n_elem_req);
- if (rc < 0)
- goto prep_out;
-
- if (!sas_protocol_ata(task->task_proto)) {
- rc = hisi_sas_dif_dma_map(hisi_hba, &n_elem_dif, task);
- if (rc < 0)
- goto err_out_dma_unmap;
- }
-
- if (hisi_hba->hw->slot_index_alloc)
- rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
- else
- rc = hisi_sas_slot_index_alloc(hisi_hba, scmd);
-
- if (rc < 0)
- goto err_out_dif_dma_unmap;
-
- slot_idx = rc;
- slot = &hisi_hba->slot_info[slot_idx];
-
spin_lock(&dq->lock);
wr_q_index = dq->wr_point;
dq->wr_point = (dq->wr_point + 1) % HISI_SAS_QUEUE_SLOTS;
@@ -496,16 +422,13 @@ static int hisi_sas_task_prep(struct sas_task *task,
dlvry_queue_slot = wr_q_index;

slot->device_id = sas_dev->device_id;
- slot->n_elem = n_elem;
- slot->n_elem_dif = n_elem_dif;
slot->dlvry_queue = dlvry_queue;
slot->dlvry_queue_slot = dlvry_queue_slot;
cmd_hdr_base = hisi_hba->cmd_hdr[dlvry_queue];
slot->cmd_hdr = &cmd_hdr_base[dlvry_queue_slot];
- slot->task = task;
- slot->port = port;
+
slot->tmf = tmf;
- slot->is_internal = is_tmf;
+ slot->is_internal = tmf;
task->lldd_task = slot;

memset(slot->cmd_hdr, 0, sizeof(struct hisi_sas_cmd_hdr));
@@ -525,8 +448,14 @@ static int hisi_sas_task_prep(struct sas_task *task,
case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
hisi_sas_task_prep_ata(hisi_hba, slot);
break;
+ case SAS_PROTOCOL_NONE:
+ if (abort) {
+ hisi_sas_task_prep_abort(hisi_hba, abort, slot, sas_dev->device_id);
+ break;
+ }
+ fallthrough;
default:
- dev_err(dev, "task prep: unknown/unsupported proto (0x%x)\n",
+ dev_err(hisi_hba->dev, "task prep: unknown/unsupported proto (0x%x)\n",
task->task_proto);
break;
}
@@ -540,29 +469,22 @@ static int hisi_sas_task_prep(struct sas_task *task,
spin_lock(&dq->lock);
hisi_hba->hw->start_delivery(dq);
spin_unlock(&dq->lock);
-
- return 0;
-
-err_out_dif_dma_unmap:
- if (!sas_protocol_ata(task->task_proto))
- hisi_sas_dif_dma_unmap(hisi_hba, task, n_elem_dif);
-err_out_dma_unmap:
- hisi_sas_dma_unmap(hisi_hba, task, n_elem,
- n_elem_req);
-prep_out:
- dev_err(dev, "task prep: failed[%d]!\n", rc);
- return rc;
}

static int hisi_sas_task_exec(struct sas_task *task, gfp_t gfp_flags,
- bool is_tmf, struct hisi_sas_tmf_task *tmf)
+ struct hisi_sas_tmf_task *tmf)
{
- u32 rc;
- struct hisi_hba *hisi_hba;
- struct device *dev;
+ int n_elem = 0, n_elem_dif = 0, n_elem_req = 0;
struct domain_device *device = task->dev;
struct asd_sas_port *sas_port = device->port;
+ struct hisi_sas_device *sas_dev = device->lldd_dev;
+ struct scsi_cmnd *scmd = NULL;
struct hisi_sas_dq *dq = NULL;
+ struct hisi_sas_port *port;
+ struct hisi_hba *hisi_hba;
+ struct hisi_sas_slot *slot;
+ struct device *dev;
+ int rc;

if (!sas_port) {
struct task_status_struct *ts = &task->task_status;
@@ -589,11 +511,94 @@ static int hisi_sas_task_exec(struct sas_task *task, gfp_t gfp_flags,
up(&hisi_hba->sem);
}

+ if (DEV_IS_GONE(sas_dev)) {
+ if (sas_dev)
+ dev_info(dev, "task prep: device %d not ready\n",
+ sas_dev->device_id);
+ else
+ dev_info(dev, "task prep: device %016llx not ready\n",
+ SAS_ADDR(device->sas_addr));
+
+ return -ECOMM;
+ }
+
+ if (task->uldd_task) {
+ struct ata_queued_cmd *qc;
+
+ if (dev_is_sata(device)) {
+ qc = task->uldd_task;
+ scmd = qc->scsicmd;
+ } else {
+ scmd = task->uldd_task;
+ }
+ }
+
+ if (scmd) {
+ unsigned int dq_index;
+ u32 blk_tag;
+
+ blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
+ dq_index = blk_mq_unique_tag_to_hwq(blk_tag);
+ dq = &hisi_hba->dq[dq_index];
+ } else {
+ struct Scsi_Host *shost = hisi_hba->shost;
+ struct blk_mq_queue_map *qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT];
+ int queue = qmap->mq_map[raw_smp_processor_id()];
+
+ dq = &hisi_hba->dq[queue];
+ }
+
+ port = to_hisi_sas_port(sas_port);
+ if (port && !port->port_attached) {
+ dev_info(dev, "task prep: %s port%d not attach device\n",
+ (dev_is_sata(device)) ?
+ "SATA/STP" : "SAS",
+ device->port->id);
+
+ return -ECOMM;
+ }
+
+ rc = hisi_sas_dma_map(hisi_hba, task, &n_elem,
+ &n_elem_req);
+ if (rc < 0)
+ goto prep_out;
+
+ if (!sas_protocol_ata(task->task_proto)) {
+ rc = hisi_sas_dif_dma_map(hisi_hba, &n_elem_dif, task);
+ if (rc < 0)
+ goto err_out_dma_unmap;
+ }
+
+ if (hisi_hba->hw->slot_index_alloc)
+ rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device);
+ else
+ rc = hisi_sas_slot_index_alloc(hisi_hba, scmd);
+
+ if (rc < 0)
+ goto err_out_dif_dma_unmap;
+
+ slot = &hisi_hba->slot_info[rc];
+ slot->n_elem = n_elem;
+ slot->n_elem_dif = n_elem_dif;
+ slot->task = task;
+ slot->port = port;
+
+ slot->tmf = tmf;
+ slot->is_internal = tmf;
+
/* protect task_prep and start_delivery sequence */
- rc = hisi_sas_task_prep(task, &dq, is_tmf, tmf);
- if (rc)
- dev_err(dev, "task exec: failed[%d]!\n", rc);
+ hisi_sas_task_deliver(hisi_hba, slot, dq, sas_dev, NULL, tmf);

+ return 0;
+
+err_out_dif_dma_unmap:
+ if (!sas_protocol_ata(task->task_proto))
+ hisi_sas_dif_dma_unmap(hisi_hba, task, n_elem_dif);
+err_out_dma_unmap:
+ hisi_sas_dma_unmap(hisi_hba, task, n_elem,
+ n_elem_req);
+prep_out:
+ dev_err(dev, "task exec: failed[%d]!\n", rc);
return rc;
}

@@ -1092,7 +1097,7 @@ static void hisi_sas_dev_gone(struct domain_device *device)

static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
{
- return hisi_sas_task_exec(task, gfp_flags, 0, NULL);
+ return hisi_sas_task_exec(task, gfp_flags, NULL);
}

static int hisi_sas_phy_set_linkrate(struct hisi_hba *hisi_hba, int phy_no,
@@ -1246,8 +1251,7 @@ static int hisi_sas_exec_internal_tmf_task(struct domain_device *device,
task->slow_task->timer.expires = jiffies + TASK_TIMEOUT;
add_timer(&task->slow_task->timer);

- res = hisi_sas_task_exec(task, GFP_KERNEL, 1, tmf);
-
+ res = hisi_sas_task_exec(task, GFP_KERNEL, tmf);
if (res) {
del_timer_sync(&task->slow_task->timer);
dev_err(dev, "abort tmf: executing internal task failed: %d\n",
@@ -2016,12 +2020,9 @@ hisi_sas_internal_abort_task_exec(struct hisi_hba *hisi_hba, int device_id,
struct hisi_sas_device *sas_dev = device->lldd_dev;
struct device *dev = hisi_hba->dev;
struct hisi_sas_port *port;
- struct hisi_sas_slot *slot;
struct asd_sas_port *sas_port = device->port;
- struct hisi_sas_cmd_hdr *cmd_hdr_base;
- int dlvry_queue_slot, dlvry_queue, n_elem = 0, rc, slot_idx;
- unsigned long flags;
- int wr_q_index;
+ struct hisi_sas_slot *slot;
+ int slot_idx;

if (unlikely(test_bit(HISI_SAS_REJECT_CMD_BIT, &hisi_hba->flags)))
return -EINVAL;
@@ -2032,58 +2033,24 @@ hisi_sas_internal_abort_task_exec(struct hisi_hba *hisi_hba, int device_id,
port = to_hisi_sas_port(sas_port);

/* simply get a slot and send abort command */
- rc = hisi_sas_slot_index_alloc(hisi_hba, NULL);
- if (rc < 0)
+ slot_idx = hisi_sas_slot_index_alloc(hisi_hba, NULL);
+ if (slot_idx < 0)
goto err_out;

- slot_idx = rc;
slot = &hisi_hba->slot_info[slot_idx];
-
- spin_lock(&dq->lock);
- wr_q_index = dq->wr_point;
- dq->wr_point = (dq->wr_point + 1) % HISI_SAS_QUEUE_SLOTS;
- list_add_tail(&slot->delivery, &dq->list);
- spin_unlock(&dq->lock);
- spin_lock(&sas_dev->lock);
- list_add_tail(&slot->entry, &sas_dev->list);
- spin_unlock(&sas_dev->lock);
-
- dlvry_queue = dq->id;
- dlvry_queue_slot = wr_q_index;
-
- slot->device_id = sas_dev->device_id;
- slot->n_elem = n_elem;
- slot->dlvry_queue = dlvry_queue;
- slot->dlvry_queue_slot = dlvry_queue_slot;
- cmd_hdr_base = hisi_hba->cmd_hdr[dlvry_queue];
- slot->cmd_hdr = &cmd_hdr_base[dlvry_queue_slot];
+ slot->n_elem = 0;
slot->task = task;
slot->port = port;
slot->is_internal = true;
- task->lldd_task = slot;

- memset(slot->cmd_hdr, 0, sizeof(struct hisi_sas_cmd_hdr));
- memset(hisi_sas_cmd_hdr_addr_mem(slot), 0, HISI_SAS_COMMAND_TABLE_SZ);
- memset(hisi_sas_status_buf_addr_mem(slot), 0,
- sizeof(struct hisi_sas_err_record));
-
- hisi_sas_task_prep_abort(hisi_hba, abort, slot, device_id);
-
- spin_lock_irqsave(&task->task_state_lock, flags);
- task->task_state_flags |= SAS_TASK_AT_INITIATOR;
- spin_unlock_irqrestore(&task->task_state_lock, flags);
- WRITE_ONCE(slot->ready, 1);
- /* send abort command to the chip */
- spin_lock(&dq->lock);
- hisi_hba->hw->start_delivery(dq);
- spin_unlock(&dq->lock);
+ hisi_sas_task_deliver(hisi_hba, slot, dq, sas_dev, abort, NULL);

return 0;

err_out:
- dev_err(dev, "internal abort task prep: failed[%d]!\n", rc);
+ dev_err(dev, "internal abort task prep: failed[%d]!\n", slot_idx);

- return rc;
+ return slot_idx;
}

/**
--
2.26.2


2021-12-15 14:43:24

by John Garry

[permalink] [raw]
Subject: [PATCH 7/8] scsi: hisi_sas: Fix phyup timeout on FPGA

From: Qi Liu <[email protected]>

The OOB interrupt and phyup interrupt handlers may run out-of-order in
high CPU usage scenarios. Since the hisi_sas_phy.timer is added in
hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this
out-of-order execution will cause hisi_sas_phy.timer timeout to trigger.

To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure
that the timer won't be added after phyup handler completes.

Signed-off-by: Qi Liu <[email protected]>
Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 18 +++++++++++++-----
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 10 ++++++++--
2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 0e14f90dbb1e..66e63a336770 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -909,10 +909,14 @@ void hisi_sas_phy_oob_ready(struct hisi_hba *hisi_hba, int phy_no)
{
struct hisi_sas_phy *phy = &hisi_hba->phy[phy_no];
struct device *dev = hisi_hba->dev;
+ unsigned long flags;

dev_dbg(dev, "phy%d OOB ready\n", phy_no);
- if (phy->phy_attached)
+ spin_lock_irqsave(&phy->lock, flags);
+ if (phy->phy_attached) {
+ spin_unlock_irqrestore(&phy->lock, flags);
return;
+ }

if (!timer_pending(&phy->timer)) {
if (phy->wait_phyup_cnt < HISI_SAS_WAIT_PHYUP_RETRIES) {
@@ -920,13 +924,17 @@ void hisi_sas_phy_oob_ready(struct hisi_hba *hisi_hba, int phy_no)
phy->timer.expires = jiffies +
HISI_SAS_WAIT_PHYUP_TIMEOUT;
add_timer(&phy->timer);
- } else {
- dev_warn(dev, "phy%d failed to come up %d times, giving up\n",
- phy_no, phy->wait_phyup_cnt);
- phy->wait_phyup_cnt = 0;
+ spin_unlock_irqrestore(&phy->lock, flags);
+ return;
}
+
+ dev_warn(dev, "phy%d failed to come up %d times, giving up\n",
+ phy_no, phy->wait_phyup_cnt);
+ phy->wait_phyup_cnt = 0;
}
+ spin_unlock_irqrestore(&phy->lock, flags);
}
+
EXPORT_SYMBOL_GPL(hisi_sas_phy_oob_ready);

static void hisi_sas_phy_init(struct hisi_hba *hisi_hba, int phy_no)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 11a44d9dd9b2..0239e2b4b84f 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -1484,7 +1484,6 @@ static irqreturn_t phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba)
struct asd_sas_phy *sas_phy = &phy->sas_phy;
struct device *dev = hisi_hba->dev;

- del_timer(&phy->timer);
hisi_sas_phy_write32(hisi_hba, phy_no, PHYCTRL_PHY_ENA_MSK, 1);

port_id = hisi_sas_read32(hisi_hba, PHY_PORT_NUM_MA);
@@ -1561,9 +1560,16 @@ static irqreturn_t phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba)
}

phy->port_id = port_id;
- phy->phy_attached = 1;
+
hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP);
+
res = IRQ_HANDLED;
+
+ spin_lock(&phy->lock);
+ /* Delete timer and set phy_attached atomically */
+ del_timer(&phy->timer);
+ phy->phy_attached = 1;
+ spin_unlock(&phy->lock);
end:
if (phy->reset_completion)
complete(phy->reset_completion);
--
2.26.2


2021-12-15 14:43:27

by John Garry

[permalink] [raw]
Subject: [PATCH 6/8] scsi: hisi_sas: Prevent parallel FLR and controller reset

From: Qi Liu <[email protected]>

If we issue a controller reset command during executing a FLR a hung task
may be found:

Call trace:
__switch_to+0x158/0x1cc
__schedule+0x2e8/0x85c
schedule+0x7c/0x110
schedule_timeout+0x190/0x1cc
__down+0x7c/0xd4
down+0x5c/0x7c
hisi_sas_task_exec+0x510/0x680 [hisi_sas_main]
hisi_sas_queue_command+0x24/0x30 [hisi_sas_main]
smp_execute_task_sg+0xf4/0x23c [libsas]
sas_smp_phy_control+0x110/0x1e0 [libsas]
transport_sas_phy_reset+0xc8/0x190 [libsas]
phy_reset_work+0x2c/0x40 [libsas]
process_one_work+0x1dc/0x48c
worker_thread+0x15c/0x464
kthread+0x160/0x170
ret_from_fork+0x10/0x18

This is a race condition which occurs when the FLR completes first.

Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as
HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held,
so now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem .

Signed-off-by: Qi Liu <[email protected]>
Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 8 +++++---
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 +
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 977911580d8f..0e14f90dbb1e 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1574,7 +1574,6 @@ void hisi_sas_controller_reset_prepare(struct hisi_hba *hisi_hba)
{
struct Scsi_Host *shost = hisi_hba->shost;

- down(&hisi_hba->sem);
hisi_hba->phy_state = hisi_hba->hw->get_phys_state(hisi_hba);

scsi_block_requests(shost);
@@ -1599,9 +1598,9 @@ void hisi_sas_controller_reset_done(struct hisi_hba *hisi_hba)
if (hisi_hba->reject_stp_links_msk)
hisi_sas_terminate_stp_reject(hisi_hba);
hisi_sas_reset_init_all_devices(hisi_hba);
- up(&hisi_hba->sem);
scsi_unblock_requests(shost);
clear_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags);
+ up(&hisi_hba->sem);

hisi_sas_rescan_topology(hisi_hba, hisi_hba->phy_state);
}
@@ -1612,8 +1611,11 @@ static int hisi_sas_controller_prereset(struct hisi_hba *hisi_hba)
if (!hisi_hba->hw->soft_reset)
return -1;

- if (test_and_set_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags))
+ down(&hisi_hba->sem);
+ if (test_and_set_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags)) {
+ up(&hisi_hba->sem);
return -1;
+ }

if (hisi_sas_debugfs_enable && hisi_hba->debugfs_itct[0].itct)
hisi_hba->hw->debugfs_snapshot_regs(hisi_hba);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 0ef6c21bf081..11a44d9dd9b2 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -4848,6 +4848,7 @@ static void hisi_sas_reset_prepare_v3_hw(struct pci_dev *pdev)
int rc;

dev_info(dev, "FLR prepare\n");
+ down(&hisi_hba->sem);
set_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags);
hisi_sas_controller_reset_prepare(hisi_hba);

--
2.26.2


2021-12-15 14:43:31

by John Garry

[permalink] [raw]
Subject: [PATCH 5/8] scsi: hisi_sas: Prevent parallel controller reset and control phy command

From: Qi Liu <[email protected]>

A user may issue a control phy command from sysfs at any time, even if the
controller is resetting.

If a phy is disabled by hardreset/linkreset command before calling
get_phys_state() in the reset path, the saved phy state may be incorrect.

To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure
that the controller reset may not run at the same time as when the phy
control function is running.

Signed-off-by: Qi Liu <[email protected]>
Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 8df1fd680eac..977911580d8f 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1148,6 +1148,7 @@ static int hisi_sas_control_phy(struct asd_sas_phy *sas_phy, enum phy_func func,
u8 sts = phy->phy_attached;
int ret = 0;

+ down(&hisi_hba->sem);
phy->reset_completion = &completion;

switch (func) {
@@ -1191,6 +1192,7 @@ static int hisi_sas_control_phy(struct asd_sas_phy *sas_phy, enum phy_func func,
out:
phy->reset_completion = NULL;

+ up(&hisi_hba->sem);
return ret;
}

--
2.26.2


2021-12-15 14:43:32

by John Garry

[permalink] [raw]
Subject: [PATCH 8/8] scsi: libsas: Decode SAM status and host byte codes

Value 0 is used for SAM status and libsas exec_status bytes codes in
sas_end_task() - use defined macros instead. In addition, change to
proper enum types.

Also replace SAM_STAT_CHECK_CONDITION with SAS_SAM_STAT_CHECK_CONDITION,
the former being a proper member of enum exec_status.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/libsas/sas_scsi_host.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index d337fdf1b9ca..fb19e739a39c 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -37,7 +37,8 @@
static void sas_end_task(struct scsi_cmnd *sc, struct sas_task *task)
{
struct task_status_struct *ts = &task->task_status;
- int hs = 0, stat = 0;
+ enum scsi_host_status hs = DID_OK;
+ enum exec_status stat = SAS_SAM_STAT_GOOD;

if (ts->resp == SAS_TASK_UNDELIVERED) {
/* transport error */
@@ -82,10 +83,10 @@ static void sas_end_task(struct scsi_cmnd *sc, struct sas_task *task)
case SAS_ABORTED_TASK:
hs = DID_ABORT;
break;
- case SAM_STAT_CHECK_CONDITION:
+ case SAS_SAM_STAT_CHECK_CONDITION:
memcpy(sc->sense_buffer, ts->buf,
min(SCSI_SENSE_BUFFERSIZE, ts->buf_valid_size));
- stat = SAM_STAT_CHECK_CONDITION;
+ stat = SAS_SAM_STAT_CHECK_CONDITION;
break;
default:
stat = ts->stat;
--
2.26.2


2021-12-17 04:00:46

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH 0/8] hisi_sas: Some misc patches


John,

> This is a small series of misc patches. Please consider for 5.17.
>
> Briefly the series covers:
> - Factoring out the common and internal abort delivery code
> - Fix some races with controller reset (again!)
> - Fix out-of-order interrupt handling on FPGA
>
> I also included a pretty straightforward libsas tidy-up.

Applied to 5.17/scsi-staging, thanks!

--
Martin K. Petersen Oracle Linux Engineering

2021-12-23 05:09:38

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH 0/8] hisi_sas: Some misc patches

On Wed, 15 Dec 2021 22:37:33 +0800, John Garry wrote:

> This is a small series of misc patches. Please consider for 5.17.
>
> Briefly the series covers:
> - Factoring out the common and internal abort delivery code
> - Fix some races with controller reset (again!)
> - Fix out-of-order interrupt handling on FPGA
>
> [...]

Applied to 5.17/scsi-queue, thanks!

[1/8] scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly
https://git.kernel.org/mkp/scsi/c/0e4620856b89
[2/8] scsi: hisi_sas: Make internal abort have no task proto
https://git.kernel.org/mkp/scsi/c/934385a4fd59
[3/8] scsi: hisi_sas: Pass abort structure for internal abort
https://git.kernel.org/mkp/scsi/c/08c61b5d902b
[4/8] scsi: hisi_sas: Factor out task prep and delivery code
https://git.kernel.org/mkp/scsi/c/dc313f6b125b
[5/8] scsi: hisi_sas: Prevent parallel controller reset and control phy command
https://git.kernel.org/mkp/scsi/c/20c634932ae8
[6/8] scsi: hisi_sas: Prevent parallel FLR and controller reset
https://git.kernel.org/mkp/scsi/c/16775db613c2
[7/8] scsi: hisi_sas: Fix phyup timeout on FPGA
https://git.kernel.org/mkp/scsi/c/37310bad7fa6
[8/8] scsi: libsas: Decode SAM status and host byte codes
https://git.kernel.org/mkp/scsi/c/4be6181fea1d

--
Martin K. Petersen Oracle Linux Engineering