2022-08-17 15:49:01

by John Garry

[permalink] [raw]
Subject: [PATCH v2 0/6] libsas and drivers: NCQ error handling

As reported in [0], the pm8001 driver NCQ error handling more or less
duplicates what libata does in link error handling, as follows:
- abort all commands
- do autopsy with read log ext 10 command
- reset the target to recover

Indeed for the hisi_sas driver we want to add similar handling for NCQ
errors.

This series add a new libsas API - sas_ata_device_link_abort() - to handle
host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it. As
mentioned in the pm8001 changeover patch, I would prefer a better place to
locate the SATA ABORT command (rather that nexus reset callback).

Damien kindly tested the v1 series for pm8001, but any further pm8001
testing would be appreciated as I have since tweaked
sas_ata_device_link_abort(). This is because the pm8001 driver hangs on my
arm64 machine read log ext10 command.

Finally with these changes we can make the libsas task alloc/free APIs
private, which they should always have been.

Based on v6.0-rc1

[0] https://lore.kernel.org/linux-scsi/[email protected]/

Changes since v1:
- Rename sas_ata_link_abort() -> sas_ata_device_link_abort()
- Set EH RESET flag in sas_ata_device_link_abort()
- Add Jack's Ack tags
- Rebase

John Garry (5):
scsi: pm8001: Modify task abort handling for SATA task
scsi: libsas: Add sas_ata_device_link_abort()
scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
scsi: hisi_sas: Don't issue ATA softreset in hisi_sas_abort_task()
scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private

Xingui Yang (1):
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw

drivers/scsi/hisi_sas/hisi_sas_main.c | 5 +-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 22 ++-
drivers/scsi/libsas/sas_ata.c | 11 ++
drivers/scsi/libsas/sas_init.c | 3 -
drivers/scsi/libsas/sas_internal.h | 4 +
drivers/scsi/pm8001/pm8001_hwi.c | 194 +++++++------------------
drivers/scsi/pm8001/pm8001_sas.c | 13 ++
drivers/scsi/pm8001/pm8001_sas.h | 8 +-
drivers/scsi/pm8001/pm80xx_hwi.c | 177 ++--------------------
include/scsi/libsas.h | 4 -
include/scsi/sas_ata.h | 5 +
11 files changed, 133 insertions(+), 313 deletions(-)

--
2.35.3


2022-08-17 15:51:21

by John Garry

[permalink] [raw]
Subject: [PATCH v2 4/6] scsi: hisi_sas: Don't issue ATA softreset in hisi_sas_abort_task()

It is not necessary to issue an ATA softreset in hisi_sas_abort_task().
This is because in the libsas EH scheme we will defer to ATA EH which will
deal with any device reset during recovery.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 33af5b8dede2..2fe5290ac56c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1598,13 +1598,16 @@ static int hisi_sas_abort_task(struct sas_task *task)
} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
if (task->dev->dev_type == SAS_SATA_DEV) {
+ struct hisi_sas_slot *slot = task->lldd_task;
+
+ slot->task = NULL;
rc = hisi_sas_internal_task_abort_dev(sas_dev, false);
if (rc < 0) {
dev_err(dev, "abort task: internal abort failed\n");
goto out;
}
hisi_sas_dereg_device(hisi_hba, device);
- rc = hisi_sas_softreset_ata_disk(device);
+ rc = TMF_RESP_FUNC_COMPLETE;
}
} else if (task->lldd_task && task->task_proto & SAS_PROTOCOL_SMP) {
/* SMP */
--
2.35.3

2022-08-17 15:56:45

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] libsas and drivers: NCQ error handling

On 2022/08/17 7:52, John Garry wrote:
> As reported in [0], the pm8001 driver NCQ error handling more or less
> duplicates what libata does in link error handling, as follows:
> - abort all commands
> - do autopsy with read log ext 10 command
> - reset the target to recover
>
> Indeed for the hisi_sas driver we want to add similar handling for NCQ
> errors.
>
> This series add a new libsas API - sas_ata_device_link_abort() - to handle
> host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it. As
> mentioned in the pm8001 changeover patch, I would prefer a better place to
> locate the SATA ABORT command (rather that nexus reset callback).
>
> Damien kindly tested the v1 series for pm8001, but any further pm8001
> testing would be appreciated as I have since tweaked
> sas_ata_device_link_abort(). This is because the pm8001 driver hangs on my
> arm64 machine read log ext10 command.

I will run more tests with this series.

>
> Finally with these changes we can make the libsas task alloc/free APIs
> private, which they should always have been.
>
> Based on v6.0-rc1
>
> [0] https://lore.kernel.org/linux-scsi/[email protected]/
>
> Changes since v1:
> - Rename sas_ata_link_abort() -> sas_ata_device_link_abort()
> - Set EH RESET flag in sas_ata_device_link_abort()
> - Add Jack's Ack tags
> - Rebase
>
> John Garry (5):
> scsi: pm8001: Modify task abort handling for SATA task
> scsi: libsas: Add sas_ata_device_link_abort()
> scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
> scsi: hisi_sas: Don't issue ATA softreset in hisi_sas_abort_task()
> scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private
>
> Xingui Yang (1):
> scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
>
> drivers/scsi/hisi_sas/hisi_sas_main.c | 5 +-
> drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 22 ++-
> drivers/scsi/libsas/sas_ata.c | 11 ++
> drivers/scsi/libsas/sas_init.c | 3 -
> drivers/scsi/libsas/sas_internal.h | 4 +
> drivers/scsi/pm8001/pm8001_hwi.c | 194 +++++++------------------
> drivers/scsi/pm8001/pm8001_sas.c | 13 ++
> drivers/scsi/pm8001/pm8001_sas.h | 8 +-
> drivers/scsi/pm8001/pm80xx_hwi.c | 177 ++--------------------
> include/scsi/libsas.h | 4 -
> include/scsi/sas_ata.h | 5 +
> 11 files changed, 133 insertions(+), 313 deletions(-)
>


--
Damien Le Moal
Western Digital Research