As reported in [0], the pm8001 driver NCQ error handling more or less
duplicates what libata does in link error handling, as follows:
- abort all commands
- do autopsy with read log ext 10 command
- reset the target to recover
Indeed for the hisi_sas driver we want to add similar handling for NCQ
errors.
This series add a new libsas API - sas_ata_device_link_abort() - to handle
host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it. As
mentioned in the pm8001 changeover patch, I would prefer a better place to
locate the SATA ABORT command (rather that nexus reset callback).
Damien kindly tested the v1 series for pm8001, but any further pm8001
testing would be appreciated as I have since tweaked
sas_ata_device_link_abort(). This is because the pm8001 driver hangs on my
arm64 machine read log ext10 command.
Finally with these changes we can make the libsas task alloc/free APIs
private, which they should always have been.
Based on v6.0-rc1
[0] https://lore.kernel.org/linux-scsi/[email protected]/
Changes since v1:
- Rename sas_ata_link_abort() -> sas_ata_device_link_abort()
- Set EH RESET flag in sas_ata_device_link_abort()
- Add Jack's Ack tags
- Rebase
John Garry (5):
scsi: pm8001: Modify task abort handling for SATA task
scsi: libsas: Add sas_ata_device_link_abort()
scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
scsi: hisi_sas: Don't issue ATA softreset in hisi_sas_abort_task()
scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private
Xingui Yang (1):
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
drivers/scsi/hisi_sas/hisi_sas_main.c | 5 +-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 22 ++-
drivers/scsi/libsas/sas_ata.c | 11 ++
drivers/scsi/libsas/sas_init.c | 3 -
drivers/scsi/libsas/sas_internal.h | 4 +
drivers/scsi/pm8001/pm8001_hwi.c | 194 +++++++------------------
drivers/scsi/pm8001/pm8001_sas.c | 13 ++
drivers/scsi/pm8001/pm8001_sas.h | 8 +-
drivers/scsi/pm8001/pm80xx_hwi.c | 177 ++--------------------
include/scsi/libsas.h | 4 -
include/scsi/sas_ata.h | 5 +
11 files changed, 133 insertions(+), 313 deletions(-)
--
2.35.3
It is not necessary to issue an ATA softreset in hisi_sas_abort_task().
This is because in the libsas EH scheme we will defer to ATA EH which will
deal with any device reset during recovery.
Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 33af5b8dede2..2fe5290ac56c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1598,13 +1598,16 @@ static int hisi_sas_abort_task(struct sas_task *task)
} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
if (task->dev->dev_type == SAS_SATA_DEV) {
+ struct hisi_sas_slot *slot = task->lldd_task;
+
+ slot->task = NULL;
rc = hisi_sas_internal_task_abort_dev(sas_dev, false);
if (rc < 0) {
dev_err(dev, "abort task: internal abort failed\n");
goto out;
}
hisi_sas_dereg_device(hisi_hba, device);
- rc = hisi_sas_softreset_ata_disk(device);
+ rc = TMF_RESP_FUNC_COMPLETE;
}
} else if (task->lldd_task && task->task_proto & SAS_PROTOCOL_SMP) {
/* SMP */
--
2.35.3
On 2022/08/17 7:52, John Garry wrote:
> As reported in [0], the pm8001 driver NCQ error handling more or less
> duplicates what libata does in link error handling, as follows:
> - abort all commands
> - do autopsy with read log ext 10 command
> - reset the target to recover
>
> Indeed for the hisi_sas driver we want to add similar handling for NCQ
> errors.
>
> This series add a new libsas API - sas_ata_device_link_abort() - to handle
> host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it. As
> mentioned in the pm8001 changeover patch, I would prefer a better place to
> locate the SATA ABORT command (rather that nexus reset callback).
>
> Damien kindly tested the v1 series for pm8001, but any further pm8001
> testing would be appreciated as I have since tweaked
> sas_ata_device_link_abort(). This is because the pm8001 driver hangs on my
> arm64 machine read log ext10 command.
I will run more tests with this series.
>
> Finally with these changes we can make the libsas task alloc/free APIs
> private, which they should always have been.
>
> Based on v6.0-rc1
>
> [0] https://lore.kernel.org/linux-scsi/[email protected]/
>
> Changes since v1:
> - Rename sas_ata_link_abort() -> sas_ata_device_link_abort()
> - Set EH RESET flag in sas_ata_device_link_abort()
> - Add Jack's Ack tags
> - Rebase
>
> John Garry (5):
> scsi: pm8001: Modify task abort handling for SATA task
> scsi: libsas: Add sas_ata_device_link_abort()
> scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
> scsi: hisi_sas: Don't issue ATA softreset in hisi_sas_abort_task()
> scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private
>
> Xingui Yang (1):
> scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
>
> drivers/scsi/hisi_sas/hisi_sas_main.c | 5 +-
> drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 22 ++-
> drivers/scsi/libsas/sas_ata.c | 11 ++
> drivers/scsi/libsas/sas_init.c | 3 -
> drivers/scsi/libsas/sas_internal.h | 4 +
> drivers/scsi/pm8001/pm8001_hwi.c | 194 +++++++------------------
> drivers/scsi/pm8001/pm8001_sas.c | 13 ++
> drivers/scsi/pm8001/pm8001_sas.h | 8 +-
> drivers/scsi/pm8001/pm80xx_hwi.c | 177 ++--------------------
> include/scsi/libsas.h | 4 -
> include/scsi/sas_ata.h | 5 +
> 11 files changed, 133 insertions(+), 313 deletions(-)
>
--
Damien Le Moal
Western Digital Research