LinuxLists.cc - [PATCH v5 0/7] libsas and drivers: NCQ error handling

2022-09-27 07:28:18

[permalink] [raw]

Subject: [PATCH v5 0/7] libsas and drivers: NCQ error handling

As reported in [0], the pm8001 driver NCQ error handling more or less
duplicates what libata does in link error handling, as follows:
- abort all commands
- do autopsy with read log ext 10 command
- reset the target to recover, if necessary

Indeed for the hisi_sas driver we want to add similar handling for NCQ
errors.

This series add a new libsas API - sas_ata_device_link_abort() - to handle
host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it.

A difference in the pm8001 driver NCQ error handling is that we send
SATA_ABORT per-task prior to read log ext10, but I feel that this should
not make a difference to the error handling.

Damien kindly tested previous the series for pm8001, but any further pm8001
testing would be appreciated as I have since tweaked pm8001 handling again.
This is because the pm8001 driver hangs on my arm64 machine read log ext10
command.

Finally with these changes we can make the libsas task alloc/free APIs
private, which they should always have been.

Based on mkp-scsi @ 6.1/scsi-staging 57569c37f0ad ("scsi: iscsi:
iscsi_tcp: Fix null-ptr-deref while calling getpeername()")

[0] https://lore.kernel.org/linux-scsi/[email protected]/

Changes since v4:
- Add Jason's tags (thanks)
- Rebase

Changes since v3:
- Add Damien's tags (thanks)
- Modify hisi_sas processing as follows:
- use sas_task_abort() for rejected IO
- Modify abort task processing to issue softreset in certain circumstances
- rebase

Changes since v2:
- Stop sending SATA_ABORT all for pm8001 handling
- Make "reset" optional in sas_ata_device_link_abort()
- Drop Jack's ACK

John Garry (5):
scsi: libsas: Add sas_ata_device_link_abort()
scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()
scsi: pm8001: Modify task abort handling for SATA task
scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private

Xingui Yang (2):
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
scsi: hisi_sas: Modify v3 HW SATA disk error state completion
processing

drivers/scsi/hisi_sas/hisi_sas.h | 1 +
drivers/scsi/hisi_sas/hisi_sas_main.c | 26 +++-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 53 ++++++-
drivers/scsi/libsas/sas_ata.c | 12 ++
drivers/scsi/libsas/sas_init.c | 3 -
drivers/scsi/libsas/sas_internal.h | 4 +
drivers/scsi/pm8001/pm8001_hwi.c | 186 ++++---------------------
drivers/scsi/pm8001/pm8001_sas.c | 8 ++
drivers/scsi/pm8001/pm8001_sas.h | 4 -
drivers/scsi/pm8001/pm80xx_hwi.c | 177 +++--------------------
include/scsi/libsas.h | 4 -
include/scsi/sas_ata.h | 6 +
12 files changed, 143 insertions(+), 341 deletions(-)

--
2.35.3

2022-09-27 07:28:18

[permalink] [raw]

Subject: [PATCH v5 3/7] scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw

From: Xingui Yang <[email protected]>

When CQ header dw3 SATA_DISK_ERR is set it means this SATA disk is in error
state and the current IPTT is invalid. An invalid IPTT does not correspond
to any slot.

In this scenario, new I/Os that delivered to disk will be rejected by the
controller and all I/Os remaining in the disk should be aborted, which we
add here with the sas_ata_device_link_abort() call.

In hisi_sas_abort_task() we don't want to issue a softreset as it may
cause info to be lost in the target disk for the ATA EH autopsy. In this
case, just release resources - the disk won't return other I/Os normally
after NCQ Error, so this is safe.

Signed-off-by: Xingui Yang <[email protected]>
Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hisi_sas/hisi_sas.h | 1 +
drivers/scsi/hisi_sas/hisi_sas_main.c | 18 +++++++++-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 48 ++++++++++++++++++++++++--
3 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 9aebf4a26b13..6f8a52a1b808 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -104,6 +104,7 @@ enum {
enum dev_status {
HISI_SAS_DEV_INIT,
HISI_SAS_DEV_NORMAL,
+ HISI_SAS_DEV_NCQ_ERR,
};

enum {
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 8303aa5eaf25..4c37ae9eb6b6 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1604,13 +1604,26 @@ static int hisi_sas_abort_task(struct sas_task *task)
} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
if (task->dev->dev_type == SAS_SATA_DEV) {
+ struct ata_queued_cmd *qc = task->uldd_task;
+
rc = hisi_sas_internal_task_abort_dev(sas_dev, false);
if (rc < 0) {
dev_err(dev, "abort task: internal abort failed\n");
goto out;
}
hisi_sas_dereg_device(hisi_hba, device);
- rc = hisi_sas_softreset_ata_disk(device);
+
+ /*
+ * If an ATA internal command times out in ATA EH, it
+ * need to execute soft reset, so check the scsicmd
+ */
+ if ((sas_dev->dev_status == HISI_SAS_DEV_NCQ_ERR) &&
+ qc && qc->scsicmd) {
+ hisi_sas_do_release_task(hisi_hba, task, slot);
+ rc = TMF_RESP_FUNC_COMPLETE;
+ } else {
+ rc = hisi_sas_softreset_ata_disk(device);
+ }
}
} else if (slot && task->task_proto & SAS_PROTOCOL_SMP) {
/* SMP */
@@ -1727,6 +1740,9 @@ static int hisi_sas_I_T_nexus_reset(struct domain_device *device)
struct device *dev = hisi_hba->dev;
int rc;

+ if (sas_dev->dev_status == HISI_SAS_DEV_NCQ_ERR)
+ sas_dev->dev_status = HISI_SAS_DEV_NORMAL;
+
rc = hisi_sas_internal_task_abort_dev(sas_dev, false);
if (rc < 0) {
dev_err(dev, "I_T nexus reset: internal abort (%d)\n", rc);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 10b55cef5cf5..4278d3482ebb 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -404,6 +404,11 @@
#define CMPLT_HDR_CMPLT_MSK (0x3 << CMPLT_HDR_CMPLT_OFF)
#define CMPLT_HDR_ERROR_PHASE_OFF 2
#define CMPLT_HDR_ERROR_PHASE_MSK (0xff << CMPLT_HDR_ERROR_PHASE_OFF)
+/* bit[9:2] Error Phase */
+#define ERR_PHASE_RESPONSE_FRAME_REV_STAGE_OFF \
+ 8
+#define ERR_PHASE_RESPONSE_FRAME_REV_STAGE_MSK \
+ (0x1 << ERR_PHASE_RESPONSE_FRAME_REV_STAGE_OFF)
#define CMPLT_HDR_RSPNS_XFRD_OFF 10
#define CMPLT_HDR_RSPNS_XFRD_MSK (0x1 << CMPLT_HDR_RSPNS_XFRD_OFF)
#define CMPLT_HDR_RSPNS_GOOD_OFF 11
@@ -423,8 +428,15 @@
#define CMPLT_HDR_DEV_ID_OFF 16
#define CMPLT_HDR_DEV_ID_MSK (0xffff << CMPLT_HDR_DEV_ID_OFF)
/* dw3 */
+#define CMPLT_HDR_SATA_DISK_ERR_OFF 16
+#define CMPLT_HDR_SATA_DISK_ERR_MSK (0x1 << CMPLT_HDR_SATA_DISK_ERR_OFF)
#define CMPLT_HDR_IO_IN_TARGET_OFF 17
#define CMPLT_HDR_IO_IN_TARGET_MSK (0x1 << CMPLT_HDR_IO_IN_TARGET_OFF)
+/* bit[23:18] ERR_FIS_ATA_STATUS */
+#define FIS_ATA_STATUS_ERR_OFF 18
+#define FIS_ATA_STATUS_ERR_MSK (0x1 << FIS_ATA_STATUS_ERR_OFF)
+#define FIS_TYPE_SDB_OFF 31
+#define FIS_TYPE_SDB_MSK (0x1 << FIS_TYPE_SDB_OFF)

/* ITCT header */
/* qw0 */
@@ -2148,6 +2160,18 @@ static irqreturn_t fatal_axi_int_v3_hw(int irq_no, void *p)
return IRQ_HANDLED;
}

+static bool is_ncq_err_v3_hw(struct hisi_sas_complete_v3_hdr *complete_hdr)
+{
+ u32 dw0, dw3;
+
+ dw0 = le32_to_cpu(complete_hdr->dw0);
+ dw3 = le32_to_cpu(complete_hdr->dw3);
+
+ return (dw0 & ERR_PHASE_RESPONSE_FRAME_REV_STAGE_MSK) &&
+ (dw3 & FIS_TYPE_SDB_MSK) &&
+ (dw3 & FIS_ATA_STATUS_ERR_MSK);
+}
+
static bool
slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
struct hisi_sas_slot *slot)
@@ -2381,14 +2405,34 @@ static irqreturn_t cq_thread_v3_hw(int irq_no, void *p)
while (rd_point != wr_point) {
struct hisi_sas_complete_v3_hdr *complete_hdr;
struct device *dev = hisi_hba->dev;
- u32 dw1;
+ u32 dw0, dw1, dw3;
int iptt;

complete_hdr = &complete_queue[rd_point];
+ dw0 = le32_to_cpu(complete_hdr->dw0);
dw1 = le32_to_cpu(complete_hdr->dw1);
+ dw3 = le32_to_cpu(complete_hdr->dw3);

iptt = dw1 & CMPLT_HDR_IPTT_MSK;
- if (likely(iptt < HISI_SAS_COMMAND_ENTRIES_V3_HW)) {
+ if (unlikely((dw0 & CMPLT_HDR_CMPLT_MSK) == 0x3) &&
+ (dw3 & CMPLT_HDR_SATA_DISK_ERR_MSK)) {
+ int device_id = (dw1 & CMPLT_HDR_DEV_ID_MSK) >>
+ CMPLT_HDR_DEV_ID_OFF;
+ struct hisi_sas_itct *itct =
+ &hisi_hba->itct[device_id];
+ struct hisi_sas_device *sas_dev =
+ &hisi_hba->devices[device_id];
+ struct domain_device *device = sas_dev->sas_device;
+
+ dev_err(dev, "erroneous completion disk err dev id=%d sas_addr=0x%llx CQ hdr: 0x%x 0x%x 0x%x 0x%x\n",
+ device_id, itct->sas_addr, dw0, dw1,
+ complete_hdr->act, dw3);
+
+ if (is_ncq_err_v3_hw(complete_hdr))
+ sas_dev->dev_status = HISI_SAS_DEV_NCQ_ERR;
+
+ sas_ata_device_link_abort(device, true);
+ } else if (likely(iptt < HISI_SAS_COMMAND_ENTRIES_V3_HW)) {
slot = &hisi_hba->slot_info[iptt];
slot->cmplt_queue_slot = rd_point;
slot->cmplt_queue = queue;
--
2.35.3

2022-09-27 07:31:04

[permalink] [raw]

Subject: [PATCH v5 5/7] scsi: pm8001: Modify task abort handling for SATA task

When we try to abort a SATA task, the CCB of the task which we are trying
to avoid may still complete. In this case, we should not touch the task
associated with that CCB as we can race with libsas freeing the last later
in sas_eh_handle_sas_errors() -> sas_eh_finish_cmd() for when
TASK_IS_ABORTED is returned from sas_scsi_find_task()

Signed-off-by: John Garry <[email protected]>
Acked-by: Jack Wang <[email protected]>
Tested-by: Damien Le Moal <[email protected]>
---
drivers/scsi/pm8001/pm8001_hwi.c | 15 +++++++++++++--
drivers/scsi/pm8001/pm8001_sas.c | 8 ++++++++
drivers/scsi/pm8001/pm80xx_hwi.c | 14 ++++++++++----
3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 628b08ba6770..c0adc3a9d196 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -2295,7 +2295,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
if (t->dev && (t->dev->lldd_dev))
pm8001_dev = t->dev->lldd_dev;
} else {
- pm8001_dbg(pm8001_ha, FAIL, "task null\n");
+ pm8001_dbg(pm8001_ha, FAIL, "task null, freeing CCB tag %d\n",
+ ccb->ccb_tag);
+ pm8001_ccb_free(pm8001_ha, ccb);
return;
}

@@ -2675,8 +2677,17 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb)
pm8001_dev = ccb->device;
if (event)
pm8001_dbg(pm8001_ha, FAIL, "sata IO status 0x%x\n", event);
- if (unlikely(!t || !t->lldd_task || !t->dev))
+
+ if (unlikely(!t)) {
+ pm8001_dbg(pm8001_ha, FAIL, "task null, freeing CCB tag %d\n",
+ ccb->ccb_tag);
+ pm8001_ccb_free(pm8001_ha, ccb);
return;
+ }
+
+ if (unlikely(!t->lldd_task || !t->dev))
+ return;
+
ts = &t->task_status;
pm8001_dbg(pm8001_ha, DEVIO,
"port_id:0x%x, device_id:0x%x, tag:0x%x, event:0x%x\n",
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index 8e3f2f9ddaac..d5ec29f69be3 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -983,6 +983,7 @@ int pm8001_query_task(struct sas_task *task)
/* mandatory SAM-3, still need free task/ccb info, abort the specified task */
int pm8001_abort_task(struct sas_task *task)
{
+ struct pm8001_ccb_info *ccb = task->lldd_task;
unsigned long flags;
u32 tag;
struct domain_device *dev ;
@@ -1113,6 +1114,13 @@ int pm8001_abort_task(struct sas_task *task)
pm8001_dev, DS_OPERATIONAL);
wait_for_completion(&completion);
} else {
+ /*
+ * Ensure that if we see a completion for the ccb
+ * associated with the task which we are trying to
+ * abort then we should not touch the sas_task as it
+ * may race with libsas freeing it when return here.
+ */
+ ccb->task = NULL;
ret = sas_execute_internal_abort_single(dev, tag, 0, NULL);
}
rc = TMF_RESP_FUNC_COMPLETE;
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index f8b8624458f7..dd0e06983cd3 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2396,7 +2396,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
if (t->dev && (t->dev->lldd_dev))
pm8001_dev = t->dev->lldd_dev;
} else {
- pm8001_dbg(pm8001_ha, FAIL, "task null\n");
+ pm8001_dbg(pm8001_ha, FAIL, "task null, freeing CCB tag %d\n",
+ ccb->ccb_tag);
+ pm8001_ccb_free(pm8001_ha, ccb);
return;
}

@@ -2813,12 +2815,16 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha,
ccb = &pm8001_ha->ccb_info[tag];
t = ccb->task;
pm8001_dev = ccb->device;
-
- if (unlikely(!t || !t->lldd_task || !t->dev)) {
- pm8001_dbg(pm8001_ha, FAIL, "task or dev null\n");
+ if (unlikely(!t)) {
+ pm8001_dbg(pm8001_ha, FAIL, "task null, freeing CCB tag %d\n",
+ ccb->ccb_tag);
+ pm8001_ccb_free(pm8001_ha, ccb);
return;
}

+ if (unlikely(!t->lldd_task || !t->dev))
+ return;
+
ts = &t->task_status;
pm8001_dbg(pm8001_ha, IOERR, "port_id:0x%x, tag:0x%x, event:0x%x\n",
port_id, tag, event);
--
2.35.3

2022-09-27 07:52:48

[permalink] [raw]

Subject: [PATCH v5 7/7] scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private

We have no users outside libsas any longer, so make sas_alloc_task(),
sas_alloc_slow_task(), and sas_free_task() private.

Signed-off-by: John Garry <[email protected]>
Tested-by: Damien Le Moal <[email protected]>
Reviewed-by: Jason Yan <[email protected]>
---
drivers/scsi/libsas/sas_init.c | 3 ---
drivers/scsi/libsas/sas_internal.h | 4 ++++
include/scsi/libsas.h | 4 ----
3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index e4f77072a58d..f2c05ebeb72f 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -35,7 +35,6 @@ struct sas_task *sas_alloc_task(gfp_t flags)

return task;
}
-EXPORT_SYMBOL_GPL(sas_alloc_task);

struct sas_task *sas_alloc_slow_task(gfp_t flags)
{
@@ -56,7 +55,6 @@ struct sas_task *sas_alloc_slow_task(gfp_t flags)

return task;
}
-EXPORT_SYMBOL_GPL(sas_alloc_slow_task);

void sas_free_task(struct sas_task *task)
{
@@ -65,7 +63,6 @@ void sas_free_task(struct sas_task *task)
kmem_cache_free(sas_task_cache, task);
}
}
-EXPORT_SYMBOL_GPL(sas_free_task);

/*------------ SAS addr hash -----------*/
void sas_hash_addr(u8 *hashed, const u8 *sas_addr)
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index 8d0ad3abc7b5..b54bcf3c9a9d 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -52,6 +52,10 @@ void sas_unregister_phys(struct sas_ha_struct *sas_ha);
struct asd_sas_event *sas_alloc_event(struct asd_sas_phy *phy, gfp_t gfp_flags);
void sas_free_event(struct asd_sas_event *event);

+struct sas_task *sas_alloc_task(gfp_t flags);
+struct sas_task *sas_alloc_slow_task(gfp_t flags);
+void sas_free_task(struct sas_task *task);
+
int sas_register_ports(struct sas_ha_struct *sas_ha);
void sas_unregister_ports(struct sas_ha_struct *sas_ha);

diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 2dbead74a2af..f86b56bf7833 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -639,10 +639,6 @@ struct sas_task_slow {
#define SAS_TASK_STATE_ABORTED 4
#define SAS_TASK_NEED_DEV_RESET 8

-extern struct sas_task *sas_alloc_task(gfp_t flags);
-extern struct sas_task *sas_alloc_slow_task(gfp_t flags);
-extern void sas_free_task(struct sas_task *task);
-
static inline bool sas_is_internal_abort(struct sas_task *task)
{
return task->task_proto == SAS_PROTOCOL_INTERNAL_ABORT;
--
2.35.3

2022-09-27 07:53:14

[permalink] [raw]

Subject: [PATCH v5 1/7] scsi: libsas: Add sas_ata_device_link_abort()

Similar to how AHCI handles NCQ errors in ahci_error_intr() ->
ata_port_abort() -> ata_do_link_abort(), add an NCQ error handler for LLDDs
to call to initiate a link abort.

This will mark all outstanding QCs as failed and kick-off EH.

Note:
A "force reset" argument is added for drivers which require the ATA error
handling to always reset the device.

A driver may require this feature for when SATA device per-SCSI cmnd
resources are only released during reset for ATA EH. As such, we need an
option to force reset to be done, regardless of what any EH autopsy
decides.

Suggested-by: Damien Le Moal <[email protected]>
Signed-off-by: John Garry <[email protected]>
Tested-by: Damien Le Moal <[email protected]>
Reviewed-by: Jason Yan <[email protected]>
---
drivers/scsi/libsas/sas_ata.c | 12 ++++++++++++
include/scsi/sas_ata.h | 6 ++++++
2 files changed, 18 insertions(+)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index d35c9296f738..bdffb6852dcf 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -861,6 +861,18 @@ void sas_ata_wait_eh(struct domain_device *dev)
ata_port_wait_eh(ap);
}

+void sas_ata_device_link_abort(struct domain_device *device, bool force_reset)
+{
+ struct ata_port *ap = device->sata_dev.ap;
+ struct ata_link *link = &ap->link;
+
+ link->eh_info.err_mask |= AC_ERR_DEV;
+ if (force_reset)
+ link->eh_info.action |= ATA_EH_RESET;
+ ata_link_abort(link);
+}
+EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
+
int sas_execute_ata_cmd(struct domain_device *device, u8 *fis, int force_phy_id)
{
struct sas_tmf_task tmf_task = {};
diff --git a/include/scsi/sas_ata.h b/include/scsi/sas_ata.h
index a1df4f9d57a3..e47f0aec0722 100644
--- a/include/scsi/sas_ata.h
+++ b/include/scsi/sas_ata.h
@@ -32,6 +32,7 @@ void sas_probe_sata(struct asd_sas_port *port);
void sas_suspend_sata(struct asd_sas_port *port);
void sas_resume_sata(struct asd_sas_port *port);
void sas_ata_end_eh(struct ata_port *ap);
+void sas_ata_device_link_abort(struct domain_device *dev, bool force_reset);
int sas_execute_ata_cmd(struct domain_device *device, u8 *fis,
int force_phy_id);
int sas_ata_wait_after_reset(struct domain_device *dev, unsigned long deadline);
@@ -87,6 +88,11 @@ static inline void sas_ata_end_eh(struct ata_port *ap)
{
}

+static inline void sas_ata_device_link_abort(struct domain_device *dev,
+ bool force_reset)
+{
+}
+
static inline int sas_execute_ata_cmd(struct domain_device *device, u8 *fis,
int force_phy_id)
{
--
2.35.3

2022-09-27 07:56:13

[permalink] [raw]

Subject: [PATCH v5 6/7] scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors

In commit c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes") the
driver had support added to handle NCQ errors but much of what is done
in this handling is duplicated from the libata EH.

In that named commit we handle in 2x main steps:
a. Issue read log ext10 to examine and clear the errors
b. Issue SATA_ABORT all command

Indeed, in libata EH, we do similar to above:
a. ata_do_eh() -> ata_eh_autopsy() -> ata_eh_link_autopsy() ->
ata_eh_analyze_ncq_error() -> ata_eh_read_log_10h()
b. ata_do_eh() -> ata_eh_recover() which will issue a device soft reset
or hard reset

Since there is so much duplication, use sas_ata_device_link_abort() which
will abort all pending IOs and kick of ATA EH which will do the steps,
above.

However we will not follow the advisory to send the SATA_ABORT all command
after the autopsy in read log ext10. Indeed, in libsas EH, we already send
a per-task SATA_ABORT command, and this is prior to the ATA EH kicking in
and issuing the read log ext10 in the recovery process. I judge that this
is ok as the SATA_ABORT command does not actually send any protocol on the
link to abort IO on the other side, so would not change any state on the
disk (for the read log ext10 command).

Signed-off-by: John Garry <[email protected]>
Tested-by: Damien Le Moal <[email protected]>
---
drivers/scsi/pm8001/pm8001_hwi.c | 171 +++----------------------------
drivers/scsi/pm8001/pm8001_sas.h | 4 -
drivers/scsi/pm8001/pm80xx_hwi.c | 163 ++---------------------------
3 files changed, 19 insertions(+), 319 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index c0adc3a9d196..ec1a9ab61814 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -1724,7 +1724,14 @@ void pm8001_work_fn(struct work_struct *work)
pm8001_free_dev(pm8001_dev);
}
}
- } break;
+ }
+ break;
+ case IO_XFER_ERROR_ABORTED_NCQ_MODE:
+ {
+ dev = pm8001_dev->sas_device;
+ sas_ata_device_link_abort(dev, false);
+ }
+ break;
}
kfree(pw);
}
@@ -1748,110 +1755,6 @@ int pm8001_handle_event(struct pm8001_hba_info *pm8001_ha, void *data,
return ret;
}

-static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
- struct pm8001_device *pm8001_ha_dev)
-{
- struct pm8001_ccb_info *ccb;
- struct sas_task *task;
- struct task_abort_req task_abort;
- u32 opc = OPC_INB_SATA_ABORT;
- int ret;
-
- pm8001_ha_dev->id |= NCQ_ABORT_ALL_FLAG;
- pm8001_ha_dev->id &= ~NCQ_READ_LOG_FLAG;
-
- task = sas_alloc_slow_task(GFP_ATOMIC);
- if (!task) {
- pm8001_dbg(pm8001_ha, FAIL, "cannot allocate task\n");
- return;
- }
-
- task->task_done = pm8001_task_done;
-
- ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
- if (!ccb) {
- sas_free_task(task);
- return;
- }
-
- memset(&task_abort, 0, sizeof(task_abort));
- task_abort.abort_all = cpu_to_le32(1);
- task_abort.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
- task_abort.tag = cpu_to_le32(ccb->ccb_tag);
-
- ret = pm8001_mpi_build_cmd(pm8001_ha, 0, opc, &task_abort,
- sizeof(task_abort), 0);
- if (ret) {
- sas_free_task(task);
- pm8001_ccb_free(pm8001_ha, ccb);
- }
-}
-
-static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
- struct pm8001_device *pm8001_ha_dev)
-{
- struct sata_start_req sata_cmd;
- int res;
- struct pm8001_ccb_info *ccb;
- struct sas_task *task = NULL;
- struct host_to_dev_fis fis;
- struct domain_device *dev;
- u32 opc = OPC_INB_SATA_HOST_OPSTART;
-
- task = sas_alloc_slow_task(GFP_ATOMIC);
- if (!task) {
- pm8001_dbg(pm8001_ha, FAIL, "cannot allocate task !!!\n");
- return;
- }
- task->task_done = pm8001_task_done;
-
- /*
- * Allocate domain device by ourselves as libsas is not going to
- * provide any.
- */
- dev = kzalloc(sizeof(struct domain_device), GFP_ATOMIC);
- if (!dev) {
- sas_free_task(task);
- pm8001_dbg(pm8001_ha, FAIL,
- "Domain device cannot be allocated\n");
- return;
- }
- task->dev = dev;
- task->dev->lldd_dev = pm8001_ha_dev;
-
- ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
- if (!ccb) {
- sas_free_task(task);
- kfree(dev);
- return;
- }
-
- pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
- pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
-
- /* construct read log FIS */
- memset(&fis, 0, sizeof(struct host_to_dev_fis));
- fis.fis_type = 0x27;
- fis.flags = 0x80;
- fis.command = ATA_CMD_READ_LOG_EXT;
- fis.lbal = 0x10;
- fis.sector_count = 0x1;
-
- memset(&sata_cmd, 0, sizeof(sata_cmd));
- sata_cmd.tag = cpu_to_le32(ccb->ccb_tag);
- sata_cmd.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
- sata_cmd.ncqtag_atap_dir_m = cpu_to_le32((0x1 << 7) | (0x5 << 9));
- memcpy(&sata_cmd.sata_fis, &fis, sizeof(struct host_to_dev_fis));
-
- res = pm8001_mpi_build_cmd(pm8001_ha, 0, opc, &sata_cmd,
- sizeof(sata_cmd), 0);
- if (res) {
- sas_free_task(task);
- pm8001_ccb_free(pm8001_ha, ccb);
- kfree(dev);
- }
-}
-
/**
* mpi_ssp_completion- process the event that FW response to the SSP request.
* @pm8001_ha: our hba card information
@@ -2301,8 +2204,7 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
return;
}

- if ((pm8001_dev && !(pm8001_dev->id & NCQ_READ_LOG_FLAG))
- && unlikely(!t || !t->lldd_task || !t->dev)) {
+ if (pm8001_dev && unlikely(!t || !t->lldd_task || !t->dev)) {
pm8001_dbg(pm8001_ha, FAIL, "task or dev null\n");
return;
}
@@ -2360,15 +2262,6 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
if (param == 0) {
ts->resp = SAS_TASK_COMPLETE;
ts->stat = SAS_SAM_STAT_GOOD;
- /* check if response is for SEND READ LOG */
- if (pm8001_dev &&
- (pm8001_dev->id & NCQ_READ_LOG_FLAG)) {
- pm8001_send_abort_all(pm8001_ha, pm8001_dev);
- /* Free the tag */
- pm8001_tag_free(pm8001_ha, tag);
- sas_free_task(t);
- return;
- }
} else {
u8 len;
ts->resp = SAS_TASK_COMPLETE;
@@ -2666,9 +2559,10 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb)
if (event == IO_XFER_ERROR_ABORTED_NCQ_MODE) {
/* find device using device id */
pm8001_dev = pm8001_find_dev(pm8001_ha, dev_id);
- /* send read log extension */
if (pm8001_dev)
- pm8001_send_read_log(pm8001_ha, pm8001_dev);
+ pm8001_handle_event(pm8001_ha,
+ pm8001_dev,
+ IO_XFER_ERROR_ABORTED_NCQ_MODE);
return;
}

@@ -3649,12 +3543,7 @@ int pm8001_mpi_task_abort_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
pm8001_ccb_task_free(pm8001_ha, ccb);
mb();

- if (pm8001_dev->id & NCQ_ABORT_ALL_FLAG) {
- sas_free_task(t);
- pm8001_dev->id &= ~NCQ_ABORT_ALL_FLAG;
- } else {
- t->task_done(t);
- }
+ t->task_done(t);

return 0;
}
@@ -4206,7 +4095,6 @@ static int pm8001_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
u64 phys_addr;
u32 ATAP = 0x0;
u32 dir;
- unsigned long flags;
u32 opc = OPC_INB_SATA_HOST_OPSTART;

memset(&sata_cmd, 0, sizeof(sata_cmd));
@@ -4261,39 +4149,6 @@ static int pm8001_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
sata_cmd.esgl = 0;
}

- /* Check for read log for failed drive and return */
- if (sata_cmd.sata_fis.command == 0x2f) {
- if (((pm8001_ha_dev->id & NCQ_READ_LOG_FLAG) ||
- (pm8001_ha_dev->id & NCQ_ABORT_ALL_FLAG) ||
- (pm8001_ha_dev->id & NCQ_2ND_RLE_FLAG))) {
- struct task_status_struct *ts;
-
- pm8001_ha_dev->id &= 0xDFFFFFFF;
- ts = &task->task_status;
-
- spin_lock_irqsave(&task->task_state_lock, flags);
- ts->resp = SAS_TASK_COMPLETE;
- ts->stat = SAS_SAM_STAT_GOOD;
- task->task_state_flags &= ~SAS_TASK_STATE_PENDING;
- task->task_state_flags |= SAS_TASK_STATE_DONE;
- if (unlikely((task->task_state_flags &
- SAS_TASK_STATE_ABORTED))) {
- spin_unlock_irqrestore(&task->task_state_lock,
- flags);
- pm8001_dbg(pm8001_ha, FAIL,
- "task 0x%p resp 0x%x stat 0x%x but aborted by upper layer\n",
- task, ts->resp,
- ts->stat);
- pm8001_ccb_task_free(pm8001_ha, ccb);
- } else {
- spin_unlock_irqrestore(&task->task_state_lock,
- flags);
- pm8001_ccb_task_free_done(pm8001_ha, ccb);
- return 0;
- }
- }
- }
-
return pm8001_mpi_build_cmd(pm8001_ha, 0, opc, &sata_cmd,
sizeof(sata_cmd), 0);
}
diff --git a/drivers/scsi/pm8001/pm8001_sas.h b/drivers/scsi/pm8001/pm8001_sas.h
index b08f52673889..8ab0654327f9 100644
--- a/drivers/scsi/pm8001/pm8001_sas.h
+++ b/drivers/scsi/pm8001/pm8001_sas.h
@@ -579,10 +579,6 @@ struct pm8001_fw_image_header {
#define FLASH_UPDATE_DNLD_NOT_SUPPORTED 0x10
#define FLASH_UPDATE_DISABLED 0x11

-#define NCQ_READ_LOG_FLAG 0x80000000
-#define NCQ_ABORT_ALL_FLAG 0x40000000
-#define NCQ_2ND_RLE_FLAG 0x20000000
-
/* Device states */
#define DS_OPERATIONAL 0x01
#define DS_PORT_IN_RESET 0x02
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index dd0e06983cd3..4484c498bcb6 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1778,113 +1778,6 @@ pm80xx_chip_interrupt_disable(struct pm8001_hba_info *pm8001_ha, u8 vec)
pm80xx_chip_intx_interrupt_disable(pm8001_ha);
}

-static void pm80xx_send_abort_all(struct pm8001_hba_info *pm8001_ha,
- struct pm8001_device *pm8001_ha_dev)
-{
- struct pm8001_ccb_info *ccb;
- struct sas_task *task;
- struct task_abort_req task_abort;
- u32 opc = OPC_INB_SATA_ABORT;
- int ret;
-
- pm8001_ha_dev->id |= NCQ_ABORT_ALL_FLAG;
- pm8001_ha_dev->id &= ~NCQ_READ_LOG_FLAG;
-
- task = sas_alloc_slow_task(GFP_ATOMIC);
- if (!task) {
- pm8001_dbg(pm8001_ha, FAIL, "cannot allocate task\n");
- return;
- }
- task->task_done = pm8001_task_done;
-
- ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
- if (!ccb) {
- sas_free_task(task);
- return;
- }
-
- memset(&task_abort, 0, sizeof(task_abort));
- task_abort.abort_all = cpu_to_le32(1);
- task_abort.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
- task_abort.tag = cpu_to_le32(ccb->ccb_tag);
-
- ret = pm8001_mpi_build_cmd(pm8001_ha, 0, opc, &task_abort,
- sizeof(task_abort), 0);
- pm8001_dbg(pm8001_ha, FAIL, "Executing abort task end\n");
- if (ret) {
- sas_free_task(task);
- pm8001_ccb_free(pm8001_ha, ccb);
- }
-}
-
-static void pm80xx_send_read_log(struct pm8001_hba_info *pm8001_ha,
- struct pm8001_device *pm8001_ha_dev)
-{
- struct sata_start_req sata_cmd;
- int res;
- struct pm8001_ccb_info *ccb;
- struct sas_task *task = NULL;
- struct host_to_dev_fis fis;
- struct domain_device *dev;
- u32 opc = OPC_INB_SATA_HOST_OPSTART;
-
- task = sas_alloc_slow_task(GFP_ATOMIC);
- if (!task) {
- pm8001_dbg(pm8001_ha, FAIL, "cannot allocate task !!!\n");
- return;
- }
- task->task_done = pm8001_task_done;
-
- /*
- * Allocate domain device by ourselves as libsas is not going to
- * provide any.
- */
- dev = kzalloc(sizeof(struct domain_device), GFP_ATOMIC);
- if (!dev) {
- sas_free_task(task);
- pm8001_dbg(pm8001_ha, FAIL,
- "Domain device cannot be allocated\n");
- return;
- }
-
- task->dev = dev;
- task->dev->lldd_dev = pm8001_ha_dev;
-
- ccb = pm8001_ccb_alloc(pm8001_ha, pm8001_ha_dev, task);
- if (!ccb) {
- sas_free_task(task);
- kfree(dev);
- return;
- }
-
- pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
- pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
-
- memset(&sata_cmd, 0, sizeof(sata_cmd));
-
- /* construct read log FIS */
- memset(&fis, 0, sizeof(struct host_to_dev_fis));
- fis.fis_type = 0x27;
- fis.flags = 0x80;
- fis.command = ATA_CMD_READ_LOG_EXT;
- fis.lbal = 0x10;
- fis.sector_count = 0x1;
-
- sata_cmd.tag = cpu_to_le32(ccb->ccb_tag);
- sata_cmd.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
- sata_cmd.ncqtag_atap_dir_m_dad = cpu_to_le32(((0x1 << 7) | (0x5 << 9)));
- memcpy(&sata_cmd.sata_fis, &fis, sizeof(struct host_to_dev_fis));
-
- res = pm8001_mpi_build_cmd(pm8001_ha, 0, opc, &sata_cmd,
- sizeof(sata_cmd), 0);
- pm8001_dbg(pm8001_ha, FAIL, "Executing read log end\n");
- if (res) {
- sas_free_task(task);
- pm8001_ccb_free(pm8001_ha, ccb);
- kfree(dev);
- }
-}
-
/**
* mpi_ssp_completion - process the event that FW response to the SSP request.
* @pm8001_ha: our hba card information
@@ -2402,11 +2295,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
return;
}

- if ((pm8001_dev && !(pm8001_dev->id & NCQ_READ_LOG_FLAG))
- && unlikely(!t || !t->lldd_task || !t->dev)) {
- pm8001_dbg(pm8001_ha, FAIL, "task or dev null\n");
+
+ if (pm8001_dev && unlikely(!t->lldd_task || !t->dev))
return;
- }

ts = &t->task_status;

@@ -2463,15 +2354,6 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
if (param == 0) {
ts->resp = SAS_TASK_COMPLETE;
ts->stat = SAS_SAM_STAT_GOOD;
- /* check if response is for SEND READ LOG */
- if (pm8001_dev &&
- (pm8001_dev->id & NCQ_READ_LOG_FLAG)) {
- pm80xx_send_abort_all(pm8001_ha, pm8001_dev);
- /* Free the tag */
- pm8001_tag_free(pm8001_ha, tag);
- sas_free_task(t);
- return;
- }
} else {
u8 len;
ts->resp = SAS_TASK_COMPLETE;
@@ -2806,9 +2688,11 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha,
if (event == IO_XFER_ERROR_ABORTED_NCQ_MODE) {
/* find device using device id */
pm8001_dev = pm8001_find_dev(pm8001_ha, dev_id);
- /* send read log extension */
+ /* send read log extension by aborting the link - libata does what we want */
if (pm8001_dev)
- pm80xx_send_read_log(pm8001_ha, pm8001_dev);
+ pm8001_handle_event(pm8001_ha,
+ pm8001_dev,
+ IO_XFER_ERROR_ABORTED_NCQ_MODE);
return;
}

@@ -4556,7 +4440,6 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
u32 end_addr_high, end_addr_low;
u32 ATAP = 0x0;
u32 dir;
- unsigned long flags;
u32 opc = OPC_INB_SATA_HOST_OPSTART;
memset(&sata_cmd, 0, sizeof(sata_cmd));

@@ -4735,40 +4618,6 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
(task->ata_task.atapi_packet[15] << 24)));
}

- /* Check for read log for failed drive and return */
- if (sata_cmd.sata_fis.command == 0x2f) {
- if (pm8001_ha_dev && ((pm8001_ha_dev->id & NCQ_READ_LOG_FLAG) ||
- (pm8001_ha_dev->id & NCQ_ABORT_ALL_FLAG) ||
- (pm8001_ha_dev->id & NCQ_2ND_RLE_FLAG))) {
- struct task_status_struct *ts;
-
- pm8001_ha_dev->id &= 0xDFFFFFFF;
- ts = &task->task_status;
-
- spin_lock_irqsave(&task->task_state_lock, flags);
- ts->resp = SAS_TASK_COMPLETE;
- ts->stat = SAS_SAM_STAT_GOOD;
- task->task_state_flags &= ~SAS_TASK_STATE_PENDING;
- task->task_state_flags |= SAS_TASK_STATE_DONE;
- if (unlikely((task->task_state_flags &
- SAS_TASK_STATE_ABORTED))) {
- spin_unlock_irqrestore(&task->task_state_lock,
- flags);
- pm8001_dbg(pm8001_ha, FAIL,
- "task 0x%p resp 0x%x stat 0x%x but aborted by upper layer\n",
- task, ts->resp,
- ts->stat);
- pm8001_ccb_task_free(pm8001_ha, ccb);
- return 0;
- } else {
- spin_unlock_irqrestore(&task->task_state_lock,
- flags);
- pm8001_ccb_task_free_done(pm8001_ha, ccb);
- atomic_dec(&pm8001_ha_dev->running_req);
- return 0;
- }
- }
- }
trace_pm80xx_request_issue(pm8001_ha->id,
ccb->device ? ccb->device->attached_phy : PM8001_MAX_PHYS,
ccb->ccb_tag, opc,
--
2.35.3

2022-10-04 13:20:13

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Tue, Sep 27, 2022 at 03:04:51PM +0800, John Garry wrote:
> As reported in [0], the pm8001 driver NCQ error handling more or less
> duplicates what libata does in link error handling, as follows:
> - abort all commands
> - do autopsy with read log ext 10 command
> - reset the target to recover, if necessary
>
> Indeed for the hisi_sas driver we want to add similar handling for NCQ
> errors.
>
> This series add a new libsas API - sas_ata_device_link_abort() - to handle
> host NCQ errors, and fixes up pm8001 and hisi_sas drivers to use it.
>
> A difference in the pm8001 driver NCQ error handling is that we send
> SATA_ABORT per-task prior to read log ext10, but I feel that this should
> not make a difference to the error handling.
>
> Damien kindly tested previous the series for pm8001, but any further pm8001
> testing would be appreciated as I have since tweaked pm8001 handling again.
> This is because the pm8001 driver hangs on my arm64 machine read log ext10
> command.
>
> Finally with these changes we can make the libsas task alloc/free APIs
> private, which they should always have been.
>
> Based on mkp-scsi @ 6.1/scsi-staging 57569c37f0ad ("scsi: iscsi:
> iscsi_tcp: Fix null-ptr-deref while calling getpeername()")
>
> [0] https://lore.kernel.org/linux-scsi/[email protected]/
>
> Changes since v4:
> - Add Jason's tags (thanks)
> - Rebase
>
> Changes since v3:
> - Add Damien's tags (thanks)
> - Modify hisi_sas processing as follows:
> - use sas_task_abort() for rejected IO
> - Modify abort task processing to issue softreset in certain circumstances
> - rebase
>
> Changes since v2:
> - Stop sending SATA_ABORT all for pm8001 handling
> - Make "reset" optional in sas_ata_device_link_abort()
> - Drop Jack's ACK
>
> John Garry (5):
> scsi: libsas: Add sas_ata_device_link_abort()
> scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()
> scsi: pm8001: Modify task abort handling for SATA task
> scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors
> scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private
>
> Xingui Yang (2):
> scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
> scsi: hisi_sas: Modify v3 HW SATA disk error state completion
> processing
>
> drivers/scsi/hisi_sas/hisi_sas.h | 1 +
> drivers/scsi/hisi_sas/hisi_sas_main.c | 26 +++-
> drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 53 ++++++-
> drivers/scsi/libsas/sas_ata.c | 12 ++
> drivers/scsi/libsas/sas_init.c | 3 -
> drivers/scsi/libsas/sas_internal.h | 4 +
> drivers/scsi/pm8001/pm8001_hwi.c | 186 ++++---------------------
> drivers/scsi/pm8001/pm8001_sas.c | 8 ++
> drivers/scsi/pm8001/pm8001_sas.h | 4 -
> drivers/scsi/pm8001/pm80xx_hwi.c | 177 +++--------------------
> include/scsi/libsas.h | 4 -
> include/scsi/sas_ata.h | 6 +
> 12 files changed, 143 insertions(+), 341 deletions(-)
>
> --
> 2.35.3
>

For pm80xx (pm8001 changes untested):
Tested-by: Niklas Cassel <[email protected]>

Notes unrelated to this patch:

Both before and after this series, this driver prints:
[ 215.845053] ata21.00: exception Emask 0x0 SAct 0xfc0000 SErr 0x0 action 0x6
[ 215.852308] ata21.00: failed command: WRITE FPDMA QUEUED
[ 215.857801] ata21.00: cmd 61/00:00:00:3a:d3/01:00:b3:04:00/40 tag 18 ncq dma 131072 out
res 43/04:00:ff:3a:d3/00:00:b3:04:00/40 Emask 0x400 (NCQ error) <F>
[ 215.874396] ata21.00: status: { DRDY SENSE ERR }
[ 215.879192] ata21.00: error: { ABRT }
[ 215.882997] ata21.00: failed command: WRITE FPDMA QUEUED
[ 215.888479] ata21.00: cmd 61/00:00:00:3b:d3/01:00:b3:04:00/40 tag 19 ncq dma 131072 out
res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
[ 215.904814] ata21.00: failed command: WRITE FPDMA QUEUED
[ 215.910311] ata21.00: cmd 61/00:00:00:3c:d3/01:00:b3:04:00/40 tag 20 ncq dma 131072 out
res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
[ 215.932679] ata21.00: failed command: WRITE FPDMA QUEUED
[ 215.941203] ata21.00: cmd 61/00:00:00:3d:d3/01:00:b3:04:00/40 tag 21 ncq dma 131072 out
res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
[ 215.963616] ata21.00: failed command: WRITE FPDMA QUEUED
[ 215.972150] ata21.00: cmd 61/00:00:00:3e:d3/01:00:b3:04:00/40 tag 22 ncq dma 131072 out
res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
[ 215.994532] ata21.00: failed command: WRITE FPDMA QUEUED
[ 216.003124] ata21.00: cmd 61/00:00:00:3f:d3/01:00:b3:04:00/40 tag 23 ncq dma 131072 out
res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)

HSM (Host State Machine) violation errors.

For the same SATA drive connected via AHCI this will instead give:

[ 3796.944923] ata14.00: exception Emask 0x0 SAct 0x80800003 SErr 0xc0000 action 0x0
[ 3796.959375] ata14.00: irq_stat 0x40000008
[ 3796.970140] ata14: SError: { CommWake 10B8B }
[ 3796.981231] ata14.00: failed command: WRITE FPDMA QUEUED
[ 3796.993237] ata14.00: cmd 61/00:08:00:7e:73/02:00:8e:08:00/40 tag 1 ncq dma 262144 out
res 43/04:01:00:00:00/00:00:00:00:00/40 Emask 0x1 (device error)
[ 3797.017984] ata14.00: status: { DRDY SENSE ERR }
[ 3797.026833] ata14.00: error: { ABRT }
[ 3797.034664] ata14.00: failed command: WRITE FPDMA QUEUED
[ 3797.043015] ata14.00: cmd 61/00:b8:00:60:73/0a:00:8e:08:00/40 tag 23 ncq dma 1310720 out
res 43/04:00:df:67:73/00:00:8e:08:00/40 Emask 0x400 (NCQ error) <F>
[ 3797.065224] ata14.00: status: { DRDY SENSE ERR }
[ 3797.072914] ata14.00: error: { ABRT }
[ 3797.079598] ata14.00: failed command: WRITE FPDMA QUEUED
[ 3797.087920] ata14.00: cmd 61/00:f8:00:6a:73/0a:00:8e:08:00/40 tag 31 ncq dma 1310720 out
res 43/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[ 3797.109800] ata14.00: status: { DRDY SENSE ERR }
[ 3797.117451] ata14.00: error: { ABRT }

device error errors.

Except for the I/O that caused the NCQ error, the remaining outstanding I/Os,
regardless if they were aborted by the drive, as a side-effect of reading the
NCQ error log (see 13.7.4 Queued Error Log (10h) in SATA 3.5a spec),
or if they were aborted by the host (by sas_ata_device_link_abort()),
I don't think it is correct to report these as HSM violation errors.

HSM violation errors are e.g. when you try to issue a command to a drive
that has ATA_BUSY bit set.

Kind regards,
Niklas

2022-10-04 14:35:04

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 04/10/2022 14:05, Niklas Cassel wrote:
>> 2.35.3
>>
> For pm80xx (pm8001 changes untested):
> Tested-by: Niklas Cassel<[email protected]>
>
>

Thanks!

>
> Notes unrelated to this patch:
>
> Both before and after this series, this driver prints:
> [ 215.845053] ata21.00: exception Emask 0x0 SAct 0xfc0000 SErr 0x0 action 0x6
> [ 215.852308] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 215.857801] ata21.00: cmd 61/00:00:00:3a:d3/01:00:b3:04:00/40 tag 18 ncq dma 131072 out
> res 43/04:00:ff:3a:d3/00:00:b3:04:00/40 Emask 0x400 (NCQ error) <F>
> [ 215.874396] ata21.00: status: { DRDY SENSE ERR }
> [ 215.879192] ata21.00: error: { ABRT }
> [ 215.882997] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 215.888479] ata21.00: cmd 61/00:00:00:3b:d3/01:00:b3:04:00/40 tag 19 ncq dma 131072 out
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> [ 215.904814] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 215.910311] ata21.00: cmd 61/00:00:00:3c:d3/01:00:b3:04:00/40 tag 20 ncq dma 131072 out
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> [ 215.932679] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 215.941203] ata21.00: cmd 61/00:00:00:3d:d3/01:00:b3:04:00/40 tag 21 ncq dma 131072 out
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> [ 215.963616] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 215.972150] ata21.00: cmd 61/00:00:00:3e:d3/01:00:b3:04:00/40 tag 22 ncq dma 131072 out
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
> [ 215.994532] ata21.00: failed command: WRITE FPDMA QUEUED
> [ 216.003124] ata21.00: cmd 61/00:00:00:3f:d3/01:00:b3:04:00/40 tag 23 ncq dma 131072 out
> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>
> HSM (Host State Machine) violation errors.
>
> For the same SATA drive connected via AHCI this will instead give:
>
> [ 3796.944923] ata14.00: exception Emask 0x0 SAct 0x80800003 SErr 0xc0000 action 0x0
> [ 3796.959375] ata14.00: irq_stat 0x40000008
> [ 3796.970140] ata14: SError: { CommWake 10B8B }
> [ 3796.981231] ata14.00: failed command: WRITE FPDMA QUEUED
> [ 3796.993237] ata14.00: cmd 61/00:08:00:7e:73/02:00:8e:08:00/40 tag 1 ncq dma 262144 out
> res 43/04:01:00:00:00/00:00:00:00:00/40 Emask 0x1 (device error)
> [ 3797.017984] ata14.00: status: { DRDY SENSE ERR }
> [ 3797.026833] ata14.00: error: { ABRT }
> [ 3797.034664] ata14.00: failed command: WRITE FPDMA QUEUED
> [ 3797.043015] ata14.00: cmd 61/00:b8:00:60:73/0a:00:8e:08:00/40 tag 23 ncq dma 1310720 out
> res 43/04:00:df:67:73/00:00:8e:08:00/40 Emask 0x400 (NCQ error) <F>
> [ 3797.065224] ata14.00: status: { DRDY SENSE ERR }
> [ 3797.072914] ata14.00: error: { ABRT }
> [ 3797.079598] ata14.00: failed command: WRITE FPDMA QUEUED
> [ 3797.087920] ata14.00: cmd 61/00:f8:00:6a:73/0a:00:8e:08:00/40 tag 31 ncq dma 1310720 out
> res 43/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
> [ 3797.109800] ata14.00: status: { DRDY SENSE ERR }
> [ 3797.117451] ata14.00: error: { ABRT }
>
> device error errors.
>
>
> Except for the I/O that caused the NCQ error, the remaining outstanding I/Os,
> regardless if they were aborted by the drive, as a side-effect of reading the
> NCQ error log (see 13.7.4 Queued Error Log (10h) in SATA 3.5a spec),
> or if they were aborted by the host (by sas_ata_device_link_abort()),
> I don't think it is correct to report these as HSM violation errors.
>
> HSM violation errors are e.g. when you try to issue a command to a drive
> that has ATA_BUSY bit set.

We had a similar issue for hisi_sas and solved in patch 4/7: don't set
ATA_ERR in the fis for those IO which complete with error, but abort the
IO via sas_abort_task().

For pm80xx the IO is either rejected (actually completes with rejection)
or is aborted via internal abort command. Maybe we can do similar for
pm8001 as we allow the IO to complete in both cases with error. I'll check.

Thanks,
John

2022-10-05 10:07:22

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 04/10/2022 15:04, John Garry wrote:
>> Notes unrelated to this patch:
>>
>> Both before and after this series, this driver prints:
>> [ 215.845053] ata21.00: exception Emask 0x0 SAct 0xfc0000 SErr 0x0 action 0x6
>> [ 215.852308] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 215.857801] ata21.00: cmd 61/00:00:00:3a:d3/01:00:b3:04:00/40 tag 18 ncq dma 131072 out
>> res 43/04:00:ff:3a:d3/00:00:b3:04:00/40 Emask 0x400 (NCQ error) <F>
>> [ 215.874396] ata21.00: status: { DRDY SENSE ERR }
>> [ 215.879192] ata21.00: error: { ABRT }
>> [ 215.882997] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 215.888479] ata21.00: cmd 61/00:00:00:3b:d3/01:00:b3:04:00/40 tag 19 ncq dma 131072 out
>> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>> [ 215.904814] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 215.910311] ata21.00: cmd 61/00:00:00:3c:d3/01:00:b3:04:00/40 tag 20 ncq dma 131072 out
>> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>> [ 215.932679] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 215.941203] ata21.00: cmd 61/00:00:00:3d:d3/01:00:b3:04:00/40 tag 21 ncq dma 131072 out
>> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>> [ 215.963616] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 215.972150] ata21.00: cmd 61/00:00:00:3e:d3/01:00:b3:04:00/40 tag 22 ncq dma 131072 out
>> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>> [ 215.994532] ata21.00: failed command: WRITE FPDMA QUEUED
>> [ 216.003124] ata21.00: cmd 61/00:00:00:3f:d3/01:00:b3:04:00/40 tag 23 ncq dma 131072 out
>> res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation)
>>
>> HSM (Host State Machine) violation errors.
>>
>> For the same SATA drive connected via AHCI this will instead give:
>>
>> [ 3796.944923] ata14.00: exception Emask 0x0 SAct 0x80800003 SErr 0xc0000 action 0x0
>> [ 3796.959375] ata14.00: irq_stat 0x40000008
>> [ 3796.970140] ata14: SError: { CommWake 10B8B }
>> [ 3796.981231] ata14.00: failed command: WRITE FPDMA QUEUED
>> [ 3796.993237] ata14.00: cmd 61/00:08:00:7e:73/02:00:8e:08:00/40 tag 1 ncq dma 262144 out
>> res 43/04:01:00:00:00/00:00:00:00:00/40 Emask 0x1 (device error)
>> [ 3797.017984] ata14.00: status: { DRDY SENSE ERR }
>> [ 3797.026833] ata14.00: error: { ABRT }
>> [ 3797.034664] ata14.00: failed command: WRITE FPDMA QUEUED
>> [ 3797.043015] ata14.00: cmd 61/00:b8:00:60:73/0a:00:8e:08:00/40 tag 23 ncq dma 1310720 out
>> res 43/04:00:df:67:73/00:00:8e:08:00/40 Emask 0x400 (NCQ error) <F>
>> [ 3797.065224] ata14.00: status: { DRDY SENSE ERR }
>> [ 3797.072914] ata14.00: error: { ABRT }
>> [ 3797.079598] ata14.00: failed command: WRITE FPDMA QUEUED
>> [ 3797.087920] ata14.00: cmd 61/00:f8:00:6a:73/0a:00:8e:08:00/40 tag 31 ncq dma 1310720 out
>> res 43/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
>> [ 3797.109800] ata14.00: status: { DRDY SENSE ERR }
>> [ 3797.117451] ata14.00: error: { ABRT }
>>
>> device error errors.
>>
>>
>> Except for the I/O that caused the NCQ error, the remaining outstanding I/Os,
>> regardless if they were aborted by the drive, as a side-effect of reading the
>> NCQ error log (see 13.7.4 Queued Error Log (10h) in SATA 3.5a spec),
>> or if they were aborted by the host (by sas_ata_device_link_abort()),
>> I don't think it is correct to report these as HSM violation errors.
>>
>> HSM violation errors are e.g. when you try to issue a command to a drive
>> that has ATA_BUSY bit set.
> We had a similar issue for hisi_sas and solved in patch 4/7: don't set
> ATA_ERR in the fis for those IO which complete with error, but abort the
> IO via sas_abort_task().
>
> For pm80xx the IO is either rejected (actually completes with rejection)
> or is aborted via internal abort command. Maybe we can do similar for
> pm8001 as we allow the IO to complete in both cases with error. I'll check.

Hi Niklas,

Could you try a change like this on top:

void sas_ata_device_link_abort(struct domain_device *device, bool
force_reset)
{
struct ata_port *ap = device->sata_dev.ap;
struct ata_link *link = &ap->link;

+ device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
+ device->sata_dev.fis[3] = 0x04;

link->eh_info.err_mask |= AC_ERR_DEV;
if (force_reset)
link->eh_info.action |= ATA_EH_RESET;
ata_link_abort(link);
}
EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);

I tried it myself and it looked to work ok, except I have a problem with
my arm64 system in that the read log ext times-out and all TF show
"device error", like:

[ 350.257870] ata1.00: qc timeout (cmd 0x47)
[ 350.262054] pm80xx0:: mpi_sata_completion 2293: task null, freeing
CCB tag 2
[ 350.269128] ata1.00: Read log 0x10 page 0x00 failed, Emask 0x40

[ 350.281581] ata1: failed to read log page 10h (errno=-5)
[ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0
action 0x6 frozen
[ 350.584836] ata1.00: failed command: READ FPDMA QUEUED
[ 350.589970] ata1.00: cmd 60/00:00:80:26:00/01:00:00:00:00/40 tag 0
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)
[ 350.605533] ata1.00: status: { DRDY ERR }
[ 350.609541] ata1.00: error: { ABRT }
[ 350.613115] ata1.00: failed command: READ FPDMA QUEUED
[ 350.618248] ata1.00: cmd 60/00:00:80:26:00/01:00:00:00:00/40 tag 1
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)
[ 350.633809] ata1.00: status: { DRDY ERR
[ 350.637813] ata1.00: error: { ABRT }
[ 350.641384] ata1.00: failed command: READ FPDMA QUEUED
[ 350.646515] ata1.00: cmd 60/00:00:80:26:00/01:00:00:00:00/40 tag 2
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)
[ 350.662076] ata1.00: status: { DRDY ERR
[ 350.666080] ata1.00: error: { ABRT }
[ 350.669652] ata1.00: failed command: READ FPDMA QUEUED
[ 350.674784] ata1.00: cmd 60/00:00:d8:26:00/01:00:00:00:00/40 tag 3
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)
[ 350.690344] ata1.00: status: { DRDY ERR
[ 350.694348] ata1.00: error: { ABRT }
[ 350.697919] ata1.00: failed command: READ FPDMA QUEUED
[ 350.703051] ata1.00: cmd 60/00:00:e0:26:00/01:00:00:00:00/40 tag 4
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)
[ 350.718612] ata1.00: status: { DRDY ERR
[ 350.722623] ata1.00: error: { ABRT }
[ 350.726196] ata1.00: failed command: READ FPDMA QUEUED
[ 350.731329] ata1.00: cmd 60/00:00:c8:26:00/01:00:00:00:00/40 tag 5
ncq dma 131072 in
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device
error)

...

Thanks,
John

2022-10-05 21:54:43

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Wed, Oct 05, 2022 at 09:53:52AM +0100, John Garry wrote:
> On 04/10/2022 15:04, John Garry wrote:
>
> Hi Niklas,
>
> Could you try a change like this on top:
>
> void sas_ata_device_link_abort(struct domain_device *device, bool
> force_reset)
> {
> struct ata_port *ap = device->sata_dev.ap;
> struct ata_link *link = &ap->link;
>
> + device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
> + device->sata_dev.fis[3] = 0x04;
>
> link->eh_info.err_mask |= AC_ERR_DEV;
> if (force_reset)
> link->eh_info.action |= ATA_EH_RESET;
> ata_link_abort(link);
> }
> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
>
> I tried it myself and it looked to work ok, except I have a problem with my
> arm64 system in that the read log ext times-out and all TF show "device
> error", like:

Do you know why it fails to read the log?
Can you read the NCQ Command Error log using ATA16 passthrough commands?

sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc

The first byte is the last NCQ tag (in hex) that failed.

I tried your patch, and it looks good:

[ 6656.228131] ata5.00: exception Emask 0x0 SAct 0x460000 SErr 0x0 action 0x0
[ 6656.252759] ata5.00: failed command: WRITE FPDMA QUEUED
[ 6656.271554] ata5.00: cmd 61/00:00:00:d8:8a/04:00:ce:03:00/40 tag 17 ncq dma 524288 out
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[ 6656.309308] ata5.00: status: { DRDY ERR }
[ 6656.316403] ata5.00: error: { ABRT }
[ 6656.322300] ata5.00: failed command: WRITE FPDMA QUEUED
[ 6656.330871] ata5.00: cmd 61/00:00:00:dc:8a/04:00:ce:03:00/40 tag 18 ncq dma 524288 out
res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[ 6656.356295] ata5.00: status: { DRDY ERR }
[ 6656.362931] ata5.00: error: { ABRT }
[ 6656.368897] ata5.00: failed command: WRITE FPDMA QUEUED
[ 6656.377471] ata5.00: cmd 61/00:00:00:d4:8a/04:00:ce:03:00/40 tag 22 ncq dma 524288 out
res 43/04:00:ff:d7:8a/00:00:ce:03:00/40 Emask 0x400 (NCQ error) <F>
[ 6656.403149] ata5.00: status: { DRDY SENSE ERR }
[ 6656.410624] ata5.00: error: { ABRT }

However, since this is a change from the existing behavior of this driver,
this could go as a separate patch, and does not need to delay this series.

I also think that we should do a similar patch for sas_ata_task_done():

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index d35c9296f738..648d0693ceee 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -140,7 +140,7 @@ static void sas_ata_task_done(struct sas_task *task)
}

dev->sata_dev.fis[3] = 0x04; /* status err */
- dev->sata_dev.fis[2] = ATA_ERR;
+ dev->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
}
}

To avoid all SAS errors from being reported as HSM errors.

Kind regards,
Niklas

2022-10-05 22:14:14

by Damien Le Moal

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 10/6/22 06:28, Niklas Cassel wrote:
> On Wed, Oct 05, 2022 at 09:53:52AM +0100, John Garry wrote:
>> On 04/10/2022 15:04, John Garry wrote:
>>
>> Hi Niklas,
>>
>> Could you try a change like this on top:
>>
>> void sas_ata_device_link_abort(struct domain_device *device, bool
>> force_reset)
>> {
>> struct ata_port *ap = device->sata_dev.ap;
>> struct ata_link *link = &ap->link;
>>
>> + device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
>> + device->sata_dev.fis[3] = 0x04;
>>
>> link->eh_info.err_mask |= AC_ERR_DEV;
>> if (force_reset)
>> link->eh_info.action |= ATA_EH_RESET;
>> ata_link_abort(link);
>> }
>> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
>>
>> I tried it myself and it looked to work ok, except I have a problem with my
>> arm64 system in that the read log ext times-out and all TF show "device
>> error", like:
>
> Do you know why it fails to read the log?
> Can you read the NCQ Command Error log using ATA16 passthrough commands?
>
> sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
>
> The first byte is the last NCQ tag (in hex) that failed.

libata issues read log as a non-ncq command under EH. So the NCQ error log
will not help.

>
>
> I tried your patch, and it looks good:
>
> [ 6656.228131] ata5.00: exception Emask 0x0 SAct 0x460000 SErr 0x0 action 0x0
> [ 6656.252759] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.271554] ata5.00: cmd 61/00:00:00:d8:8a/04:00:ce:03:00/40 tag 17 ncq dma 524288 out
> res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
> [ 6656.309308] ata5.00: status: { DRDY ERR }
> [ 6656.316403] ata5.00: error: { ABRT }
> [ 6656.322300] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.330871] ata5.00: cmd 61/00:00:00:dc:8a/04:00:ce:03:00/40 tag 18 ncq dma 524288 out
> res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
> [ 6656.356295] ata5.00: status: { DRDY ERR }
> [ 6656.362931] ata5.00: error: { ABRT }
> [ 6656.368897] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.377471] ata5.00: cmd 61/00:00:00:d4:8a/04:00:ce:03:00/40 tag 22 ncq dma 524288 out
> res 43/04:00:ff:d7:8a/00:00:ce:03:00/40 Emask 0x400 (NCQ error) <F>
> [ 6656.403149] ata5.00: status: { DRDY SENSE ERR }
> [ 6656.410624] ata5.00: error: { ABRT }
>
> However, since this is a change from the existing behavior of this driver,
> this could go as a separate patch, and does not need to delay this series.
>
>
>
> I also think that we should do a similar patch for sas_ata_task_done():
>
> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> index d35c9296f738..648d0693ceee 100644
> --- a/drivers/scsi/libsas/sas_ata.c
> +++ b/drivers/scsi/libsas/sas_ata.c
> @@ -140,7 +140,7 @@ static void sas_ata_task_done(struct sas_task *task)
> }
>
> dev->sata_dev.fis[3] = 0x04; /* status err */
> - dev->sata_dev.fis[2] = ATA_ERR;
> + dev->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
> }
> }
>
> To avoid all SAS errors from being reported as HSM errors.
>
>
> Kind regards,
> Niklas

--
Damien Le Moal
Western Digital Research

2022-10-05 22:15:20

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Thu, Oct 06, 2022 at 06:36:05AM +0900, Damien Le Moal wrote:
> On 10/6/22 06:28, Niklas Cassel wrote:
> > On Wed, Oct 05, 2022 at 09:53:52AM +0100, John Garry wrote:
> >> On 04/10/2022 15:04, John Garry wrote:
> >>
> >> Hi Niklas,
> >>
> >> Could you try a change like this on top:
> >>
> >> void sas_ata_device_link_abort(struct domain_device *device, bool
> >> force_reset)
> >> {
> >> struct ata_port *ap = device->sata_dev.ap;
> >> struct ata_link *link = &ap->link;
> >>
> >> + device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
> >> + device->sata_dev.fis[3] = 0x04;
> >>
> >> link->eh_info.err_mask |= AC_ERR_DEV;
> >> if (force_reset)
> >> link->eh_info.action |= ATA_EH_RESET;
> >> ata_link_abort(link);
> >> }
> >> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
> >>
> >> I tried it myself and it looked to work ok, except I have a problem with my
> >> arm64 system in that the read log ext times-out and all TF show "device
> >> error", like:
> >
> > Do you know why it fails to read the log?
> > Can you read the NCQ Command Error log using ATA16 passthrough commands?
> >
> > sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
> >
> > The first byte is the last NCQ tag (in hex) that failed.
>
> libata issues read log as a non-ncq command under EH. So the NCQ error log
> will not help.

Hello Damien,

John explained that he got a timeout from EH when reading the log:
[ 350.281581] ata1: failed to read log page 10h (errno=-5)
[ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen

ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.

Therefore, to see if this is a driver specific bug, I suggested to try to read
the NCQ Command Error log using ATA16 passthrough commands:

$ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
will read the log using READ LOG DMA EXT.

$ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
will read the log using READ LOG EXT.

Neither of these two suggested commands are NCQ commands.
(Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
so I'm not sure what you mean.)

Garry, I now see that:
[ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
Your port is frozen.

ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
which will simply return an error without sending down the command to the drive,
if the port is frozen.

Not sure why your port is frozen, mine is obviously not.

ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989

So I'm not sure why your port is frozen.
(The fast drain timer does freeze the port, but it shouldn't be enabled.)
It might be worthwhile to see who freezes the port in your case.

Kind regards,
Niklas

2022-10-05 22:49:13

by Damien Le Moal

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 10/6/22 07:11, Niklas Cassel wrote:
> On Thu, Oct 06, 2022 at 06:36:05AM +0900, Damien Le Moal wrote:
>> On 10/6/22 06:28, Niklas Cassel wrote:
>>> On Wed, Oct 05, 2022 at 09:53:52AM +0100, John Garry wrote:
>>>> On 04/10/2022 15:04, John Garry wrote:
>>>>
>>>> Hi Niklas,
>>>>
>>>> Could you try a change like this on top:
>>>>
>>>> void sas_ata_device_link_abort(struct domain_device *device, bool
>>>> force_reset)
>>>> {
>>>> struct ata_port *ap = device->sata_dev.ap;
>>>> struct ata_link *link = &ap->link;
>>>>
>>>> + device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
>>>> + device->sata_dev.fis[3] = 0x04;
>>>>
>>>> link->eh_info.err_mask |= AC_ERR_DEV;
>>>> if (force_reset)
>>>> link->eh_info.action |= ATA_EH_RESET;
>>>> ata_link_abort(link);
>>>> }
>>>> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
>>>>
>>>> I tried it myself and it looked to work ok, except I have a problem with my
>>>> arm64 system in that the read log ext times-out and all TF show "device
>>>> error", like:
>>>
>>> Do you know why it fails to read the log?
>>> Can you read the NCQ Command Error log using ATA16 passthrough commands?
>>>
>>> sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
>>>
>>> The first byte is the last NCQ tag (in hex) that failed.
>>
>> libata issues read log as a non-ncq command under EH. So the NCQ error log
>> will not help.
>
> Hello Damien,
>
> John explained that he got a timeout from EH when reading the log:
> [ 350.281581] ata1: failed to read log page 10h (errno=-5)
> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
>
> ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
> the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.
>
> Therefore, to see if this is a driver specific bug, I suggested to try to read
> the NCQ Command Error log using ATA16 passthrough commands:
>
> $ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
> will read the log using READ LOG DMA EXT.
>
> $ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
> will read the log using READ LOG EXT.
>
> Neither of these two suggested commands are NCQ commands.
> (Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
> so I'm not sure what you mean.)
>
>
> Garry, I now see that:
> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
> Your port is frozen.
>
> ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
> which will simply return an error without sending down the command to the drive,
> if the port is frozen.
>
> Not sure why your port is frozen, mine is obviously not.
>
> ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
>
> So I'm not sure why your port is frozen.
> (The fast drain timer does freeze the port, but it shouldn't be enabled.)
> It might be worthwhile to see who freezes the port in your case.

Might come from the command timeout. John has had many problems with the
pm80xx HBA in his Arm machine from a while back. Likely not a driver issue
but a hw one... No-one seems to be able to recreate the same problem.

We need to try the HBA on our Arm board to see what happens.

>
>
> Kind regards,
> Niklas

--
Damien Le Moal
Western Digital Research

2022-10-06 08:54:21

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 05/10/2022 22:28, Niklas Cassel wrote:
> Do you know why it fails to read the log?
> Can you read the NCQ Command Error log using ATA16 passthrough commands?
>
> sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
>
> The first byte is the last NCQ tag (in hex) that failed.

Please see other reply.

>
>
> I tried your patch, and it looks good:

Thanks, good to know.

>
> [ 6656.228131] ata5.00: exception Emask 0x0 SAct 0x460000 SErr 0x0 action 0x0
> [ 6656.252759] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.271554] ata5.00: cmd 61/00:00:00:d8:8a/04:00:ce:03:00/40 tag 17 ncq dma 524288 out
> res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
> [ 6656.309308] ata5.00: status: { DRDY ERR }
> [ 6656.316403] ata5.00: error: { ABRT }
> [ 6656.322300] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.330871] ata5.00: cmd 61/00:00:00:dc:8a/04:00:ce:03:00/40 tag 18 ncq dma 524288 out
> res 41/04:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
> [ 6656.356295] ata5.00: status: { DRDY ERR }
> [ 6656.362931] ata5.00: error: { ABRT }
> [ 6656.368897] ata5.00: failed command: WRITE FPDMA QUEUED
> [ 6656.377471] ata5.00: cmd 61/00:00:00:d4:8a/04:00:ce:03:00/40 tag 22 ncq dma 524288 out
> res 43/04:00:ff:d7:8a/00:00:ce:03:00/40 Emask 0x400 (NCQ error) <F>
> [ 6656.403149] ata5.00: status: { DRDY SENSE ERR }
> [ 6656.410624] ata5.00: error: { ABRT }
>
> However, since this is a change from the existing behavior of this driver,
> this could go as a separate patch, and does not need to delay this series.
>
>

ok, but I am not sure about this series for 6.1 since it's now, so I
will just wait.

>
> I also think that we should do a similar patch for sas_ata_task_done():
>
> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> index d35c9296f738..648d0693ceee 100644
> --- a/drivers/scsi/libsas/sas_ata.c
> +++ b/drivers/scsi/libsas/sas_ata.c
> @@ -140,7 +140,7 @@ static void sas_ata_task_done(struct sas_task *task)
> }
>
> dev->sata_dev.fis[3] = 0x04; /* status err */
> - dev->sata_dev.fis[2] = ATA_ERR;
> + dev->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
> }
> }
>
> To avoid all SAS errors from being reported as HSM errors.

Yeah, I tend to agree. I can put that change in another patch.

Thanks,
John

2022-10-06 08:57:29

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 05/10/2022 23:42, Damien Le Moal wrote:
>> Hello Damien,
>>
>> John explained that he got a timeout from EH when reading the log:
>> [ 350.281581] ata1: failed to read log page 10h (errno=-5)
>> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
>>
>> ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
>> the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.
>>
>> Therefore, to see if this is a driver specific bug, I suggested to try to read
>> the NCQ Command Error log using ATA16 passthrough commands:
>>
>> $ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
>> will read the log using READ LOG DMA EXT.
>>
>> $ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
>> will read the log using READ LOG EXT.

Note that I can't get a distro to boot on this system from the HDD for
the same timeout problem (so no tools easily available).

>>
>> Neither of these two suggested commands are NCQ commands.
>> (Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
>> so I'm not sure what you mean.)
>>
>>
>> Garry, I now see that:
>> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
>> Your port is frozen.
>>
>> ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
>> which will simply return an error without sending down the command to the drive,
>> if the port is frozen.
>>
>> Not sure why your port is frozen, mine is obviously not.

I think that it gets frozen when the internal command for read log ext
times out. More below about that timeout.

>>
>> ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
>>
>> So I'm not sure why your port is frozen.
>> (The fast drain timer does freeze the port, but it shouldn't be enabled.)
>> It might be worthwhile to see who freezes the port in your case.
> Might come from the command timeout. John has had many problems with the
> pm80xx HBA in his Arm machine from a while back. Likely not a driver issue
> but a hw one... No-one seems to be able to recreate the same problem.
>
> We need to try the HBA on our Arm board to see what happens.
>

Yeah, it just looks to be the longstanding issue of using this card on
my arm64 machine - that is that I get IO timeouts quite regularly. I
should have mentioned that yesterday. This just seems to be a driver issue.

Interestingly this read log ext always seems to timeout, so maybe I
could see if there is anything specific about this command which could
give a clue to the underlying issue. But I have spent much time trying
to debug this issue, so not too motivated any more if I’m completely
honest ...

Thanks,
John

2022-10-06 14:51:36

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Thu, Oct 06, 2022 at 09:33:23AM +0100, John Garry wrote:
> On 05/10/2022 23:42, Damien Le Moal wrote:
> > > Hello Damien,
> > >
> > > John explained that he got a timeout from EH when reading the log:
> > > [ 350.281581] ata1: failed to read log page 10h (errno=-5)
> > > [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
> > >
> > > ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
> > > the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.
> > >
> > > Therefore, to see if this is a driver specific bug, I suggested to try to read
> > > the NCQ Command Error log using ATA16 passthrough commands:
> > >
> > > $ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
> > > will read the log using READ LOG DMA EXT.
> > >
> > > $ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
> > > will read the log using READ LOG EXT.
>
> Note that I can't get a distro to boot on this system from the HDD for the
> same timeout problem (so no tools easily available).
>
> > >
> > > Neither of these two suggested commands are NCQ commands.
> > > (Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
> > > so I'm not sure what you mean.)
> > >
> > >
> > > Garry, I now see that:
> > > [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
> > > Your port is frozen.
> > >
> > > ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
> > > which will simply return an error without sending down the command to the drive,
> > > if the port is frozen.
> > >
> > > Not sure why your port is frozen, mine is obviously not.
>
> I think that it gets frozen when the internal command for read log ext times
> out. More below about that timeout.

ata_read_log_page() will first try to read using READ LOG DMA EXT.
If that fails it will retry with READ LOG EXT.

Your log has this:
[ 350.257870] ata1.00: qc timeout (cmd 0x47)

So it is definitely ATA_CMD_READ_LOG_DMA_EXT that times out.

On timeout, ata_exec_internal_sg() will freeze the port:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?h=v6.0#n1577

When ata_read_log_page() retries with the port frozen,
READ LOG EXT will obviously fail (since the port is frozen).

Not sure why READ LOG DMA EXT would timeout for you...
Perhaps your drive does not implement this command,
and incorrectly reports supporting this command via
ata_id_has_read_log_dma_ext().

Perhaps you could try boot your kernel with libata.force=nodmalog
on the kernel command line, so that ata_read_log_page() will use
READ LOG EXT on the first try.

Damien, it seems that there is no use in retrying if the port
is frozen/we got a timeout, so perhaps:

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index e74ab6c0f1a0..1aa628332c8e 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
if (err_mask) {
if (dma) {
dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
- goto retry;
+ if (err_mask != AC_ERR_TIMEOUT)
+ goto retry;
}

or:

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index e74ab6c0f1a0..2fa03b7573ac 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
if (err_mask) {
if (dma) {
dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
- goto retry;
+ if (!(dev->link->ap->pflags & ATA_PFLAG_FROZEN))
+ goto retry;
}

would be in order, so that we actually print the real error, instead of a bogus
AC_ERR_SYSTEM (returned by ata_exec_internal_sg()) when the port is frozen.

>
> > >
> > > ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
> > >
> > > So I'm not sure why your port is frozen.
> > > (The fast drain timer does freeze the port, but it shouldn't be enabled.)
> > > It might be worthwhile to see who freezes the port in your case.
> > Might come from the command timeout. John has had many problems with the
> > pm80xx HBA in his Arm machine from a while back. Likely not a driver issue
> > but a hw one... No-one seems to be able to recreate the same problem.
> >
> > We need to try the HBA on our Arm board to see what happens.
> >
>
> Yeah, it just looks to be the longstanding issue of using this card on my
> arm64 machine - that is that I get IO timeouts quite regularly. I should
> have mentioned that yesterday. This just seems to be a driver issue.

Out of curiosity, which arm64 SoC is this?

While it is very unlikely that this is your problem, but I've encountered
an issue on an ARM board before, where the PCIe controller was incorrectly
configured in device tree, causing the controller to miss interrrupts,
which presented itself to the user as timeouts in the WiFi driver:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97131f85c08e024df49480ed499aae8fb754067f

Kind regards,
Niklas

2022-10-06 16:57:24

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 06/10/2022 15:45, Niklas Cassel wrote:
>> I think that it gets frozen when the internal command for read log ext times
>> out. More below about that timeout.
> ata_read_log_page() will first try to read using READ LOG DMA EXT.
> If that fails it will retry with READ LOG EXT.
>
> Your log has this:
> [ 350.257870] ata1.00: qc timeout (cmd 0x47)
>
> So it is definitely ATA_CMD_READ_LOG_DMA_EXT that times out.
>
> On timeout, ata_exec_internal_sg() will freeze the port:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?h=v6.0#n1577
>
> When ata_read_log_page() retries with the port frozen,
> READ LOG EXT will obviously fail (since the port is frozen).
>
> Not sure why READ LOG DMA EXT would timeout for you...
> Perhaps your drive does not implement this command,
> and incorrectly reports supporting this command via
> ata_id_has_read_log_dma_ext().
>
> Perhaps you could try boot your kernel with libata.force=nodmalog
> on the kernel command line, so that ata_read_log_page() will use
> READ LOG EXT on the first try.
>

I tried that and unfortunately it does not appear to help.

I get this log, which proves no dmalog

[ 15.757617] ata1.00: FORCE: horkage modified (nodmalog)

but then still fails with timeout:

[ 123.094430] ata1.00: qc timeout (cmd 0x2f)
[ 123.098637] pm80xx0:: mpi_sata_completion 2293: task null, freeing
CCB tag 2
[ 123.105711] ata1.00: Read log 0x10 page 0x00 failed, Emask 0x5
[ 123.118081] ata1: failed to read log page 10h (errno=-5)

>
> Damien, it seems that there is no use in retrying if the port
> is frozen/we got a timeout, so perhaps:
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index e74ab6c0f1a0..1aa628332c8e 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
> if (err_mask) {
> if (dma) {
> dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
> - goto retry;
> + if (err_mask != AC_ERR_TIMEOUT)
> + goto retry;
> }
>
> or:
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index e74ab6c0f1a0..2fa03b7573ac 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
> if (err_mask) {
> if (dma) {
> dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
> - goto retry;
> + if (!(dev->link->ap->pflags & ATA_PFLAG_FROZEN))
> + goto retry;
> }
>
> would be in order, so that we actually print the real error, instead of a bogus
> AC_ERR_SYSTEM (returned by ata_exec_internal_sg()) when the port is frozen.
>
>>>> ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
>>>>
>>>> So I'm not sure why your port is frozen.
>>>> (The fast drain timer does freeze the port, but it shouldn't be enabled.)
>>>> It might be worthwhile to see who freezes the port in your case.
>>> Might come from the command timeout. John has had many problems with the
>>> pm80xx HBA in his Arm machine from a while back. Likely not a driver issue
>>> but a hw one... No-one seems to be able to recreate the same problem.
>>>
>>> We need to try the HBA on our Arm board to see what happens.
>>>
>> Yeah, it just looks to be the longstanding issue of using this card on my
>> arm64 machine - that is that I get IO timeouts quite regularly. I should
>> have mentioned that yesterday. This just seems to be a driver issue.
> Out of curiosity, which arm64 SoC is this?

HiSilicon hi1620 which contains a custom arm v8 implementation. Note
that others have also seen the issue with this card on other arm
implementations.

>
> While it is very unlikely that this is your problem, but I've encountered
> an issue on an ARM board before, where the PCIe controller was incorrectly
> configured in device tree, causing the controller to miss interrrupts,
> which presented itself to the user as timeouts in the WiFi driver:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97131f85c08e024df49480ed499aae8fb754067f

Unlikely. Indeed, when I was checking this issue some time go, I found
that not only was there no completion interrupt but also no completion
when I manually examine the completion ring buffer read and write pointers.

Here's where I discuss this issue earlier a bit:
https://lore.kernel.org/linux-scsi/PH0PR11MB511238B8FF7B44C375DDDFADEC519@PH0PR11MB5112.namprd11.prod.outlook.com/

Thanks,
John

2022-10-06 23:09:06

by Damien Le Moal

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 10/6/22 23:45, Niklas Cassel wrote:
> On Thu, Oct 06, 2022 at 09:33:23AM +0100, John Garry wrote:
>> On 05/10/2022 23:42, Damien Le Moal wrote:
>>>> Hello Damien,
>>>>
>>>> John explained that he got a timeout from EH when reading the log:
>>>> [ 350.281581] ata1: failed to read log page 10h (errno=-5)
>>>> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
>>>>
>>>> ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
>>>> the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.
>>>>
>>>> Therefore, to see if this is a driver specific bug, I suggested to try to read
>>>> the NCQ Command Error log using ATA16 passthrough commands:
>>>>
>>>> $ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
>>>> will read the log using READ LOG DMA EXT.
>>>>
>>>> $ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
>>>> will read the log using READ LOG EXT.
>>
>> Note that I can't get a distro to boot on this system from the HDD for the
>> same timeout problem (so no tools easily available).
>>
>>>>
>>>> Neither of these two suggested commands are NCQ commands.
>>>> (Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
>>>> so I'm not sure what you mean.)
>>>>
>>>>
>>>> Garry, I now see that:
>>>> [ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
>>>> Your port is frozen.
>>>>
>>>> ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
>>>> which will simply return an error without sending down the command to the drive,
>>>> if the port is frozen.
>>>>
>>>> Not sure why your port is frozen, mine is obviously not.
>>
>> I think that it gets frozen when the internal command for read log ext times
>> out. More below about that timeout.
>
> ata_read_log_page() will first try to read using READ LOG DMA EXT.
> If that fails it will retry with READ LOG EXT.
>
> Your log has this:
> [ 350.257870] ata1.00: qc timeout (cmd 0x47)
>
> So it is definitely ATA_CMD_READ_LOG_DMA_EXT that times out.
>
> On timeout, ata_exec_internal_sg() will freeze the port:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?h=v6.0#n1577
>
> When ata_read_log_page() retries with the port frozen,
> READ LOG EXT will obviously fail (since the port is frozen).
>
> Not sure why READ LOG DMA EXT would timeout for you...
> Perhaps your drive does not implement this command,
> and incorrectly reports supporting this command via
> ata_id_has_read_log_dma_ext().
>
> Perhaps you could try boot your kernel with libata.force=nodmalog
> on the kernel command line, so that ata_read_log_page() will use
> READ LOG EXT on the first try.
>
>
> Damien, it seems that there is no use in retrying if the port
> is frozen/we got a timeout, so perhaps:
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index e74ab6c0f1a0..1aa628332c8e 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
> if (err_mask) {
> if (dma) {
> dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
> - goto retry;
> + if (err_mask != AC_ERR_TIMEOUT)
> + goto retry;
> }
>
> or:
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index e74ab6c0f1a0..2fa03b7573ac 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2035,7 +2035,8 @@ unsigned int ata_read_log_page(struct ata_device *dev, u8 log,
> if (err_mask) {
> if (dma) {
> dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
> - goto retry;
> + if (!(dev->link->ap->pflags & ATA_PFLAG_FROZEN))
> + goto retry;

Yes, something like this is needed. Though I would prefer having a little
ata_port_frozen() helper for the condition.

> }
>
> would be in order, so that we actually print the real error, instead of a bogus
> AC_ERR_SYSTEM (returned by ata_exec_internal_sg()) when the port is frozen.

yep.

>>
>>>>
>>>> ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
>>>>
>>>> So I'm not sure why your port is frozen.
>>>> (The fast drain timer does freeze the port, but it shouldn't be enabled.)
>>>> It might be worthwhile to see who freezes the port in your case.
>>> Might come from the command timeout. John has had many problems with the
>>> pm80xx HBA in his Arm machine from a while back. Likely not a driver issue
>>> but a hw one... No-one seems to be able to recreate the same problem.
>>>
>>> We need to try the HBA on our Arm board to see what happens.
>>>
>>
>> Yeah, it just looks to be the longstanding issue of using this card on my
>> arm64 machine - that is that I get IO timeouts quite regularly. I should
>> have mentioned that yesterday. This just seems to be a driver issue.
>
> Out of curiosity, which arm64 SoC is this?
>
> While it is very unlikely that this is your problem, but I've encountered
> an issue on an ARM board before, where the PCIe controller was incorrectly
> configured in device tree, causing the controller to miss interrrupts,
> which presented itself to the user as timeouts in the WiFi driver:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97131f85c08e024df49480ed499aae8fb754067f
>
>
> Kind regards,
> Niklas

--
Damien Le Moal
Western Digital Research

2022-10-24 14:09:30

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Thu, Oct 06, 2022 at 05:41:40PM +0100, John Garry wrote:
> > > >
> > > Yeah, it just looks to be the longstanding issue of using this card on my
> > > arm64 machine - that is that I get IO timeouts quite regularly. I should
> > > have mentioned that yesterday. This just seems to be a driver issue.
> > Out of curiosity, which arm64 SoC is this?
>
> HiSilicon hi1620 which contains a custom arm v8 implementation. Note that
> others have also seen the issue with this card on other arm implementations.
>
> >
> > While it is very unlikely that this is your problem, but I've encountered
> > an issue on an ARM board before, where the PCIe controller was incorrectly
> > configured in device tree, causing the controller to miss interrrupts,
> > which presented itself to the user as timeouts in the WiFi driver:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=97131f85c08e024df49480ed499aae8fb754067f
>
> Unlikely. Indeed, when I was checking this issue some time go, I found that
> not only was there no completion interrupt but also no completion when I
> manually examine the completion ring buffer read and write pointers.
>
> Here's where I discuss this issue earlier a bit:
> https://lore.kernel.org/linux-scsi/PH0PR11MB511238B8FF7B44C375DDDFADEC519@PH0PR11MB5112.namprd11.prod.outlook.com/
>

Hello John,

For the record, I tested the pm80xx driver on a HoneyComb LX2 board
(an arm64 board using ACPI).

I tried v6.1-rc1 both with and without your series in $subject.

I couldn't see any issues.

What I tried:
-Running fio:
fio --name=test --filename=/dev/sdc --ioengine=io_uring --rw=randrw --direct=1 --iodepth=32 --bs=1M
on three different HDDs simultaneously for 15+ minutes,
without any errors in fio or dmesg.

-Creating and mounting a btrfs volume, doing a huge dd to the filesystem
without issues.

-sg_sat_read_gplog -d --log=0x10 /dev/sda
which successfully returned the log.

It is worth mentioning that this arm64 board has reserved memory regions,
but does not yet have a firmware that supplies a IORT RMR (reserved memory
regions) revision E.d node, which means that in order to get this board to
boot successfully, we need to supply:
"arm-smmu.disable_bypass=0 iommu.passthrough=1"
on the kernel command line.

It could be worth trying the same on your arm64 machine, to see that also
makes the pm80xx driver play nice for you.

Kind regards,
Niklas

2022-10-24 16:25:40

by Niklas Cassel

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On Mon, Oct 24, 2022 at 01:44:56PM +0100, John Garry wrote:
> Hi Niklas,
>
> >
> > For the record, I tested the pm80xx driver on a HoneyComb LX2 board
> > (an arm64 board using ACPI).
> >
> > I tried v6.1-rc1 both with and without your series in $subject.
> >
> > I couldn't see any issues.
>
> ok, thanks for the effort.
>
> >
> >
> > What I tried:
> > -Running fio:
> > fio --name=test --filename=/dev/sdc --ioengine=io_uring --rw=randrw --direct=1 --iodepth=32 --bs=1M
> > on three different HDDs simultaneously for 15+ minutes,
> > without any errors in fio or dmesg.
> >
> > -Creating and mounting a btrfs volume, doing a huge dd to the filesystem
> > without issues.
> >
> > -sg_sat_read_gplog -d --log=0x10 /dev/sda
> > which successfully returned the log.
> >
> >
> > It is worth mentioning that this arm64 board has reserved memory regions,
> > but does not yet have a firmware that supplies a IORT RMR (reserved memory
> > regions) revision E.d node, which means that in order to get this board to
> > boot successfully, we need to supply:
> > "arm-smmu.disable_bypass=0 iommu.passthrough=1"
> > on the kernel command line.
>
> hmmm... that's interesting. I can try again with the IOMMU turned off, but,
> as I recall, it did not make a difference before. I think that requiring
> reserved memory regions would totally bust the driver (if not present) with
> IOMMU enabled. As I recall, sas 3008 card would not work without RMR for us.

At least on the HoneyComb LX2,
running with "arm-smmu.disable_bypass=0 iommu.passthrough=1" gives a working
system (and working pm80xx).

The ACPI IOMMU code that parses the ACPI IORT RMR revision E.d node
was first included in kernel v6.0:
https://lore.kernel.org/linux-iommu/[email protected]/

However, the HoneyComb edk2-platforms code has not yet been updated to add
a ACPI IORT RMR revision E.d node.

Considering that it works with "arm-smmu.disable_bypass=0 iommu.passthrough=1",
I assume that the ACPI IORT RMR node basically just defines a number of
memory regions which the IOMMU should treat as "bypass", while all other
memory has to be re-mapped via the IOMMU as per usual.

Kind regards,
Niklas

2022-10-24 19:26:29

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

On 24/10/2022 14:10, Niklas Cassel wrote:
> At least on the HoneyComb LX2,
> running with "arm-smmu.disable_bypass=0 iommu.passthrough=1" gives a working
> system (and working pm80xx).
>
> The ACPI IOMMU code that parses the ACPI IORT RMR revision E.d node
> was first included in kernel v6.0:
> https://lore.kernel.org/linux-iommu/[email protected]/
>
> However, the HoneyComb edk2-platforms code has not yet been updated to add
> a ACPI IORT RMR revision E.d node.
>
> Considering that it works with "arm-smmu.disable_bypass=0 iommu.passthrough=1",
> I assume that the ACPI IORT RMR node basically just defines a number of
> memory regions which the IOMMU should treat as "bypass", while all other
> memory has to be re-mapped via the IOMMU as per usual.

Hi Niklas,

As I expected, unfortunately that did not help. Note that I actually
can't use passthrough on my platform due to SMMUv3 implementation bug,
so I just disable SMMUv3 entirely via kernel config.

I'm still betting on a memory ordering issue for me.

Thanks again,
John

2022-10-24 22:34:42

[permalink] [raw]

Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling

Hi Niklas,

>
> For the record, I tested the pm80xx driver on a HoneyComb LX2 board
> (an arm64 board using ACPI).
>
> I tried v6.1-rc1 both with and without your series in $subject.
>
> I couldn't see any issues.

ok, thanks for the effort.

>
>
> What I tried:
> -Running fio:
> fio --name=test --filename=/dev/sdc --ioengine=io_uring --rw=randrw --direct=1 --iodepth=32 --bs=1M
> on three different HDDs simultaneously for 15+ minutes,
> without any errors in fio or dmesg.
>
> -Creating and mounting a btrfs volume, doing a huge dd to the filesystem
> without issues.
>
> -sg_sat_read_gplog -d --log=0x10 /dev/sda
> which successfully returned the log.
>
>
> It is worth mentioning that this arm64 board has reserved memory regions,
> but does not yet have a firmware that supplies a IORT RMR (reserved memory
> regions) revision E.d node, which means that in order to get this board to
> boot successfully, we need to supply:
> "arm-smmu.disable_bypass=0 iommu.passthrough=1"
> on the kernel command line.

hmmm... that's interesting. I can try again with the IOMMU turned off,
but, as I recall, it did not make a difference before. I think that
requiring reserved memory regions would totally bust the driver (if not
present) with IOMMU enabled. As I recall, sas 3008 card would not work
without RMR for us.

It's also interesting that this LX2 board has A72 cores. For my system,
we have newer custom arm v8 cores with quite weak memory ordering
implementation. With that same system, I have detected a couple of other
driver memory ordering bugs which we did not see on our A72-based platforms.

I always suspected that this issue was a memory ordering issue, but
since the hang so reliably occurs I ruled it out. Maybe it is...

thanks,
John