2011-05-03 20:09:33

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 00/16] hpsa: May 3, 2011 updates

The following series mostly contains fixes to improve kdump behavior
and esp. to make older controllers which cannot be hard reset work by
doing a soft reset instead. There are a few patches factoring out
various functionality into individual functions make way for the soft
reset functionality. There is also a bugfix which prevents PCI write
combining from potentially causing commands to get lost.

---

Stephen M. Cameron (16):
hpsa: do readl after writel in main i/o path to ensure commands don't get lost.
hpsa: add readl after writel in interrupt mask setting code
hpsa: remove unused parameter from hpsa_complete_scsi_command()
hpsa: delete old unused padding garbage
hpsa: do a better job of detecting controller reset failure
hpsa: wait longer for no-op to complete after resetting controller
hpsa: factor out cmd pool allocation functions
hpsa: factor out irq request code
hpsa: increase time to wait for board reset
hpsa: clarify messages around reset behavior
hpsa: remove atrophied hpsa_scsi_setup function
hpsa: use new doorbell-bit-5 reset method
hpsa: do soft reset if hard reset is broken
hpsa: remove superfluous sleeps around reset code
hpsa: do not attempt PCI power management reset method if we know it won't work.
hpsa: add P2000 to list of shared SAS devices


drivers/scsi/hpsa.c | 494 ++++++++++++++++++++++++++++++++++++++---------
drivers/scsi/hpsa.h | 15 +
drivers/scsi/hpsa_cmd.h | 11 -
3 files changed, 412 insertions(+), 108 deletions(-)

--
-- steve


2011-05-03 19:58:52

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

From: Stephen M. Cameron <[email protected]>

Apparently we've been doin it rong for a decade, but only lately do we
run into problems.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 621a153..98c97ca 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -212,6 +212,7 @@ static void SA5_submit_command(struct ctlr_info *h,
dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
c->Header.Tag.lower);
writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
+ (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
h->commands_outstanding++;
if (h->commands_outstanding > h->max_outstanding)
h->max_outstanding = h->commands_outstanding;

2011-05-03 19:58:58

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 02/16] hpsa: add readl after writel in interrupt mask setting code

From: Stephen M. Cameron <[email protected]>

This is to ensure the board interrupts are really off when
these functions return.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 98c97ca..7eec397 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -228,10 +228,12 @@ static void SA5_intr_mask(struct ctlr_info *h, unsigned long val)
if (val) { /* Turn interrupts on */
h->interrupts_enabled = 1;
writel(0, h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+ (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
} else { /* Turn them off */
h->interrupts_enabled = 0;
writel(SA5_INTR_OFF,
h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+ (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
}
}

@@ -240,10 +242,12 @@ static void SA5_performant_intr_mask(struct ctlr_info *h, unsigned long val)
if (val) { /* turn on interrupts */
h->interrupts_enabled = 1;
writel(0, h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+ (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
} else {
h->interrupts_enabled = 0;
writel(SA5_PERF_INTR_OFF,
h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+ (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
}
}

2011-05-03 19:59:05

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 03/16] hpsa: remove unused parameter from hpsa_complete_scsi_command()

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 415ad4f..8266850 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1006,8 +1006,7 @@ static void hpsa_unmap_sg_chain_block(struct ctlr_info *h,
pci_unmap_single(h->pdev, temp64.val, chain_sg->Len, PCI_DMA_TODEVICE);
}

-static void complete_scsi_command(struct CommandList *cp,
- int timeout, u32 tag)
+static void complete_scsi_command(struct CommandList *cp)
{
struct scsi_cmnd *cmd;
struct ctlr_info *h;
@@ -2936,7 +2935,7 @@ static inline void finish_cmd(struct CommandList *c, u32 raw_tag)
{
removeQ(c);
if (likely(c->cmd_type == CMD_SCSI))
- complete_scsi_command(c, 0, raw_tag);
+ complete_scsi_command(c);
else if (c->cmd_type == CMD_IOCTL_PEND)
complete(c->waiting);
}

2011-05-03 20:09:34

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 04/16] hpsa: delete old unused padding garbage

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa_cmd.h | 8 --------
1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 1846490..0500740 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -256,14 +256,6 @@ struct ErrorInfo {
#define CMD_IOCTL_PEND 0x01
#define CMD_SCSI 0x03

-/* This structure needs to be divisible by 32 for new
- * indexing method and performant mode.
- */
-#define PAD32 32
-#define PAD64DIFF 0
-#define USEEXTRA ((sizeof(void *) - 4)/4)
-#define PADSIZE (PAD32 + PAD64DIFF * USEEXTRA)
-
#define DIRECT_LOOKUP_SHIFT 5
#define DIRECT_LOOKUP_BIT 0x10
#define DIRECT_LOOKUP_MASK (~((1 << DIRECT_LOOKUP_SHIFT) - 1))

2011-05-03 19:59:13

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 05/16] hpsa: do a better job of detecting controller reset failure

From: Stephen M. Cameron <[email protected]>

Detect failure of controller reset by noticing if the 32 bytes of
"driver version" we store on the hardware in the config table
fail to get zeroed out. Previously we noticed if the controller
did not transition to "simple mode", but this did not detect reset
failure if the controller was already in simple mode prior to
the reset attempt (e.g. due to module parameter hpsa_simple_mode=1).

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 76 +++++++++++++++++++++++++++++++++++++++++------
drivers/scsi/hpsa_cmd.h | 1 +
2 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 8266850..2aeb210 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3184,6 +3184,59 @@ static int hpsa_controller_hard_reset(struct pci_dev *pdev,
return 0;
}

+static __devinit void init_driver_version(char *driver_version, int len)
+{
+ memset(driver_version, 0, len);
+ strncpy(driver_version, "hpsa " HPSA_DRIVER_VERSION, len - 1);
+}
+
+static __devinit int write_driver_ver_to_cfgtable(
+ struct CfgTable __iomem *cfgtable)
+{
+ char *driver_version;
+ int i, size = sizeof(cfgtable->driver_version);
+
+ driver_version = kmalloc(size, GFP_KERNEL);
+ if (!driver_version)
+ return -ENOMEM;
+
+ init_driver_version(driver_version, size);
+ for (i = 0; i < size; i++)
+ writeb(driver_version[i], &cfgtable->driver_version[i]);
+ kfree(driver_version);
+ return 0;
+}
+
+static __devinit void read_driver_ver_from_cfgtable(
+ struct CfgTable __iomem *cfgtable, unsigned char *driver_ver)
+{
+ int i;
+
+ for (i = 0; i < sizeof(cfgtable->driver_version); i++)
+ driver_ver[i] = readb(&cfgtable->driver_version[i]);
+}
+
+static __devinit int controller_reset_failed(
+ struct CfgTable __iomem *cfgtable)
+{
+
+ char *driver_ver, *old_driver_ver;
+ int rc, size = sizeof(cfgtable->driver_version);
+
+ old_driver_ver = kmalloc(2 * size, GFP_KERNEL);
+ if (!old_driver_ver)
+ return -ENOMEM;
+ driver_ver = old_driver_ver + size;
+
+ /* After a reset, the 32 bytes of "driver version" in the cfgtable
+ * should have been changed, otherwise we know the reset failed.
+ */
+ init_driver_version(old_driver_ver, size);
+ read_driver_ver_from_cfgtable(cfgtable, driver_ver);
+ rc = !memcmp(driver_ver, old_driver_ver, size);
+ kfree(old_driver_ver);
+ return rc;
+}
/* This does a hard reset of the controller using PCI power management
* states or the using the doorbell register.
*/
@@ -3194,7 +3247,7 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
u64 cfg_base_addr_index;
void __iomem *vaddr;
unsigned long paddr;
- u32 misc_fw_support, active_transport;
+ u32 misc_fw_support;
int rc;
struct CfgTable __iomem *cfgtable;
bool use_doorbell;
@@ -3256,6 +3309,9 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
rc = -ENOMEM;
goto unmap_vaddr;
}
+ rc = write_driver_ver_to_cfgtable(cfgtable);
+ if (rc)
+ goto unmap_vaddr;

/* If reset via doorbell register is supported, use that. */
misc_fw_support = readl(&cfgtable->misc_fw_support);
@@ -3289,19 +3345,16 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
"failed waiting for board to become ready\n");
goto unmap_cfgtable;
}
- dev_info(&pdev->dev, "board ready.\n");

- /* Controller should be in simple mode at this point. If it's not,
- * It means we're on one of those controllers which doesn't support
- * the doorbell reset method and on which the PCI power management reset
- * method doesn't work (P800, for example.)
- * In those cases, don't try to proceed, as it generally doesn't work.
- */
- active_transport = readl(&cfgtable->TransportActive);
- if (active_transport & PERFORMANT_MODE) {
+ rc = controller_reset_failed(vaddr);
+ if (rc < 0)
+ goto unmap_cfgtable;
+ if (rc) {
dev_warn(&pdev->dev, "Unable to successfully reset controller,"
" Ignoring controller.\n");
rc = -ENODEV;
+ } else {
+ dev_info(&pdev->dev, "board ready.\n");
}

unmap_cfgtable:
@@ -3542,6 +3595,9 @@ static int __devinit hpsa_find_cfgtables(struct ctlr_info *h)
cfg_base_addr_index) + cfg_offset, sizeof(*h->cfgtable));
if (!h->cfgtable)
return -ENOMEM;
+ rc = write_driver_ver_to_cfgtable(h->cfgtable);
+ if (rc)
+ return rc;
/* Find performant mode table. */
trans_offset = readl(&h->cfgtable->TransMethodOffset);
h->transtable = remap_pci_mem(pci_resource_start(h->pdev,
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 0500740..8fd35a7 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -337,6 +337,7 @@ struct CfgTable {
u8 reserved[0x78 - 0x58];
u32 misc_fw_support; /* offset 0x78 */
#define MISC_FW_DOORBELL_RESET (0x02)
+ u8 driver_version[32];
};

#define NUM_BLOCKFETCH_ENTRIES 8

2011-05-03 19:59:19

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 06/16] hpsa: wait longer for no-op to complete after resetting controller

From: Stephen M. Cameron <[email protected]>

This is to avoid the usual two or three messages about the command
timing out. We're obviously not waiting long enough.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 7eec397..7a4ee73 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -130,7 +130,7 @@ struct ctlr_info {
#define HPSA_BUS_RESET_MSG 2
#define HPSA_HOST_RESET_MSG 3
#define HPSA_MSG_SEND_RETRY_LIMIT 10
-#define HPSA_MSG_SEND_RETRY_INTERVAL_MSECS 1000
+#define HPSA_MSG_SEND_RETRY_INTERVAL_MSECS (10000)

/* Maximum time in seconds driver will wait for command completions
* when polling before giving up.

2011-05-03 19:59:24

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 07/16] hpsa: factor out cmd pool allocation functions

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 66 ++++++++++++++++++++++++++++-----------------------
1 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 2aeb210..7336f3c 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3847,6 +3847,40 @@ static __devinit int hpsa_init_reset_devices(struct pci_dev *pdev)
return 0;
}

+static __devinit int hpsa_allocate_cmd_pool(struct ctlr_info *h)
+{
+ h->cmd_pool_bits = kzalloc(
+ DIV_ROUND_UP(h->nr_cmds, BITS_PER_LONG) *
+ sizeof(unsigned long), GFP_KERNEL);
+ h->cmd_pool = pci_alloc_consistent(h->pdev,
+ h->nr_cmds * sizeof(*h->cmd_pool),
+ &(h->cmd_pool_dhandle));
+ h->errinfo_pool = pci_alloc_consistent(h->pdev,
+ h->nr_cmds * sizeof(*h->errinfo_pool),
+ &(h->errinfo_pool_dhandle));
+ if ((h->cmd_pool_bits == NULL)
+ || (h->cmd_pool == NULL)
+ || (h->errinfo_pool == NULL)) {
+ dev_err(&h->pdev->dev, "out of memory in %s", __func__);
+ return -ENOMEM;
+ }
+ return 0;
+}
+
+static void hpsa_free_cmd_pool(struct ctlr_info *h)
+{
+ kfree(h->cmd_pool_bits);
+ if (h->cmd_pool)
+ pci_free_consistent(h->pdev,
+ h->nr_cmds * sizeof(struct CommandList),
+ h->cmd_pool, h->cmd_pool_dhandle);
+ if (h->errinfo_pool)
+ pci_free_consistent(h->pdev,
+ h->nr_cmds * sizeof(struct ErrorInfo),
+ h->errinfo_pool,
+ h->errinfo_pool_dhandle);
+}
+
static int __devinit hpsa_init_one(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
@@ -3917,33 +3951,14 @@ static int __devinit hpsa_init_one(struct pci_dev *pdev,
dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
h->devname, pdev->device,
h->intr[h->intr_mode], dac ? "" : " not");
-
- h->cmd_pool_bits =
- kmalloc(((h->nr_cmds + BITS_PER_LONG -
- 1) / BITS_PER_LONG) * sizeof(unsigned long), GFP_KERNEL);
- h->cmd_pool = pci_alloc_consistent(h->pdev,
- h->nr_cmds * sizeof(*h->cmd_pool),
- &(h->cmd_pool_dhandle));
- h->errinfo_pool = pci_alloc_consistent(h->pdev,
- h->nr_cmds * sizeof(*h->errinfo_pool),
- &(h->errinfo_pool_dhandle));
- if ((h->cmd_pool_bits == NULL)
- || (h->cmd_pool == NULL)
- || (h->errinfo_pool == NULL)) {
- dev_err(&pdev->dev, "out of memory");
- rc = -ENOMEM;
+ if (hpsa_allocate_cmd_pool(h))
goto clean4;
- }
if (hpsa_allocate_sg_chain_blocks(h))
goto clean4;
init_waitqueue_head(&h->scan_wait_queue);
h->scan_finished = 1; /* no scan currently in progress */

pci_set_drvdata(pdev, h);
- memset(h->cmd_pool_bits, 0,
- ((h->nr_cmds + BITS_PER_LONG -
- 1) / BITS_PER_LONG) * sizeof(unsigned long));
-
hpsa_scsi_setup(h);

/* Turn the interrupts on so we can service requests */
@@ -3957,16 +3972,7 @@ static int __devinit hpsa_init_one(struct pci_dev *pdev,

clean4:
hpsa_free_sg_chain_blocks(h);
- kfree(h->cmd_pool_bits);
- if (h->cmd_pool)
- pci_free_consistent(h->pdev,
- h->nr_cmds * sizeof(struct CommandList),
- h->cmd_pool, h->cmd_pool_dhandle);
- if (h->errinfo_pool)
- pci_free_consistent(h->pdev,
- h->nr_cmds * sizeof(struct ErrorInfo),
- h->errinfo_pool,
- h->errinfo_pool_dhandle);
+ hpsa_free_cmd_pool(h);
free_irq(h->intr[h->intr_mode], h);
clean2:
clean1:

2011-05-03 19:59:30

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 08/16] hpsa: factor out irq request code

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 32 +++++++++++++++++++++-----------
1 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 7336f3c..97db2e5 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3881,6 +3881,26 @@ static void hpsa_free_cmd_pool(struct ctlr_info *h)
h->errinfo_pool_dhandle);
}

+static int hpsa_request_irq(struct ctlr_info *h,
+ irqreturn_t (*msixhandler)(int, void *),
+ irqreturn_t (*intxhandler)(int, void *))
+{
+ int rc;
+
+ if (h->msix_vector || h->msi_vector)
+ rc = request_irq(h->intr[h->intr_mode], msixhandler,
+ IRQF_DISABLED, h->devname, h);
+ else
+ rc = request_irq(h->intr[h->intr_mode], intxhandler,
+ IRQF_DISABLED, h->devname, h);
+ if (rc) {
+ dev_err(&h->pdev->dev, "unable to get irq %d for %s\n",
+ h->intr[h->intr_mode], h->devname);
+ return -ENODEV;
+ }
+ return 0;
+}
+
static int __devinit hpsa_init_one(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
@@ -3936,18 +3956,8 @@ static int __devinit hpsa_init_one(struct pci_dev *pdev,
/* make sure the board interrupts are off */
h->access.set_intr_mask(h, HPSA_INTR_OFF);

- if (h->msix_vector || h->msi_vector)
- rc = request_irq(h->intr[h->intr_mode], do_hpsa_intr_msi,
- IRQF_DISABLED, h->devname, h);
- else
- rc = request_irq(h->intr[h->intr_mode], do_hpsa_intr_intx,
- IRQF_DISABLED, h->devname, h);
- if (rc) {
- dev_err(&pdev->dev, "unable to get irq %d for %s\n",
- h->intr[h->intr_mode], h->devname);
+ if (hpsa_request_irq(h, do_hpsa_intr_msi, do_hpsa_intr_intx))
goto clean2;
- }
-
dev_info(&pdev->dev, "%s: <0x%x> at IRQ %d%s using DAC\n",
h->devname, pdev->device,
h->intr[h->intr_mode], dac ? "" : " not");

2011-05-03 20:06:38

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 09/16] hpsa: increase time to wait for board reset

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 7a4ee73..b1412a7 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -155,7 +155,7 @@ struct ctlr_info {
* HPSA_BOARD_READY_ITERATIONS are derived from those.
*/
#define HPSA_BOARD_READY_WAIT_SECS (120)
-#define HPSA_BOARD_NOT_READY_WAIT_SECS (10)
+#define HPSA_BOARD_NOT_READY_WAIT_SECS (100)
#define HPSA_BOARD_READY_POLL_INTERVAL_MSECS (100)
#define HPSA_BOARD_READY_POLL_INTERVAL \
((HPSA_BOARD_READY_POLL_INTERVAL_MSECS * HZ) / 1000)

2011-05-03 19:59:38

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 10/16] hpsa: clarify messages around reset behavior

From: Stephen M. Cameron <[email protected]>

When waiting for the board to become "not ready"
don't print a message saying "waiting for board to
become ready" (possibly followed by a message saying
"failed waiting for board to become not ready". Instead,
it should be "waiting for board to reset" and "failed
waiting for board to reset."

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 97db2e5..2a90e9a 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3334,11 +3334,11 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
msleep(HPSA_POST_RESET_PAUSE_MSECS);

/* Wait for board to become not ready, then ready. */
- dev_info(&pdev->dev, "Waiting for board to become ready.\n");
+ dev_info(&pdev->dev, "Waiting for board to reset.\n");
rc = hpsa_wait_for_board_state(pdev, vaddr, BOARD_NOT_READY);
if (rc)
dev_warn(&pdev->dev,
- "failed waiting for board to become not ready\n");
+ "failed waiting for board to reset\n");
rc = hpsa_wait_for_board_state(pdev, vaddr, BOARD_READY);
if (rc) {
dev_warn(&pdev->dev,
@@ -3837,6 +3837,7 @@ static __devinit int hpsa_init_reset_devices(struct pci_dev *pdev)
return -ENODEV;

/* Now try to get the controller to respond to a no-op */
+ dev_warn(&pdev->dev, "Waiting for controller to respond to no-op\n");
for (i = 0; i < HPSA_POST_RESET_NOOP_RETRIES; i++) {
if (hpsa_noop(pdev) == 0)
break;

2011-05-03 19:59:44

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 11/16] hpsa: remove atrophied hpsa_scsi_setup function

From: Stephen M. Cameron <[email protected]>

hpsa_scsi_setup at one time contained enough code to justify
its existence, but that time has passed.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 11 +++--------
1 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 2a90e9a..ca92141 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -929,13 +929,6 @@ static void hpsa_slave_destroy(struct scsi_device *sdev)
/* nothing to do. */
}

-static void hpsa_scsi_setup(struct ctlr_info *h)
-{
- h->ndevices = 0;
- h->scsi_host = NULL;
- spin_lock_init(&h->devlock);
-}
-
static void hpsa_free_sg_chain_blocks(struct ctlr_info *h)
{
int i;
@@ -3970,7 +3963,9 @@ static int __devinit hpsa_init_one(struct pci_dev *pdev,
h->scan_finished = 1; /* no scan currently in progress */

pci_set_drvdata(pdev, h);
- hpsa_scsi_setup(h);
+ h->ndevices = 0;
+ h->scsi_host = NULL;
+ spin_lock_init(&h->devlock);

/* Turn the interrupts on so we can service requests */
h->access.set_intr_mask(h, HPSA_INTR_ON);

2011-05-03 19:59:49

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 12/16] hpsa: use new doorbell-bit-5 reset method

From: Stephen M. Cameron <[email protected]>

The bit-2-doorbell reset method seemed to cause (survivable) NMIs
on some systems and (unsurvivable) IOCK NMIs on some G7 servers.
Firmware guys implemented a new doorbell method to alleviate these
problems triggered by bit 5 of the doorbell register. We want to
use it if it's available.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 25 ++++++++++++++++++++-----
drivers/scsi/hpsa_cmd.h | 2 ++
2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index ca92141..c096cda 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3128,7 +3128,7 @@ static __devinit int hpsa_message(struct pci_dev *pdev, unsigned char opcode,
#define hpsa_noop(p) hpsa_message(p, 3, 0)

static int hpsa_controller_hard_reset(struct pci_dev *pdev,
- void * __iomem vaddr, bool use_doorbell)
+ void * __iomem vaddr, u32 use_doorbell)
{
u16 pmcsr;
int pos;
@@ -3139,7 +3139,7 @@ static int hpsa_controller_hard_reset(struct pci_dev *pdev,
* other way using the doorbell register.
*/
dev_info(&pdev->dev, "using doorbell to reset controller\n");
- writel(DOORBELL_CTLR_RESET, vaddr + SA5_DOORBELL);
+ writel(use_doorbell, vaddr + SA5_DOORBELL);
msleep(1000);
} else { /* Try to do it the PCI power state way */

@@ -3243,7 +3243,7 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
u32 misc_fw_support;
int rc;
struct CfgTable __iomem *cfgtable;
- bool use_doorbell;
+ u32 use_doorbell;
u32 board_id;
u16 command_register;

@@ -3306,9 +3306,24 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
if (rc)
goto unmap_vaddr;

- /* If reset via doorbell register is supported, use that. */
+ /* If reset via doorbell register is supported, use that.
+ * There are two such methods. Favor the newest method.
+ */
misc_fw_support = readl(&cfgtable->misc_fw_support);
- use_doorbell = misc_fw_support & MISC_FW_DOORBELL_RESET;
+ use_doorbell = misc_fw_support & MISC_FW_DOORBELL_RESET2;
+ if (use_doorbell) {
+ use_doorbell = DOORBELL_CTLR_RESET2;
+ } else {
+ use_doorbell = misc_fw_support & MISC_FW_DOORBELL_RESET;
+ if (use_doorbell) {
+ dev_warn(&pdev->dev, "Controller claims that "
+ "'Bit 2 doorbell reset' is "
+ "supported, but not 'bit 5 doorbell reset'. "
+ "Firmware update is recommended.\n");
+ rc = -ENODEV;
+ goto unmap_cfgtable;
+ }
+ }

rc = hpsa_controller_hard_reset(pdev, vaddr, use_doorbell);
if (rc)
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 8fd35a7..55d741b 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -101,6 +101,7 @@
#define CFGTBL_ChangeReq 0x00000001l
#define CFGTBL_AccCmds 0x00000001l
#define DOORBELL_CTLR_RESET 0x00000004l
+#define DOORBELL_CTLR_RESET2 0x00000020l

#define CFGTBL_Trans_Simple 0x00000002l
#define CFGTBL_Trans_Performant 0x00000004l
@@ -337,6 +338,7 @@ struct CfgTable {
u8 reserved[0x78 - 0x58];
u32 misc_fw_support; /* offset 0x78 */
#define MISC_FW_DOORBELL_RESET (0x02)
+#define MISC_FW_DOORBELL_RESET2 (0x010)
u8 driver_version[32];
};

2011-05-03 19:59:55

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 13/16] hpsa: do soft reset if hard reset is broken

From: Stephen M. Cameron <[email protected]>

on driver load, if reset_devices is set, and the hard reset
attempts fail, try to bring up the controller to the point that
a command can be sent, and send it a soft reset command, then
after the reset undo whatever driver initialization was done to get
it to the point to take a command, and re-do it after the reset.

This is to get kdump to work on all the "non-resettable" controllers
(except 64xx controllers which can't be reset due to the potentially
shared cache module.)

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 226 ++++++++++++++++++++++++++++++++++++++++++++++++---
drivers/scsi/hpsa.h | 6 +
2 files changed, 216 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index c096cda..6fe77d0 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -2743,6 +2743,26 @@ static int hpsa_ioctl(struct scsi_device *dev, int cmd, void *arg)
}
}

+static int __devinit hpsa_send_host_reset(struct ctlr_info *h,
+ unsigned char *scsi3addr, u8 reset_type)
+{
+ struct CommandList *c;
+
+ c = cmd_alloc(h);
+ if (!c)
+ return -ENOMEM;
+ fill_cmd(c, HPSA_DEVICE_RESET_MSG, h, NULL, 0, 0,
+ RAID_CTLR_LUNID, TYPE_MSG);
+ c->Request.CDB[1] = reset_type; /* fill_cmd defaults to target reset */
+ c->waiting = NULL;
+ enqueue_cmd_and_start_io(h, c);
+ /* Don't wait for completion, the reset won't complete. Don't free
+ * the command either. This is the last command we will send before
+ * re-initializing everything, so it doesn't matter and won't leak.
+ */
+ return 0;
+}
+
static void fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
void *buff, size_t size, u8 page_code, unsigned char *scsi3addr,
int cmd_type)
@@ -2820,7 +2840,8 @@ static void fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
c->Request.Type.Attribute = ATTR_SIMPLE;
c->Request.Type.Direction = XFER_NONE;
c->Request.Timeout = 0; /* Don't time out */
- c->Request.CDB[0] = 0x01; /* RESET_MSG is 0x01 */
+ memset(&c->Request.CDB[0], 0, sizeof(c->Request.CDB));
+ c->Request.CDB[0] = cmd;
c->Request.CDB[1] = 0x03; /* Reset target above */
/* If bytes 4-7 are zero, it means reset the */
/* LunID device */
@@ -2986,6 +3007,63 @@ static inline u32 process_nonindexed_cmd(struct ctlr_info *h,
return next_command(h);
}

+/* Some controllers, like p400, will give us one interrupt
+ * after a soft reset, even if we turned interrupts off.
+ * Only need to check for this in the hpsa_xxx_discard_completions
+ * functions.
+ */
+static int ignore_bogus_interrupt(struct ctlr_info *h)
+{
+ if (likely(!reset_devices))
+ return 0;
+
+ if (likely(h->interrupts_enabled))
+ return 0;
+
+ dev_info(&h->pdev->dev, "Received interrupt while interrupts disabled "
+ "(known firmware bug.) Ignoring.\n");
+
+ return 1;
+}
+
+static irqreturn_t hpsa_intx_discard_completions(int irq, void *dev_id)
+{
+ struct ctlr_info *h = dev_id;
+ unsigned long flags;
+ u32 raw_tag;
+
+ if (ignore_bogus_interrupt(h))
+ return IRQ_NONE;
+
+ if (interrupt_not_for_us(h))
+ return IRQ_NONE;
+ spin_lock_irqsave(&h->lock, flags);
+ while (interrupt_pending(h)) {
+ raw_tag = get_next_completion(h);
+ while (raw_tag != FIFO_EMPTY)
+ raw_tag = next_command(h);
+ }
+ spin_unlock_irqrestore(&h->lock, flags);
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t hpsa_msix_discard_completions(int irq, void *dev_id)
+{
+ struct ctlr_info *h = dev_id;
+ unsigned long flags;
+ u32 raw_tag;
+
+ if (ignore_bogus_interrupt(h))
+ return IRQ_NONE;
+
+ spin_lock_irqsave(&h->lock, flags);
+ raw_tag = get_next_completion(h);
+ while (raw_tag != FIFO_EMPTY)
+ raw_tag = next_command(h);
+ spin_unlock_irqrestore(&h->lock, flags);
+ return IRQ_HANDLED;
+}
+
static irqreturn_t do_hpsa_intr_intx(int irq, void *dev_id)
{
struct ctlr_info *h = dev_id;
@@ -3124,7 +3202,6 @@ static __devinit int hpsa_message(struct pci_dev *pdev, unsigned char opcode,
return 0;
}

-#define hpsa_soft_reset_controller(p) hpsa_message(p, 1, 0)
#define hpsa_noop(p) hpsa_message(p, 3, 0)

static int hpsa_controller_hard_reset(struct pci_dev *pdev,
@@ -3320,7 +3397,7 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
"'Bit 2 doorbell reset' is "
"supported, but not 'bit 5 doorbell reset'. "
"Firmware update is recommended.\n");
- rc = -ENODEV;
+ rc = -ENOTSUPP; /* try soft reset */
goto unmap_cfgtable;
}
}
@@ -3344,13 +3421,18 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
/* Wait for board to become not ready, then ready. */
dev_info(&pdev->dev, "Waiting for board to reset.\n");
rc = hpsa_wait_for_board_state(pdev, vaddr, BOARD_NOT_READY);
- if (rc)
+ if (rc) {
dev_warn(&pdev->dev,
- "failed waiting for board to reset\n");
+ "failed waiting for board to reset."
+ " Will try soft reset.\n");
+ rc = -ENOTSUPP; /* Not expected, but try soft reset later */
+ goto unmap_cfgtable;
+ }
rc = hpsa_wait_for_board_state(pdev, vaddr, BOARD_READY);
if (rc) {
dev_warn(&pdev->dev,
- "failed waiting for board to become ready\n");
+ "failed waiting for board to become ready "
+ "after hard reset\n");
goto unmap_cfgtable;
}

@@ -3358,11 +3440,11 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
if (rc < 0)
goto unmap_cfgtable;
if (rc) {
- dev_warn(&pdev->dev, "Unable to successfully reset controller,"
- " Ignoring controller.\n");
- rc = -ENODEV;
+ dev_warn(&pdev->dev, "Unable to successfully reset "
+ "controller. Will try soft reset.\n");
+ rc = -ENOTSUPP;
} else {
- dev_info(&pdev->dev, "board ready.\n");
+ dev_info(&pdev->dev, "board ready after hard reset.\n");
}

unmap_cfgtable:
@@ -3840,7 +3922,7 @@ static __devinit int hpsa_init_reset_devices(struct pci_dev *pdev)
* due to concerns about shared bbwc between 6402/6404 pair.
*/
if (rc == -ENOTSUPP)
- return 0; /* just try to do the kdump anyhow. */
+ return rc; /* just try to do the kdump anyhow. */
if (rc)
return -ENODEV;

@@ -3910,18 +3992,79 @@ static int hpsa_request_irq(struct ctlr_info *h,
return 0;
}

+static int __devinit hpsa_kdump_soft_reset(struct ctlr_info *h)
+{
+ if (hpsa_send_host_reset(h, RAID_CTLR_LUNID,
+ HPSA_RESET_TYPE_CONTROLLER)) {
+ dev_warn(&h->pdev->dev, "Resetting array controller failed.\n");
+ return -EIO;
+ }
+
+ dev_info(&h->pdev->dev, "Waiting for board to soft reset.\n");
+ if (hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_NOT_READY)) {
+ dev_warn(&h->pdev->dev, "Soft reset had no effect.\n");
+ return -1;
+ }
+
+ dev_info(&h->pdev->dev, "Board reset, awaiting READY status.\n");
+ if (hpsa_wait_for_board_state(h->pdev, h->vaddr, BOARD_READY)) {
+ dev_warn(&h->pdev->dev, "Board failed to become ready "
+ "after soft reset.\n");
+ return -1;
+ }
+
+ return 0;
+}
+
+static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
+{
+ free_irq(h->intr[h->intr_mode], h);
+#ifdef CONFIG_PCI_MSI
+ if (h->msix_vector)
+ pci_disable_msix(h->pdev);
+ else if (h->msi_vector)
+ pci_disable_msi(h->pdev);
+#endif /* CONFIG_PCI_MSI */
+ hpsa_free_sg_chain_blocks(h);
+ hpsa_free_cmd_pool(h);
+ kfree(h->blockFetchTable);
+ pci_free_consistent(h->pdev, h->reply_pool_size,
+ h->reply_pool, h->reply_pool_dhandle);
+ if (h->vaddr)
+ iounmap(h->vaddr);
+ if (h->transtable)
+ iounmap(h->transtable);
+ if (h->cfgtable)
+ iounmap(h->cfgtable);
+ pci_release_regions(h->pdev);
+ kfree(h);
+}
+
static int __devinit hpsa_init_one(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
int dac, rc;
struct ctlr_info *h;
+ int try_soft_reset = 0;
+ unsigned long flags;

if (number_of_controllers == 0)
printk(KERN_INFO DRIVER_NAME "\n");

rc = hpsa_init_reset_devices(pdev);
- if (rc)
- return rc;
+ if (rc) {
+ if (rc != -ENOTSUPP)
+ return rc;
+ /* If the reset fails in a particular way (it has no way to do
+ * a proper hard reset, so returns -ENOTSUPP) we can try to do
+ * a soft reset once we get the controller configured up to the
+ * point that it can accept a command.
+ */
+ try_soft_reset = 1;
+ rc = 0;
+ }
+
+reinit_after_soft_reset:

/* Command structures must be aligned on a 32-byte boundary because
* the 5 lower bits of the address are used by the hardware. and by
@@ -3981,11 +4124,66 @@ static int __devinit hpsa_init_one(struct pci_dev *pdev,
h->ndevices = 0;
h->scsi_host = NULL;
spin_lock_init(&h->devlock);
+ hpsa_put_ctlr_into_performant_mode(h);
+
+ /* At this point, the controller is ready to take commands.
+ * Now, if reset_devices and the hard reset didn't work, try
+ * the soft reset and see if that works.
+ */
+ if (try_soft_reset) {
+
+ /* This is kind of gross. We may or may not get a completion
+ * from the soft reset command, and if we do, then the value
+ * from the fifo may or may not be valid. So, we wait 10 secs
+ * after the reset throwing away any completions we get during
+ * that time. Unregister the interrupt handler and register
+ * fake ones to scoop up any residual completions.
+ */
+ spin_lock_irqsave(&h->lock, flags);
+ h->access.set_intr_mask(h, HPSA_INTR_OFF);
+ spin_unlock_irqrestore(&h->lock, flags);
+ free_irq(h->intr[h->intr_mode], h);
+ rc = hpsa_request_irq(h, hpsa_msix_discard_completions,
+ hpsa_intx_discard_completions);
+ if (rc) {
+ dev_warn(&h->pdev->dev, "Failed to request_irq after "
+ "soft reset.\n");
+ goto clean4;
+ }
+
+ rc = hpsa_kdump_soft_reset(h);
+ if (rc)
+ /* Neither hard nor soft reset worked, we're hosed. */
+ goto clean4;
+
+ dev_info(&h->pdev->dev, "Board READY.\n");
+ dev_info(&h->pdev->dev,
+ "Waiting for stale completions to drain.\n");
+ h->access.set_intr_mask(h, HPSA_INTR_ON);
+ msleep(10000);
+ h->access.set_intr_mask(h, HPSA_INTR_OFF);
+
+ rc = controller_reset_failed(h->cfgtable);
+ if (rc)
+ dev_info(&h->pdev->dev,
+ "Soft reset appears to have failed.\n");
+
+ /* since the controller's reset, we have to go back and re-init
+ * everything. Easiest to just forget what we've done and do it
+ * all over again.
+ */
+ hpsa_undo_allocations_after_kdump_soft_reset(h);
+ try_soft_reset = 0;
+ if (rc)
+ /* don't go to clean4, we already unallocated */
+ return -ENODEV;
+
+ goto reinit_after_soft_reset;
+ }

/* Turn the interrupts on so we can service requests */
h->access.set_intr_mask(h, HPSA_INTR_ON);

- hpsa_put_ctlr_into_performant_mode(h);
hpsa_hba_inquiry(h);
hpsa_register_scsi(h); /* hook ourselves into SCSI subsystem */
h->busy_initializing = 0;
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index b1412a7..6d8dcd4 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -127,8 +127,10 @@ struct ctlr_info {
};
#define HPSA_ABORT_MSG 0
#define HPSA_DEVICE_RESET_MSG 1
-#define HPSA_BUS_RESET_MSG 2
-#define HPSA_HOST_RESET_MSG 3
+#define HPSA_RESET_TYPE_CONTROLLER 0x00
+#define HPSA_RESET_TYPE_BUS 0x01
+#define HPSA_RESET_TYPE_TARGET 0x03
+#define HPSA_RESET_TYPE_LUN 0x04
#define HPSA_MSG_SEND_RETRY_LIMIT 10
#define HPSA_MSG_SEND_RETRY_INTERVAL_MSECS (10000)

2011-05-03 20:00:03

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 14/16] hpsa: remove superfluous sleeps around reset code

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 6fe77d0..ca8ee92 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3217,7 +3217,6 @@ static int hpsa_controller_hard_reset(struct pci_dev *pdev,
*/
dev_info(&pdev->dev, "using doorbell to reset controller\n");
writel(use_doorbell, vaddr + SA5_DOORBELL);
- msleep(1000);
} else { /* Try to do it the PCI power state way */

/* Quoting from the Open CISS Specification: "The Power
@@ -3248,8 +3247,6 @@ static int hpsa_controller_hard_reset(struct pci_dev *pdev,
pmcsr &= ~PCI_PM_CTRL_STATE_MASK;
pmcsr |= PCI_D0;
pci_write_config_word(pdev, pos + PCI_PM_CTRL, pmcsr);
-
- msleep(500);
}
return 0;
}

2011-05-03 20:06:37

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 15/16] hpsa: do not attempt PCI power management reset method if we know it won't work.

From: Stephen M. Cameron <[email protected]>

Just go straight to the soft-reset method instead.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 52 +++++++++++++++++++++++++++++++++++++--------------
1 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index ca8ee92..5c03cb5 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -273,7 +273,7 @@ static ssize_t host_show_transport_mode(struct device *dev,
"performant" : "simple");
}

-/* List of controllers which cannot be reset on kexec with reset_devices */
+/* List of controllers which cannot be hard reset on kexec with reset_devices */
static u32 unresettable_controller[] = {
0x324a103C, /* Smart Array P712m */
0x324b103C, /* SmartArray P711m */
@@ -291,16 +291,45 @@ static u32 unresettable_controller[] = {
0x409D0E11, /* Smart Array 6400 EM */
};

-static int ctlr_is_resettable(struct ctlr_info *h)
+/* List of controllers which cannot even be soft reset */
+static u32 soft_unresettable_controller[] = {
+ /* Exclude 640x boards. These are two pci devices in one slot
+ * which share a battery backed cache module. One controls the
+ * cache, the other accesses the cache through the one that controls
+ * it. If we reset the one controlling the cache, the other will
+ * likely not be happy. Just forbid resetting this conjoined mess.
+ * The 640x isn't really supported by hpsa anyway.
+ */
+ 0x409C0E11, /* Smart Array 6400 */
+ 0x409D0E11, /* Smart Array 6400 EM */
+};
+
+static int ctlr_is_hard_resettable(u32 board_id)
{
int i;

for (i = 0; i < ARRAY_SIZE(unresettable_controller); i++)
- if (unresettable_controller[i] == h->board_id)
+ if (unresettable_controller[i] == board_id)
+ return 0;
+ return 1;
+}
+
+static int ctlr_is_soft_resettable(u32 board_id)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(soft_unresettable_controller); i++)
+ if (soft_unresettable_controller[i] == board_id)
return 0;
return 1;
}

+static int ctlr_is_resettable(u32 board_id)
+{
+ return ctlr_is_hard_resettable(board_id) ||
+ ctlr_is_soft_resettable(board_id);
+}
+
static ssize_t host_show_resettable(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -308,7 +337,7 @@ static ssize_t host_show_resettable(struct device *dev,
struct Scsi_Host *shost = class_to_shost(dev);

h = shost_to_hba(shost);
- return snprintf(buf, 20, "%d\n", ctlr_is_resettable(h));
+ return snprintf(buf, 20, "%d\n", ctlr_is_resettable(h->board_id));
}

static inline int is_logical_dev_addr_mode(unsigned char scsi3addr[])
@@ -3334,20 +3363,15 @@ static __devinit int hpsa_kdump_hard_reset_controller(struct pci_dev *pdev)
* using the doorbell register.
*/

- /* Exclude 640x boards. These are two pci devices in one slot
- * which share a battery backed cache module. One controls the
- * cache, the other accesses the cache through the one that controls
- * it. If we reset the one controlling the cache, the other will
- * likely not be happy. Just forbid resetting this conjoined mess.
- * The 640x isn't really supported by hpsa anyway.
- */
rc = hpsa_lookup_board_id(pdev, &board_id);
- if (rc < 0) {
+ if (rc < 0 || !ctlr_is_resettable(board_id)) {
dev_warn(&pdev->dev, "Not resetting device.\n");
return -ENODEV;
}
- if (board_id == 0x409C0E11 || board_id == 0x409D0E11)
- return -ENOTSUPP;
+
+ /* if controller is soft- but not hard resettable... */
+ if (!ctlr_is_hard_resettable(board_id))
+ return -ENOTSUPP; /* try soft reset later. */

/* Save the PCI command register */
pci_read_config_word(pdev, 4, &command_register);

2011-05-03 20:00:11

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 16/16] hpsa: add P2000 to list of shared SAS devices

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Scott Teel <[email protected]>
---
drivers/scsi/hpsa.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 5c03cb5..cffc7bb 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1591,6 +1591,7 @@ static unsigned char *msa2xxx_model[] = {
"MSA2024",
"MSA2312",
"MSA2324",
+ "P2000 G3 SAS",
NULL,
};

2011-05-04 11:16:07

by Tomas Henzl

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On 05/03/2011 09:58 PM, Stephen M. Cameron wrote:
> From: Stephen M. Cameron <[email protected]>
>
> Apparently we've been doin it rong for a decade, but only lately do we
> run into problems.
>
> Signed-off-by: Stephen M. Cameron <[email protected]>
> ---
> drivers/scsi/hpsa.h | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index 621a153..98c97ca 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -212,6 +212,7 @@ static void SA5_submit_command(struct ctlr_info *h,
> dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
> c->Header.Tag.lower);
> writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
> + (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
>
Hi,
a small nit -
the (void) ^ is I think not needed for gcc and isn't present in the cciss.h patch

Tomas

> h->commands_outstanding++;
> if (h->commands_outstanding > h->max_outstanding)
> h->max_outstanding = h->commands_outstanding;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2011-05-04 12:52:18

by Stephen M. Cameron

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On Wed, May 04, 2011 at 01:15:50PM +0200, Tomas Henzl wrote:
> On 05/03/2011 09:58 PM, Stephen M. Cameron wrote:
> > From: Stephen M. Cameron <[email protected]>
> >
> > Apparently we've been doin it rong for a decade, but only lately do we
> > run into problems.
> >
> > Signed-off-by: Stephen M. Cameron <[email protected]>
> > ---
> > drivers/scsi/hpsa.h | 1 +
> > 1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> > index 621a153..98c97ca 100644
> > --- a/drivers/scsi/hpsa.h
> > +++ b/drivers/scsi/hpsa.h
> > @@ -212,6 +212,7 @@ static void SA5_submit_command(struct ctlr_info *h,
> > dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
> > c->Header.Tag.lower);
> > writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
> > + (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
> >
> Hi,
> a small nit -
> the (void) ^ is I think not needed for gcc and isn't present in the cciss.h patch

I just put it there to make it clear that it ignoring the return of readl is
done intentionally, not accidentally. If this goes against some coding convention,
whatever, I'm not super attached to the (void), but I did put it there on purpose,
and would have done it in cciss as well, had I thought of it at the time.

-- steve

>
> Tomas
>
> > h->commands_outstanding++;
> > if (h->commands_outstanding > h->max_outstanding)
> > h->max_outstanding = h->commands_outstanding;
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >

2011-05-04 13:34:47

by Tomas Henzl

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On 05/04/2011 02:52 PM, [email protected] wrote:
> On Wed, May 04, 2011 at 01:15:50PM +0200, Tomas Henzl wrote:
>
>> On 05/03/2011 09:58 PM, Stephen M. Cameron wrote:
>>
>>> From: Stephen M. Cameron <[email protected]>
>>>
>>> Apparently we've been doin it rong for a decade, but only lately do we
>>> run into problems.
>>>
>>> Signed-off-by: Stephen M. Cameron <[email protected]>
>>> ---
>>> drivers/scsi/hpsa.h | 1 +
>>> 1 files changed, 1 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>>> index 621a153..98c97ca 100644
>>> --- a/drivers/scsi/hpsa.h
>>> +++ b/drivers/scsi/hpsa.h
>>> @@ -212,6 +212,7 @@ static void SA5_submit_command(struct ctlr_info *h,
>>> dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
>>> c->Header.Tag.lower);
>>> writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
>>> + (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
>>>
>>>
>> Hi,
>> a small nit -
>> the (void) ^ is I think not needed for gcc and isn't present in the cciss.h patch
>>
> I just put it there to make it clear that it ignoring the return of readl is
> done intentionally, not accidentally. If this goes against some coding convention,
> whatever, I'm not super attached to the (void), but I did put it there on purpose,
> and would have done it in cciss as well, had I thought of it at the time.
>
I'm not sure what the coding convention says about this, I personally would omit it,
both ways are used in the kernel - so it is fine for me.

Ack - Tomas



> -- steve
>
>
>> Tomas
>>
>>
>>> h->commands_outstanding++;
>>> if (h->commands_outstanding > h->max_outstanding)
>>> h->max_outstanding = h->commands_outstanding;
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2011-05-04 17:29:35

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On Wed, 04 May 2011 07:52:12 CDT, [email protected] said:
> On Wed, May 04, 2011 at 01:15:50PM +0200, Tomas Henzl wrote:
> > On 05/03/2011 09:58 PM, Stephen M. Cameron wrote:
> > > From: Stephen M. Cameron <[email protected]>

> > > dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
> > > c->Header.Tag.lower);
> > > writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
> > > + (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);

> I just put it there to make it clear that it ignoring the return of readl is
> done intentionally, not accidentally. If this goes against some coding convention,
> whatever, I'm not super attached to the (void), but I did put it there on purpose,
> and would have done it in cciss as well, had I thought of it at the time.

This probably needs a comment like
/* don't care - dummy read just to force write posting to chipset */
or similar. I'm assuming it's just functioning as a barrier-type flush of some sort?




Attachments:
(No filename) (227.00 B)

2011-05-04 17:37:38

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On Wed, May 04, 2011 at 01:28:21PM -0400, [email protected] wrote:
> On Wed, 04 May 2011 07:52:12 CDT, [email protected] said:
> > On Wed, May 04, 2011 at 01:15:50PM +0200, Tomas Henzl wrote:
> > > On 05/03/2011 09:58 PM, Stephen M. Cameron wrote:
> > > > From: Stephen M. Cameron <[email protected]>
>
> > > > dev_dbg(&h->pdev->dev, "Sending %x, tag = %x\n", c->busaddr,
> > > > c->Header.Tag.lower);
> > > > writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
> > > > + (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
>
> > I just put it there to make it clear that it ignoring the return of readl is
> > done intentionally, not accidentally. If this goes against some coding convention,
> > whatever, I'm not super attached to the (void), but I did put it there on purpose,
> > and would have done it in cciss as well, had I thought of it at the time.
>
> This probably needs a comment like
> /* don't care - dummy read just to force write posting to chipset */
> or similar. I'm assuming it's just functioning as a barrier-type flush of some sort?

It's a PCI write flush. It's not clear to me why it's needed here,
though. The write will eventually get to the device; why we need to
make the CPU wait around for it to actually get there doesn't make sense.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2011-05-04 17:55:34

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
> > This probably needs a comment like
> > /* don't care - dummy read just to force write posting to chipset */
> > or similar. I'm assuming it's just functioning as a barrier-type flush of some sort?
>
> It's a PCI write flush. It's not clear to me why it's needed here,
> though. The write will eventually get to the device; why we need to
> make the CPU wait around for it to actually get there doesn't make sense.

Exactly why I think it needs a one-liner comment. :)



Attachments:
(No filename) (227.00 B)

2011-05-05 18:41:29

by Mike Miller

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On Wed, May 04, 2011 at 01:54:22PM -0400, [email protected] wrote:
> On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
> > > This probably needs a comment like
> > > /* don't care - dummy read just to force write posting to chipset */
> > > or similar. I'm assuming it's just functioning as a barrier-type flush of some sort?
> >
> > It's a PCI write flush. It's not clear to me why it's needed here,
> > though. The write will eventually get to the device; why we need to
> > make the CPU wait around for it to actually get there doesn't make sense.
>
> Exactly why I think it needs a one-liner comment. :)
>
So we're not exactly sure why it's needed either. We've had reports of
commands getting "lost" or "stuck" under some workloads. The extra readl
works around the issue but certainly may have negative side effects.

I'm not sure I understand how writel works.

>From linux-2.6/arch/x86/include/asm/io.h:

#define build_mmio_write(name, size, type, reg, barrier) \
static inline void name(type val, volatile void __iomem *addr) \
{ asm volatile("mov" size " %0,%1": :reg (val), \
"m" (*(volatile type __force *)addr) barrier); }

This implies (at least to me) that a barrier is part of writel. I don't know
why a write operation needs a barrier but thats essentially what we've done
by adding the extra readl. Can someone confirm or deny that a barrier is
actually built into writel? Or used by writel? If so, does this indicate
that barrier is broken?

At this point we (the software guys) are pretty much at a loss as to how to
continue debugging. We don't know what to trigger on for the PCIe analyzer.
If we track outstanding commands then trigger on one that doesn't complete in
some amount of time the problem could conceivably be far in the past and
difficult to correlate to the data in the trace.

If anyone has any thoughts, suggestions, or flames they would be greatly
appreciated.

-- mikem

2011-05-17 10:12:36

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH 16/16] hpsa: add P2000 to list of shared SAS devices

On Tue, 2011-05-03 at 15:00 -0500, Stephen M. Cameron wrote:
> From: Stephen M. Cameron <[email protected]>
>
> Signed-off-by: Scott Teel <[email protected]>

Just FYI for future, this is missing your signoff.; it needs to be there
because you're sending me the patch. I also suspect this patch is
actually by Stott Teel? In which case it should show as From: him.

James

2011-05-17 13:26:15

by Stephen M. Cameron

[permalink] [raw]
Subject: Re: [PATCH 16/16] hpsa: add P2000 to list of shared SAS devices

On Tue, May 17, 2011 at 02:12:27PM +0400, James Bottomley wrote:
> On Tue, 2011-05-03 at 15:00 -0500, Stephen M. Cameron wrote:
> > From: Stephen M. Cameron <[email protected]>
> >
> > Signed-off-by: Scott Teel <[email protected]>
>
> Just FYI for future, this is missing your signoff.; it needs to be there
> because you're sending me the patch. I also suspect this patch is
> actually by Stott Teel? In which case it should show as From: him.

Ok thanks, and sorry about that. Yes, it is from Scott Teel. I'm not
very accustomed to forwarding patches from other people.

-- steve

2011-05-23 11:37:56

by Tomas Henzl

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On 05/05/2011 08:35 PM, Mike Miller wrote:
> On Wed, May 04, 2011 at 01:54:22PM -0400, [email protected] wrote:
>
>> On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
>>
>>>> This probably needs a comment like
>>>> /* don't care - dummy read just to force write posting to chipset */
>>>> or similar. I'm assuming it's just functioning as a barrier-type flush of some sort?
>>>>
>>> It's a PCI write flush. It's not clear to me why it's needed here,
>>> though. The write will eventually get to the device; why we need to
>>> make the CPU wait around for it to actually get there doesn't make sense.
>>>
>> Exactly why I think it needs a one-liner comment. :)
>>
>>
> So we're not exactly sure why it's needed either. We've had reports of
> commands getting "lost" or "stuck" under some workloads. The extra readl
> works around the issue but certainly may have negative side effects.
>
> I'm not sure I understand how writel works.
>
> From linux-2.6/arch/x86/include/asm/io.h:
>
> #define build_mmio_write(name, size, type, reg, barrier) \
> static inline void name(type val, volatile void __iomem *addr) \
> { asm volatile("mov" size " %0,%1": :reg (val), \
> "m" (*(volatile type __force *)addr) barrier); }
>
> This implies (at least to me) that a barrier is part of writel. I don't know
> why a write operation needs a barrier but thats essentially what we've done
> by adding the extra readl. Can someone confirm or deny that a barrier is
> actually built into writel? Or used by writel? If so, does this indicate
> that barrier is broken?
>
> At this point we (the software guys) are pretty much at a loss as to how to
> continue debugging. We don't know what to trigger on for the PCIe analyzer.
> If we track outstanding commands then trigger on one that doesn't complete in
> some amount of time the problem could conceivably be far in the past and
> difficult to correlate to the data in the trace.
>
I'd look at the firmware part, you could check what happens for example when
the firmware gets send a command it doesn't understand.
You could also change the communication with the fw by adding a count field, which can
be then checked for the !(next_value == previous_value + 1) and raise an event.
tomas


> If anyone has any thoughts, suggestions, or flames they would be greatly
> appreciated.
>
> -- mikem
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2011-05-25 15:26:48

by Mike Miller

[permalink] [raw]
Subject: RE: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

Tomas wrote:

> -----Original Message-----
> From: Tomas Henzl [mailto:[email protected]]
> Sent: Monday, May 23, 2011 6:38 AM
> To: Miller, Mike (OS Dev)
> Cc: [email protected]; [email protected]; Andrew Morton;
> LKML; LKML-scsi; Jens Axboe
> Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path
> to ensure commands don't get lost.
>
> On 05/05/2011 08:35 PM, Mike Miller wrote:
> > On Wed, May 04, 2011 at 01:54:22PM -0400, [email protected]
> wrote:
> >
> >> On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
> >>
> >>>> This probably needs a comment like
> >>>> /* don't care - dummy read just to force write posting to chipset
> */
> >>>> or similar. I'm assuming it's just functioning as a barrier-type
> flush of some sort?
> >>>>
> >>> It's a PCI write flush. It's not clear to me why it's needed here,
> >>> though. The write will eventually get to the device; why we need to
> >>> make the CPU wait around for it to actually get there doesn't make
> sense.
> >>>
> >> Exactly why I think it needs a one-liner comment. :)
> >>
> >>
> > So we're not exactly sure why it's needed either. We've had reports of
> > commands getting "lost" or "stuck" under some workloads. The extra
> readl
> > works around the issue but certainly may have negative side effects.
> >
> > I'm not sure I understand how writel works.
> >
> > From linux-2.6/arch/x86/include/asm/io.h:
> >
> > #define build_mmio_write(name, size, type, reg, barrier) \
> > static inline void name(type val, volatile void __iomem *addr) \
> > { asm volatile("mov" size " %0,%1": :reg (val), \
> > "m" (*(volatile type __force *)addr) barrier); }
> >
> > This implies (at least to me) that a barrier is part of writel. I
> don't know
> > why a write operation needs a barrier but thats essentially what we've
> done
> > by adding the extra readl. Can someone confirm or deny that a barrier
> is
> > actually built into writel? Or used by writel? If so, does this
> indicate
> > that barrier is broken?
> >
> > At this point we (the software guys) are pretty much at a loss as to
> how to
> > continue debugging. We don't know what to trigger on for the PCIe
> analyzer.
> > If we track outstanding commands then trigger on one that doesn't
> complete in
> > some amount of time the problem could conceivably be far in the past
> and
> > difficult to correlate to the data in the trace.
> >
> I'd look at the firmware part, you could check what happens for example
> when
> the firmware gets send a command it doesn't understand.
> You could also change the communication with the fw by adding a count
> field, which can
> be then checked for the !(next_value == previous_value + 1) and raise an
> event.
> tomas

Tomas,
We've tried something very similar to the counter idea in fw. It doesn't help because the controller thinks he's done with the request. We have a (pretty crude) counter in the driver but no timing mechanism. We could add a timer. But what's a suitable timeout value? Is 2 seconds too short, too long? Suggestions, please.

-- mikem


>
>
> > If anyone has any thoughts, suggestions, or flames they would be
> greatly
> > appreciated.
> >
> > -- mikem
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >

2011-05-26 12:14:28

by Tomas Henzl

[permalink] [raw]
Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.

On 05/25/2011 05:20 PM, Miller, Mike (OS Dev) wrote:
> Tomas wrote:
>
>
>> -----Original Message-----
>> From: Tomas Henzl [mailto:[email protected]]
>> Sent: Monday, May 23, 2011 6:38 AM
>> To: Miller, Mike (OS Dev)
>> Cc: [email protected]; [email protected]; Andrew Morton;
>> LKML; LKML-scsi; Jens Axboe
>> Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path
>> to ensure commands don't get lost.
>>
>> On 05/05/2011 08:35 PM, Mike Miller wrote:
>>
>>> On Wed, May 04, 2011 at 01:54:22PM -0400, [email protected]
>>>
>> wrote:
>>
>>>
>>>> On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
>>>>
>>>>
>>>>>> This probably needs a comment like
>>>>>> /* don't care - dummy read just to force write posting to chipset
>>>>>>
>> */
>>
>>>>>> or similar. I'm assuming it's just functioning as a barrier-type
>>>>>>
>> flush of some sort?
>>
>>>>>>
>>>>> It's a PCI write flush. It's not clear to me why it's needed here,
>>>>> though. The write will eventually get to the device; why we need to
>>>>> make the CPU wait around for it to actually get there doesn't make
>>>>>
>> sense.
>>
>>>>>
>>>> Exactly why I think it needs a one-liner comment. :)
>>>>
>>>>
>>>>
>>> So we're not exactly sure why it's needed either. We've had reports of
>>> commands getting "lost" or "stuck" under some workloads. The extra
>>>
>> readl
>>
>>> works around the issue but certainly may have negative side effects.
>>>
>>> I'm not sure I understand how writel works.
>>>
>>> From linux-2.6/arch/x86/include/asm/io.h:
>>>
>>> #define build_mmio_write(name, size, type, reg, barrier) \
>>> static inline void name(type val, volatile void __iomem *addr) \
>>> { asm volatile("mov" size " %0,%1": :reg (val), \
>>> "m" (*(volatile type __force *)addr) barrier); }
>>>
>>> This implies (at least to me) that a barrier is part of writel. I
>>>
>> don't know
>>
>>> why a write operation needs a barrier but thats essentially what we've
>>>
>> done
>>
>>> by adding the extra readl. Can someone confirm or deny that a barrier
>>>
>> is
>>
>>> actually built into writel? Or used by writel? If so, does this
>>>
>> indicate
>>
>>> that barrier is broken?
>>>
>>> At this point we (the software guys) are pretty much at a loss as to
>>>
>> how to
>>
>>> continue debugging. We don't know what to trigger on for the PCIe
>>>
>> analyzer.
>>
>>> If we track outstanding commands then trigger on one that doesn't
>>>
>> complete in
>>
>>> some amount of time the problem could conceivably be far in the past
>>>
>> and
>>
>>> difficult to correlate to the data in the trace.
>>>
>>>
>> I'd look at the firmware part, you could check what happens for example
>> when
>> the firmware gets send a command it doesn't understand.
>> You could also change the communication with the fw by adding a count
>> field, which can
>> be then checked for the !(next_value == previous_value + 1) and raise an
>> event.
>> tomas
>>
> Tomas,
> We've tried something very similar to the counter idea in fw. It doesn't help because the controller thinks he's done with the request. We have a (pretty crude) counter in the driver but no timing mechanism. We could add a timer. But what's a suitable timeout value? Is 2 seconds too short, too long? Suggestions, please.
>
I know that a counter isn't a ground-breaking idea, just wanted to show some interest :)
The command can be either eaten by the firmware or during the communication in or out from the device.
I'd would start by the communication, by adding some fields to the command to detect if a command in the row(s) isn't
missing - I know even that isn't easy. The same could be done independently done for the other direction.

tomash

> -- mikem
>
>
>
>>
>>
>>> If anyone has any thoughts, suggestions, or flames they would be
>>>
>> greatly
>>
>>> appreciated.
>>>
>>> -- mikem
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>>>
>> in
>>
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>

2011-05-26 14:55:08

by Mike Miller

[permalink] [raw]
Subject: RE: [PATCH 01/16] hpsa: do readl after writel in main i/o path to ensure commands don't get lost.



> -----Original Message-----
> From: Tomas Henzl [mailto:[email protected]]
> Sent: Thursday, May 26, 2011 7:14 AM
> To: Miller, Mike (OS Dev)
> Cc: [email protected]; [email protected]; Andrew Morton;
> LKML; LKML-scsi; Jens Axboe
> Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o path
> to ensure commands don't get lost.
>
> On 05/25/2011 05:20 PM, Miller, Mike (OS Dev) wrote:
> > Tomas wrote:
> >
> >
> >> -----Original Message-----
> >> From: Tomas Henzl [mailto:[email protected]]
> >> Sent: Monday, May 23, 2011 6:38 AM
> >> To: Miller, Mike (OS Dev)
> >> Cc: [email protected]; [email protected]; Andrew
> Morton;
> >> LKML; LKML-scsi; Jens Axboe
> >> Subject: Re: [PATCH 01/16] hpsa: do readl after writel in main i/o
> path
> >> to ensure commands don't get lost.
> >>
> >> On 05/05/2011 08:35 PM, Mike Miller wrote:
> >>
> >>> On Wed, May 04, 2011 at 01:54:22PM -0400, [email protected]
> >>>
> >> wrote:
> >>
> >>>
> >>>> On Wed, 04 May 2011 11:37:35 MDT, Matthew Wilcox said:
> >>>>
> >>>>
> >>>>>> This probably needs a comment like
> >>>>>> /* don't care - dummy read just to force write posting to
> chipset
> >>>>>>
> >> */
> >>
> >>>>>> or similar. I'm assuming it's just functioning as a barrier-type
> >>>>>>
> >> flush of some sort?
> >>
> >>>>>>
> >>>>> It's a PCI write flush. It's not clear to me why it's needed
> here,
> >>>>> though. The write will eventually get to the device; why we need
> to
> >>>>> make the CPU wait around for it to actually get there doesn't make
> >>>>>
> >> sense.
> >>
> >>>>>
> >>>> Exactly why I think it needs a one-liner comment. :)
> >>>>
> >>>>
> >>>>
> >>> So we're not exactly sure why it's needed either. We've had reports
> of
> >>> commands getting "lost" or "stuck" under some workloads. The extra
> >>>
> >> readl
> >>
> >>> works around the issue but certainly may have negative side effects.
> >>>
> >>> I'm not sure I understand how writel works.
> >>>
> >>> From linux-2.6/arch/x86/include/asm/io.h:
> >>>
> >>> #define build_mmio_write(name, size, type, reg, barrier) \
> >>> static inline void name(type val, volatile void __iomem *addr) \
> >>> { asm volatile("mov" size " %0,%1": :reg (val), \
> >>> "m" (*(volatile type __force *)addr) barrier); }
> >>>
> >>> This implies (at least to me) that a barrier is part of writel. I
> >>>
> >> don't know
> >>
> >>> why a write operation needs a barrier but thats essentially what
> we've
> >>>
> >> done
> >>
> >>> by adding the extra readl. Can someone confirm or deny that a
> barrier
> >>>
> >> is
> >>
> >>> actually built into writel? Or used by writel? If so, does this
> >>>
> >> indicate
> >>
> >>> that barrier is broken?
> >>>
> >>> At this point we (the software guys) are pretty much at a loss as to
> >>>
> >> how to
> >>
> >>> continue debugging. We don't know what to trigger on for the PCIe
> >>>
> >> analyzer.
> >>
> >>> If we track outstanding commands then trigger on one that doesn't
> >>>
> >> complete in
> >>
> >>> some amount of time the problem could conceivably be far in the past
> >>>
> >> and
> >>
> >>> difficult to correlate to the data in the trace.
> >>>
> >>>
> >> I'd look at the firmware part, you could check what happens for
> example
> >> when
> >> the firmware gets send a command it doesn't understand.
> >> You could also change the communication with the fw by adding a count
> >> field, which can
> >> be then checked for the !(next_value == previous_value + 1) and raise
> an
> >> event.
> >> tomas
> >>
> > Tomas,
> > We've tried something very similar to the counter idea in fw. It
> doesn't help because the controller thinks he's done with the request.
> We have a (pretty crude) counter in the driver but no timing mechanism.
> We could add a timer. But what's a suitable timeout value? Is 2 seconds
> too short, too long? Suggestions, please.
> >
> I know that a counter isn't a ground-breaking idea, just wanted to show
> some interest :)

:)

> The command can be either eaten by the firmware or during the
> communication in or out from the device.
> I'd would start by the communication, by adding some fields to the
> command to detect if a command in the row(s) isn't
> missing - I know even that isn't easy. The same could be done
> independently done for the other direction.
>
> tomash

Thanks, Tomas.

>
> > -- mikem
> >
> >
> >
> >>
> >>
> >>> If anyone has any thoughts, suggestions, or flames they would be
> >>>
> >> greatly
> >>
> >>> appreciated.
> >>>
> >>> -- mikem
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-
> scsi"
> >>>
> >> in
> >>
> >>> the body of a message to [email protected]
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> >