2012-05-01 16:42:08

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 00/17] hpsa: resend updates for April, 2012

The two big changes here are adding an abort handler, and using multiple
reply queues for command completions. Updated the patch set since last
time to address comments:

* Use pci_set_master(h->pdev) to enable bus mastering instead of
doing it manually.

* Retry driver initiated commands on SAM_STAT_TASK_SET_FULL
as well as SAM_STAT_BUSY and retry more than just 3 times,
backing off the retry rate.

* Dropped the changes that used the new IS_ENABLED macro.

Matt Bondurant (1):
hpsa: retry driver initiated commands on busy status

Matt Gates (2):
hpsa: use multiple reply queues
hpsa: refine interrupt handler locking for greater concurrency

Mike Miller (1):
hpsa: add new RAID level "1(ADM)"

Stephen M. Cameron (13):
hpsa: call pci_disable_device on driver unload
hpsa: do not skip disabled devices
hpsa: enable bus master bit after pci_enable_device
hpsa: suppress excessively chatty error messages
hpsa: do not read from controller unnecessarily in completion code
hpsa: do not give up retry of driver cmds after only 3 retries
hpsa: remove unused parameter from finish_cmd
hpsa: add abort error handler function
hpsa: do aborts two ways
hpsa: factor out tail calls to next_command() in process_(non)indexed_cmd()
hpsa: factor out hpsa_free_irqs_and_disable_msix
hpsa: removed unused member maxQsinceinit
hpsa: dial down lockup detection during firmware flash


drivers/block/cciss.c | 2
drivers/scsi/hpsa.c | 682 ++++++++++++++++++++++++++++++++++++++---------
drivers/scsi/hpsa.h | 85 ++++--
drivers/scsi/hpsa_cmd.h | 37 ++-
4 files changed, 642 insertions(+), 164 deletions(-)

--
-- steve


2012-05-01 16:42:24

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 03/17] hpsa: enable bus master bit after pci_enable_device

From: Stephen M. Cameron <[email protected]>

pci_disable_device() disables the bus master bit and pci_enable_device does
not re-enable it. It needs to be enabled.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 5119ce6..de9c8f3 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3933,6 +3933,9 @@ static int __devinit hpsa_pci_init(struct ctlr_info *h)
return err;
}

+ /* Enable bus mastering (pci_disable_device may disable this) */
+ pci_set_master(h->pdev);
+
err = pci_request_regions(h->pdev, HPSA);
if (err) {
dev_err(&h->pdev->dev,

2012-05-01 16:42:34

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 05/17] hpsa: do not read from controller unnecessarily in completion code

From: Stephen M. Cameron <[email protected]>

MSI/MSI-X interrupts can't race the DMA completion they are communicating
so no need to read from controller to flush the DMA to the host if
MSI or MSI-X interrupts are being used.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 7b28d54..48f7812 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -258,12 +258,12 @@ static unsigned long SA5_performant_completed(struct ctlr_info *h)
{
unsigned long register_value = FIFO_EMPTY;

- /* flush the controller write of the reply queue by reading
- * outbound doorbell status register.
- */
- register_value = readl(h->vaddr + SA5_OUTDB_STATUS);
/* msi auto clears the interrupt pending bit. */
if (!(h->msi_vector || h->msix_vector)) {
+ /* flush the controller write of the reply queue by reading
+ * outbound doorbell status register.
+ */
+ register_value = readl(h->vaddr + SA5_OUTDB_STATUS);
writel(SA5_OUTDB_CLEAR_PERF_BIT, h->vaddr + SA5_OUTDB_CLEAR);
/* Do a read in order to flush the write to the controller
* (as per spec.)

2012-05-01 16:42:29

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 04/17] hpsa: suppress excessively chatty error messages

From: Stephen M. Cameron <[email protected]>

Default behavior for any CHECK CONDITION excepting a few special cases is to
print out certain parts of the sense buffer and the CDB. Default behavior
should be to print nothing and let the upper layers or applications decide what
to do about these. The same information is already available by setting the
appropriate bits of the scsi_logging_level kernel parameter or via
/proc/sys/dev/scsi/logging_level.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index ab16c16..9e09717 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1193,7 +1193,7 @@ static void complete_scsi_command(struct CommandList *cp)
break;
}
/* Must be some other type of check condition */
- dev_warn(&h->pdev->dev, "cp %p has check condition: "
+ dev_dbg(&h->pdev->dev, "cp %p has check condition: "
"unknown type: "
"Sense: 0x%x, ASC: 0x%x, ASCQ: 0x%x, "
"Returning result: 0x%x, "

2012-05-01 16:42:39

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 06/17] hpsa: retry driver initiated commands on busy status

From: Matt Bondurant <[email protected]>

In shared SAS configurations we might get a busy status
during driver initiated commands (e.g. during rescan for
devices). We should retry the command in such cases rather
than giving up.

Signed-off-by: Matt Bondurant <[email protected]>
Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 9e09717..5015ac4 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -234,6 +234,16 @@ static int check_for_unit_attention(struct ctlr_info *h,
return 1;
}

+static int check_for_busy(struct ctlr_info *h, struct CommandList *c)
+{
+ if (c->err_info->CommandStatus != CMD_TARGET_STATUS ||
+ (c->err_info->ScsiStatus != SAM_STAT_BUSY &&
+ c->err_info->ScsiStatus != SAM_STAT_TASK_SET_FULL))
+ return 0;
+ dev_warn(&h->pdev->dev, HPSA "device busy");
+ return 1;
+}
+
static ssize_t host_store_rescan(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
@@ -1379,7 +1389,8 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
memset(c->err_info, 0, sizeof(*c->err_info));
hpsa_scsi_do_simple_cmd_core(h, c);
retry_count++;
- } while (check_for_unit_attention(h, c) && retry_count <= 3);
+ } while ((check_for_unit_attention(h, c) ||
+ check_for_busy(h, c)) && retry_count <= 3);
hpsa_pci_unmap(h->pdev, c, 1, data_direction);
}

2012-05-01 16:42:44

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

From: Stephen M. Cameron <[email protected]>

Instead of giving up after 3 immediate retries of driver initiated
commands, back off the rate of retries and retry a bunch more times.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 5015ac4..cb9d619 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1380,17 +1380,24 @@ static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
}
}

+#define MAX_DRIVER_CMD_RETRIES 25
static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
struct CommandList *c, int data_direction)
{
- int retry_count = 0;
+ int backoff_time = 10, retry_count = 0;

do {
memset(c->err_info, 0, sizeof(*c->err_info));
hpsa_scsi_do_simple_cmd_core(h, c);
retry_count++;
+ if (retry_count > 3) {
+ msleep(backoff_time);
+ if (backoff_time < 1000)
+ backoff_time *= 2;
+ }
} while ((check_for_unit_attention(h, c) ||
- check_for_busy(h, c)) && retry_count <= 3);
+ check_for_busy(h, c)) &&
+ retry_count <= MAX_DRIVER_CMD_RETRIES);
hpsa_pci_unmap(h->pdev, c, 1, data_direction);
}

2012-05-01 16:42:57

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 09/17] hpsa: add abort error handler function

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 221 +++++++++++++++++++++++++++++++++++++++++++++++
drivers/scsi/hpsa.h | 21 ++++
drivers/scsi/hpsa_cmd.h | 31 ++++++-
3 files changed, 271 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 49b31ad..abf5c7e 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -159,6 +159,7 @@ static int hpsa_change_queue_depth(struct scsi_device *sdev,
int qdepth, int reason);

static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd);
+static int hpsa_eh_abort_handler(struct scsi_cmnd *scsicmd);
static int hpsa_slave_alloc(struct scsi_device *sdev);
static void hpsa_slave_destroy(struct scsi_device *sdev);

@@ -180,6 +181,7 @@ static int __devinit hpsa_pci_find_memory_BAR(struct pci_dev *pdev,
static int __devinit hpsa_lookup_board_id(struct pci_dev *pdev, u32 *board_id);
static int __devinit hpsa_wait_for_board_state(struct pci_dev *pdev,
void __iomem *vaddr, int wait_for_ready);
+static inline void finish_cmd(struct CommandList *c);
#define BOARD_NOT_READY 0
#define BOARD_READY 1

@@ -507,6 +509,7 @@ static struct scsi_host_template hpsa_driver_template = {
.change_queue_depth = hpsa_change_queue_depth,
.this_id = -1,
.use_clustering = ENABLE_CLUSTERING,
+ .eh_abort_handler = hpsa_eh_abort_handler,
.eh_device_reset_handler = hpsa_eh_device_reset_handler,
.ioctl = hpsa_ioctl,
.slave_alloc = hpsa_slave_alloc,
@@ -2352,6 +2355,191 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
return FAILED;
}

+static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
+ struct CommandList *abort)
+{
+ int rc = IO_OK;
+ struct CommandList *c;
+ struct ErrorInfo *ei;
+
+ c = cmd_special_alloc(h);
+ if (c == NULL) { /* trouble... */
+ dev_warn(&h->pdev->dev, "cmd_special_alloc returned NULL!\n");
+ return -ENOMEM;
+ }
+
+ fill_cmd(c, HPSA_ABORT_MSG, h, abort, 0, 0, scsi3addr, TYPE_MSG);
+ hpsa_scsi_do_simple_cmd_core(h, c);
+ dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd_core completed.\n",
+ __func__, abort->Header.Tag.upper, abort->Header.Tag.lower);
+ /* no unmap needed here because no data xfer. */
+
+ ei = c->err_info;
+ switch (ei->CommandStatus) {
+ case CMD_SUCCESS:
+ break;
+ case CMD_UNABORTABLE: /* Very common, don't make noise. */
+ rc = -1;
+ break;
+ default:
+ dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: interpreting error.\n",
+ __func__, abort->Header.Tag.upper,
+ abort->Header.Tag.lower);
+ hpsa_scsi_interpret_error(c);
+ rc = -1;
+ break;
+ }
+ cmd_special_free(h, c);
+ dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: Finished.\n", __func__,
+ abort->Header.Tag.upper, abort->Header.Tag.lower);
+ return rc;
+}
+
+/*
+ * hpsa_find_cmd_in_queue
+ *
+ * Used to determine whether a command (find) is still present
+ * in queue_head. Optionally excludes the last element of queue_head.
+ *
+ * This is used to avoid unnecessary aborts. Commands in h->reqQ have
+ * not yet been submitted, and so can be aborted by the driver without
+ * sending an abort to the hardware.
+ *
+ * Returns pointer to command if found in queue, NULL otherwise.
+ */
+static struct CommandList *hpsa_find_cmd_in_queue(struct ctlr_info *h,
+ struct scsi_cmnd *find, struct list_head *queue_head)
+{
+ unsigned long flags;
+ struct CommandList *c = NULL; /* ptr into cmpQ */
+
+ if (!find)
+ return 0;
+ spin_lock_irqsave(&h->lock, flags);
+ list_for_each_entry(c, queue_head, list) {
+ if (c->scsi_cmd == NULL) /* e.g.: passthru ioctl */
+ continue;
+ if (c->scsi_cmd == find) {
+ spin_unlock_irqrestore(&h->lock, flags);
+ return c;
+ }
+ }
+ spin_unlock_irqrestore(&h->lock, flags);
+ return NULL;
+}
+
+/* Send an abort for the specified command.
+ * If the device and controller support it,
+ * send a task abort request.
+ */
+static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
+{
+
+ int i, rc;
+ struct ctlr_info *h;
+ struct hpsa_scsi_dev_t *dev;
+ struct CommandList *abort; /* pointer to command to be aborted */
+ struct CommandList *found;
+ struct scsi_cmnd *as; /* ptr to scsi cmd inside aborted command. */
+ char msg[256]; /* For debug messaging. */
+ int ml = 0;
+
+ /* Find the controller of the command to be aborted */
+ h = sdev_to_hba(sc->device);
+ if (WARN(h == NULL,
+ "ABORT REQUEST FAILED, Controller lookup failed.\n"))
+ return FAILED;
+
+ /* Check that controller supports some kind of task abort */
+ if (!(HPSATMF_PHYS_TASK_ABORT & h->TMFSupportFlags) &&
+ !(HPSATMF_LOG_TASK_ABORT & h->TMFSupportFlags))
+ return FAILED;
+
+ memset(msg, 0, sizeof(msg));
+ ml += sprintf(msg+ml, "ABORT REQUEST on C%d:B%d:T%d:L%d ",
+ h->scsi_host->host_no, sc->device->channel,
+ sc->device->id, sc->device->lun);
+
+ /* Find the device of the command to be aborted */
+ dev = sc->device->hostdata;
+ if (!dev) {
+ dev_err(&h->pdev->dev, "%s FAILED, Device lookup failed.\n",
+ msg);
+ return FAILED;
+ }
+
+ /* Get SCSI command to be aborted */
+ abort = (struct CommandList *) sc->host_scribble;
+ if (abort == NULL) {
+ dev_err(&h->pdev->dev, "%s FAILED, Command to abort is NULL.\n",
+ msg);
+ return FAILED;
+ }
+
+ ml += sprintf(msg+ml, "Tag:0x%08x:%08x ",
+ abort->Header.Tag.upper, abort->Header.Tag.lower);
+ as = (struct scsi_cmnd *) abort->scsi_cmd;
+ if (as != NULL)
+ ml += sprintf(msg+ml, "Command:0x%x SN:0x%lx ",
+ as->cmnd[0], as->serial_number);
+ dev_dbg(&h->pdev->dev, "%s\n", msg);
+ dev_warn(&h->pdev->dev, "Abort request on C%d:B%d:T%d:L%d\n",
+ h->scsi_host->host_no, dev->bus, dev->target, dev->lun);
+
+ /* Search reqQ to See if command is queued but not submitted,
+ * if so, complete the command with aborted status and remove
+ * it from the reqQ.
+ */
+ found = hpsa_find_cmd_in_queue(h, sc, &h->reqQ);
+ if (found) {
+ found->err_info->CommandStatus = CMD_ABORTED;
+ finish_cmd(found);
+ dev_info(&h->pdev->dev, "%s Request SUCCEEDED (driver queue).\n",
+ msg);
+ return SUCCESS;
+ }
+
+ /* not in reqQ, if also not in cmpQ, must have already completed */
+ found = hpsa_find_cmd_in_queue(h, sc, &h->cmpQ);
+ if (!found) {
+ dev_dbg(&h->pdev->dev, "%s Request FAILED (not known to driver).\n",
+ msg);
+ return SUCCESS;
+ }
+
+ /*
+ * Command is in flight, or possibly already completed
+ * by the firmware (but not to the scsi mid layer) but we can't
+ * distinguish which. Send the abort down.
+ */
+ rc = hpsa_send_abort(h, dev->scsi3addr, abort);
+ if (rc != 0) {
+ dev_dbg(&h->pdev->dev, "%s Request FAILED.\n", msg);
+ dev_warn(&h->pdev->dev, "FAILED abort on device C%d:B%d:T%d:L%d\n",
+ h->scsi_host->host_no,
+ dev->bus, dev->target, dev->lun);
+ return FAILED;
+ }
+ dev_info(&h->pdev->dev, "%s REQUEST SUCCEEDED.\n", msg);
+
+ /* If the abort(s) above completed and actually aborted the
+ * command, then the command to be aborted should already be
+ * completed. If not, wait around a bit more to see if they
+ * manage to complete normally.
+ */
+#define ABORT_COMPLETE_WAIT_SECS 30
+ for (i = 0; i < ABORT_COMPLETE_WAIT_SECS * 10; i++) {
+ found = hpsa_find_cmd_in_queue(h, sc, &h->cmpQ);
+ if (!found)
+ return SUCCESS;
+ msleep(100);
+ }
+ dev_warn(&h->pdev->dev, "%s FAILED. Aborted command has not completed after %d seconds.\n",
+ msg, ABORT_COMPLETE_WAIT_SECS);
+ return FAILED;
+}
+
+
/*
* For operations that cannot sleep, a command block is allocated at init,
* and managed by cmd_alloc() and cmd_free() using a simple bitmap to track
@@ -2884,6 +3072,7 @@ static void fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
int cmd_type)
{
int pci_dir = XFER_NONE;
+ struct CommandList *a; /* for commands to be aborted */

c->cmd_type = CMD_IOCTL_PEND;
c->Header.ReplyQueue = 0;
@@ -2967,8 +3156,35 @@ static void fill_cmd(struct CommandList *c, u8 cmd, struct ctlr_info *h,
c->Request.CDB[5] = 0x00;
c->Request.CDB[6] = 0x00;
c->Request.CDB[7] = 0x00;
+ break;
+ case HPSA_ABORT_MSG:
+ a = buff; /* point to command to be aborted */
+ dev_dbg(&h->pdev->dev, "Abort Tag:0x%08x:%08x using request Tag:0x%08x:%08x\n",
+ a->Header.Tag.upper, a->Header.Tag.lower,
+ c->Header.Tag.upper, c->Header.Tag.lower);
+ c->Request.CDBLen = 16;
+ c->Request.Type.Type = TYPE_MSG;
+ c->Request.Type.Attribute = ATTR_SIMPLE;
+ c->Request.Type.Direction = XFER_WRITE;
+ c->Request.Timeout = 0; /* Don't time out */
+ c->Request.CDB[0] = HPSA_TASK_MANAGEMENT;
+ c->Request.CDB[1] = HPSA_TMF_ABORT_TASK;
+ c->Request.CDB[2] = 0x00; /* reserved */
+ c->Request.CDB[3] = 0x00; /* reserved */
+ /* Tag to abort goes in CDB[4]-CDB[11] */
+ c->Request.CDB[4] = a->Header.Tag.lower & 0xFF;
+ c->Request.CDB[5] = (a->Header.Tag.lower >> 8) & 0xFF;
+ c->Request.CDB[6] = (a->Header.Tag.lower >> 16) & 0xFF;
+ c->Request.CDB[7] = (a->Header.Tag.lower >> 24) & 0xFF;
+ c->Request.CDB[8] = a->Header.Tag.upper & 0xFF;
+ c->Request.CDB[9] = (a->Header.Tag.upper >> 8) & 0xFF;
+ c->Request.CDB[10] = (a->Header.Tag.upper >> 16) & 0xFF;
+ c->Request.CDB[11] = (a->Header.Tag.upper >> 24) & 0xFF;
+ c->Request.CDB[12] = 0x00; /* reserved */
+ c->Request.CDB[13] = 0x00; /* reserved */
+ c->Request.CDB[14] = 0x00; /* reserved */
+ c->Request.CDB[15] = 0x00; /* reserved */
break;
-
default:
dev_warn(&h->pdev->dev, "unknown message type %d\n",
cmd);
@@ -3848,6 +4064,9 @@ static void __devinit hpsa_find_board_params(struct ctlr_info *h)
h->maxsgentries = 31; /* default to traditional values */
h->chainsize = 0;
}
+
+ /* Find out what task management functions are supported and cache */
+ h->TMFSupportFlags = readl(&(h->cfgtable->TMFSupportFlags));
}

static inline bool hpsa_CISS_signature_present(struct ctlr_info *h)
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 48f7812..d8aa95c 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -125,6 +125,27 @@ struct ctlr_info {
u64 last_heartbeat_timestamp;
u32 lockup_detected;
struct list_head lockup_list;
+ u32 TMFSupportFlags; /* cache what task mgmt funcs are supported. */
+#define HPSATMF_BITS_SUPPORTED (1 << 0)
+#define HPSATMF_PHYS_LUN_RESET (1 << 1)
+#define HPSATMF_PHYS_NEX_RESET (1 << 2)
+#define HPSATMF_PHYS_TASK_ABORT (1 << 3)
+#define HPSATMF_PHYS_TSET_ABORT (1 << 4)
+#define HPSATMF_PHYS_CLEAR_ACA (1 << 5)
+#define HPSATMF_PHYS_CLEAR_TSET (1 << 6)
+#define HPSATMF_PHYS_QRY_TASK (1 << 7)
+#define HPSATMF_PHYS_QRY_TSET (1 << 8)
+#define HPSATMF_PHYS_QRY_ASYNC (1 << 9)
+#define HPSATMF_MASK_SUPPORTED (1 << 16)
+#define HPSATMF_LOG_LUN_RESET (1 << 17)
+#define HPSATMF_LOG_NEX_RESET (1 << 18)
+#define HPSATMF_LOG_TASK_ABORT (1 << 19)
+#define HPSATMF_LOG_TSET_ABORT (1 << 20)
+#define HPSATMF_LOG_CLEAR_ACA (1 << 21)
+#define HPSATMF_LOG_CLEAR_TSET (1 << 22)
+#define HPSATMF_LOG_QRY_TASK (1 << 23)
+#define HPSATMF_LOG_QRY_TSET (1 << 24)
+#define HPSATMF_LOG_QRY_ASYNC (1 << 25)
};
#define HPSA_ABORT_MSG 0
#define HPSA_DEVICE_RESET_MSG 1
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 8049815..14b56c9 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -82,6 +82,29 @@
#define TYPE_CMD 0x00
#define TYPE_MSG 0x01

+/* Message Types */
+#define HPSA_TASK_MANAGEMENT 0x00
+#define HPSA_RESET 0x01
+#define HPSA_SCAN 0x02
+#define HPSA_NOOP 0x03
+
+#define HPSA_CTLR_RESET_TYPE 0x00
+#define HPSA_BUS_RESET_TYPE 0x01
+#define HPSA_TARGET_RESET_TYPE 0x03
+#define HPSA_LUN_RESET_TYPE 0x04
+#define HPSA_NEXUS_RESET_TYPE 0x05
+
+/* Task Management Functions */
+#define HPSA_TMF_ABORT_TASK 0x00
+#define HPSA_TMF_ABORT_TASK_SET 0x01
+#define HPSA_TMF_CLEAR_ACA 0x02
+#define HPSA_TMF_CLEAR_TASK_SET 0x03
+#define HPSA_TMF_QUERY_TASK 0x04
+#define HPSA_TMF_QUERY_TASK_SET 0x05
+#define HPSA_TMF_QUERY_ASYNCEVENT 0x06
+
+
+
/* config space register offsets */
#define CFG_VENDORID 0x00
#define CFG_DEVICEID 0x02
@@ -337,11 +360,17 @@ struct CfgTable {
u32 MaxPhysicalDevices;
u32 MaxPhysicalDrivesPerLogicalUnit;
u32 MaxPerformantModeCommands;
- u8 reserved[0x78 - 0x58];
+ u32 MaxBlockFetch;
+ u32 PowerConservationSupport;
+ u32 PowerConservationEnable;
+ u32 TMFSupportFlags;
+ u8 TMFTagMask[8];
+ u8 reserved[0x78 - 0x70];
u32 misc_fw_support; /* offset 0x78 */
#define MISC_FW_DOORBELL_RESET (0x02)
#define MISC_FW_DOORBELL_RESET2 (0x010)
u8 driver_version[32];
+
};

#define NUM_BLOCKFETCH_ENTRIES 8

2012-05-01 16:42:49

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 08/17] hpsa: remove unused parameter from finish_cmd

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index cb9d619..49b31ad 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3063,7 +3063,7 @@ static inline int bad_tag(struct ctlr_info *h, u32 tag_index,
return 0;
}

-static inline void finish_cmd(struct CommandList *c, u32 raw_tag)
+static inline void finish_cmd(struct CommandList *c)
{
removeQ(c);
if (likely(c->cmd_type == CMD_SCSI))
@@ -3103,7 +3103,7 @@ static inline u32 process_indexed_cmd(struct ctlr_info *h,
if (bad_tag(h, tag_index, raw_tag))
return next_command(h);
c = h->cmd_pool + tag_index;
- finish_cmd(c, raw_tag);
+ finish_cmd(c);
return next_command(h);
}

@@ -3117,7 +3117,7 @@ static inline u32 process_nonindexed_cmd(struct ctlr_info *h,
tag = hpsa_tag_discard_error_bits(h, raw_tag);
list_for_each_entry(c, &h->cmpQ, list) {
if ((c->busaddr & 0xFFFFFFE0) == (tag & 0xFFFFFFE0)) {
- finish_cmd(c, raw_tag);
+ finish_cmd(c);
return next_command(h);
}
}
@@ -4170,7 +4170,7 @@ static void fail_all_cmds_on_list(struct ctlr_info *h, struct list_head *list)
while (!list_empty(list)) {
c = list_entry(list->next, struct CommandList, list);
c->err_info->CommandStatus = CMD_HARDWARE_ERR;
- finish_cmd(c, c->Header.Tag.lower);
+ finish_cmd(c);
}
}

2012-05-01 16:43:00

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 10/17] hpsa: do aborts two ways

From: Stephen M. Cameron <[email protected]>

When aborting a command, the tag is supposed to be
specified as 64-bit little endian. However, some smart
arrays expect the tag of the command to be aborted to be
specified in a strange byte order. How to tell which sort
of Smart Array firmware we're dealing with is not obvious.
However, because of the way we construct our tags, the values
of any outstanding tag when specified with the "strange" byte
order will not collide with the value specified in the correct
order. That means we can safely attempt the abort both ways.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index abf5c7e..3eda5f7 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -2355,8 +2355,23 @@ static int hpsa_eh_device_reset_handler(struct scsi_cmnd *scsicmd)
return FAILED;
}

+static void swizzle_abort_tag(u8 *tag)
+{
+ u8 original_tag[8];
+
+ memcpy(original_tag, tag, 8);
+ tag[0] = original_tag[3];
+ tag[1] = original_tag[2];
+ tag[2] = original_tag[1];
+ tag[3] = original_tag[0];
+ tag[4] = original_tag[7];
+ tag[5] = original_tag[6];
+ tag[6] = original_tag[5];
+ tag[7] = original_tag[4];
+}
+
static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
- struct CommandList *abort)
+ struct CommandList *abort, int swizzle)
{
int rc = IO_OK;
struct CommandList *c;
@@ -2369,6 +2384,8 @@ static int hpsa_send_abort(struct ctlr_info *h, unsigned char *scsi3addr,
}

fill_cmd(c, HPSA_ABORT_MSG, h, abort, 0, 0, scsi3addr, TYPE_MSG);
+ if (swizzle)
+ swizzle_abort_tag(&c->Request.CDB[4]);
hpsa_scsi_do_simple_cmd_core(h, c);
dev_dbg(&h->pdev->dev, "%s: Tag:0x%08x:%08x: do_simple_cmd_core completed.\n",
__func__, abort->Header.Tag.upper, abort->Header.Tag.lower);
@@ -2428,6 +2445,59 @@ static struct CommandList *hpsa_find_cmd_in_queue(struct ctlr_info *h,
return NULL;
}

+static struct CommandList *hpsa_find_cmd_in_queue_by_tag(struct ctlr_info *h,
+ u8 *tag, struct list_head *queue_head)
+{
+ unsigned long flags;
+ struct CommandList *c;
+
+ spin_lock_irqsave(&h->lock, flags);
+ list_for_each_entry(c, queue_head, list) {
+ if (memcmp(&c->Header.Tag, tag, 8) != 0)
+ continue;
+ spin_unlock_irqrestore(&h->lock, flags);
+ return c;
+ }
+ spin_unlock_irqrestore(&h->lock, flags);
+ return NULL;
+}
+
+/* Some Smart Arrays need the abort tag swizzled, and some don't. It's hard to
+ * tell which kind we're dealing with, so we send the abort both ways. There
+ * shouldn't be any collisions between swizzled and unswizzled tags due to the
+ * way we construct our tags but we check anyway in case the assumptions which
+ * make this true someday become false.
+ */
+static int hpsa_send_abort_both_ways(struct ctlr_info *h,
+ unsigned char *scsi3addr, struct CommandList *abort)
+{
+ u8 swizzled_tag[8];
+ struct CommandList *c;
+ int rc = 0, rc2 = 0;
+
+ /* we do not expect to find the swizzled tag in our queue, but
+ * check anyway just to be sure the assumptions which make this
+ * the case haven't become wrong.
+ */
+ memcpy(swizzled_tag, &abort->Request.CDB[4], 8);
+ swizzle_abort_tag(swizzled_tag);
+ c = hpsa_find_cmd_in_queue_by_tag(h, swizzled_tag, &h->cmpQ);
+ if (c != NULL) {
+ dev_warn(&h->pdev->dev, "Unexpectedly found byte-swapped tag in completion queue.\n");
+ return hpsa_send_abort(h, scsi3addr, abort, 0);
+ }
+ rc = hpsa_send_abort(h, scsi3addr, abort, 0);
+
+ /* if the command is still in our queue, we can't conclude that it was
+ * aborted (it might have just completed normally) but in any case
+ * we don't need to try to abort it another way.
+ */
+ c = hpsa_find_cmd_in_queue(h, abort->scsi_cmd, &h->cmpQ);
+ if (c)
+ rc2 = hpsa_send_abort(h, scsi3addr, abort, 1);
+ return rc && rc2;
+}
+
/* Send an abort for the specified command.
* If the device and controller support it,
* send a task abort request.
@@ -2512,7 +2582,7 @@ static int hpsa_eh_abort_handler(struct scsi_cmnd *sc)
* by the firmware (but not to the scsi mid layer) but we can't
* distinguish which. Send the abort down.
*/
- rc = hpsa_send_abort(h, dev->scsi3addr, abort);
+ rc = hpsa_send_abort_both_ways(h, dev->scsi3addr, abort);
if (rc != 0) {
dev_dbg(&h->pdev->dev, "%s Request FAILED.\n", msg);
dev_warn(&h->pdev->dev, "FAILED abort on device C%d:B%d:T%d:L%d\n",

2012-05-01 16:43:06

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 11/17] hpsa: factor out tail calls to next_command() in process_(non)indexed_cmd()

From: Stephen M. Cameron <[email protected]>

This is in order to smooth the way for upcoming changes to allow use of
multiple reply queues for command completions.

Signed-off-by: Stephen M. Cameron <[email protected]>
Signed-off-by: Matt Gates <[email protected]>
---
drivers/scsi/hpsa.c | 30 +++++++++++++++---------------
1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3eda5f7..7bf0327 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3379,22 +3379,21 @@ static inline u32 hpsa_tag_discard_error_bits(struct ctlr_info *h, u32 tag)
}

/* process completion of an indexed ("direct lookup") command */
-static inline u32 process_indexed_cmd(struct ctlr_info *h,
+static inline void process_indexed_cmd(struct ctlr_info *h,
u32 raw_tag)
{
u32 tag_index;
struct CommandList *c;

tag_index = hpsa_tag_to_index(raw_tag);
- if (bad_tag(h, tag_index, raw_tag))
- return next_command(h);
- c = h->cmd_pool + tag_index;
- finish_cmd(c);
- return next_command(h);
+ if (!bad_tag(h, tag_index, raw_tag)) {
+ c = h->cmd_pool + tag_index;
+ finish_cmd(c);
+ }
}

/* process completion of a non-indexed command */
-static inline u32 process_nonindexed_cmd(struct ctlr_info *h,
+static inline void process_nonindexed_cmd(struct ctlr_info *h,
u32 raw_tag)
{
u32 tag;
@@ -3404,11 +3403,10 @@ static inline u32 process_nonindexed_cmd(struct ctlr_info *h,
list_for_each_entry(c, &h->cmpQ, list) {
if ((c->busaddr & 0xFFFFFFE0) == (tag & 0xFFFFFFE0)) {
finish_cmd(c);
- return next_command(h);
+ return;
}
}
bad_tag(h, h->nr_cmds + 1, raw_tag);
- return next_command(h);
}

/* Some controllers, like p400, will give us one interrupt
@@ -3483,10 +3481,11 @@ static irqreturn_t do_hpsa_intr_intx(int irq, void *dev_id)
while (interrupt_pending(h)) {
raw_tag = get_next_completion(h);
while (raw_tag != FIFO_EMPTY) {
- if (hpsa_tag_contains_index(raw_tag))
- raw_tag = process_indexed_cmd(h, raw_tag);
+ if (likely(hpsa_tag_contains_index(raw_tag)))
+ process_indexed_cmd(h, raw_tag);
else
- raw_tag = process_nonindexed_cmd(h, raw_tag);
+ process_nonindexed_cmd(h, raw_tag);
+ raw_tag = next_command(h);
}
}
spin_unlock_irqrestore(&h->lock, flags);
@@ -3503,10 +3502,11 @@ static irqreturn_t do_hpsa_intr_msi(int irq, void *dev_id)
h->last_intr_timestamp = get_jiffies_64();
raw_tag = get_next_completion(h);
while (raw_tag != FIFO_EMPTY) {
- if (hpsa_tag_contains_index(raw_tag))
- raw_tag = process_indexed_cmd(h, raw_tag);
+ if (likely(hpsa_tag_contains_index(raw_tag)))
+ process_indexed_cmd(h, raw_tag);
else
- raw_tag = process_nonindexed_cmd(h, raw_tag);
+ process_nonindexed_cmd(h, raw_tag);
+ raw_tag = next_command(h);
}
spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;

2012-05-01 16:43:21

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 13/17] hpsa: refine interrupt handler locking for greater concurrency

From: Matt Gates <[email protected]>

Use spinlocks with finer granularity in the submission and
completion paths to allow concurrent execution for multiple
reply queues. In particular, do not hold a spin lock while
submitting a request to the device, nor during most of the
interrupt handler.

Signed-off-by: Matt Gates <[email protected]>
Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 61 +++++++++++++++++++++++++++++++++------------------
drivers/scsi/hpsa.h | 13 +++++++----
2 files changed, 48 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 620e50a..cf2f60e 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -533,6 +533,7 @@ static inline u32 next_command(struct ctlr_info *h, u8 q)
{
u32 a;
struct reply_pool *rq = &h->reply_queue[q];
+ unsigned long flags;

if (unlikely(!(h->transMethod & CFGTBL_Trans_Performant)))
return h->access.command_completed(h, q);
@@ -540,7 +541,9 @@ static inline u32 next_command(struct ctlr_info *h, u8 q)
if ((rq->head[rq->current_entry] & 1) == rq->wraparound) {
a = rq->head[rq->current_entry];
rq->current_entry++;
+ spin_lock_irqsave(&h->lock, flags);
h->commands_outstanding--;
+ spin_unlock_irqrestore(&h->lock, flags);
} else {
a = FIFO_EMPTY;
}
@@ -575,8 +578,8 @@ static void enqueue_cmd_and_start_io(struct ctlr_info *h,
spin_lock_irqsave(&h->lock, flags);
addQ(&h->reqQ, c);
h->Qdepth++;
- start_io(h);
spin_unlock_irqrestore(&h->lock, flags);
+ start_io(h);
}

static inline void removeQ(struct CommandList *c)
@@ -2091,9 +2094,8 @@ static int hpsa_scsi_queue_command_lck(struct scsi_cmnd *cmd,
done(cmd);
return 0;
}
- /* Need a lock as this is being allocated from the pool */
- c = cmd_alloc(h);
spin_unlock_irqrestore(&h->lock, flags);
+ c = cmd_alloc(h);
if (c == NULL) { /* trouble... */
dev_err(&h->pdev->dev, "cmd_alloc returned NULL!\n");
return SCSI_MLQUEUE_HOST_BUSY;
@@ -2627,14 +2629,21 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
int i;
union u64bit temp64;
dma_addr_t cmd_dma_handle, err_dma_handle;
+ unsigned long flags;

+ spin_lock_irqsave(&h->lock, flags);
do {
i = find_first_zero_bit(h->cmd_pool_bits, h->nr_cmds);
- if (i == h->nr_cmds)
+ if (i == h->nr_cmds) {
+ spin_unlock_irqrestore(&h->lock, flags);
return NULL;
+ }
} while (test_and_set_bit
(i & (BITS_PER_LONG - 1),
h->cmd_pool_bits + (i / BITS_PER_LONG)) != 0);
+ h->nr_allocs++;
+ spin_unlock_irqrestore(&h->lock, flags);
+
c = h->cmd_pool + i;
memset(c, 0, sizeof(*c));
cmd_dma_handle = h->cmd_pool_dhandle
@@ -2643,7 +2652,6 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h)
memset(c->err_info, 0, sizeof(*c->err_info));
err_dma_handle = h->errinfo_pool_dhandle
+ i * sizeof(*c->err_info);
- h->nr_allocs++;

c->cmdindex = i;

@@ -2699,11 +2707,14 @@ static struct CommandList *cmd_special_alloc(struct ctlr_info *h)
static void cmd_free(struct ctlr_info *h, struct CommandList *c)
{
int i;
+ unsigned long flags;

i = c - h->cmd_pool;
+ spin_lock_irqsave(&h->lock, flags);
clear_bit(i & (BITS_PER_LONG - 1),
h->cmd_pool_bits + (i / BITS_PER_LONG));
h->nr_frees++;
+ spin_unlock_irqrestore(&h->lock, flags);
}

static void cmd_special_free(struct ctlr_info *h, struct CommandList *c)
@@ -3307,7 +3318,9 @@ static void __iomem *remap_pci_mem(ulong base, ulong size)
static void start_io(struct ctlr_info *h)
{
struct CommandList *c;
+ unsigned long flags;

+ spin_lock_irqsave(&h->lock, flags);
while (!list_empty(&h->reqQ)) {
c = list_entry(h->reqQ.next, struct CommandList, list);
/* can't do anything if fifo is full */
@@ -3320,12 +3333,23 @@ static void start_io(struct ctlr_info *h)
removeQ(c);
h->Qdepth--;

- /* Tell the controller execute command */
- h->access.submit_command(h, c);
-
/* Put job onto the completed Q */
addQ(&h->cmpQ, c);
+
+ /* Must increment commands_outstanding before unlocking
+ * and submitting to avoid race checking for fifo full
+ * condition.
+ */
+ h->commands_outstanding++;
+ if (h->commands_outstanding > h->max_outstanding)
+ h->max_outstanding = h->commands_outstanding;
+
+ /* Tell the controller execute command */
+ spin_unlock_irqrestore(&h->lock, flags);
+ h->access.submit_command(h, c);
+ spin_lock_irqsave(&h->lock, flags);
}
+ spin_unlock_irqrestore(&h->lock, flags);
}

static inline unsigned long get_next_completion(struct ctlr_info *h, u8 q)
@@ -3356,7 +3380,11 @@ static inline int bad_tag(struct ctlr_info *h, u32 tag_index,

static inline void finish_cmd(struct CommandList *c)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&c->h->lock, flags);
removeQ(c);
+ spin_unlock_irqrestore(&c->h->lock, flags);
if (likely(c->cmd_type == CMD_SCSI))
complete_scsi_command(c);
else if (c->cmd_type == CMD_IOCTL_PEND)
@@ -3403,14 +3431,18 @@ static inline void process_nonindexed_cmd(struct ctlr_info *h,
{
u32 tag;
struct CommandList *c = NULL;
+ unsigned long flags;

tag = hpsa_tag_discard_error_bits(h, raw_tag);
+ spin_lock_irqsave(&h->lock, flags);
list_for_each_entry(c, &h->cmpQ, list) {
if ((c->busaddr & 0xFFFFFFE0) == (tag & 0xFFFFFFE0)) {
+ spin_unlock_irqrestore(&h->lock, flags);
finish_cmd(c);
return;
}
}
+ spin_unlock_irqrestore(&h->lock, flags);
bad_tag(h, h->nr_cmds + 1, raw_tag);
}

@@ -3447,7 +3479,6 @@ static irqreturn_t hpsa_intx_discard_completions(int irq, void *queue)
{
struct ctlr_info *h = queue_to_hba(queue);
u8 q = *(u8 *) queue;
- unsigned long flags;
u32 raw_tag;

if (ignore_bogus_interrupt(h))
@@ -3455,47 +3486,39 @@ static irqreturn_t hpsa_intx_discard_completions(int irq, void *queue)

if (interrupt_not_for_us(h))
return IRQ_NONE;
- spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
while (interrupt_pending(h)) {
raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY)
raw_tag = next_command(h, q);
}
- spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

static irqreturn_t hpsa_msix_discard_completions(int irq, void *queue)
{
struct ctlr_info *h = queue_to_hba(queue);
- unsigned long flags;
u32 raw_tag;
u8 q = *(u8 *) queue;

if (ignore_bogus_interrupt(h))
return IRQ_NONE;

- spin_lock_irqsave(&h->lock, flags);
-
h->last_intr_timestamp = get_jiffies_64();
raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY)
raw_tag = next_command(h, q);
- spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

static irqreturn_t do_hpsa_intr_intx(int irq, void *queue)
{
struct ctlr_info *h = queue_to_hba((u8 *) queue);
- unsigned long flags;
u32 raw_tag;
u8 q = *(u8 *) queue;

if (interrupt_not_for_us(h))
return IRQ_NONE;
- spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
while (interrupt_pending(h)) {
raw_tag = get_next_completion(h, q);
@@ -3507,18 +3530,15 @@ static irqreturn_t do_hpsa_intr_intx(int irq, void *queue)
raw_tag = next_command(h, q);
}
}
- spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

static irqreturn_t do_hpsa_intr_msi(int irq, void *queue)
{
struct ctlr_info *h = queue_to_hba(queue);
- unsigned long flags;
u32 raw_tag;
u8 q = *(u8 *) queue;

- spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY) {
@@ -3528,7 +3548,6 @@ static irqreturn_t do_hpsa_intr_msi(int irq, void *queue)
process_nonindexed_cmd(h, raw_tag);
raw_tag = next_command(h, q);
}
- spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 486a7c0..79c36aa 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -246,9 +246,6 @@ static void SA5_submit_command(struct ctlr_info *h,
c->Header.Tag.lower);
writel(c->busaddr, h->vaddr + SA5_REQUEST_PORT_OFFSET);
(void) readl(h->vaddr + SA5_SCRATCHPAD_OFFSET);
- h->commands_outstanding++;
- if (h->commands_outstanding > h->max_outstanding)
- h->max_outstanding = h->commands_outstanding;
}

/*
@@ -287,7 +284,7 @@ static void SA5_performant_intr_mask(struct ctlr_info *h, unsigned long val)
static unsigned long SA5_performant_completed(struct ctlr_info *h, u8 q)
{
struct reply_pool *rq = &h->reply_queue[q];
- unsigned long register_value = FIFO_EMPTY;
+ unsigned long flags, register_value = FIFO_EMPTY;

/* msi auto clears the interrupt pending bit. */
if (!(h->msi_vector || h->msix_vector)) {
@@ -305,7 +302,9 @@ static unsigned long SA5_performant_completed(struct ctlr_info *h, u8 q)
if ((rq->head[rq->current_entry] & 1) == rq->wraparound) {
register_value = rq->head[rq->current_entry];
rq->current_entry++;
+ spin_lock_irqsave(&h->lock, flags);
h->commands_outstanding--;
+ spin_unlock_irqrestore(&h->lock, flags);
} else {
register_value = FIFO_EMPTY;
}
@@ -338,9 +337,13 @@ static unsigned long SA5_completed(struct ctlr_info *h,
{
unsigned long register_value
= readl(h->vaddr + SA5_REPLY_PORT_OFFSET);
+ unsigned long flags;

- if (register_value != FIFO_EMPTY)
+ if (register_value != FIFO_EMPTY) {
+ spin_lock_irqsave(&h->lock, flags);
h->commands_outstanding--;
+ spin_unlock_irqrestore(&h->lock, flags);
+ }

#ifdef HPSA_DEBUG
if (register_value != FIFO_EMPTY)

2012-05-01 16:43:13

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 12/17] hpsa: use multiple reply queues

From: Matt Gates <[email protected]>

Smart Arrays can support multiple reply queues onto which command
completions may be deposited. It can help performance quite a bit
to arrange for command completions to be processed on the same CPU
from which they were submitted to increase the likelihood of cache
hits.

Signed-off-by: Matt Gates <[email protected]>
Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 181 ++++++++++++++++++++++++++++++++---------------
drivers/scsi/hpsa.h | 40 ++++++----
drivers/scsi/hpsa_cmd.h | 5 +
3 files changed, 153 insertions(+), 73 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 7bf0327..620e50a 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -172,7 +172,7 @@ static void check_ioctl_unit_attention(struct ctlr_info *h,
static void calc_bucket_map(int *bucket, int num_buckets,
int nsgs, int *bucket_map);
static __devinit void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h);
-static inline u32 next_command(struct ctlr_info *h);
+static inline u32 next_command(struct ctlr_info *h, u8 q);
static int __devinit hpsa_find_cfg_addrs(struct pci_dev *pdev,
void __iomem *vaddr, u32 *cfg_base_addr, u64 *cfg_base_addr_index,
u64 *cfg_offset);
@@ -529,24 +529,25 @@ static inline void addQ(struct list_head *list, struct CommandList *c)
list_add_tail(&c->list, list);
}

-static inline u32 next_command(struct ctlr_info *h)
+static inline u32 next_command(struct ctlr_info *h, u8 q)
{
u32 a;
+ struct reply_pool *rq = &h->reply_queue[q];

if (unlikely(!(h->transMethod & CFGTBL_Trans_Performant)))
- return h->access.command_completed(h);
+ return h->access.command_completed(h, q);

- if ((*(h->reply_pool_head) & 1) == (h->reply_pool_wraparound)) {
- a = *(h->reply_pool_head); /* Next cmd in ring buffer */
- (h->reply_pool_head)++;
+ if ((rq->head[rq->current_entry] & 1) == rq->wraparound) {
+ a = rq->head[rq->current_entry];
+ rq->current_entry++;
h->commands_outstanding--;
} else {
a = FIFO_EMPTY;
}
/* Check for wraparound */
- if (h->reply_pool_head == (h->reply_pool + h->max_commands)) {
- h->reply_pool_head = h->reply_pool;
- h->reply_pool_wraparound ^= 1;
+ if (rq->current_entry == h->max_commands) {
+ rq->current_entry = 0;
+ rq->wraparound ^= 1;
}
return a;
}
@@ -557,8 +558,12 @@ static inline u32 next_command(struct ctlr_info *h)
*/
static void set_performant_mode(struct ctlr_info *h, struct CommandList *c)
{
- if (likely(h->transMethod & CFGTBL_Trans_Performant))
+ if (likely(h->transMethod & CFGTBL_Trans_Performant)) {
c->busaddr |= 1 | (h->blockFetchTable[c->Header.SGList] << 1);
+ if (likely(h->msix_vector))
+ c->Header.ReplyQueue =
+ smp_processor_id() % h->nreply_queues;
+ }
}

static void enqueue_cmd_and_start_io(struct ctlr_info *h,
@@ -3323,9 +3328,9 @@ static void start_io(struct ctlr_info *h)
}
}

-static inline unsigned long get_next_completion(struct ctlr_info *h)
+static inline unsigned long get_next_completion(struct ctlr_info *h, u8 q)
{
- return h->access.command_completed(h);
+ return h->access.command_completed(h, q);
}

static inline bool interrupt_pending(struct ctlr_info *h)
@@ -3428,9 +3433,20 @@ static int ignore_bogus_interrupt(struct ctlr_info *h)
return 1;
}

-static irqreturn_t hpsa_intx_discard_completions(int irq, void *dev_id)
+/*
+ * Convert &h->q[x] (passed to interrupt handlers) back to h.
+ * Relies on (h-q[x] == x) being true for x such that
+ * 0 <= x < MAX_REPLY_QUEUES.
+ */
+static struct ctlr_info *queue_to_hba(u8 *queue)
{
- struct ctlr_info *h = dev_id;
+ return container_of((queue - *queue), struct ctlr_info, q[0]);
+}
+
+static irqreturn_t hpsa_intx_discard_completions(int irq, void *queue)
+{
+ struct ctlr_info *h = queue_to_hba(queue);
+ u8 q = *(u8 *) queue;
unsigned long flags;
u32 raw_tag;

@@ -3442,71 +3458,75 @@ static irqreturn_t hpsa_intx_discard_completions(int irq, void *dev_id)
spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
while (interrupt_pending(h)) {
- raw_tag = get_next_completion(h);
+ raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY)
- raw_tag = next_command(h);
+ raw_tag = next_command(h, q);
}
spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

-static irqreturn_t hpsa_msix_discard_completions(int irq, void *dev_id)
+static irqreturn_t hpsa_msix_discard_completions(int irq, void *queue)
{
- struct ctlr_info *h = dev_id;
+ struct ctlr_info *h = queue_to_hba(queue);
unsigned long flags;
u32 raw_tag;
+ u8 q = *(u8 *) queue;

if (ignore_bogus_interrupt(h))
return IRQ_NONE;

spin_lock_irqsave(&h->lock, flags);
+
h->last_intr_timestamp = get_jiffies_64();
- raw_tag = get_next_completion(h);
+ raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY)
- raw_tag = next_command(h);
+ raw_tag = next_command(h, q);
spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

-static irqreturn_t do_hpsa_intr_intx(int irq, void *dev_id)
+static irqreturn_t do_hpsa_intr_intx(int irq, void *queue)
{
- struct ctlr_info *h = dev_id;
+ struct ctlr_info *h = queue_to_hba((u8 *) queue);
unsigned long flags;
u32 raw_tag;
+ u8 q = *(u8 *) queue;

if (interrupt_not_for_us(h))
return IRQ_NONE;
spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
while (interrupt_pending(h)) {
- raw_tag = get_next_completion(h);
+ raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY) {
if (likely(hpsa_tag_contains_index(raw_tag)))
process_indexed_cmd(h, raw_tag);
else
process_nonindexed_cmd(h, raw_tag);
- raw_tag = next_command(h);
+ raw_tag = next_command(h, q);
}
}
spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
}

-static irqreturn_t do_hpsa_intr_msi(int irq, void *dev_id)
+static irqreturn_t do_hpsa_intr_msi(int irq, void *queue)
{
- struct ctlr_info *h = dev_id;
+ struct ctlr_info *h = queue_to_hba(queue);
unsigned long flags;
u32 raw_tag;
+ u8 q = *(u8 *) queue;

spin_lock_irqsave(&h->lock, flags);
h->last_intr_timestamp = get_jiffies_64();
- raw_tag = get_next_completion(h);
+ raw_tag = get_next_completion(h, q);
while (raw_tag != FIFO_EMPTY) {
if (likely(hpsa_tag_contains_index(raw_tag)))
process_indexed_cmd(h, raw_tag);
else
process_nonindexed_cmd(h, raw_tag);
- raw_tag = next_command(h);
+ raw_tag = next_command(h, q);
}
spin_unlock_irqrestore(&h->lock, flags);
return IRQ_HANDLED;
@@ -3942,10 +3962,13 @@ static int find_PCI_BAR_index(struct pci_dev *pdev, unsigned long pci_bar_addr)
static void __devinit hpsa_interrupt_mode(struct ctlr_info *h)
{
#ifdef CONFIG_PCI_MSI
- int err;
- struct msix_entry hpsa_msix_entries[4] = { {0, 0}, {0, 1},
- {0, 2}, {0, 3}
- };
+ int err, i;
+ struct msix_entry hpsa_msix_entries[MAX_REPLY_QUEUES];
+
+ for (i = 0; i < MAX_REPLY_QUEUES; i++) {
+ hpsa_msix_entries[i].vector = 0;
+ hpsa_msix_entries[i].entry = i;
+ }

/* Some boards advertise MSI but don't really support it */
if ((h->board_id == 0x40700E11) || (h->board_id == 0x40800E11) ||
@@ -3953,12 +3976,11 @@ static void __devinit hpsa_interrupt_mode(struct ctlr_info *h)
goto default_int_mode;
if (pci_find_capability(h->pdev, PCI_CAP_ID_MSIX)) {
dev_info(&h->pdev->dev, "MSIX\n");
- err = pci_enable_msix(h->pdev, hpsa_msix_entries, 4);
+ err = pci_enable_msix(h->pdev, hpsa_msix_entries,
+ MAX_REPLY_QUEUES);
if (!err) {
- h->intr[0] = hpsa_msix_entries[0].vector;
- h->intr[1] = hpsa_msix_entries[1].vector;
- h->intr[2] = hpsa_msix_entries[2].vector;
- h->intr[3] = hpsa_msix_entries[3].vector;
+ for (i = 0; i < MAX_REPLY_QUEUES; i++)
+ h->intr[i] = hpsa_msix_entries[i].vector;
h->msix_vector = 1;
return;
}
@@ -4375,14 +4397,33 @@ static int hpsa_request_irq(struct ctlr_info *h,
irqreturn_t (*msixhandler)(int, void *),
irqreturn_t (*intxhandler)(int, void *))
{
- int rc;
+ int rc, i;

- if (h->msix_vector || h->msi_vector)
- rc = request_irq(h->intr[h->intr_mode], msixhandler,
- 0, h->devname, h);
- else
- rc = request_irq(h->intr[h->intr_mode], intxhandler,
- IRQF_SHARED, h->devname, h);
+ /*
+ * initialize h->q[x] = x so that interrupt handlers know which
+ * queue to process.
+ */
+ for (i = 0; i < MAX_REPLY_QUEUES; i++)
+ h->q[i] = (u8) i;
+
+ if (h->intr_mode == PERF_MODE_INT && h->msix_vector) {
+ /* If performant mode and MSI-X, use multiple reply queues */
+ for (i = 0; i < MAX_REPLY_QUEUES; i++)
+ rc = request_irq(h->intr[i], msixhandler,
+ 0, h->devname,
+ &h->q[i]);
+ } else {
+ /* Use single reply pool */
+ if (h->msix_vector || h->msi_vector) {
+ rc = request_irq(h->intr[h->intr_mode],
+ msixhandler, 0, h->devname,
+ &h->q[h->intr_mode]);
+ } else {
+ rc = request_irq(h->intr[h->intr_mode],
+ intxhandler, IRQF_SHARED, h->devname,
+ &h->q[h->intr_mode]);
+ }
+ }
if (rc) {
dev_err(&h->pdev->dev, "unable to get irq %d for %s\n",
h->intr[h->intr_mode], h->devname);
@@ -4415,9 +4456,24 @@ static int __devinit hpsa_kdump_soft_reset(struct ctlr_info *h)
return 0;
}

+static void free_irqs(struct ctlr_info *h)
+{
+ int i;
+
+ if (!h->msix_vector || h->intr_mode != PERF_MODE_INT) {
+ /* Single reply queue, only one irq to free */
+ i = h->intr_mode;
+ free_irq(h->intr[i], &h->q[i]);
+ return;
+ }
+
+ for (i = 0; i < MAX_REPLY_QUEUES; i++)
+ free_irq(h->intr[i], &h->q[i]);
+}
+
static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
{
- free_irq(h->intr[h->intr_mode], h);
+ free_irqs(h);
#ifdef CONFIG_PCI_MSI
if (h->msix_vector)
pci_disable_msix(h->pdev);
@@ -4685,7 +4741,7 @@ reinit_after_soft_reset:
spin_lock_irqsave(&h->lock, flags);
h->access.set_intr_mask(h, HPSA_INTR_OFF);
spin_unlock_irqrestore(&h->lock, flags);
- free_irq(h->intr[h->intr_mode], h);
+ free_irqs(h);
rc = hpsa_request_irq(h, hpsa_msix_discard_completions,
hpsa_intx_discard_completions);
if (rc) {
@@ -4735,7 +4791,7 @@ reinit_after_soft_reset:
clean4:
hpsa_free_sg_chain_blocks(h);
hpsa_free_cmd_pool(h);
- free_irq(h->intr[h->intr_mode], h);
+ free_irqs(h);
clean2:
clean1:
kfree(h);
@@ -4778,7 +4834,7 @@ static void hpsa_shutdown(struct pci_dev *pdev)
*/
hpsa_flush_cache(h);
h->access.set_intr_mask(h, HPSA_INTR_OFF);
- free_irq(h->intr[h->intr_mode], h);
+ free_irqs(h);
#ifdef CONFIG_PCI_MSI
if (h->msix_vector)
pci_disable_msix(h->pdev);
@@ -4918,11 +4974,8 @@ static __devinit void hpsa_enter_performant_mode(struct ctlr_info *h,
* 10 = 6 s/g entry or 24k
*/

- h->reply_pool_wraparound = 1; /* spec: init to 1 */
-
/* Controller spec: zero out this buffer. */
memset(h->reply_pool, 0, h->reply_pool_size);
- h->reply_pool_head = h->reply_pool;

bft[7] = SG_ENTRIES_IN_CMD + 4;
calc_bucket_map(bft, ARRAY_SIZE(bft),
@@ -4932,12 +4985,19 @@ static __devinit void hpsa_enter_performant_mode(struct ctlr_info *h,

/* size of controller ring buffer */
writel(h->max_commands, &h->transtable->RepQSize);
- writel(1, &h->transtable->RepQCount);
+ writel(h->nreply_queues, &h->transtable->RepQCount);
writel(0, &h->transtable->RepQCtrAddrLow32);
writel(0, &h->transtable->RepQCtrAddrHigh32);
- writel(h->reply_pool_dhandle, &h->transtable->RepQAddr0Low32);
- writel(0, &h->transtable->RepQAddr0High32);
- writel(CFGTBL_Trans_Performant | use_short_tags,
+
+ for (i = 0; i < h->nreply_queues; i++) {
+ writel(0, &h->transtable->RepQAddr[i].upper);
+ writel(h->reply_pool_dhandle +
+ (h->max_commands * sizeof(u64) * i),
+ &h->transtable->RepQAddr[i].lower);
+ }
+
+ writel(CFGTBL_Trans_Performant | use_short_tags |
+ CFGTBL_Trans_enable_directed_msix,
&(h->cfgtable->HostWrite.TransportRequest));
writel(CFGTBL_ChangeReq, h->vaddr + SA5_DOORBELL);
hpsa_wait_for_mode_change_ack(h);
@@ -4955,6 +5015,7 @@ static __devinit void hpsa_enter_performant_mode(struct ctlr_info *h,
static __devinit void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
{
u32 trans_support;
+ int i;

if (hpsa_simple_mode)
return;
@@ -4963,12 +5024,20 @@ static __devinit void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
if (!(trans_support & PERFORMANT_MODE))
return;

+ h->nreply_queues = h->msix_vector ? MAX_REPLY_QUEUES : 1;
hpsa_get_max_perf_mode_cmds(h);
/* Performant mode ring buffer and supporting data structures */
- h->reply_pool_size = h->max_commands * sizeof(u64);
+ h->reply_pool_size = h->max_commands * sizeof(u64) * h->nreply_queues;
h->reply_pool = pci_alloc_consistent(h->pdev, h->reply_pool_size,
&(h->reply_pool_dhandle));

+ for (i = 0; i < h->nreply_queues; i++) {
+ h->reply_queue[i].head = &h->reply_pool[h->max_commands * i];
+ h->reply_queue[i].size = h->max_commands;
+ h->reply_queue[i].wraparound = 1; /* spec: init to 1 */
+ h->reply_queue[i].current_entry = 0;
+ }
+
/* Need a block fetch table for performant mode */
h->blockFetchTable = kmalloc(((SG_ENTRIES_IN_CMD + 1) *
sizeof(u32)), GFP_KERNEL);
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index d8aa95c..486a7c0 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -34,7 +34,7 @@ struct access_method {
void (*set_intr_mask)(struct ctlr_info *h, unsigned long val);
unsigned long (*fifo_full)(struct ctlr_info *h);
bool (*intr_pending)(struct ctlr_info *h);
- unsigned long (*command_completed)(struct ctlr_info *h);
+ unsigned long (*command_completed)(struct ctlr_info *h, u8 q);
};

struct hpsa_scsi_dev_t {
@@ -48,6 +48,13 @@ struct hpsa_scsi_dev_t {
unsigned char raid_level; /* from inquiry page 0xC1 */
};

+struct reply_pool {
+ u64 *head;
+ size_t size;
+ u8 wraparound;
+ u32 current_entry;
+};
+
struct ctlr_info {
int ctlr;
char devname[8];
@@ -68,7 +75,7 @@ struct ctlr_info {
# define DOORBELL_INT 1
# define SIMPLE_MODE_INT 2
# define MEMQ_MODE_INT 3
- unsigned int intr[4];
+ unsigned int intr[MAX_REPLY_QUEUES];
unsigned int msix_vector;
unsigned int msi_vector;
int intr_mode; /* either PERF_MODE_INT or SIMPLE_MODE_INT */
@@ -111,13 +118,13 @@ struct ctlr_info {
unsigned long transMethod;

/*
- * Performant mode completion buffer
+ * Performant mode completion buffers
*/
u64 *reply_pool;
- dma_addr_t reply_pool_dhandle;
- u64 *reply_pool_head;
size_t reply_pool_size;
- unsigned char reply_pool_wraparound;
+ struct reply_pool reply_queue[MAX_REPLY_QUEUES];
+ u8 nreply_queues;
+ dma_addr_t reply_pool_dhandle;
u32 *blockFetchTable;
unsigned char *hba_inquiry_data;
u64 last_intr_timestamp;
@@ -125,6 +132,8 @@ struct ctlr_info {
u64 last_heartbeat_timestamp;
u32 lockup_detected;
struct list_head lockup_list;
+ /* Address of h->q[x] is passed to intr handler to know which queue */
+ u8 q[MAX_REPLY_QUEUES];
u32 TMFSupportFlags; /* cache what task mgmt funcs are supported. */
#define HPSATMF_BITS_SUPPORTED (1 << 0)
#define HPSATMF_PHYS_LUN_RESET (1 << 1)
@@ -275,8 +284,9 @@ static void SA5_performant_intr_mask(struct ctlr_info *h, unsigned long val)
}
}

-static unsigned long SA5_performant_completed(struct ctlr_info *h)
+static unsigned long SA5_performant_completed(struct ctlr_info *h, u8 q)
{
+ struct reply_pool *rq = &h->reply_queue[q];
unsigned long register_value = FIFO_EMPTY;

/* msi auto clears the interrupt pending bit. */
@@ -292,19 +302,18 @@ static unsigned long SA5_performant_completed(struct ctlr_info *h)
register_value = readl(h->vaddr + SA5_OUTDB_STATUS);
}

- if ((*(h->reply_pool_head) & 1) == (h->reply_pool_wraparound)) {
- register_value = *(h->reply_pool_head);
- (h->reply_pool_head)++;
+ if ((rq->head[rq->current_entry] & 1) == rq->wraparound) {
+ register_value = rq->head[rq->current_entry];
+ rq->current_entry++;
h->commands_outstanding--;
} else {
register_value = FIFO_EMPTY;
}
/* Check for wraparound */
- if (h->reply_pool_head == (h->reply_pool + h->max_commands)) {
- h->reply_pool_head = h->reply_pool;
- h->reply_pool_wraparound ^= 1;
+ if (rq->current_entry == h->max_commands) {
+ rq->current_entry = 0;
+ rq->wraparound ^= 1;
}
-
return register_value;
}

@@ -324,7 +333,8 @@ static unsigned long SA5_fifo_full(struct ctlr_info *h)
* returns value read from hardware.
* returns FIFO_EMPTY if there is nothing to read
*/
-static unsigned long SA5_completed(struct ctlr_info *h)
+static unsigned long SA5_completed(struct ctlr_info *h,
+ __attribute__((unused)) u8 q)
{
unsigned long register_value
= readl(h->vaddr + SA5_REPLY_PORT_OFFSET);
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 14b56c9..43f1631 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -129,6 +129,7 @@
#define CFGTBL_Trans_Simple 0x00000002l
#define CFGTBL_Trans_Performant 0x00000004l
#define CFGTBL_Trans_use_short_tags 0x20000000l
+#define CFGTBL_Trans_enable_directed_msix (1 << 30)

#define CFGTBL_BusType_Ultra2 0x00000001l
#define CFGTBL_BusType_Ultra3 0x00000002l
@@ -380,8 +381,8 @@ struct TransTable_struct {
u32 RepQCount;
u32 RepQCtrAddrLow32;
u32 RepQCtrAddrHigh32;
- u32 RepQAddr0Low32;
- u32 RepQAddr0High32;
+#define MAX_REPLY_QUEUES 8
+ struct vals32 RepQAddr[MAX_REPLY_QUEUES];
};

struct hpsa_pci_info {

2012-05-01 16:43:31

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 14/17] hpsa: factor out hpsa_free_irqs_and_disable_msix

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 26 ++++++++++++++------------
1 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index cf2f60e..1351ddf 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -4490,15 +4490,23 @@ static void free_irqs(struct ctlr_info *h)
free_irq(h->intr[i], &h->q[i]);
}

-static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
+static void hpsa_free_irqs_and_disable_msix(struct ctlr_info *h)
{
free_irqs(h);
#ifdef CONFIG_PCI_MSI
- if (h->msix_vector)
- pci_disable_msix(h->pdev);
- else if (h->msi_vector)
- pci_disable_msi(h->pdev);
+ if (h->msix_vector) {
+ if (h->pdev->msix_enabled)
+ pci_disable_msix(h->pdev);
+ } else if (h->msi_vector) {
+ if (h->pdev->msi_enabled)
+ pci_disable_msi(h->pdev);
+ }
#endif /* CONFIG_PCI_MSI */
+}
+
+static void hpsa_undo_allocations_after_kdump_soft_reset(struct ctlr_info *h)
+{
+ hpsa_free_irqs_and_disable_msix(h);
hpsa_free_sg_chain_blocks(h);
hpsa_free_cmd_pool(h);
kfree(h->blockFetchTable);
@@ -4853,13 +4861,7 @@ static void hpsa_shutdown(struct pci_dev *pdev)
*/
hpsa_flush_cache(h);
h->access.set_intr_mask(h, HPSA_INTR_OFF);
- free_irqs(h);
-#ifdef CONFIG_PCI_MSI
- if (h->msix_vector)
- pci_disable_msix(h->pdev);
- else if (h->msi_vector)
- pci_disable_msi(h->pdev);
-#endif /* CONFIG_PCI_MSI */
+ hpsa_free_irqs_and_disable_msix(h);
}

static void __devexit hpsa_free_device_info(struct ctlr_info *h)

2012-05-01 16:43:39

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 15/17] hpsa: add new RAID level "1(ADM)"

From: Mike Miller <[email protected]>

Signed-off-by: Mike Miller <[email protected]>
Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 1351ddf..84239d8 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -380,7 +380,7 @@ static inline int is_logical_dev_addr_mode(unsigned char scsi3addr[])
}

static const char *raid_label[] = { "0", "4", "1(1+0)", "5", "5+1", "ADG",
- "UNKNOWN"
+ "1(ADM)", "UNKNOWN"
};
#define RAID_UNKNOWN (ARRAY_SIZE(raid_label) - 1)

2012-05-01 16:43:46

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 16/17] hpsa: removed unused member maxQsinceinit

From: Stephen M. Cameron <[email protected]>

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.h | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 79c36aa..fb51ef7 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -85,7 +85,6 @@ struct ctlr_info {
struct list_head reqQ;
struct list_head cmpQ;
unsigned int Qdepth;
- unsigned int maxQsinceinit;
unsigned int maxSG;
spinlock_t lock;
int maxsgentries;

2012-05-01 16:44:01

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 17/17] hpsa: dial down lockup detection during firmware flash

From: Stephen M. Cameron <[email protected]>

Dial back the aggressiveness of the controller lockup detection thread.
Currently it will declare the controller to be locked up if it goes
for 10 seconds with no interrupts and no change in the heartbeat
register. Dial back this to 30 seconds with no heartbeat change, and
also snoop the ioctl path and if a firmware flash command is detected,
dial it back further to 4 minutes until the firmware flash command
completes. The reason for this is that during the firmware flash
operation, the controller apparently doesn't update the heartbeat
register as frequently as it is supposed to, and we can get a false
positive.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 39 ++++++++++++++++++++++++++++++++++-----
drivers/scsi/hpsa.h | 2 ++
drivers/scsi/hpsa_cmd.h | 1 +
3 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 84239d8..28a568c 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -569,12 +569,42 @@ static void set_performant_mode(struct ctlr_info *h, struct CommandList *c)
}
}

+static int is_firmware_flash_cmd(u8 *cdb)
+{
+ return cdb[0] == BMIC_WRITE && cdb[6] == BMIC_FLASH_FIRMWARE;
+}
+
+/*
+ * During firmware flash, the heartbeat register may not update as frequently
+ * as it should. So we dial down lockup detection during firmware flash. and
+ * dial it back up when firmware flash completes.
+ */
+#define HEARTBEAT_SAMPLE_INTERVAL_DURING_FLASH (240 * HZ)
+#define HEARTBEAT_SAMPLE_INTERVAL (30 * HZ)
+static void dial_down_lockup_detection_during_fw_flash(struct ctlr_info *h,
+ struct CommandList *c)
+{
+ if (!is_firmware_flash_cmd(c->Request.CDB))
+ return;
+ atomic_inc(&h->firmware_flash_in_progress);
+ h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL_DURING_FLASH;
+}
+
+static void dial_up_lockup_detection_on_fw_flash_complete(struct ctlr_info *h,
+ struct CommandList *c)
+{
+ if (is_firmware_flash_cmd(c->Request.CDB) &&
+ atomic_dec_and_test(&h->firmware_flash_in_progress))
+ h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
+}
+
static void enqueue_cmd_and_start_io(struct ctlr_info *h,
struct CommandList *c)
{
unsigned long flags;

set_performant_mode(h, c);
+ dial_down_lockup_detection_during_fw_flash(h, c);
spin_lock_irqsave(&h->lock, flags);
addQ(&h->reqQ, c);
h->Qdepth++;
@@ -3385,6 +3415,7 @@ static inline void finish_cmd(struct CommandList *c)
spin_lock_irqsave(&c->h->lock, flags);
removeQ(c);
spin_unlock_irqrestore(&c->h->lock, flags);
+ dial_up_lockup_detection_on_fw_flash_complete(c->h, c);
if (likely(c->cmd_type == CMD_SCSI))
complete_scsi_command(c);
else if (c->cmd_type == CMD_IOCTL_PEND)
@@ -4565,9 +4596,6 @@ static void controller_lockup_detected(struct ctlr_info *h)
spin_unlock_irqrestore(&h->lock, flags);
}

-#define HEARTBEAT_SAMPLE_INTERVAL (10 * HZ)
-#define HEARTBEAT_CHECK_MINIMUM_INTERVAL (HEARTBEAT_SAMPLE_INTERVAL / 2)
-
static void detect_controller_lockup(struct ctlr_info *h)
{
u64 now;
@@ -4578,7 +4606,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
now = get_jiffies_64();
/* If we've received an interrupt recently, we're ok. */
if (time_after64(h->last_intr_timestamp +
- (HEARTBEAT_CHECK_MINIMUM_INTERVAL), now))
+ (h->heartbeat_sample_interval), now))
return;

/*
@@ -4587,7 +4615,7 @@ static void detect_controller_lockup(struct ctlr_info *h)
* otherwise don't care about signals in this thread.
*/
if (time_after64(h->last_heartbeat_timestamp +
- (HEARTBEAT_CHECK_MINIMUM_INTERVAL), now))
+ (h->heartbeat_sample_interval), now))
return;

/* If heartbeat has not changed since we last looked, we're not ok. */
@@ -4629,6 +4657,7 @@ static void add_ctlr_to_lockup_detector_list(struct ctlr_info *h)
{
unsigned long flags;

+ h->heartbeat_sample_interval = HEARTBEAT_SAMPLE_INTERVAL;
spin_lock_irqsave(&lockup_detector_lock, flags);
list_add_tail(&h->lockup_list, &hpsa_ctlr_list);
spin_unlock_irqrestore(&lockup_detector_lock, flags);
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index fb51ef7..9816479 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -129,6 +129,8 @@ struct ctlr_info {
u64 last_intr_timestamp;
u32 last_heartbeat;
u64 last_heartbeat_timestamp;
+ u32 heartbeat_sample_interval;
+ atomic_t firmware_flash_in_progress;
u32 lockup_detected;
struct list_head lockup_list;
/* Address of h->q[x] is passed to intr handler to know which queue */
diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
index 43f1631..a894f2e 100644
--- a/drivers/scsi/hpsa_cmd.h
+++ b/drivers/scsi/hpsa_cmd.h
@@ -186,6 +186,7 @@ struct SenseSubsystem_info {
#define BMIC_WRITE 0x27
#define BMIC_CACHE_FLUSH 0xc2
#define HPSA_CACHE_FLUSH 0x01 /* C2 was already being used by HPSA */
+#define BMIC_FLASH_FIRMWARE 0xF7

/* Command List Structure */
union SCSI3Addr {

2012-05-01 16:42:19

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 01/17] hpsa: call pci_disable_device on driver unload

From: Stephen M. Cameron <[email protected]>

As Jenx Axboe explained to me: "In earlier times (2.6.18 and pre, iirc), Linux
disabled IO and mem bars on pci_disable_device(). Now in newer kernel it does
not. And in the newer kernels you run into problems if you DON'T disable the
device on exit, since when it later loads the device is already in the enabled
state - and pci_enable_device() then does nothing. This typically screws
MSI/MSI-X." This is what the big scary comment that says pci_disable_device
does "something nasty" to smart arrays was evidently referring to.

If pci_disable_device is not called on driver rmmod, subsequently insmod'ing
the driver may in result in some cases fail to be able to receive interrupts,
esp. if other drivers are loaded between unloading and loading hpsa.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 10 ++--------
1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 500e20d..1a6c319 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3987,10 +3987,7 @@ err_out_free_res:
iounmap(h->cfgtable);
if (h->vaddr)
iounmap(h->vaddr);
- /*
- * Deliberately omit pci_disable_device(): it does something nasty to
- * Smart Array controllers that pci_enable_device does not undo
- */
+ pci_disable_device(h->pdev);
pci_release_regions(h->pdev);
return err;
}
@@ -4529,10 +4526,7 @@ static void __devexit hpsa_remove_one(struct pci_dev *pdev)
kfree(h->cmd_pool_bits);
kfree(h->blockFetchTable);
kfree(h->hba_inquiry_data);
- /*
- * Deliberately omit pci_disable_device(): it does something nasty to
- * Smart Array controllers that pci_enable_device does not undo
- */
+ pci_disable_device(pdev);
pci_release_regions(pdev);
pci_set_drvdata(pdev, NULL);
kfree(h);

2012-05-01 16:47:09

by Stephen M. Cameron

[permalink] [raw]
Subject: [PATCH 02/17] hpsa: do not skip disabled devices

From: Stephen M. Cameron <[email protected]>

There was code to skip "disabled" devices which was intended to
skip devices disabled in the BIOS, but it really just checks to
see if the device can write to host memory, which this is disabled
by pci_disable_device on driver unload, so this check has the effect
of preventing subsequent load of the driver. And devices disabled in
the BIOS don't show up at all anyway, so this check never made any
sense to begin with, and should be removed.

Signed-off-by: Stephen M. Cameron <[email protected]>
---
drivers/scsi/hpsa.c | 13 -------------
1 files changed, 0 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 1a6c319..5119ce6 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3705,14 +3705,6 @@ static int __devinit hpsa_lookup_board_id(struct pci_dev *pdev, u32 *board_id)
return ARRAY_SIZE(products) - 1; /* generic unknown smart array */
}

-static inline bool hpsa_board_disabled(struct pci_dev *pdev)
-{
- u16 command;
-
- (void) pci_read_config_word(pdev, PCI_COMMAND, &command);
- return ((command & PCI_COMMAND_MEMORY) == 0);
-}
-
static int __devinit hpsa_pci_find_memory_BAR(struct pci_dev *pdev,
unsigned long *memory_bar)
{
@@ -3932,11 +3924,6 @@ static int __devinit hpsa_pci_init(struct ctlr_info *h)
h->product_name = products[prod_index].product_name;
h->access = *(products[prod_index].access);

- if (hpsa_board_disabled(h->pdev)) {
- dev_warn(&h->pdev->dev, "controller appears to be disabled\n");
- return -ENODEV;
- }
-
pci_disable_link_state(h->pdev, PCIE_LINK_STATE_L0S |
PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_CLKPM);

2012-05-01 17:26:49

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

Hi,

> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -1380,17 +1380,24 @@ static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> }
> }
>
> +#define MAX_DRIVER_CMD_RETRIES 25
> static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> struct CommandList *c, int data_direction)
> {
> - int retry_count = 0;
> + int backoff_time = 10, retry_count = 0;
>
> do {
> memset(c->err_info, 0, sizeof(*c->err_info));
> hpsa_scsi_do_simple_cmd_core(h, c);
> retry_count++;
> + if (retry_count > 3) {
> + msleep(backoff_time);

for 10ms isn't it better to avoid using msleep?

Andi

> + if (backoff_time < 1000)
> + backoff_time *= 2;
> + }
> } while ((check_for_unit_attention(h, c) ||
> - check_for_busy(h, c)) && retry_count <= 3);
> + check_for_busy(h, c)) &&
> + retry_count <= MAX_DRIVER_CMD_RETRIES);
> hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> }

2012-05-01 18:20:18

by Stephen M. Cameron

[permalink] [raw]
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> Hi,
>
> > --- a/drivers/scsi/hpsa.c
> > +++ b/drivers/scsi/hpsa.c
> > @@ -1380,17 +1380,24 @@ static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> > }
> > }
> >
> > +#define MAX_DRIVER_CMD_RETRIES 25
> > static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> > struct CommandList *c, int data_direction)
> > {
> > - int retry_count = 0;
> > + int backoff_time = 10, retry_count = 0;
> >
> > do {
> > memset(c->err_info, 0, sizeof(*c->err_info));
> > hpsa_scsi_do_simple_cmd_core(h, c);
> > retry_count++;
> > + if (retry_count > 3) {
> > + msleep(backoff_time);
>
> for 10ms isn't it better to avoid using msleep?

[...]
> > + if (backoff_time < 1000)
> > + backoff_time *= 2;

Eh, maybe. from Documentation/timers-howto.txt

msleep(1~20) may not do what the caller intends, and
will often sleep longer (~20 ms actual sleep for any
value given in the 1~20ms range). In many cases this
is not the desired behavior.

Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
really care too much exactly how long it sleeps, and it backs off to up to
1280ms eventually anyway. The idea is, "wait a bit, and retry, and then if
that doesn't work, wait twice as long, and retry, etc." *exactly* how long
"a bit" is is not super important. I could change the initial back_off time
to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
if doing so is important.

This is kind of a corner case of a corner case, I don't expect
things will ordinarily end up waiting that long, because normally
one of the 1st 3 retries will succeed. I just wanted to make it
a little more robust and not just give up immediately if the 3
initial retries don't succeed, the specific number of retries,
wait times, etc, I just made up. It still does eventually give up
though, and then probably doesn't do anything good after that
(same as current behavior, just somewhat less likely to get to
that point.) I'm not actually aware of any complaints of the
retries failing though (apart from the complaint that prompted
the patch prior to this one, that we didn't retry on getting
SAM_STAT_BUSY.)

-- steve

> > + }
> > } while ((check_for_unit_attention(h, c) ||
> > - check_for_busy(h, c)) && retry_count <= 3);
> > + check_for_busy(h, c)) &&
> > + retry_count <= MAX_DRIVER_CMD_RETRIES);
> > hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> > }

2012-05-01 21:39:54

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

Hi Steve,

Thanks for the walk-through, but still some doubts...

On Tue, May 01, 2012 at 01:20:11PM -0500, [email protected] wrote:
> On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> > >
> > > do {
> > > memset(c->err_info, 0, sizeof(*c->err_info));
> > > hpsa_scsi_do_simple_cmd_core(h, c);
> > > retry_count++;
> > > + if (retry_count > 3) {
> > > + msleep(backoff_time);
> >
> > for 10ms isn't it better to avoid using msleep?
>
> [...]
> > > + if (backoff_time < 1000)
> > > + backoff_time *= 2;
>
> Eh, maybe. from Documentation/timers-howto.txt
>
> msleep(1~20) may not do what the caller intends, and
> will often sleep longer (~20 ms actual sleep for any
> value given in the 1~20ms range). In many cases this
> is not the desired behavior.
>
> Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
> really care too much exactly how long it sleeps, and it backs off to up to
> 1280ms eventually anyway. The idea is, "wait a bit, and retry, and then if
> that doesn't work, wait twice as long, and retry, etc." *exactly* how long
> "a bit" is is not super important. I could change the initial back_off time
> to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
> if doing so is important.

No, you're right, it should not really matter, but here in the
worst case you put the driver on sleep for almost 22 seconds,
that is a huge difference compared to the original
implementation.

> This is kind of a corner case of a corner case, I don't expect
> things will ordinarily end up waiting that long, because normally
> one of the 1st 3 retries will succeed. I just wanted to make it
> a little more robust and not just give up immediately if the 3
> initial retries don't succeed, the specific number of retries,
> wait times, etc, I just made up.

Premising that I don't know the device, therefore I could be
totally wrong, if you don't expect things to wait so long, why not
to decrease the MAX_DRIVER_CMD_RETRIES and sleep increasingly (as
you did) but for shorter period?

Andi

> It still does eventually give up
> though, and then probably doesn't do anything good after that
> (same as current behavior, just somewhat less likely to get to
> that point.) I'm not actually aware of any complaints of the
> retries failing though (apart from the complaint that prompted
> the patch prior to this one, that we didn't retry on getting
> SAM_STAT_BUSY.)
>
> -- steve
>

2012-05-02 16:30:15

by Stephen M. Cameron

[permalink] [raw]
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

On Tue, May 01, 2012 at 11:39:34PM +0200, Andi Shyti wrote:
> Hi Steve,
>
> Thanks for the walk-through, but still some doubts...
>
> On Tue, May 01, 2012 at 01:20:11PM -0500, [email protected] wrote:
> > On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> > > >
> > > > do {
> > > > memset(c->err_info, 0, sizeof(*c->err_info));
> > > > hpsa_scsi_do_simple_cmd_core(h, c);
> > > > retry_count++;
> > > > + if (retry_count > 3) {
> > > > + msleep(backoff_time);
> > >
> > > for 10ms isn't it better to avoid using msleep?
> >
> > [...]
> > > > + if (backoff_time < 1000)
> > > > + backoff_time *= 2;
> >
> > Eh, maybe. from Documentation/timers-howto.txt
> >
> > msleep(1~20) may not do what the caller intends, and
> > will often sleep longer (~20 ms actual sleep for any
> > value given in the 1~20ms range). In many cases this
> > is not the desired behavior.
> >
> > Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
> > really care too much exactly how long it sleeps, and it backs off to up to
> > 1280ms eventually anyway. The idea is, "wait a bit, and retry, and then if
> > that doesn't work, wait twice as long, and retry, etc." *exactly* how long
> > "a bit" is is not super important. I could change the initial back_off time
> > to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
> > if doing so is important.
>
> No, you're right, it should not really matter, but here in the
> worst case you put the driver on sleep for almost 22 seconds,
> that is a huge difference compared to the original
> implementation.
>
> > This is kind of a corner case of a corner case, I don't expect
> > things will ordinarily end up waiting that long, because normally
> > one of the 1st 3 retries will succeed. I just wanted to make it
> > a little more robust and not just give up immediately if the 3
> > initial retries don't succeed, the specific number of retries,
> > wait times, etc, I just made up.
>
> Premising that I don't know the device, therefore I could be
> totally wrong, if you don't expect things to wait so long, why not
> to decrease the MAX_DRIVER_CMD_RETRIES and sleep increasingly (as
> you did) but for shorter period?

[...]

Yes the behavior is different, intentionally so.

Here's a patch to test it out. I add a sysfs attribute to the host to
tell it to simulate being busy for the specified number of seconds, and
add some debug code so we can see what's happening.

commit 37edc48530210ea49d38fdafb0ba11da8d8b6d53
Author: Stephen M. Cameron <[email protected]>
Date: Wed May 2 08:34:17 2012 -0500

hpsa: debug retry behavior change

This patch is just for testing.

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index cb9d619..c2a4128 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -255,6 +255,29 @@ static ssize_t host_store_rescan(struct device *dev,
return count;
}

+static ssize_t host_store_act_busy(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct ctlr_info *h;
+ struct Scsi_Host *shost = class_to_shost(dev);
+ int rc;
+ unsigned long flags;
+ u64 now, val;
+
+ h = shost_to_hba(shost);
+
+ rc = sscanf(buf, "%llu", &val);
+ if (rc != 1 || val > 30)
+ return count;
+ dev_warn(&h->pdev->dev, "Will act busy for %llu seconds.\n", val);
+ now = get_jiffies_64() + val * HZ;
+ spin_lock_irqsave(&h->act_busy_lock, flags);
+ h->act_busy = now;
+ spin_unlock_irqrestore(&h->act_busy_lock, flags);
+ return count;
+}
+
static ssize_t host_show_firmware_revision(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -480,6 +503,8 @@ static DEVICE_ATTR(transport_mode, S_IRUGO,
host_show_transport_mode, NULL);
static DEVICE_ATTR(resettable, S_IRUGO,
host_show_resettable, NULL);
+static DEVICE_ATTR(act_busy, S_IWUSR,
+ NULL, host_store_act_busy);

static struct device_attribute *hpsa_sdev_attrs[] = {
&dev_attr_raid_level,
@@ -494,6 +519,7 @@ static struct device_attribute *hpsa_shost_attrs[] = {
&dev_attr_commands_outstanding,
&dev_attr_transport_mode,
&dev_attr_resettable,
+ &dev_attr_act_busy,
NULL,
};

@@ -1228,7 +1254,7 @@ static void complete_scsi_command(struct CommandList *cp)
* Pass it up to the upper layers...
*/
if (ei->ScsiStatus) {
- dev_warn(&h->pdev->dev, "cp %p has status 0x%x "
+ dev_dbg(&h->pdev->dev, "cp %p has status 0x%x "
"Sense: 0x%x, ASC: 0x%x, ASCQ: 0x%x, "
"Returning result: 0x%x\n",
cp, ei->ScsiStatus,
@@ -1389,6 +1415,15 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
do {
memset(c->err_info, 0, sizeof(*c->err_info));
hpsa_scsi_do_simple_cmd_core(h, c);
+ if (retry_count > 0) {
+ dev_warn(&h->pdev->dev,
+ "RETRYING(%d): %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x %02x\n",
+ retry_count,
+ c->Request.CDB[0], c->Request.CDB[1], c->Request.CDB[2], c->Request.CDB[3],
+ c->Request.CDB[4], c->Request.CDB[5], c->Request.CDB[6], c->Request.CDB[7],
+ c->Request.CDB[8], c->Request.CDB[9], c->Request.CDB[10], c->Request.CDB[11],
+ c->Request.CDB[12], c->Request.CDB[13], c->Request.CDB[14], c->Request.CDB[15]);
+ }
retry_count++;
if (retry_count > 3) {
msleep(backoff_time);
@@ -3063,9 +3098,33 @@ static inline int bad_tag(struct ctlr_info *h, u32 tag_index,
return 0;
}

+static void act_busy(struct CommandList *c)
+{
+ struct ctlr_info *h = c->h;
+ unsigned long flags;
+ u64 now;
+
+ now = get_jiffies_64();
+ spin_lock_irqsave(&h->act_busy_lock, flags);
+ if (!h->act_busy) {
+ spin_unlock_irqrestore(&h->act_busy_lock, flags);
+ return;
+ }
+
+ if (time_after64(now, h->act_busy)) {
+ h->act_busy = 0;
+ spin_unlock_irqrestore(&h->act_busy_lock, flags);
+ return;
+ }
+ spin_unlock_irqrestore(&h->act_busy_lock, flags);
+ c->err_info->CommandStatus = CMD_TARGET_STATUS;
+ c->err_info->ScsiStatus = SAM_STAT_BUSY;
+}
+
static inline void finish_cmd(struct CommandList *c, u32 raw_tag)
{
removeQ(c);
+ act_busy(c);
if (likely(c->cmd_type == CMD_SCSI))
complete_scsi_command(c);
else if (c->cmd_type == CMD_IOCTL_PEND)
@@ -4337,6 +4396,7 @@ reinit_after_soft_reset:
INIT_LIST_HEAD(&h->reqQ);
spin_lock_init(&h->lock);
spin_lock_init(&h->scan_lock);
+ spin_lock_init(&h->act_busy_lock);
rc = hpsa_pci_init(h);
if (rc != 0)
goto clean1;
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 48f7812..b381694 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -125,6 +125,8 @@ struct ctlr_info {
u64 last_heartbeat_timestamp;
u32 lockup_detected;
struct list_head lockup_list;
+ u64 act_busy;
+ spinlock_t act_busy_lock; /* to protect act_busy */
};
#define HPSA_ABORT_MSG 0
#define HPSA_DEVICE_RESET_MSG 1



Now, testing that out, I tell the host to act busy for 20 seconds
(not a likely scenario, but not inconceivable). Then, I tell it
to rescan for logical drives while it is busy. With the patch,
to retry 25 times over a longer period of time, here's what happens:

[root@localhost scameron]# find /sys -name 'act_busy'
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/act_busy
/sys/devices/pci0000:00/0000:00:06.0/0000:05:00.0/host1/scsi_host/host1/act_busy
[root@localhost scameron]# find /sys -name 'rescan' | grep host0
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/target0:0:0/0:0:0:0/rescan
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/target0:3:0/0:3:0:0/rescan
[root@localhost scameron]# echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
[root@localhost scameron]# tail -f /var/log/messages &
[1] 9862
[root@localhost scameron]# May 2 10:48:01 localhost avahi-daemon[9035]: Registering HINFO record with values 'X86_64'/'LINUX'.
May 2 10:48:02 localhost avahi-daemon[9035]: Withdrawing address record for 192.168.122.1 on virbr0.
May 2 10:48:02 localhost avahi-daemon[9035]: Withdrawing address record for 10.10.12.180 on eth0.
May 2 10:48:02 localhost avahi-daemon[9035]: Host name conflict, retrying with <linux-10>
May 2 10:48:02 localhost avahi-daemon[9035]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
May 2 10:48:02 localhost avahi-daemon[9035]: Registering new address record for fe80::218:feff:fe7d:3438 on eth0.*.
May 2 10:48:02 localhost avahi-daemon[9035]: Registering new address record for 10.10.12.180 on eth0.IPv4.
May 2 10:48:02 localhost avahi-daemon[9035]: Registering HINFO record with values 'X86_64'/'LINUX'.
May 2 10:48:03 localhost avahi-daemon[9035]: Server startup complete. Host name is linux-10.local. Local service cookie is 1386635384.
May 2 10:48:04 localhost avahi-daemon[9035]: Service "linux-10" (/services/ssh.service) successfully established.

[root@localhost scameron]# echo 20 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/act_busy
May 2 10:49:36 localhost kernel: hpsa 0000:0a:00.0: Will act busy for 20 seconds.
[root@localhost scameron]# echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(1): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(2): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(3): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(4): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(5): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:39 localhost kernel: hpsa 0000:0a:00.0: RETRYING(6): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: RETRYING(7): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: RETRYING(8): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:40 localhost kernel: hpsa 0000:0a:00.0: RETRYING(9): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:41 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:41 localhost kernel: hpsa 0000:0a:00.0: RETRYING(10): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:42 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:42 localhost kernel: hpsa 0000:0a:00.0: RETRYING(11): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:43 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:43 localhost kernel: hpsa 0000:0a:00.0: RETRYING(12): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:45 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:45 localhost kernel: hpsa 0000:0a:00.0: RETRYING(13): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:46 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:46 localhost kernel: hpsa 0000:0a:00.0: RETRYING(14): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:47 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:47 localhost kernel: hpsa 0000:0a:00.0: RETRYING(15): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:48 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:48 localhost kernel: hpsa 0000:0a:00.0: RETRYING(16): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:50 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:50 localhost kernel: hpsa 0000:0a:00.0: RETRYING(17): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:51 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:51 localhost kernel: hpsa 0000:0a:00.0: RETRYING(18): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:52 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:52 localhost kernel: hpsa 0000:0a:00.0: RETRYING(19): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:54 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:54 localhost kernel: hpsa 0000:0a:00.0: RETRYING(20): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:55 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:55 localhost kernel: hpsa 0000:0a:00.0: RETRYING(21): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:56 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:56 localhost kernel: hpsa 0000:0a:00.0: RETRYING(22): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 10:49:57 localhost kernel: hpsa 0000:0a:00.0: hpsa device busy
May 2 10:49:57 localhost kernel: hpsa 0000:0a:00.0: RETRYING(23): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
[root@localhost scameron]# echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
[root@localhost scameron]#

It eventually works.

Now, here is a patch which leaves the debug in, but goes back to the old behavior of just
retrying 3 times immediately then giving up:

commit 571c024f893831f0a23c5f3a49bdcffa9af1a38b
Author: Stephen M. Cameron <[email protected]>
Date: Wed May 2 10:51:28 2012 -0500

go back to old behavior (immediately give up after 3 quick retries)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index c2a4128..e31c14d 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -1426,6 +1426,7 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
}
retry_count++;
if (retry_count > 3) {
+ goto give_up;
msleep(backoff_time);
if (backoff_time < 1000)
backoff_time *= 2;
@@ -1433,6 +1434,7 @@ static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
} while ((check_for_unit_attention(h, c) ||
check_for_busy(h, c)) &&
retry_count <= MAX_DRIVER_CMD_RETRIES);
+give_up:
hpsa_pci_unmap(h->pdev, c, 1, data_direction);
}


Now, the behavior we get is this:

[root@localhost ~]# find /sys -name 'rescan' | grep host0 | grep scsi_host
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
[root@localhost ~]# find /sys -name 'act_busy'
/sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/act_busy
/sys/devices/pci0000:00/0000:00:06.0/0000:05:00.0/host1/scsi_host/host1/act_busy
[root@localhost ~]# echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
[root@localhost ~]# echo 20 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/act_busy
[root@localhost ~]# May 2 11:19:36 localhost kernel: hpsa 0000:0a:00.0: Will act busy for 20 seconds.
[root@localhost ~]# echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:0a:00.0/host0/scsi_host/host0/rescan
[root@localhost ~]# May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: hpsadevice busy
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: RETRYING(1): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: hpsadevice busy
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: RETRYING(2): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: hpsadevice busy
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: RETRYING(3): c3 00 00 00 00 00 00 00 40 08 00 00 00 00 00 00
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: cmd ffff88003766a000 has completed with errors
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: cmd ffff88003766a000 has SCSI Status = 8
May 2 11:19:54 localhost kernel: hpsa 0000:0a:00.0: report physical LUNs failed.

The rescan just doesn't work in the face of even a 1 second BUSY condition with
the old behavior.

Yeah, in the working case, the driver thread is sleeping for a bit. Note
that this is not in the main i/o path (the scsi mid layer will have to deal with
SAM_STAT_BUSY however it normally does) this retrying is only in paths involving driver
initiated commands, in this case, rescanning for new/changed/removed logical drives.

So yeah, the "echo 1 > /sys/blah/blah/rescan" process will sleep for up to ~20 seconds
in the event of some (presumably very rare) 20 second BUSY condition, but it will work in
cases the old code will not, and so it sleeps 20 secs, I don't think that's really a
problem, esp. compared to the alternative of just failing.

-- steve

2012-05-02 20:28:10

by Andi Shyti

[permalink] [raw]
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

On Wed, May 02, 2012 at 11:30:09AM -0500, [email protected] wrote:
> On Tue, May 01, 2012 at 11:39:34PM +0200, Andi Shyti wrote:
> > Hi Steve,
> >
> > Thanks for the walk-through, but still some doubts...
> >
> > On Tue, May 01, 2012 at 01:20:11PM -0500, [email protected] wrote:
> > > On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> > > > >
> > > > > do {
> > > > > memset(c->err_info, 0, sizeof(*c->err_info));
> > > > > hpsa_scsi_do_simple_cmd_core(h, c);
> > > > > retry_count++;
> > > > > + if (retry_count > 3) {
> > > > > + msleep(backoff_time);
> > > >
> > > > for 10ms isn't it better to avoid using msleep?
> > >
> > > [...]
> > > > > + if (backoff_time < 1000)
> > > > > + backoff_time *= 2;
> > >
> > > Eh, maybe. from Documentation/timers-howto.txt
> > >
> > > msleep(1~20) may not do what the caller intends, and
> > > will often sleep longer (~20 ms actual sleep for any
> > > value given in the 1~20ms range). In many cases this
> > > is not the desired behavior.
> > >
> > > Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
> > > really care too much exactly how long it sleeps, and it backs off to up to
> > > 1280ms eventually anyway. The idea is, "wait a bit, and retry, and then if
> > > that doesn't work, wait twice as long, and retry, etc." *exactly* how long
> > > "a bit" is is not super important. I could change the initial back_off time
> > > to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
> > > if doing so is important.
> >
> > No, you're right, it should not really matter, but here in the
> > worst case you put the driver on sleep for almost 22 seconds,
> > that is a huge difference compared to the original
> > implementation.
> >
> > > This is kind of a corner case of a corner case, I don't expect
> > > things will ordinarily end up waiting that long, because normally
> > > one of the 1st 3 retries will succeed. I just wanted to make it
> > > a little more robust and not just give up immediately if the 3
> > > initial retries don't succeed, the specific number of retries,
> > > wait times, etc, I just made up.
> >
> > Premising that I don't know the device, therefore I could be
> > totally wrong, if you don't expect things to wait so long, why not
> > to decrease the MAX_DRIVER_CMD_RETRIES and sleep increasingly (as
> > you did) but for shorter period?
>
> [...]
>
> So yeah, the "echo 1 > /sys/blah/blah/rescan" process will sleep for up to ~20 seconds
> in the event of some (presumably very rare) 20 second BUSY condition, but it will work in
> cases the old code will not, and so it sleeps 20 secs, I don't think that's really a
> problem, esp. compared to the alternative of just failing.
>

Uhh... that was a good explanation :)
Thanks a lot! You convinced me :)

If you want you can add

Reviewed-by: Andi Shyti <[email protected]>