Recent updates in pHyp Firmware and VIOS releases provide new infrastructure
towards enabling Subordinate Command Response Queues (Sub-CRQs) such that each
Sub-CRQ is a channel backed by an actual hardware queue in the FC stack on the
partner VIOS. Sub-CRQs are registered with the firmware via hypercalls and then
negotiated with the VIOS via new Management Datagrams (MADs) for channel setup.
This initial implementation adds the necessary Sub-CRQ framework and implements
the new MADs for negotiating and assigning a set of Sub-CRQs to associated VIOS
HW backed channels. The event pool and locking still leverages the legacy single
queue implementation, and as such lock contention is problematic when increasing
the number of queues. However, this initial work demonstrates a 1.2x factor
increase in IOPs when configured with two HW queues despite lock contention.
changes in v2:
* Patch 4: NULL'd scsi_scrq reference after deallocation [brking]
* Patch 6: Added switch case to handle XPORT event [brking]
* Patch 9: fixed ibmvfc_event leak and double free [brking]
* added support for cancel command with MQ
* added parameter toggles for MQ settings
Tyrel Datwyler (17):
ibmvfc: add vhost fields and defaults for MQ enablement
ibmvfc: define hcall wrapper for registering a Sub-CRQ
ibmvfc: add Subordinate CRQ definitions
ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ Channels
ibmvfc: add Sub-CRQ IRQ enable/disable routine
ibmvfc: add handlers to drain and complete Sub-CRQ responses
ibmvfc: define Sub-CRQ interrupt handler routine
ibmvfc: map/request irq and register Sub-CRQ interrupt handler
ibmvfc: implement channel enquiry and setup commands
ibmvfc: advertise client support for using hardware channels
ibmvfc: set and track hw queue in ibmvfc_event struct
ibmvfc: send commands down HW Sub-CRQ when channelized
ibmvfc: register Sub-CRQ handles with VIOS during channel setup
ibmvfc: add cancel mad initialization helper
ibmvfc: send Cancel MAD down each hw scsi channel
ibmvfc: enable MQ and set reasonable defaults
ibmvfc: provide modules parameters for MQ settings
drivers/scsi/ibmvscsi/ibmvfc.c | 706 +++++++++++++++++++++++++++++----
drivers/scsi/ibmvscsi/ibmvfc.h | 41 +-
2 files changed, 675 insertions(+), 72 deletions(-)
--
2.27.0
Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when
channels are being utilized.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 5 +++++
drivers/scsi/ibmvscsi/ibmvfc.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 0e6c9e55a221..4555775ea74b 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1397,6 +1397,7 @@ static void ibmvfc_init_event(struct ibmvfc_event *evt,
evt->crq.format = format;
evt->done = done;
evt->eh_comp = NULL;
+ evt->hwq = 0;
}
/**
@@ -1748,6 +1749,8 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
struct ibmvfc_cmd *vfc_cmd;
struct ibmvfc_fcp_cmd_iu *iu;
struct ibmvfc_event *evt;
+ u32 tag_and_hwq = blk_mq_unique_tag(cmnd->request);
+ u16 hwq = blk_mq_unique_tag_to_hwq(tag_and_hwq);
int rc;
if (unlikely((rc = fc_remote_port_chkready(rport))) ||
@@ -1775,6 +1778,8 @@ static int ibmvfc_queuecommand_lck(struct scsi_cmnd *cmnd,
}
vfc_cmd->correlation = cpu_to_be64(evt);
+ if (vhost->using_channels)
+ evt->hwq = hwq % vhost->scsi_scrqs.active_queues;
if (likely(!(rc = ibmvfc_map_sg_data(cmnd, evt, vfc_cmd, vhost->dev))))
return ibmvfc_send_event(evt, vhost, 0);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index dff26dbd912c..e0ffb0416223 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -781,6 +781,7 @@ struct ibmvfc_event {
struct completion comp;
struct completion *eh_comp;
struct timer_list timer;
+ u16 hwq;
};
/* a pool of event structs for use */
--
2.27.0
Turn on MQ by default and set sane values for the upper limit on hw
queues for the scsi host, and number of hw scsi channels to request from
the partner VIOS.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index e0ffb0416223..c327b9c3090e 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -41,9 +41,9 @@
#define IBMVFC_DEFAULT_LOG_LEVEL 2
#define IBMVFC_MAX_CDB_LEN 16
#define IBMVFC_CLS3_ERROR 0
-#define IBMVFC_MQ 0
-#define IBMVFC_SCSI_CHANNELS 0
-#define IBMVFC_SCSI_HW_QUEUES 1
+#define IBMVFC_MQ 1
+#define IBMVFC_SCSI_CHANNELS 8
+#define IBMVFC_SCSI_HW_QUEUES 16
#define IBMVFC_MIG_NO_SUB_TO_CRQ 0
#define IBMVFC_MIG_NO_N_TO_M 0
--
2.27.0
In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a
Cancel down each channel commands were submitted for that LUN. Further,
the client then must wait for those cancel completions prior to
submitting a LUN RESET or ABORT TASK SET.
Allocate event pointers for each possible scsi channel and assign an
event for each channel that requires a cancel. Wait for completion each
submitted cancel.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 106 +++++++++++++++++++++------------
1 file changed, 68 insertions(+), 38 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 0b6284020f06..97e8eed04b01 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2339,32 +2339,52 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
{
struct ibmvfc_host *vhost = shost_priv(sdev->host);
struct ibmvfc_event *evt, *found_evt;
- union ibmvfc_iu rsp;
- int rsp_rc = -EBUSY;
+ struct ibmvfc_event **evt_list;
+ union ibmvfc_iu *rsp;
+ int rsp_rc = 0;
unsigned long flags;
u16 status;
+ int num_hwq = 1;
+ int i;
+ int ret = 0;
ENTER;
spin_lock_irqsave(vhost->host->host_lock, flags);
- found_evt = NULL;
- list_for_each_entry(evt, &vhost->sent, queue) {
- if (evt->cmnd && evt->cmnd->device == sdev) {
- found_evt = evt;
- break;
+ if (vhost->using_channels && vhost->scsi_scrqs.active_queues)
+ num_hwq = vhost->scsi_scrqs.active_queues;
+
+ evt_list = kcalloc(num_hwq, sizeof(*evt_list), GFP_KERNEL);
+ rsp = kcalloc(num_hwq, sizeof(*rsp), GFP_KERNEL);
+
+ for (i = 0; i < num_hwq; i++) {
+ sdev_printk(KERN_INFO, sdev, "Cancelling outstanding commands on queue %d.\n", i);
+
+ found_evt = NULL;
+ list_for_each_entry(evt, &vhost->sent, queue) {
+ if (evt->cmnd && evt->cmnd->device == sdev && evt->hwq == i) {
+ found_evt = evt;
+ break;
+ }
}
- }
- if (!found_evt) {
- if (vhost->log_level > IBMVFC_DEFAULT_LOG_LEVEL)
- sdev_printk(KERN_INFO, sdev, "No events found to cancel\n");
- spin_unlock_irqrestore(vhost->host->host_lock, flags);
- return 0;
- }
+ if (!found_evt) {
+ if (vhost->log_level > IBMVFC_DEFAULT_LOG_LEVEL)
+ sdev_printk(KERN_INFO, sdev, "No events found to cancel on queue %d\n", i);
+ continue;
+ }
- if (vhost->logged_in) {
- evt = ibmvfc_init_tmf(vhost, sdev, type);
- evt->sync_iu = &rsp;
- rsp_rc = ibmvfc_send_event(evt, vhost, default_timeout);
+
+ if (vhost->logged_in) {
+ evt_list[i] = ibmvfc_init_tmf(vhost, sdev, type);
+ evt_list[i]->hwq = i;
+ evt_list[i]->sync_iu = &rsp[i];
+ rsp_rc = ibmvfc_send_event(evt_list[i], vhost, default_timeout);
+ if (rsp_rc)
+ break;
+ } else {
+ rsp_rc = -EBUSY;
+ break;
+ }
}
spin_unlock_irqrestore(vhost->host->host_lock, flags);
@@ -2374,32 +2394,42 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
/* If failure is received, the host adapter is most likely going
through reset, return success so the caller will wait for the command
being cancelled to get returned */
- return 0;
+ goto free_mem;
}
- sdev_printk(KERN_INFO, sdev, "Cancelling outstanding commands.\n");
-
- wait_for_completion(&evt->comp);
- status = be16_to_cpu(rsp.mad_common.status);
- spin_lock_irqsave(vhost->host->host_lock, flags);
- ibmvfc_free_event(evt);
- spin_unlock_irqrestore(vhost->host->host_lock, flags);
+ for (i = 0; i < num_hwq; i++) {
+ if (!evt_list[i])
+ continue;
- if (status != IBMVFC_MAD_SUCCESS) {
- sdev_printk(KERN_WARNING, sdev, "Cancel failed with rc=%x\n", status);
- switch (status) {
- case IBMVFC_MAD_DRIVER_FAILED:
- case IBMVFC_MAD_CRQ_ERROR:
- /* Host adapter most likely going through reset, return success to
- the caller will wait for the command being cancelled to get returned */
- return 0;
- default:
- return -EIO;
- };
+ wait_for_completion(&evt_list[i]->comp);
+ status = be16_to_cpu(rsp[i].mad_common.status);
+
+ if (status != IBMVFC_MAD_SUCCESS) {
+ sdev_printk(KERN_WARNING, sdev, "Cancel failed with rc=%x\n", status);
+ switch (status) {
+ case IBMVFC_MAD_DRIVER_FAILED:
+ case IBMVFC_MAD_CRQ_ERROR:
+ /* Host adapter most likely going through reset, return success to
+ the caller will wait for the command being cancelled to get returned */
+ goto free_mem;
+ default:
+ ret = -EIO;
+ goto free_mem;
+ };
+ }
}
sdev_printk(KERN_INFO, sdev, "Successfully cancelled outstanding commands\n");
- return 0;
+free_mem:
+ spin_lock_irqsave(vhost->host->host_lock, flags);
+ for (i = 0; i < num_hwq; i++)
+ if (evt_list[i])
+ ibmvfc_free_event(evt_list[i]);
+ spin_unlock_irqrestore(vhost->host->host_lock, flags);
+ kfree(evt_list);
+ kfree(rsp);
+
+ return ret;
}
/**
--
2.27.0
Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 75 +++++++++++++++++++++++++++++-----
1 file changed, 65 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 97e8eed04b01..bc7c2dcd902c 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -40,6 +40,12 @@ static unsigned int disc_threads = IBMVFC_MAX_DISC_THREADS;
static unsigned int ibmvfc_debug = IBMVFC_DEBUG;
static unsigned int log_level = IBMVFC_DEFAULT_LOG_LEVEL;
static unsigned int cls3_error = IBMVFC_CLS3_ERROR;
+static unsigned int mq_enabled = IBMVFC_MQ;
+static unsigned int nr_scsi_hw_queues = IBMVFC_SCSI_HW_QUEUES;
+static unsigned int nr_scsi_channels = IBMVFC_SCSI_CHANNELS;
+static unsigned int mig_channels_only = IBMVFC_MIG_NO_SUB_TO_CRQ;
+static unsigned int mig_no_less_channels = IBMVFC_MIG_NO_N_TO_M;
+
static LIST_HEAD(ibmvfc_head);
static DEFINE_SPINLOCK(ibmvfc_driver_lock);
static struct scsi_transport_template *ibmvfc_transport_template;
@@ -49,6 +55,22 @@ MODULE_AUTHOR("Brian King <[email protected]>");
MODULE_LICENSE("GPL");
MODULE_VERSION(IBMVFC_DRIVER_VERSION);
+module_param_named(mq, mq_enabled, uint, S_IRUGO);
+MODULE_PARM_DESC(mq, "Enable multiqueue support. "
+ "[Default=" __stringify(IBMVFC_MQ) "]");
+module_param_named(scsi_host_queues, nr_scsi_hw_queues, uint, S_IRUGO);
+MODULE_PARM_DESC(scsi_host_queues, "Number of SCSI Host submission queues. "
+ "[Default=" __stringify(IBMVFC_SCSI_HW_QUEUES) "]");
+module_param_named(scsi_hw_channels, nr_scsi_channels, uint, S_IRUGO);
+MODULE_PARM_DESC(scsi_hw_channels, "Number of hw scsi channels to request. "
+ "[Default=" __stringify(IBMVFC_SCSI_CHANNELS) "]");
+module_param_named(mig_channels_only, mig_channels_only, uint, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(mig_channels_only, "Prevent migration to non-channelized system. "
+ "[Default=" __stringify(IBMVFC_MIG_NO_SUB_TO_CRQ) "]");
+module_param_named(mig_no_less_channels, mig_no_less_channels, uint, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(mig_no_less_channels, "Prevent migration to system with less channels. "
+ "[Default=" __stringify(IBMVFC_MIG_NO_N_TO_M) "]");
+
module_param_named(init_timeout, init_timeout, uint, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(init_timeout, "Initialization timeout in seconds. "
"[Default=" __stringify(IBMVFC_INIT_TIMEOUT) "]");
@@ -823,7 +845,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
crq->cur = 0;
if (vhost->scsi_scrqs.scrqs) {
- for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+ for (i = 0; i < nr_scsi_hw_queues; i++) {
scrq = &vhost->scsi_scrqs.scrqs[i];
memset(scrq->msgs, 0, PAGE_SIZE);
scrq->cur = 0;
@@ -3228,6 +3250,36 @@ static ssize_t ibmvfc_store_log_level(struct device *dev,
return strlen(buf);
}
+static ssize_t ibmvfc_show_scsi_channels(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct Scsi_Host *shost = class_to_shost(dev);
+ struct ibmvfc_host *vhost = shost_priv(shost);
+ unsigned long flags = 0;
+ int len;
+
+ spin_lock_irqsave(shost->host_lock, flags);
+ len = snprintf(buf, PAGE_SIZE, "%d\n", vhost->client_scsi_channels);
+ spin_unlock_irqrestore(shost->host_lock, flags);
+ return len;
+}
+
+static ssize_t ibmvfc_store_scsi_channels(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct Scsi_Host *shost = class_to_shost(dev);
+ struct ibmvfc_host *vhost = shost_priv(shost);
+ unsigned long flags = 0;
+ unsigned int channels;
+
+ spin_lock_irqsave(shost->host_lock, flags);
+ channels = simple_strtoul(buf, NULL, 10);
+ vhost->client_scsi_channels = min(channels, nr_scsi_hw_queues);
+ spin_unlock_irqrestore(shost->host_lock, flags);
+ return strlen(buf);
+}
+
static DEVICE_ATTR(partition_name, S_IRUGO, ibmvfc_show_host_partition_name, NULL);
static DEVICE_ATTR(device_name, S_IRUGO, ibmvfc_show_host_device_name, NULL);
static DEVICE_ATTR(port_loc_code, S_IRUGO, ibmvfc_show_host_loc_code, NULL);
@@ -3236,6 +3288,8 @@ static DEVICE_ATTR(npiv_version, S_IRUGO, ibmvfc_show_host_npiv_version, NULL);
static DEVICE_ATTR(capabilities, S_IRUGO, ibmvfc_show_host_capabilities, NULL);
static DEVICE_ATTR(log_level, S_IRUGO | S_IWUSR,
ibmvfc_show_log_level, ibmvfc_store_log_level);
+static DEVICE_ATTR(nr_scsi_channels, S_IRUGO | S_IWUSR,
+ ibmvfc_show_scsi_channels, ibmvfc_store_scsi_channels);
#ifdef CONFIG_SCSI_IBMVFC_TRACE
/**
@@ -3292,6 +3346,7 @@ static struct device_attribute *ibmvfc_attrs[] = {
&dev_attr_npiv_version,
&dev_attr_capabilities,
&dev_attr_log_level,
+ &dev_attr_nr_scsi_channels,
NULL
};
@@ -4676,9 +4731,9 @@ static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost)
mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_ENQUIRY);
mad->common.length = cpu_to_be16(sizeof(*mad));
- if (IBMVFC_MIG_NO_SUB_TO_CRQ)
+ if (mig_channels_only)
mad->flags |= cpu_to_be32(IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT);
- if (IBMVFC_MIG_NO_N_TO_M)
+ if (mig_no_less_channels)
mad->flags |= cpu_to_be32(IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT);
ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
@@ -5416,13 +5471,13 @@ static int ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
ENTER;
- vhost->scsi_scrqs.scrqs = kcalloc(IBMVFC_SCSI_HW_QUEUES,
+ vhost->scsi_scrqs.scrqs = kcalloc(nr_scsi_hw_queues,
sizeof(*vhost->scsi_scrqs.scrqs),
GFP_KERNEL);
if (!vhost->scsi_scrqs.scrqs)
return -1;
- for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+ for (i = 0; i < nr_scsi_hw_queues; i++) {
if (ibmvfc_register_scsi_channel(vhost, i)) {
for (j = i; j > 0; j--)
ibmvfc_deregister_scsi_channel(vhost, j - 1);
@@ -5446,7 +5501,7 @@ static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
if (!vhost->scsi_scrqs.scrqs)
return;
- for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++)
+ for (i = 0; i < nr_scsi_hw_queues; i++)
ibmvfc_deregister_scsi_channel(vhost, i);
kfree(vhost->scsi_scrqs.scrqs);
@@ -5658,13 +5713,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
}
shost->transportt = ibmvfc_transport_template;
- shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
+ shost->can_queue = (max_requests / nr_scsi_hw_queues);
shost->max_lun = max_lun;
shost->max_id = max_targets;
shost->max_sectors = IBMVFC_MAX_SECTORS;
shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
shost->unique_id = shost->host_no;
- shost->nr_hw_queues = IBMVFC_SCSI_HW_QUEUES;
+ shost->nr_hw_queues = nr_scsi_hw_queues;
vhost = shost_priv(shost);
INIT_LIST_HEAD(&vhost->sent);
@@ -5677,8 +5732,8 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
vhost->log_level = log_level;
vhost->task_set = 1;
- vhost->mq_enabled = IBMVFC_MQ;
- vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS;
+ vhost->mq_enabled = mq_enabled;
+ vhost->client_scsi_channels = min(nr_scsi_hw_queues, nr_scsi_channels);
vhost->using_channels = 0;
vhost->do_enquiry = 1;
--
2.27.0
Add a helper routine for initializing a Cancel MAD. This will be useful
for a channelized client that needs to send a Cancel commands down every
channel commands were sent for a particular LUN.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 67 ++++++++++++++++++++--------------
1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index c1ac2acba5fd..0b6284020f06 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2286,6 +2286,44 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
return SUCCESS;
}
+static struct ibmvfc_event *ibmvfc_init_tmf(struct ibmvfc_host *vhost,
+ struct scsi_device *sdev,
+ int type)
+{
+ struct scsi_target *starget = scsi_target(sdev);
+ struct fc_rport *rport = starget_to_rport(starget);
+ struct ibmvfc_event *evt;
+ struct ibmvfc_tmf *tmf;
+
+ evt = ibmvfc_get_event(vhost);
+ ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
+
+ tmf = &evt->iu.tmf;
+ memset(tmf, 0, sizeof(*tmf));
+ if (ibmvfc_check_caps(vhost, IBMVFC_HANDLE_VF_WWPN)) {
+ tmf->common.version = cpu_to_be32(2);
+ tmf->target_wwpn = cpu_to_be64(rport->port_name);
+ } else {
+ tmf->common.version = cpu_to_be32(1);
+ }
+ tmf->common.opcode = cpu_to_be32(IBMVFC_TMF_MAD);
+ tmf->common.length = cpu_to_be16(sizeof(*tmf));
+ tmf->scsi_id = cpu_to_be64(rport->port_id);
+ int_to_scsilun(sdev->lun, &tmf->lun);
+ if (!ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPRESS_ABTS))
+ type &= ~IBMVFC_TMF_SUPPRESS_ABTS;
+ if (vhost->state == IBMVFC_ACTIVE)
+ tmf->flags = cpu_to_be32((type | IBMVFC_TMF_LUA_VALID));
+ else
+ tmf->flags = cpu_to_be32(((type & IBMVFC_TMF_SUPPRESS_ABTS) | IBMVFC_TMF_LUA_VALID));
+ tmf->cancel_key = cpu_to_be32((unsigned long)sdev->hostdata);
+ tmf->my_cancel_key = cpu_to_be32((unsigned long)starget->hostdata);
+
+ init_completion(&evt->comp);
+
+ return evt;
+}
+
/**
* ibmvfc_cancel_all - Cancel all outstanding commands to the device
* @sdev: scsi device to cancel commands
@@ -2300,9 +2338,6 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
{
struct ibmvfc_host *vhost = shost_priv(sdev->host);
- struct scsi_target *starget = scsi_target(sdev);
- struct fc_rport *rport = starget_to_rport(starget);
- struct ibmvfc_tmf *tmf;
struct ibmvfc_event *evt, *found_evt;
union ibmvfc_iu rsp;
int rsp_rc = -EBUSY;
@@ -2327,32 +2362,8 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
}
if (vhost->logged_in) {
- evt = ibmvfc_get_event(vhost);
- ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
-
- tmf = &evt->iu.tmf;
- memset(tmf, 0, sizeof(*tmf));
- if (ibmvfc_check_caps(vhost, IBMVFC_HANDLE_VF_WWPN)) {
- tmf->common.version = cpu_to_be32(2);
- tmf->target_wwpn = cpu_to_be64(rport->port_name);
- } else {
- tmf->common.version = cpu_to_be32(1);
- }
- tmf->common.opcode = cpu_to_be32(IBMVFC_TMF_MAD);
- tmf->common.length = cpu_to_be16(sizeof(*tmf));
- tmf->scsi_id = cpu_to_be64(rport->port_id);
- int_to_scsilun(sdev->lun, &tmf->lun);
- if (!ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPRESS_ABTS))
- type &= ~IBMVFC_TMF_SUPPRESS_ABTS;
- if (vhost->state == IBMVFC_ACTIVE)
- tmf->flags = cpu_to_be32((type | IBMVFC_TMF_LUA_VALID));
- else
- tmf->flags = cpu_to_be32(((type & IBMVFC_TMF_SUPPRESS_ABTS) | IBMVFC_TMF_LUA_VALID));
- tmf->cancel_key = cpu_to_be32((unsigned long)sdev->hostdata);
- tmf->my_cancel_key = cpu_to_be32((unsigned long)starget->hostdata);
-
+ evt = ibmvfc_init_tmf(vhost, sdev, type);
evt->sync_iu = &rsp;
- init_completion(&evt->comp);
rsp_rc = ibmvfc_send_event(evt, vhost, default_timeout);
}
--
2.27.0
If the ibmvfc client adapter requests channels it must submit a number
of Sub-CRQ handles matching the number of channels being requested. The
VIOS in its response will overwrite the actual number of channel
resources allocated which may be less than what was requested. The
client then must store the VIOS Sub-CRQ handle for each queue. This VIOS
handle is needed as a parameter with h_send_sub_crq().
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 3bb20bfdaf4b..c1ac2acba5fd 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4509,15 +4509,35 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
{
struct ibmvfc_host *vhost = evt->vhost;
+ struct ibmvfc_channel_setup *setup = vhost->channel_setup_buf;
+ struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
int level = IBMVFC_DEFAULT_LOG_LEVEL;
+ int flags, active_queues, i;
ibmvfc_free_event(evt);
switch (mad_status) {
case IBMVFC_MAD_SUCCESS:
ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+ flags = be32_to_cpu(setup->flags);
vhost->do_enquiry = 0;
+ active_queues = be32_to_cpu(setup->num_scsi_subq_channels);
+ scrqs->active_queues = active_queues;
+
+ if (flags & IBMVFC_CHANNELS_CANCELED) {
+ ibmvfc_dbg(vhost, "Channels Canceled\n");
+ vhost->using_channels = 0;
+ } else {
+ if (active_queues)
+ vhost->using_channels = 1;
+ for (i = 0; i < active_queues; i++)
+ scrqs->scrqs[i].vios_cookie =
+ be64_to_cpu(setup->channel_handles[i]);
+
+ ibmvfc_dbg(vhost, "Using %u channels\n",
+ vhost->scsi_scrqs.active_queues);
+ }
break;
case IBMVFC_MAD_FAILED:
level += ibmvfc_retry_host_init(vhost);
@@ -4541,9 +4561,19 @@ static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
struct ibmvfc_channel_setup_mad *mad;
struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+ struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
+ unsigned int num_channels =
+ min(vhost->client_scsi_channels, vhost->max_vios_scsi_channels);
+ int i;
memset(setup_buf, 0, sizeof(*setup_buf));
- setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+ if (num_channels == 0)
+ setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+ else {
+ setup_buf->num_scsi_subq_channels = cpu_to_be32(num_channels);
+ for (i = 0; i < num_channels; i++)
+ setup_buf->channel_handles[i] = cpu_to_be64(scrqs->scrqs[i].cookie);
+ }
ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
mad = &evt->iu.channel_setup;
--
2.27.0
When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 4555775ea74b..3bb20bfdaf4b 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -701,6 +701,15 @@ static int ibmvfc_send_crq(struct ibmvfc_host *vhost, u64 word1, u64 word2)
return plpar_hcall_norets(H_SEND_CRQ, vdev->unit_address, word1, word2);
}
+static int ibmvfc_send_sub_crq(struct ibmvfc_host *vhost, u64 cookie, u64 word1,
+ u64 word2, u64 word3, u64 word4)
+{
+ struct vio_dev *vdev = to_vio_dev(vhost->dev);
+
+ return plpar_hcall_norets(H_SEND_SUB_CRQ, vdev->unit_address, cookie,
+ word1, word2, word3, word4);
+}
+
/**
* ibmvfc_send_crq_init - Send a CRQ init message
* @vhost: ibmvfc host struct
@@ -1513,15 +1522,19 @@ static int ibmvfc_send_event(struct ibmvfc_event *evt,
struct ibmvfc_host *vhost, unsigned long timeout)
{
__be64 *crq_as_u64 = (__be64 *) &evt->crq;
+ int channel_cmd = 0;
int rc;
/* Copy the IU into the transfer area */
*evt->xfer_iu = evt->iu;
- if (evt->crq.format == IBMVFC_CMD_FORMAT)
+ if (evt->crq.format == IBMVFC_CMD_FORMAT) {
evt->xfer_iu->cmd.tag = cpu_to_be64((u64)evt);
- else if (evt->crq.format == IBMVFC_MAD_FORMAT)
+ channel_cmd = 1;
+ } else if (evt->crq.format == IBMVFC_MAD_FORMAT) {
evt->xfer_iu->mad_common.tag = cpu_to_be64((u64)evt);
- else
+ if (evt->xfer_iu->mad_common.opcode == IBMVFC_TMF_MAD)
+ channel_cmd = 1;
+ } else
BUG();
list_add_tail(&evt->queue, &vhost->sent);
@@ -1534,8 +1547,17 @@ static int ibmvfc_send_event(struct ibmvfc_event *evt,
mb();
- if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
- be64_to_cpu(crq_as_u64[1])))) {
+ if (vhost->using_channels && channel_cmd)
+ rc = ibmvfc_send_sub_crq(vhost,
+ vhost->scsi_scrqs.scrqs[evt->hwq].vios_cookie,
+ be64_to_cpu(crq_as_u64[0]),
+ be64_to_cpu(crq_as_u64[1]),
+ 0, 0);
+ else
+ rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
+ be64_to_cpu(crq_as_u64[1]));
+
+ if (rc) {
list_del(&evt->queue);
del_timer(&evt->timer);
--
2.27.0
The logic for iterating over the Sub-CRQ responses is similiar to that
of the primary CRQ. Add the necessary handlers for processing those
responses.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 77 ++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 97f00fefa809..e9da3f60c793 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3381,6 +3381,83 @@ static int ibmvfc_toggle_scrq_irq(struct ibmvfc_sub_queue *scrq, int enable)
return rc;
}
+static void ibmvfc_handle_scrq(struct ibmvfc_crq *crq, struct ibmvfc_host *vhost)
+{
+ struct ibmvfc_event *evt = (struct ibmvfc_event *)be64_to_cpu(crq->ioba);
+ unsigned long flags;
+
+ switch (crq->valid) {
+ case IBMVFC_CRQ_CMD_RSP:
+ break;
+ case IBMVFC_CRQ_XPORT_EVENT:
+ return;
+ default:
+ dev_err(vhost->dev, "Got and invalid message type 0x%02x\n", crq->valid);
+ return;
+ }
+
+ /* The only kind of payload CRQs we should get are responses to
+ * things we send. Make sure this response is to something we
+ * actually sent
+ */
+ if (unlikely(!ibmvfc_valid_event(&vhost->pool, evt))) {
+ dev_err(vhost->dev, "Returned correlation_token 0x%08llx is invalid!\n",
+ crq->ioba);
+ return;
+ }
+
+ if (unlikely(atomic_read(&evt->free))) {
+ dev_err(vhost->dev, "Received duplicate correlation_token 0x%08llx!\n",
+ crq->ioba);
+ return;
+ }
+
+ spin_lock_irqsave(vhost->host->host_lock, flags);
+ del_timer(&evt->timer);
+ list_del(&evt->queue);
+ ibmvfc_trc_end(evt);
+ spin_unlock_irqrestore(vhost->host->host_lock, flags);
+ evt->done(evt);
+}
+
+static struct ibmvfc_crq *ibmvfc_next_scrq(struct ibmvfc_sub_queue *scrq)
+{
+ struct ibmvfc_crq *crq;
+
+ crq = &scrq->msgs[scrq->cur].crq;
+ if (crq->valid & 0x80) {
+ if (++scrq->cur == scrq->size)
+ scrq->cur = 0;
+ rmb();
+ } else
+ crq = NULL;
+
+ return crq;
+}
+
+static void ibmvfc_drain_sub_crq(struct ibmvfc_sub_queue *scrq)
+{
+ struct ibmvfc_crq *crq;
+ int done = 0;
+
+ while (!done) {
+ while ((crq = ibmvfc_next_scrq(scrq)) != NULL) {
+ ibmvfc_handle_scrq(crq, scrq->vhost);
+ crq->valid = 0;
+ wmb();
+ }
+
+ ibmvfc_toggle_scrq_irq(scrq, 1);
+ if ((crq = ibmvfc_next_scrq(scrq)) != NULL) {
+ ibmvfc_toggle_scrq_irq(scrq, 0);
+ ibmvfc_handle_scrq(crq, scrq->vhost);
+ crq->valid = 0;
+ wmb();
+ } else
+ done = 1;
+ }
+}
+
/**
* ibmvfc_init_tgt - Set the next init job step for the target
* @tgt: ibmvfc target struct
--
2.27.0
New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams
(MADs) were defined in a previous patchset. If the client advertises a
desire to use channels and the partner VIOS is channel capable then the
client must proceed with channel enquiry to determine the maximum number
of channels the VIOS is capable of providing, and registering SubCRQs
via channel setup with the VIOS immediately following NPIV Login. This
handshaking should not be performed for subsequent NPIV Logins unless
the CRQ connection has been reset.
Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 135 ++++++++++++++++++++++++++++++++-
drivers/scsi/ibmvscsi/ibmvfc.h | 3 +
2 files changed, 136 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 0336833a6950..bfd3340eb0b6 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -806,6 +806,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
spin_lock_irqsave(vhost->host->host_lock, flags);
vhost->state = IBMVFC_NO_CRQ;
vhost->logged_in = 0;
+ vhost->do_enquiry = 1;
+ vhost->using_channels = 0;
/* Clean out the queue */
memset(crq->msgs, 0, PAGE_SIZE);
@@ -4473,6 +4475,118 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
}
+static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
+{
+ struct ibmvfc_host *vhost = evt->vhost;
+ u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
+ int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+ ibmvfc_free_event(evt);
+
+ switch (mad_status) {
+ case IBMVFC_MAD_SUCCESS:
+ ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+ vhost->do_enquiry = 0;
+ break;
+ case IBMVFC_MAD_FAILED:
+ level += ibmvfc_retry_host_init(vhost);
+ ibmvfc_log(vhost, level, "Channel Setup failed\n");
+ fallthrough;
+ case IBMVFC_MAD_DRIVER_FAILED:
+ return;
+ default:
+ dev_err(vhost->dev, "Invalid Channel Setup response: 0x%x\n",
+ mad_status);
+ ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+ return;
+ }
+
+ ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+ wake_up(&vhost->work_wait_q);
+}
+
+static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
+{
+ struct ibmvfc_channel_setup_mad *mad;
+ struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
+ struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+
+ memset(setup_buf, 0, sizeof(*setup_buf));
+ setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+
+ ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
+ mad = &evt->iu.channel_setup;
+ memset(mad, 0, sizeof(*mad));
+ mad->common.version = cpu_to_be32(1);
+ mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_SETUP);
+ mad->common.length = cpu_to_be16(sizeof(*mad));
+ mad->buffer.va = cpu_to_be64(vhost->channel_setup_dma);
+ mad->buffer.len = cpu_to_be32(sizeof(*vhost->channel_setup_buf));
+
+ ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+ if (!ibmvfc_send_event(evt, vhost, default_timeout))
+ ibmvfc_dbg(vhost, "Sent channel setup\n");
+ else
+ ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+}
+
+static void ibmvfc_channel_enquiry_done(struct ibmvfc_event *evt)
+{
+ struct ibmvfc_host *vhost = evt->vhost;
+ struct ibmvfc_channel_enquiry *rsp = &evt->xfer_iu->channel_enquiry;
+ u32 mad_status = be16_to_cpu(rsp->common.status);
+ int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+ switch (mad_status) {
+ case IBMVFC_MAD_SUCCESS:
+ ibmvfc_dbg(vhost, "Channel Enquiry succeeded\n");
+ vhost->max_vios_scsi_channels = be32_to_cpu(rsp->num_scsi_subq_channels);
+ ibmvfc_free_event(evt);
+ break;
+ case IBMVFC_MAD_FAILED:
+ level += ibmvfc_retry_host_init(vhost);
+ ibmvfc_log(vhost, level, "Channel Enquiry failed\n");
+ fallthrough;
+ case IBMVFC_MAD_DRIVER_FAILED:
+ ibmvfc_free_event(evt);
+ return;
+ default:
+ dev_err(vhost->dev, "Invalid Channel Enquiry response: 0x%x\n",
+ mad_status);
+ ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+ ibmvfc_free_event(evt);
+ return;
+ }
+
+ ibmvfc_channel_setup(vhost);
+}
+
+static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost)
+{
+ struct ibmvfc_channel_enquiry *mad;
+ struct ibmvfc_event *evt = ibmvfc_get_event(vhost);
+
+ ibmvfc_init_event(evt, ibmvfc_channel_enquiry_done, IBMVFC_MAD_FORMAT);
+ mad = &evt->iu.channel_enquiry;
+ memset(mad, 0, sizeof(*mad));
+ mad->common.version = cpu_to_be32(1);
+ mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_ENQUIRY);
+ mad->common.length = cpu_to_be16(sizeof(*mad));
+
+ if (IBMVFC_MIG_NO_SUB_TO_CRQ)
+ mad->flags |= cpu_to_be32(IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT);
+ if (IBMVFC_MIG_NO_N_TO_M)
+ mad->flags |= cpu_to_be32(IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT);
+
+ ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+ if (!ibmvfc_send_event(evt, vhost, default_timeout))
+ ibmvfc_dbg(vhost, "Send channel enquiry\n");
+ else
+ ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+}
+
/**
* ibmvfc_npiv_login_done - Completion handler for NPIV Login
* @evt: ibmvfc event struct
@@ -4554,8 +4668,14 @@ static void ibmvfc_npiv_login_done(struct ibmvfc_event *evt)
vhost->host->can_queue = be32_to_cpu(rsp->max_cmds) - IBMVFC_NUM_INTERNAL_REQ;
vhost->host->max_sectors = npiv_max_sectors;
- ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
- wake_up(&vhost->work_wait_q);
+
+ if (ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPORT_CHANNELS) && vhost->do_enquiry) {
+ ibmvfc_channel_enquiry(vhost);
+ } else {
+ vhost->do_enquiry = 0;
+ ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+ wake_up(&vhost->work_wait_q);
+ }
}
/**
@@ -5327,9 +5447,20 @@ static int ibmvfc_alloc_mem(struct ibmvfc_host *vhost)
goto free_trace;
}
+ vhost->channel_setup_buf = dma_alloc_coherent(dev, sizeof(*vhost->channel_setup_buf),
+ &vhost->channel_setup_dma,
+ GFP_KERNEL);
+
+ if (!vhost->channel_setup_buf) {
+ dev_err(dev, "Couldn't allocate Channel Setup buffer\n");
+ goto free_tgt_pool;
+ }
+
LEAVE;
return 0;
+free_tgt_pool:
+ mempool_destroy(vhost->tgt_pool);
free_trace:
kfree(vhost->trace);
free_disc_buffer:
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 986ce4530382..dff26dbd912c 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -842,10 +842,13 @@ struct ibmvfc_host {
struct ibmvfc_npiv_login login_info;
union ibmvfc_npiv_login_data *login_buf;
dma_addr_t login_buf_dma;
+ struct ibmvfc_channel_setup *channel_setup_buf;
+ dma_addr_t channel_setup_dma;
int disc_buf_sz;
int log_level;
struct ibmvfc_discover_targets_entry *disc_buf;
struct mutex passthru_mutex;
+ int max_vios_scsi_channels;
int task_set;
int init_retries;
int discovery_threads;
--
2.27.0
Allocate a set of Sub-CRQs in advance. During channel setup the client
and VIOS negotiate the number of queues the VIOS supports and the number
that the client desires to request. Its possible that the final channel
resources allocated is less than requested, but the client is still
responsible for sending handles for every queue it is hoping for.
Also, provide deallocation cleanup routines.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 128 +++++++++++++++++++++++++++++++++
drivers/scsi/ibmvscsi/ibmvfc.h | 1 +
2 files changed, 129 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 64674054dbae..4860487c6779 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -793,6 +793,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
unsigned long flags;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
struct ibmvfc_crq_queue *crq = &vhost->crq;
+ struct ibmvfc_sub_queue *scrq;
+ int i;
/* Close the CRQ */
do {
@@ -809,6 +811,14 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
memset(crq->msgs, 0, PAGE_SIZE);
crq->cur = 0;
+ if (vhost->scsi_scrqs.scrqs) {
+ for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+ scrq = &vhost->scsi_scrqs.scrqs[i];
+ memset(scrq->msgs, 0, PAGE_SIZE);
+ scrq->cur = 0;
+ }
+ }
+
/* And re-open it again */
rc = plpar_hcall_norets(H_REG_CRQ, vdev->unit_address,
crq->msg_token, PAGE_SIZE);
@@ -4983,6 +4993,117 @@ static int ibmvfc_init_crq(struct ibmvfc_host *vhost)
return retrc;
}
+static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
+ int index)
+{
+ struct device *dev = vhost->dev;
+ struct vio_dev *vdev = to_vio_dev(dev);
+ struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
+ int rc = -ENOMEM;
+
+ ENTER;
+
+ scrq->msgs = (struct ibmvfc_sub_crq *)get_zeroed_page(GFP_KERNEL);
+ if (!scrq->msgs)
+ return rc;
+
+ scrq->size = PAGE_SIZE / sizeof(*scrq->msgs);
+ scrq->msg_token = dma_map_single(dev, scrq->msgs, PAGE_SIZE,
+ DMA_BIDIRECTIONAL);
+
+ if (dma_mapping_error(dev, scrq->msg_token))
+ goto dma_map_failed;
+
+ rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
+ &scrq->cookie, &scrq->hw_irq);
+
+ if (rc) {
+ dev_warn(dev, "Error registering sub-crq: %d\n", rc);
+ dev_warn(dev, "Firmware may not support MQ\n");
+ goto reg_failed;
+ }
+
+ scrq->hwq_id = index;
+ scrq->vhost = vhost;
+
+ LEAVE;
+ return 0;
+
+reg_failed:
+ dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
+dma_map_failed:
+ free_page((unsigned long)scrq->msgs);
+ LEAVE;
+ return rc;
+}
+
+static void ibmvfc_deregister_scsi_channel(struct ibmvfc_host *vhost, int index)
+{
+ struct device *dev = vhost->dev;
+ struct vio_dev *vdev = to_vio_dev(dev);
+ struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
+ long rc;
+
+ ENTER;
+
+ do {
+ rc = plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address,
+ scrq->cookie);
+ } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
+
+ if (rc)
+ dev_err(dev, "Failed to free sub-crq[%d]: rc=%ld\n", index, rc);
+
+ dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ free_page((unsigned long)scrq->msgs);
+ LEAVE;
+}
+
+static int ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
+{
+ int i, j;
+
+ ENTER;
+
+ vhost->scsi_scrqs.scrqs = kcalloc(IBMVFC_SCSI_HW_QUEUES,
+ sizeof(*vhost->scsi_scrqs.scrqs),
+ GFP_KERNEL);
+ if (!vhost->scsi_scrqs.scrqs)
+ return -1;
+
+ for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+ if (ibmvfc_register_scsi_channel(vhost, i)) {
+ for (j = i; j > 0; j--)
+ ibmvfc_deregister_scsi_channel(vhost, j - 1);
+ kfree(vhost->scsi_scrqs.scrqs);
+ vhost->scsi_scrqs.scrqs = NULL;
+ vhost->scsi_scrqs.active_queues = 0;
+ LEAVE;
+ return -1;
+ }
+ }
+
+ LEAVE;
+ return 0;
+}
+
+static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
+{
+ int i;
+
+ ENTER;
+ if (!vhost->scsi_scrqs.scrqs)
+ return;
+
+ for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++)
+ ibmvfc_deregister_scsi_channel(vhost, i);
+
+ kfree(vhost->scsi_scrqs.scrqs);
+ vhost->scsi_scrqs.scrqs = NULL;
+ vhost->scsi_scrqs.active_queues = 0;
+ LEAVE;
+}
+
/**
* ibmvfc_free_mem - Free memory for vhost
* @vhost: ibmvfc host struct
@@ -5239,6 +5360,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
goto remove_shost;
}
+ if (vhost->mq_enabled) {
+ rc = ibmvfc_init_sub_crqs(vhost);
+ if (rc)
+ dev_warn(dev, "Failed to allocate Sub-CRQs. rc=%d\n", rc);
+ }
+
if (shost_to_fc_host(shost)->rqst_q)
blk_queue_max_segments(shost_to_fc_host(shost)->rqst_q, 1);
dev_set_drvdata(dev, vhost);
@@ -5296,6 +5423,7 @@ static int ibmvfc_remove(struct vio_dev *vdev)
ibmvfc_purge_requests(vhost, DID_ERROR);
spin_unlock_irqrestore(vhost->host->host_lock, flags);
ibmvfc_free_event_pool(vhost);
+ ibmvfc_release_sub_crqs(vhost);
ibmvfc_free_mem(vhost);
spin_lock(&ibmvfc_driver_lock);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index b3cd35cbf067..986ce4530382 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -838,6 +838,7 @@ struct ibmvfc_host {
mempool_t *tgt_pool;
struct ibmvfc_crq_queue crq;
struct ibmvfc_async_crq_queue async_crq;
+ struct ibmvfc_scsi_channels scsi_scrqs;
struct ibmvfc_npiv_login login_info;
union ibmvfc_npiv_login_data *login_buf;
dma_addr_t login_buf_dma;
--
2.27.0
Simple handler that calls Sub-CRQ drain routine directly.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index e9da3f60c793..a3e2d627c1ac 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3458,6 +3458,16 @@ static void ibmvfc_drain_sub_crq(struct ibmvfc_sub_queue *scrq)
}
}
+static irqreturn_t ibmvfc_interrupt_scsi(int irq, void *scrq_instance)
+{
+ struct ibmvfc_sub_queue *scrq = (struct ibmvfc_sub_queue *)scrq_instance;
+
+ ibmvfc_toggle_scrq_irq(scrq, 0);
+ ibmvfc_drain_sub_crq(scrq);
+
+ return IRQ_HANDLED;
+}
+
/**
* ibmvfc_init_tgt - Set the next init job step for the target
* @tgt: ibmvfc target struct
--
2.27.0
Create an irq mapping for the hw_irq number provided from phyp firmware.
Request an irq assigned our Sub-CRQ interrupt handler.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index a3e2d627c1ac..0336833a6950 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5130,12 +5130,34 @@ static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
goto reg_failed;
}
+ scrq->irq = irq_create_mapping(NULL, scrq->hw_irq);
+
+ if (!scrq->irq) {
+ rc = -EINVAL;
+ dev_err(dev, "Error mapping sub-crq[%d] irq\n", index);
+ goto irq_failed;
+ }
+
+ snprintf(scrq->name, sizeof(scrq->name), "ibmvfc-%x-scsi%d",
+ vdev->unit_address, index);
+ rc = request_irq(scrq->irq, ibmvfc_interrupt_scsi, 0, scrq->name, scrq);
+
+ if (rc) {
+ dev_err(dev, "Couldn't register sub-crq[%d] irq\n", index);
+ irq_dispose_mapping(scrq->irq);
+ goto irq_failed;
+ }
+
scrq->hwq_id = index;
scrq->vhost = vhost;
LEAVE;
return 0;
+irq_failed:
+ do {
+ plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address, scrq->cookie);
+ } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
reg_failed:
dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
dma_map_failed:
--
2.27.0
Sub-CRQs are registred with firmware via a hypercall. Abstract that
interface into a simpler helper function.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index f1d677a7423d..64674054dbae 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -138,6 +138,20 @@ static void ibmvfc_tgt_move_login(struct ibmvfc_target *);
static const char *unknown_error = "unknown error";
+static long h_reg_sub_crq(unsigned long unit_address, unsigned long ioba,
+ unsigned long length, unsigned long *cookie,
+ unsigned long *irq)
+{
+ unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
+ long rc;
+
+ rc = plpar_hcall(H_REG_SUB_CRQ, retbuf, unit_address, ioba, length);
+ *cookie = retbuf[0];
+ *irq = retbuf[1];
+
+ return rc;
+}
+
static int ibmvfc_check_caps(struct ibmvfc_host *vhost, unsigned long cap_flags)
{
u64 host_caps = be64_to_cpu(vhost->login_buf->resp.capabilities);
--
2.27.0
Previous patches have plumbed the necessary Sub-CRQ interface and
channel negotiation MADs to fully channelized hardware queues.
Advertise client support via NPIV Login capability
IBMVFC_CAN_USE_CHANNELS when the client bits have MQ enabled via
vhost->mq_enabled, or when channels were already in use during a
subsequent NPIV Login. The later is required because channel support is
only renegotiated after a CRQ pair is broken. Simple NPIV Logout/Logins
require the client to continue to advertise the channel capability until
the CRQ pair between the client is broken.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index bfd3340eb0b6..0e6c9e55a221 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1282,6 +1282,10 @@ static void ibmvfc_set_login_info(struct ibmvfc_host *vhost)
login_info->max_cmds = cpu_to_be32(max_requests + IBMVFC_NUM_INTERNAL_REQ);
login_info->capabilities = cpu_to_be64(IBMVFC_CAN_MIGRATE | IBMVFC_CAN_SEND_VF_WWPN);
+
+ if (vhost->mq_enabled || vhost->using_channels)
+ login_info->capabilities |= cpu_to_be64(IBMVFC_CAN_USE_CHANNELS);
+
login_info->async.va = cpu_to_be64(vhost->async_crq.msg_token);
login_info->async.len = cpu_to_be32(vhost->async_crq.size * sizeof(*vhost->async_crq.msgs));
strncpy(login_info->partition_name, vhost->partition_name, IBMVFC_MAX_NAME);
--
2.27.0
Introduce several new vhost fields for managing MQ state of the adapter
as well as initial defaults for MQ enablement.
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 42e4d35e0d35..f1d677a7423d 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
}
shost->transportt = ibmvfc_transport_template;
- shost->can_queue = max_requests;
+ shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
shost->max_lun = max_lun;
shost->max_id = max_targets;
shost->max_sectors = IBMVFC_MAX_SECTORS;
shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
shost->unique_id = shost->host_no;
+ shost->nr_hw_queues = IBMVFC_SCSI_HW_QUEUES;
vhost = shost_priv(shost);
INIT_LIST_HEAD(&vhost->sent);
@@ -5178,6 +5179,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
vhost->partition_number = -1;
vhost->log_level = log_level;
vhost->task_set = 1;
+
+ vhost->mq_enabled = IBMVFC_MQ;
+ vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS;
+ vhost->using_channels = 0;
+ vhost->do_enquiry = 1;
+
strcpy(vhost->partition_name, "UNKNOWN");
init_waitqueue_head(&vhost->work_wait_q);
init_waitqueue_head(&vhost->init_wait_q);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 9d58cfd774d3..e095daada70e 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -41,16 +41,21 @@
#define IBMVFC_DEFAULT_LOG_LEVEL 2
#define IBMVFC_MAX_CDB_LEN 16
#define IBMVFC_CLS3_ERROR 0
+#define IBMVFC_MQ 0
+#define IBMVFC_SCSI_CHANNELS 0
+#define IBMVFC_SCSI_HW_QUEUES 1
+#define IBMVFC_MIG_NO_SUB_TO_CRQ 0
+#define IBMVFC_MIG_NO_N_TO_M 0
/*
* Ensure we have resources for ERP and initialization:
- * 1 for ERP
* 1 for initialization
* 1 for NPIV Logout
* 2 for BSG passthru
* 2 for each discovery thread
+ * 1 ERP for each possible HW Queue
*/
-#define IBMVFC_NUM_INTERNAL_REQ (1 + 1 + 1 + 2 + (disc_threads * 2))
+#define IBMVFC_NUM_INTERNAL_REQ (1 + 1 + 2 + (disc_threads * 2) + IBMVFC_SCSI_HW_QUEUES)
#define IBMVFC_MAD_SUCCESS 0x00
#define IBMVFC_MAD_NOT_SUPPORTED 0xF1
@@ -826,6 +831,10 @@ struct ibmvfc_host {
int delay_init;
int scan_complete;
int logged_in;
+ int mq_enabled;
+ int using_channels;
+ int do_enquiry;
+ int client_scsi_channels;
int aborting_passthru;
int events_to_log;
#define IBMVFC_AE_LINKUP 0x0001
--
2.27.0
Each Sub-CRQ has its own interrupt. A hypercall is required to toggle
the IRQ state. Provide the necessary mechanism via a helper function.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 4860487c6779..97f00fefa809 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3361,6 +3361,26 @@ static void ibmvfc_tasklet(void *data)
spin_unlock_irqrestore(vhost->host->host_lock, flags);
}
+static int ibmvfc_toggle_scrq_irq(struct ibmvfc_sub_queue *scrq, int enable)
+{
+ struct device *dev = scrq->vhost->dev;
+ struct vio_dev *vdev = to_vio_dev(dev);
+ unsigned long rc;
+ int irq_action = H_ENABLE_VIO_INTERRUPT;
+
+ if (!enable)
+ irq_action = H_DISABLE_VIO_INTERRUPT;
+
+ rc = plpar_hcall_norets(H_VIOCTL, vdev->unit_address, irq_action,
+ scrq->hw_irq, 0, 0);
+
+ if (rc)
+ dev_err(dev, "Couldn't %s sub-crq[%lu] irq. rc=%ld\n",
+ enable ? "enable" : "disable", scrq->hwq_id, rc);
+
+ return rc;
+}
+
/**
* ibmvfc_init_tgt - Set the next init job step for the target
* @tgt: ibmvfc target struct
--
2.27.0
Subordinate Command Response Queues (Sub CRQ) are used in conjunction
with the primary CRQ when more than one queue is needed by the virtual
IO adapter. Recent phyp firmware versions support Sub CRQ's with ibmvfc
adapters. This feature is a prerequisite for supporting multiple
hardware backed submission queues in the vfc adapter.
The Sub CRQ command element differs from the standard CRQ in that it is
32bytes long as opposed to 16bytes for the latter. Despite this extra
16bytes the ibmvfc protocol will use the original CRQ command element
mapped to the first 16bytes of the Sub CRQ element initially.
Add definitions for the Sub CRQ command element and queue.
Signed-off-by: Tyrel Datwyler <[email protected]>
Reviewed-by: Brian King <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index e095daada70e..b3cd35cbf067 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -656,6 +656,29 @@ struct ibmvfc_crq_queue {
dma_addr_t msg_token;
};
+struct ibmvfc_sub_crq {
+ struct ibmvfc_crq crq;
+ __be64 reserved[2];
+} __packed __aligned(8);
+
+struct ibmvfc_sub_queue {
+ struct ibmvfc_sub_crq *msgs;
+ dma_addr_t msg_token;
+ int size, cur;
+ struct ibmvfc_host *vhost;
+ unsigned long cookie;
+ unsigned long vios_cookie;
+ unsigned long hw_irq;
+ unsigned long irq;
+ unsigned long hwq_id;
+ char name[32];
+};
+
+struct ibmvfc_scsi_channels {
+ struct ibmvfc_sub_queue *scrqs;
+ unsigned int active_queues;
+};
+
enum ibmvfc_ae_link_state {
IBMVFC_AE_LS_LINK_UP = 0x01,
IBMVFC_AE_LS_LINK_BOUNCED = 0x02,
--
2.27.0
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> Introduce several new vhost fields for managing MQ state of the adapter
> as well as initial defaults for MQ enablement.
>
> Signed-off-by: Tyrel Datwyler <[email protected]>
> ---
> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
> 2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index 42e4d35e0d35..f1d677a7423d 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
> }
>
> shost->transportt = ibmvfc_transport_template;
> - shost->can_queue = max_requests;
> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
This doesn't look right. can_queue is the SCSI host queue depth, not the MQ queue depth.
> shost->max_lun = max_lun;
> shost->max_id = max_targets;
> shost->max_sectors = IBMVFC_MAX_SECTORS;
> shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
> shost->unique_id = shost->host_no;
> + shost->nr_hw_queues = IBMVFC_SCSI_HW_QUEUES;
>
> vhost = shost_priv(shost);
> INIT_LIST_HEAD(&vhost->sent);
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> +static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
> + int index)
> +{
> + struct device *dev = vhost->dev;
> + struct vio_dev *vdev = to_vio_dev(dev);
> + struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
> + int rc = -ENOMEM;
> +
> + ENTER;
> +
> + scrq->msgs = (struct ibmvfc_sub_crq *)get_zeroed_page(GFP_KERNEL);
> + if (!scrq->msgs)
> + return rc;
> +
> + scrq->size = PAGE_SIZE / sizeof(*scrq->msgs);
> + scrq->msg_token = dma_map_single(dev, scrq->msgs, PAGE_SIZE,
> + DMA_BIDIRECTIONAL);
> +
> + if (dma_mapping_error(dev, scrq->msg_token))
> + goto dma_map_failed;
> +
> + rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
> + &scrq->cookie, &scrq->hw_irq);
> +
> + if (rc) {
> + dev_warn(dev, "Error registering sub-crq: %d\n", rc);
> + dev_warn(dev, "Firmware may not support MQ\n");
Will this now get logged everywhere this new driver runs if the firmware
does not support sub CRQs? Is there something better that could be done
here to only log this for a true error and not just because a new driver
is running with an older firmware release?
> + goto reg_failed;
> + }
> +
> + scrq->hwq_id = index;
> + scrq->vhost = vhost;
> +
> + LEAVE;
> + return 0;
> +
> +reg_failed:
> + dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
> +dma_map_failed:
> + free_page((unsigned long)scrq->msgs);
> + LEAVE;
> + return rc;
> +}
> +
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> The logic for iterating over the Sub-CRQ responses is similiar to that
> of the primary CRQ. Add the necessary handlers for processing those
> responses.
>
> Signed-off-by: Tyrel Datwyler <[email protected]>
> ---
> drivers/scsi/ibmvscsi/ibmvfc.c | 77 ++++++++++++++++++++++++++++++++++
> 1 file changed, 77 insertions(+)
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index 97f00fefa809..e9da3f60c793 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -3381,6 +3381,83 @@ static int ibmvfc_toggle_scrq_irq(struct ibmvfc_sub_queue *scrq, int enable)
> return rc;
> }
>
> +static void ibmvfc_handle_scrq(struct ibmvfc_crq *crq, struct ibmvfc_host *vhost)
> +{
> + struct ibmvfc_event *evt = (struct ibmvfc_event *)be64_to_cpu(crq->ioba);
> + unsigned long flags;
> +
> + switch (crq->valid) {
> + case IBMVFC_CRQ_CMD_RSP:
> + break;
> + case IBMVFC_CRQ_XPORT_EVENT:
> + return;
> + default:
> + dev_err(vhost->dev, "Got and invalid message type 0x%02x\n", crq->valid);
> + return;
> + }
> +
> + /* The only kind of payload CRQs we should get are responses to
> + * things we send. Make sure this response is to something we
> + * actually sent
> + */
> + if (unlikely(!ibmvfc_valid_event(&vhost->pool, evt))) {
> + dev_err(vhost->dev, "Returned correlation_token 0x%08llx is invalid!\n",
> + crq->ioba);
> + return;
> + }
> +
> + if (unlikely(atomic_read(&evt->free))) {
> + dev_err(vhost->dev, "Received duplicate correlation_token 0x%08llx!\n",
> + crq->ioba);
> + return;
> + }
> +
> + spin_lock_irqsave(vhost->host->host_lock, flags);
> + del_timer(&evt->timer);
> + list_del(&evt->queue);
> + ibmvfc_trc_end(evt);
> + spin_unlock_irqrestore(vhost->host->host_lock, flags);
> + evt->done(evt);
> +}
> +
> +static struct ibmvfc_crq *ibmvfc_next_scrq(struct ibmvfc_sub_queue *scrq)
> +{
> + struct ibmvfc_crq *crq;
> +
> + crq = &scrq->msgs[scrq->cur].crq;
> + if (crq->valid & 0x80) {
> + if (++scrq->cur == scrq->size)
You are incrementing the cur pointer without any locks held. Although
unlikely, could you also be in ibmvfc_reset_crq in another thread?
If so, you'd have a subtle race condition here where the cur pointer could
be read, then ibmvfc_reset_crq writes it to zero, then this thread
writes it to a non zero value, which would then cause you to be out of
sync with the VIOS as to where the cur pointer is.
> + scrq->cur = 0;
> + rmb();
> + } else
> + crq = NULL;
> +
> + return crq;
> +}
> +
--
Brian King
Power Linux I/O
IBM Linux Technology Center
Reviewed-by: Brian King <[email protected]>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> +static void ibmvfc_handle_scrq(struct ibmvfc_crq *crq, struct ibmvfc_host *vhost)
> +{
> + struct ibmvfc_event *evt = (struct ibmvfc_event *)be64_to_cpu(crq->ioba);
> + unsigned long flags;
> +
> + switch (crq->valid) {
> + case IBMVFC_CRQ_CMD_RSP:
> + break;
> + case IBMVFC_CRQ_XPORT_EVENT:
> + return;
> + default:
> + dev_err(vhost->dev, "Got and invalid message type 0x%02x\n", crq->valid);
> + return;
> + }
> +
> + /* The only kind of payload CRQs we should get are responses to
> + * things we send. Make sure this response is to something we
> + * actually sent
> + */
> + if (unlikely(!ibmvfc_valid_event(&vhost->pool, evt))) {
> + dev_err(vhost->dev, "Returned correlation_token 0x%08llx is invalid!\n",
> + crq->ioba);
> + return;
> + }
> +
> + if (unlikely(atomic_read(&evt->free))) {
> + dev_err(vhost->dev, "Received duplicate correlation_token 0x%08llx!\n",
> + crq->ioba);
> + return;
> + }
> +
> + spin_lock_irqsave(vhost->host->host_lock, flags);
> + del_timer(&evt->timer);
> + list_del(&evt->queue);
> + ibmvfc_trc_end(evt);
Another thought here... If you are going through ibmvfc_purge_requests at the same time
as this code, you could check the free bit above, then have ibmvfc_purge_requests
put the event on the free queue and call scsi_done, then you come down and get the host
lock here, remove the command from the free list, and call the done function again,
which could result in a double completion to the scsi layer.
I think you need to grab the host lock before you check the free bit to avoid this race.
> + spin_unlock_irqrestore(vhost->host->host_lock, flags);
> + evt->done(evt);
> +}
> +
--
Brian King
Power Linux I/O
IBM Linux Technology Center
Reviewed-by: Brian King <[email protected]>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/2/20 7:14 AM, Brian King wrote:
> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>> Introduce several new vhost fields for managing MQ state of the adapter
>> as well as initial defaults for MQ enablement.
>>
>> Signed-off-by: Tyrel Datwyler <[email protected]>
>> ---
>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>> index 42e4d35e0d35..f1d677a7423d 100644
>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
>> }
>>
>> shost->transportt = ibmvfc_transport_template;
>> - shost->can_queue = max_requests;
>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>
> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ queue depth.
Our max_requests is the total number commands allowed across all queues. From
what I understand is can_queue is the total number of commands in flight allowed
for each hw queue.
/*
* In scsi-mq mode, the number of hardware queues supported by the LLD.
*
* Note: it is assumed that each hardware queue has a queue depth of
* can_queue. In other words, the total queue depth per host
* is nr_hw_queues * can_queue. However, for when host_tagset is set,
* the total queue depth is can_queue.
*/
We currently don't use the host wide shared tagset.
-Tyrel
>
>> shost->max_lun = max_lun;
>> shost->max_id = max_targets;
>> shost->max_sectors = IBMVFC_MAX_SECTORS;
>> shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
>> shost->unique_id = shost->host_no;
>> + shost->nr_hw_queues = IBMVFC_SCSI_HW_QUEUES;
>>
>> vhost = shost_priv(shost);
>> INIT_LIST_HEAD(&vhost->sent);
>
>
>
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> In general the client needs to send Cancel MADs and task management
> commands down the same channel as the command(s) intended to cancel or
> abort. The client assigns cancel keys per LUN and thus must send a
> Cancel down each channel commands were submitted for that LUN. Further,
> the client then must wait for those cancel completions prior to
> submitting a LUN RESET or ABORT TASK SET.
>
> Allocate event pointers for each possible scsi channel and assign an
> event for each channel that requires a cancel. Wait for completion each
> submitted cancel.
>
> Signed-off-by: Tyrel Datwyler <[email protected]>
> ---
> drivers/scsi/ibmvscsi/ibmvfc.c | 106 +++++++++++++++++++++------------
> 1 file changed, 68 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index 0b6284020f06..97e8eed04b01 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -2339,32 +2339,52 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
> {
> struct ibmvfc_host *vhost = shost_priv(sdev->host);
> struct ibmvfc_event *evt, *found_evt;
> - union ibmvfc_iu rsp;
> - int rsp_rc = -EBUSY;
> + struct ibmvfc_event **evt_list;
> + union ibmvfc_iu *rsp;
> + int rsp_rc = 0;
> unsigned long flags;
> u16 status;
> + int num_hwq = 1;
> + int i;
> + int ret = 0;
>
> ENTER;
> spin_lock_irqsave(vhost->host->host_lock, flags);
> - found_evt = NULL;
> - list_for_each_entry(evt, &vhost->sent, queue) {
> - if (evt->cmnd && evt->cmnd->device == sdev) {
> - found_evt = evt;
> - break;
> + if (vhost->using_channels && vhost->scsi_scrqs.active_queues)
> + num_hwq = vhost->scsi_scrqs.active_queues;
> +
> + evt_list = kcalloc(num_hwq, sizeof(*evt_list), GFP_KERNE> + rsp = kcalloc(num_hwq, sizeof(*rsp), GFP_KERNEL);
Can't this just go on the stack? We don't want to be allocating memory
during error recovery. Or, alternatively, you could put this in the
vhost structure and protect it with a mutex. We only have enough events
to single thread these anyway.
> +
> + for (i = 0; i < num_hwq; i++) {
> + sdev_printk(KERN_INFO, sdev, "Cancelling outstanding commands on queue %d.\n", i);
Prior to this patch, if there was nothing outstanding to the device and cancel_all was called,
no messages would get printed. This is changing that behavior. Is that intentional? Additionally,
it looks like this will get a lot more vebose, logging a message for each hw queue, regardless
of whether there was anything outstanding. Perhaps you want to move this down to after the check
for !found_evt?
> +
> + found_evt = NULL;
> + list_for_each_entry(evt, &vhost->sent, queue) {
> + if (evt->cmnd && evt->cmnd->device == sdev && evt->hwq == i) {
> + found_evt = evt;
> + break;
> + }
> }
> - }
>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
Reviewed-by: Brian King <[email protected]>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
> +module_param_named(mig_channels_only, mig_channels_only, uint, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(mig_channels_only, "Prevent migration to non-channelized system. "
> + "[Default=" __stringify(IBMVFC_MIG_NO_SUB_TO_CRQ) "]");
> +module_param_named(mig_no_less_channels, mig_no_less_channels, uint, S_IRUGO | S_IWUSR);
> +MODULE_PARM_DESC(mig_no_less_channels, "Prevent migration to system with less channels. "
> + "[Default=" __stringify(IBMVFC_MIG_NO_N_TO_M) "]");
Both of these are writeable, but it doesn't look like you do any re-negotiation
with the VIOS for these changed settings to take effect if someone changes
them at runtime.
> +
> module_param_named(init_timeout, init_timeout, uint, S_IRUGO | S_IWUSR);
> MODULE_PARM_DESC(init_timeout, "Initialization timeout in seconds. "
> "[Default=" __stringify(IBMVFC_INIT_TIMEOUT) "]");
> @@ -3228,6 +3250,36 @@ static ssize_t ibmvfc_store_log_level(struct device *dev,
> return strlen(buf);
> }
>
> +static ssize_t ibmvfc_show_scsi_channels(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct Scsi_Host *shost = class_to_shost(dev);
> + struct ibmvfc_host *vhost = shost_priv(shost);
> + unsigned long flags = 0;
> + int len;
> +
> + spin_lock_irqsave(shost->host_lock, flags);
> + len = snprintf(buf, PAGE_SIZE, "%d\n", vhost->client_scsi_channels);
> + spin_unlock_irqrestore(shost->host_lock, flags);
> + return len;
> +}
> +
> +static ssize_t ibmvfc_store_scsi_channels(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct Scsi_Host *shost = class_to_shost(dev);
> + struct ibmvfc_host *vhost = shost_priv(shost);
> + unsigned long flags = 0;
> + unsigned int channels;
> +
> + spin_lock_irqsave(shost->host_lock, flags);
> + channels = simple_strtoul(buf, NULL, 10);
> + vhost->client_scsi_channels = min(channels, nr_scsi_hw_queues);
Don't we need to do a LIP here for this new setting to go into effect?
> + spin_unlock_irqrestore(shost->host_lock, flags);
> + return strlen(buf);
> +}
> +
> static DEVICE_ATTR(partition_name, S_IRUGO, ibmvfc_show_host_partition_name, NULL);
> static DEVICE_ATTR(device_name, S_IRUGO, ibmvfc_show_host_device_name, NULL);
> static DEVICE_ATTR(port_loc_code, S_IRUGO, ibmvfc_show_host_loc_code, NULL);
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/2/20 10:27 AM, Brian King wrote:
> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>> In general the client needs to send Cancel MADs and task management
>> commands down the same channel as the command(s) intended to cancel or
>> abort. The client assigns cancel keys per LUN and thus must send a
>> Cancel down each channel commands were submitted for that LUN. Further,
>> the client then must wait for those cancel completions prior to
>> submitting a LUN RESET or ABORT TASK SET.
>>
>> Allocate event pointers for each possible scsi channel and assign an
>> event for each channel that requires a cancel. Wait for completion each
>> submitted cancel.
>>
>> Signed-off-by: Tyrel Datwyler <[email protected]>
>> ---
>> drivers/scsi/ibmvscsi/ibmvfc.c | 106 +++++++++++++++++++++------------
>> 1 file changed, 68 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>> index 0b6284020f06..97e8eed04b01 100644
>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>> @@ -2339,32 +2339,52 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
>> {
>> struct ibmvfc_host *vhost = shost_priv(sdev->host);
>> struct ibmvfc_event *evt, *found_evt;
>> - union ibmvfc_iu rsp;
>> - int rsp_rc = -EBUSY;
>> + struct ibmvfc_event **evt_list;
>> + union ibmvfc_iu *rsp;
>> + int rsp_rc = 0;
>> unsigned long flags;
>> u16 status;
>> + int num_hwq = 1;
>> + int i;
>> + int ret = 0;
>>
>> ENTER;
>> spin_lock_irqsave(vhost->host->host_lock, flags);
>> - found_evt = NULL;
>> - list_for_each_entry(evt, &vhost->sent, queue) {
>> - if (evt->cmnd && evt->cmnd->device == sdev) {
>> - found_evt = evt;
>> - break;
>> + if (vhost->using_channels && vhost->scsi_scrqs.active_queues)
>> + num_hwq = vhost->scsi_scrqs.active_queues;
>> +
>> + evt_list = kcalloc(num_hwq, sizeof(*evt_list), GFP_KERNE> + rsp = kcalloc(num_hwq, sizeof(*rsp), GFP_KERNEL);
>
> Can't this just go on the stack? We don't want to be allocating memory
> during error recovery. Or, alternatively, you could put this in the
> vhost structure and protect it with a mutex. We only have enough events
> to single thread these anyway.
Yes, this could just go on the stack.
>
>> +
>> + for (i = 0; i < num_hwq; i++) {
>> + sdev_printk(KERN_INFO, sdev, "Cancelling outstanding commands on queue %d.\n", i);
>
> Prior to this patch, if there was nothing outstanding to the device and cancel_all was called,
> no messages would get printed. This is changing that behavior. Is that intentional? Additionally,
> it looks like this will get a lot more vebose, logging a message for each hw queue, regardless
> of whether there was anything outstanding. Perhaps you want to move this down to after the check
> for !found_evt?
It would actually print "no commands found to cancel". I think its fair to make
it less verbose or at least make them dbg output for each queue.
-Tyrel
>
>> +
>> + found_evt = NULL;
>> + list_for_each_entry(evt, &vhost->sent, queue) {
>> + if (evt->cmnd && evt->cmnd->device == sdev && evt->hwq == i) {
>> + found_evt = evt;
>> + break;
>> + }
>> }
>> - }
>>
>
>
>
On 12/2/20 10:40 AM, Brian King wrote:
> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>> +module_param_named(mig_channels_only, mig_channels_only, uint, S_IRUGO | S_IWUSR);
>> +MODULE_PARM_DESC(mig_channels_only, "Prevent migration to non-channelized system. "
>> + "[Default=" __stringify(IBMVFC_MIG_NO_SUB_TO_CRQ) "]");
>> +module_param_named(mig_no_less_channels, mig_no_less_channels, uint, S_IRUGO | S_IWUSR);
>> +MODULE_PARM_DESC(mig_no_less_channels, "Prevent migration to system with less channels. "
>> + "[Default=" __stringify(IBMVFC_MIG_NO_N_TO_M) "]");
>
> Both of these are writeable, but it doesn't look like you do any re-negotiation
> with the VIOS for these changed settings to take effect if someone changes
> them at runtime.
For some reason I convinced myself that these could just be changed on the fly,
but yes for them to actually take effect we need to re-negotiate the channels setup.
>
>> +
>> module_param_named(init_timeout, init_timeout, uint, S_IRUGO | S_IWUSR);
>> MODULE_PARM_DESC(init_timeout, "Initialization timeout in seconds. "
>> "[Default=" __stringify(IBMVFC_INIT_TIMEOUT) "]");
>
>> @@ -3228,6 +3250,36 @@ static ssize_t ibmvfc_store_log_level(struct device *dev,
>> return strlen(buf);
>> }
>>
>> +static ssize_t ibmvfc_show_scsi_channels(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct Scsi_Host *shost = class_to_shost(dev);
>> + struct ibmvfc_host *vhost = shost_priv(shost);
>> + unsigned long flags = 0;
>> + int len;
>> +
>> + spin_lock_irqsave(shost->host_lock, flags);
>> + len = snprintf(buf, PAGE_SIZE, "%d\n", vhost->client_scsi_channels);
>> + spin_unlock_irqrestore(shost->host_lock, flags);
>> + return len;
>> +}
>> +
>> +static ssize_t ibmvfc_store_scsi_channels(struct device *dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + struct Scsi_Host *shost = class_to_shost(dev);
>> + struct ibmvfc_host *vhost = shost_priv(shost);
>> + unsigned long flags = 0;
>> + unsigned int channels;
>> +
>> + spin_lock_irqsave(shost->host_lock, flags);
>> + channels = simple_strtoul(buf, NULL, 10);
>> + vhost->client_scsi_channels = min(channels, nr_scsi_hw_queues);
>
> Don't we need to do a LIP here for this new setting to go into effect?
Actually, we need a hard reset to break the CRQ Pair. A LIP will only do a NPIV
Logout/Login which keeps the existing channel setup.
-Tyrel
>
>> + spin_unlock_irqrestore(shost->host_lock, flags);
>> + return strlen(buf);
>> +}
>> +
>> static DEVICE_ATTR(partition_name, S_IRUGO, ibmvfc_show_host_partition_name, NULL);
>> static DEVICE_ATTR(device_name, S_IRUGO, ibmvfc_show_host_device_name, NULL);
>> static DEVICE_ATTR(port_loc_code, S_IRUGO, ibmvfc_show_host_loc_code, NULL);
>
>
>
On 12/2/20 7:25 AM, Brian King wrote:
> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>> +static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
>> + int index)
>> +{
>> + struct device *dev = vhost->dev;
>> + struct vio_dev *vdev = to_vio_dev(dev);
>> + struct ibmvfc_sub_queue *scrq = &vhost->scsi_scrqs.scrqs[index];
>> + int rc = -ENOMEM;
>> +
>> + ENTER;
>> +
>> + scrq->msgs = (struct ibmvfc_sub_crq *)get_zeroed_page(GFP_KERNEL);
>> + if (!scrq->msgs)
>> + return rc;
>> +
>> + scrq->size = PAGE_SIZE / sizeof(*scrq->msgs);
>> + scrq->msg_token = dma_map_single(dev, scrq->msgs, PAGE_SIZE,
>> + DMA_BIDIRECTIONAL);
>> +
>> + if (dma_mapping_error(dev, scrq->msg_token))
>> + goto dma_map_failed;
>> +
>> + rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
>> + &scrq->cookie, &scrq->hw_irq);
>> +
>> + if (rc) {
>> + dev_warn(dev, "Error registering sub-crq: %d\n", rc);
>> + dev_warn(dev, "Firmware may not support MQ\n");
>
> Will this now get logged everywhere this new driver runs if the firmware
> does not support sub CRQs? Is there something better that could be done
> here to only log this for a true error and not just because a new driver
> is running with an older firmware release?
I suppose we can guess if the rc is H_PARAMETER that sub-crqs are probably not
supported and do a dev_warn_once() for the no MQ support message. H_PARAMETER
could mean other things though so we still need to log the failure in my opinion.
-Tyrel
>
>> + goto reg_failed;
>> + }
>> +
>> + scrq->hwq_id = index;
>> + scrq->vhost = vhost;
>> +
>> + LEAVE;
>> + return 0;
>> +
>> +reg_failed:
>> + dma_unmap_single(dev, scrq->msg_token, PAGE_SIZE, DMA_BIDIRECTIONAL);
>> +dma_map_failed:
>> + free_page((unsigned long)scrq->msgs);
>> + LEAVE;
>> + return rc;
>> +}
>> +
>
>
>
On 12/2/20 7:46 AM, Brian King wrote:
> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>> The logic for iterating over the Sub-CRQ responses is similiar to that
>> of the primary CRQ. Add the necessary handlers for processing those
>> responses.
>>
>> Signed-off-by: Tyrel Datwyler <[email protected]>
>> ---
>> drivers/scsi/ibmvscsi/ibmvfc.c | 77 ++++++++++++++++++++++++++++++++++
>> 1 file changed, 77 insertions(+)
>>
>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>> index 97f00fefa809..e9da3f60c793 100644
>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>> @@ -3381,6 +3381,83 @@ static int ibmvfc_toggle_scrq_irq(struct ibmvfc_sub_queue *scrq, int enable)
>> return rc;
>> }
>>
>> +static void ibmvfc_handle_scrq(struct ibmvfc_crq *crq, struct ibmvfc_host *vhost)
>> +{
>> + struct ibmvfc_event *evt = (struct ibmvfc_event *)be64_to_cpu(crq->ioba);
>> + unsigned long flags;
>> +
>> + switch (crq->valid) {
>> + case IBMVFC_CRQ_CMD_RSP:
>> + break;
>> + case IBMVFC_CRQ_XPORT_EVENT:
>> + return;
>> + default:
>> + dev_err(vhost->dev, "Got and invalid message type 0x%02x\n", crq->valid);
>> + return;
>> + }
>> +
>> + /* The only kind of payload CRQs we should get are responses to
>> + * things we send. Make sure this response is to something we
>> + * actually sent
>> + */
>> + if (unlikely(!ibmvfc_valid_event(&vhost->pool, evt))) {
>> + dev_err(vhost->dev, "Returned correlation_token 0x%08llx is invalid!\n",
>> + crq->ioba);
>> + return;
>> + }
>> +
>> + if (unlikely(atomic_read(&evt->free))) {
>> + dev_err(vhost->dev, "Received duplicate correlation_token 0x%08llx!\n",
>> + crq->ioba);
>> + return;
>> + }
>> +
>> + spin_lock_irqsave(vhost->host->host_lock, flags);
>> + del_timer(&evt->timer);
>> + list_del(&evt->queue);
>> + ibmvfc_trc_end(evt);
>> + spin_unlock_irqrestore(vhost->host->host_lock, flags);
>> + evt->done(evt);
>> +}
>> +
>> +static struct ibmvfc_crq *ibmvfc_next_scrq(struct ibmvfc_sub_queue *scrq)
>> +{
>> + struct ibmvfc_crq *crq;
>> +
>> + crq = &scrq->msgs[scrq->cur].crq;
>> + if (crq->valid & 0x80) {
>> + if (++scrq->cur == scrq->size)
>
> You are incrementing the cur pointer without any locks held. Although
> unlikely, could you also be in ibmvfc_reset_crq in another thread?
> If so, you'd have a subtle race condition here where the cur pointer could
> be read, then ibmvfc_reset_crq writes it to zero, then this thread
> writes it to a non zero value, which would then cause you to be out of
> sync with the VIOS as to where the cur pointer is.
Oof, yeah I was previously holding the lock the whole time, but switched it up
once I realized I can't complete a scsi command with the lock held, and got a
little too loose with it.
-Tyrel
>
>> + scrq->cur = 0;
>> + rmb();
>> + } else
>> + crq = NULL;
>> +
>> + return crq;
>> +}
>> +
>
>
>
On 12/2/20 11:27 AM, Tyrel Datwyler wrote:
> On 12/2/20 7:14 AM, Brian King wrote:
>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>>> Introduce several new vhost fields for managing MQ state of the adapter
>>> as well as initial defaults for MQ enablement.
>>>
>>> Signed-off-by: Tyrel Datwyler <[email protected]>
>>> ---
>>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
>>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>>> index 42e4d35e0d35..f1d677a7423d 100644
>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
>>> }
>>>
>>> shost->transportt = ibmvfc_transport_template;
>>> - shost->can_queue = max_requests;
>>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>>
>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ queue depth.
>
> Our max_requests is the total number commands allowed across all queues. From
> what I understand is can_queue is the total number of commands in flight allowed
> for each hw queue.
>
> /*
> * In scsi-mq mode, the number of hardware queues supported by the LLD.
> *
> * Note: it is assumed that each hardware queue has a queue depth of
> * can_queue. In other words, the total queue depth per host
> * is nr_hw_queues * can_queue. However, for when host_tagset is set,
> * the total queue depth is can_queue.
> */
>
> We currently don't use the host wide shared tagset.
Ok. I missed that bit... In that case, since we allocate by default only 100
event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then
we end up with only about 6 commands that can be outstanding per queue,
which is going to really hurt performance... I'd suggest bumping up
IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point.
Thanks,
Brian
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On 12/4/20 3:26 PM, Brian King wrote:
> On 12/2/20 11:27 AM, Tyrel Datwyler wrote:
>> On 12/2/20 7:14 AM, Brian King wrote:
>>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>>>> Introduce several new vhost fields for managing MQ state of the adapter
>>>> as well as initial defaults for MQ enablement.
>>>>
>>>> Signed-off-by: Tyrel Datwyler <[email protected]>
>>>> ---
>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>> index 42e4d35e0d35..f1d677a7423d 100644
>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
>>>> }
>>>>
>>>> shost->transportt = ibmvfc_transport_template;
>>>> - shost->can_queue = max_requests;
>>>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>>>
>>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ queue depth.
>>
>> Our max_requests is the total number commands allowed across all queues. From
>> what I understand is can_queue is the total number of commands in flight allowed
>> for each hw queue.
>>
>> /*
>> * In scsi-mq mode, the number of hardware queues supported by the LLD.
>> *
>> * Note: it is assumed that each hardware queue has a queue depth of
>> * can_queue. In other words, the total queue depth per host
>> * is nr_hw_queues * can_queue. However, for when host_tagset is set,
>> * the total queue depth is can_queue.
>> */
>>
>> We currently don't use the host wide shared tagset.
>
> Ok. I missed that bit... In that case, since we allocate by default only 100
> event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then
> we end up with only about 6 commands that can be outstanding per queue,
> which is going to really hurt performance... I'd suggest bumping up
> IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point.
>
Before doing that I'd rather use the host-wide shared tagset.
Increasing the number of requests will increase the memory footprint of
the driver (as each request will be statically allocated).
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
On 12/7/20 3:56 AM, Hannes Reinecke wrote:
> On 12/4/20 3:26 PM, Brian King wrote:
>> On 12/2/20 11:27 AM, Tyrel Datwyler wrote:
>>> On 12/2/20 7:14 AM, Brian King wrote:
>>>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>>>>> Introduce several new vhost fields for managing MQ state of the adapter
>>>>> as well as initial defaults for MQ enablement.
>>>>>
>>>>> Signed-off-by: Tyrel Datwyler <[email protected]>
>>>>> ---
>>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
>>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>>>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> index 42e4d35e0d35..f1d677a7423d 100644
>>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const
>>>>> struct vio_device_id *id)
>>>>> }
>>>>> shost->transportt = ibmvfc_transport_template;
>>>>> - shost->can_queue = max_requests;
>>>>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>>>>
>>>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ
>>>> queue depth.
>>>
>>> Our max_requests is the total number commands allowed across all queues. From
>>> what I understand is can_queue is the total number of commands in flight allowed
>>> for each hw queue.
>>>
>>> /*
>>> * In scsi-mq mode, the number of hardware queues supported by the LLD.
>>> *
>>> * Note: it is assumed that each hardware queue has a queue depth of
>>> * can_queue. In other words, the total queue depth per host
>>> * is nr_hw_queues * can_queue. However, for when host_tagset is set,
>>> * the total queue depth is can_queue.
>>> */
>>>
>>> We currently don't use the host wide shared tagset.
>>
>> Ok. I missed that bit... In that case, since we allocate by default only 100
>> event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then
>> we end up with only about 6 commands that can be outstanding per queue,
>> which is going to really hurt performance... I'd suggest bumping up
>> IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point.
>>
> Before doing that I'd rather use the host-wide shared tagset.
> Increasing the number of requests will increase the memory footprint of the
> driver (as each request will be statically allocated).
>
In the case where we use host-wide how do I determine the queue depth per
hardware queue? Is is hypothetically can_queue or is it (can_queue /
nr_hw_queues)? We want to allocate an event pool per-queue which made sense
without host-wide tags since the queue depth per hw queue is exactly can_queue.
-Tyrel
On 08/12/2020 22:37, Tyrel Datwyler wrote:
> On 12/7/20 3:56 AM, Hannes Reinecke wrote:
>> On 12/4/20 3:26 PM, Brian King wrote:
>>> On 12/2/20 11:27 AM, Tyrel Datwyler wrote:
>>>> On 12/2/20 7:14 AM, Brian King wrote:
>>>>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote:
>>>>>> Introduce several new vhost fields for managing MQ state of the adapter
>>>>>> as well as initial defaults for MQ enablement.
>>>>>>
>>>>>> Signed-off-by: Tyrel Datwyler <[email protected]>
>>>>>> ---
>>>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++-
>>>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++--
>>>>>> 2 files changed, 19 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>>> index 42e4d35e0d35..f1d677a7423d 100644
>>>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
>>>>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const
>>>>>> struct vio_device_id *id)
>>>>>> }
>>>>>> shost->transportt = ibmvfc_transport_template;
>>>>>> - shost->can_queue = max_requests;
>>>>>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES);
>>>>>
>>>>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ
>>>>> queue depth.
>>>>
>>>> Our max_requests is the total number commands allowed across all queues. From
>>>> what I understand is can_queue is the total number of commands in flight allowed
>>>> for each hw queue.
>>>>
>>>> /*
>>>> * In scsi-mq mode, the number of hardware queues supported by the LLD.
>>>> *
>>>> * Note: it is assumed that each hardware queue has a queue depth of
>>>> * can_queue. In other words, the total queue depth per host
>>>> * is nr_hw_queues * can_queue. However, for when host_tagset is set,
>>>> * the total queue depth is can_queue.
>>>> */
>>>>
>>>> We currently don't use the host wide shared tagset.
>>>
>>> Ok. I missed that bit... In that case, since we allocate by default only 100
>>> event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then
>>> we end up with only about 6 commands that can be outstanding per queue,
>>> which is going to really hurt performance... I'd suggest bumping up
>>> IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point.
>>>
>> Before doing that I'd rather use the host-wide shared tagset.
>> Increasing the number of requests will increase the memory footprint of the
>> driver (as each request will be statically allocated).
Exposing HW queues increases memory footprint as we allocate the static
requests per HW queue ctx, regardless of shared hostwide tagset enabled
or not. This could prob be improved.
>>
>
> In the case where we use host-wide how do I determine the queue depth per
> hardware queue? Is is hypothetically can_queue or is it (can_queue /
> nr_hw_queues)? We want to allocate an event pool per-queue which made sense
> without host-wide tags since the queue depth per hw queue is exactly can_queue.
>
Generally hw queue depth should be same as can_queue. And this applies
when hostwide shared tags is enabled as well.
We do this for hisi_sas: the host can queue max 4096 commands over all
queues, so we set .can_queue = 4096*, set HW queue depth = 4096, and set
.host_tagset = 1.
* we need to reserve some commands for internal IO, so this is reduced a
little
Thanks,
John