Changes to the locking pattern protecting the event lists and handling of scsi
command completion introduced a race where an ouststanding command that EH is
waiting ifor to complete is no longer identifiable by being on the sent list, but
instead as a command that is not on the free list. This is a result of moving
commands to be completed off the sent list to a private list to be completed
outside the list lock.
Second, during MQ enablement the ibmvfc_wait_for_ops helper used during EH to
ensure commands were properely completed failed to be converted to check for
commands on the sub-queues isntead of the primary CRQ.
Tyrel Datwyler (2):
ibmvfc: fix potential race in ibmvfc_wait_for_ops
ibmvfc: make ibmvfc_wait_for_ops MQ aware
drivers/scsi/ibmvscsi/ibmvfc.c | 67 +++++++++++++++++++++++++++-------
1 file changed, 54 insertions(+), 13 deletions(-)
--
2.27.0
For various EH activities the ibmvfc driver uses ibmvfc_wait_for_ops()
to wait for the completion of commands that match a given criteria be it
cancel key, or specific LUN. With recent changes commands are completed
outside the lock in bulk by removing them from the sent list and adding
them to a private completion list. This introduces a potential race in
ibmvfc_wait_for_ops() since the criteria for a command to be oustsanding
is no longer simply being on the sent list, but instead not being on the
free list.
Avoid this race by scanning the entire command event pool and checking
that any matching command that ibmvfc needs to wait on is not already on
the free list.
Fixes: 1f4a4a19508d ("scsi: ibmvfc: Complete commands outside the host/queue lock")
Signed-off-by: Tyrel Datwyler <[email protected]>
---
drivers/scsi/ibmvscsi/ibmvfc.c | 42 ++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index f140eafc0d6a..5414b465a92f 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2371,6 +2371,24 @@ static int ibmvfc_match_lun(struct ibmvfc_event *evt, void *device)
return 0;
}
+/**
+ * ibmvfc_event_is_free - Check if event is free or not
+ * @evt: ibmvfc event struct
+ *
+ * Returns:
+ * true / false
+ **/
+static bool ibmvfc_event_is_free(struct ibmvfc_event *evt)
+{
+ struct ibmvfc_event *loop_evt;
+
+ list_for_each_entry(loop_evt, &evt->queue->free, queue_list)
+ if (loop_evt == evt)
+ return true;
+
+ return false;
+}
+
/**
* ibmvfc_wait_for_ops - Wait for ops to complete
* @vhost: ibmvfc host struct
@@ -2385,7 +2403,7 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
{
struct ibmvfc_event *evt;
DECLARE_COMPLETION_ONSTACK(comp);
- int wait;
+ int wait, i;
unsigned long flags;
signed long timeout = IBMVFC_ABORT_WAIT_TIMEOUT * HZ;
@@ -2393,10 +2411,13 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
do {
wait = 0;
spin_lock_irqsave(&vhost->crq.l_lock, flags);
- list_for_each_entry(evt, &vhost->crq.sent, queue_list) {
- if (match(evt, device)) {
- evt->eh_comp = ∁
- wait++;
+ for (i = 0; i < vhost->crq.evt_pool.size; i++) {
+ evt = &vhost->crq.evt_pool.events[i];
+ if (!ibmvfc_event_is_free(evt)) {
+ if (match(evt, device)) {
+ evt->eh_comp = ∁
+ wait++;
+ }
}
}
spin_unlock_irqrestore(&vhost->crq.l_lock, flags);
@@ -2407,10 +2428,13 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
if (!timeout) {
wait = 0;
spin_lock_irqsave(&vhost->crq.l_lock, flags);
- list_for_each_entry(evt, &vhost->crq.sent, queue_list) {
- if (match(evt, device)) {
- evt->eh_comp = NULL;
- wait++;
+ for (i = 0; i < vhost->crq.evt_pool.size; i++) {
+ evt = &vhost->crq.evt_pool.events[i];
+ if (!ibmvfc_event_is_free(evt)) {
+ if (match(evt, device)) {
+ evt->eh_comp = NULL;
+ wait++;
+ }
}
}
spin_unlock_irqrestore(&vhost->crq.l_lock, flags);
--
2.27.0
Reviewed-by: Brian King <[email protected]>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
On Fri, 19 Mar 2021 14:50:27 -0600, Tyrel Datwyler wrote:
> Changes to the locking pattern protecting the event lists and handling of scsi
> command completion introduced a race where an ouststanding command that EH is
> waiting ifor to complete is no longer identifiable by being on the sent list, but
> instead as a command that is not on the free list. This is a result of moving
> commands to be completed off the sent list to a private list to be completed
> outside the list lock.
>
> [...]
Applied to 5.12/scsi-fixes, thanks!
[1/2] ibmvfc: fix potential race in ibmvfc_wait_for_ops
https://git.kernel.org/mkp/scsi/c/8b1c9b202549
[2/2] ibmvfc: make ibmvfc_wait_for_ops MQ aware
https://git.kernel.org/mkp/scsi/c/62fc2661482b
--
Martin K. Petersen Oracle Linux Engineering