As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
other acquisition of the same lock under process context should disable
irq, otherwise deadlock could happen if the irq preempt the execution
while the lock is held in process context on the same CPU.
qedi_cpu_offline() is one such function acquires the lock on process
context.
[Deadlock Scenario]
qedi_cpu_offline()
->spin_lock(&p->p_work_lock)
<irq>
->qedi_msix_handler()
->edi_process_completions()
->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here)
This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.
The tentative patch fix the potential deadlock by spin_lock_irqsave()
under process context.
Signed-off-by: Chengfeng Ye <[email protected]>
---
drivers/scsi/qedi/qedi_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index 450522b204d6..77a56a136678 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -1976,8 +1976,9 @@ static int qedi_cpu_offline(unsigned int cpu)
struct qedi_percpu_s *p = this_cpu_ptr(&qedi_percpu);
struct qedi_work *work, *tmp;
struct task_struct *thread;
+ unsigned long flags;
- spin_lock_bh(&p->p_work_lock);
+ spin_lock_irqsave(&p->p_work_lock, flags);
thread = p->iothread;
p->iothread = NULL;
@@ -1988,7 +1989,7 @@ static int qedi_cpu_offline(unsigned int cpu)
kfree(work);
}
- spin_unlock_bh(&p->p_work_lock);
+ spin_unlock_irqrestore(&p->p_work_lock, flags);
if (thread)
kthread_stop(thread);
return 0;
--
2.17.1
> -----Original Message-----
> From: Chengfeng Ye <[email protected]>
> Sent: Wednesday, July 26, 2023 6:27 PM
> To: Nilesh Javali <[email protected]>; Manish Rangankar
> <[email protected]>; GR-QLogic-Storage-Upstream <GR-QLogic-
> [email protected]>; [email protected];
> [email protected]
> Cc: [email protected]; [email protected]; Chengfeng Ye
> <[email protected]>
> Subject: [EXT] [PATCH] scsi: qedi: Fix potential deadlock on &qedi_percpu-
> >p_work_lock
>
> External Email
>
> ----------------------------------------------------------------------
> As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
> other acquisition of the same lock under process context should disable irq,
> otherwise deadlock could happen if the irq preempt the execution while the
> lock is held in process context on the same CPU.
>
> qedi_cpu_offline() is one such function acquires the lock on process context.
>
> [Deadlock Scenario]
> qedi_cpu_offline()
> ->spin_lock(&p->p_work_lock)
> <irq>
> ->qedi_msix_handler()
> ->edi_process_completions()
> ->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here)
>
> This flaw was found by an experimental static analysis tool I am developing
> for irq-related deadlock.
>
> The tentative patch fix the potential deadlock by spin_lock_irqsave() under
> process context.
>
> Signed-off-by: Chengfeng Ye <[email protected]>
> ---
> drivers/scsi/qedi/qedi_main.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
> index 450522b204d6..77a56a136678 100644
> --- a/drivers/scsi/qedi/qedi_main.c
> +++ b/drivers/scsi/qedi/qedi_main.c
> @@ -1976,8 +1976,9 @@ static int qedi_cpu_offline(unsigned int cpu)
> struct qedi_percpu_s *p = this_cpu_ptr(&qedi_percpu);
> struct qedi_work *work, *tmp;
> struct task_struct *thread;
> + unsigned long flags;
>
> - spin_lock_bh(&p->p_work_lock);
> + spin_lock_irqsave(&p->p_work_lock, flags);
> thread = p->iothread;
> p->iothread = NULL;
>
> @@ -1988,7 +1989,7 @@ static int qedi_cpu_offline(unsigned int cpu)
> kfree(work);
> }
>
> - spin_unlock_bh(&p->p_work_lock);
> + spin_unlock_irqrestore(&p->p_work_lock, flags);
> if (thread)
> kthread_stop(thread);
> return 0;
> --
> 2.17.1
Thanks for the patch,
Acked-by: Manish Rangankar <[email protected]>
On Wed, 26 Jul 2023 12:56:55 +0000, Chengfeng Ye wrote:
> As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
> other acquisition of the same lock under process context should disable
> irq, otherwise deadlock could happen if the irq preempt the execution
> while the lock is held in process context on the same CPU.
>
> qedi_cpu_offline() is one such function acquires the lock on process
> context.
>
> [...]
Applied to 6.5/scsi-fixes, thanks!
[1/1] scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
https://git.kernel.org/mkp/scsi/c/dd64f8058719
--
Martin K. Petersen Oracle Linux Engineering
Thanks for the review.
Best,
Chengfeng