2023-07-26 17:58:30

by Chengfeng Ye

[permalink] [raw]
Subject: [PATCH] scsi: lpfc: Fix potential deadlock on &phba->hbalock

As &phba->hbalock is acquired by hardirq such as lpfc_sli_intr_handler(),
process context code acquiring the lock &phba->hbalock should disable
irq, otherwise deadlock could happen if the irq preempt the execution
while the lock is held in process context on the same CPU.

Most lock acquicision site disables irq but inside the callback
lpfc_cmpl_els_uvem() the lock is acquired without explicitly disable irq.
The outside caller of this callback also seems not disable irq.

[Deadlock Scenario]
lpfc_cmpl_els_uvem()
-> spin_lock(&phba->hbalock)
<irq>
-> lpfc_sli_intr_handle()
-> spin_lock(&phba->hbalock); (deadlock here)

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

The patch fix the potential deadlock by spin_lock_irqsave() just like
other callsite.

Signed-off-by: Chengfeng Ye <[email protected]>
---
drivers/scsi/lpfc/lpfc_els.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 2bad9954c355..9667b4937b3a 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -12398,6 +12398,7 @@ lpfc_cmpl_els_uvem(struct lpfc_hba *phba, struct lpfc_iocbq *icmdiocb,
u32 ulp_word4 = get_job_word4(phba, rspiocb);
struct lpfc_dmabuf *dmabuf = icmdiocb->cmd_dmabuf;
struct lpfc_vmid *vmid;
+ unsigned long flags;

vmid = vmid_context->vmp;
if (!ndlp || ndlp->nlp_state != NLP_STE_UNMAPPED_NODE)
@@ -12419,11 +12420,11 @@ lpfc_cmpl_els_uvem(struct lpfc_hba *phba, struct lpfc_iocbq *icmdiocb,
ulp_status, ulp_word4);
goto out;
}
- spin_lock(&phba->hbalock);
+ spin_lock_irqsave(&phba->hbalock, flags);
/* Set IN USE flag */
vport->vmid_flag |= LPFC_VMID_IN_USE;
phba->pport->vmid_flag |= LPFC_VMID_IN_USE;
- spin_unlock(&phba->hbalock);
+ spin_unlock_irqrestore(&phba->hbalock, flags);

if (vmid_context->instantiated) {
write_lock(&vport->vmid_lock);
--
2.17.1



2023-07-26 22:38:18

by Justin Tee

[permalink] [raw]
Subject: Re: [PATCH] scsi: lpfc: Fix potential deadlock on &phba->hbalock

Hi Chengfeng,

lpfc_cmpl_els_uvem is for the VMID feature that could only ever be
called on an SLI4 type HBA.
lpfc_sli_intr_handler can only ever be called on an SLI3 type HBA.

So, the deadlock being referred to can never happen.

Thanks,
Justin

On Wed, Jul 26, 2023 at 8:55 AM Chengfeng Ye <[email protected]> wrote:
>
> As &phba->hbalock is acquired by hardirq such as lpfc_sli_intr_handler(),
> process context code acquiring the lock &phba->hbalock should disable
> irq, otherwise deadlock could happen if the irq preempt the execution
> while the lock is held in process context on the same CPU.
>
> Most lock acquicision site disables irq but inside the callback
> lpfc_cmpl_els_uvem() the lock is acquired without explicitly disable irq.
> The outside caller of this callback also seems not disable irq.
>
> [Deadlock Scenario]
> lpfc_cmpl_els_uvem()
> -> spin_lock(&phba->hbalock)
> <irq>
> -> lpfc_sli_intr_handle()
> -> spin_lock(&phba->hbalock); (deadlock here)
>
> This flaw was found by an experimental static analysis tool I am
> developing for irq-related deadlock.
>
> The patch fix the potential deadlock by spin_lock_irqsave() just like
> other callsite.
>
> Signed-off-by: Chengfeng Ye <[email protected]>
> ---
> drivers/scsi/lpfc/lpfc_els.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
> index 2bad9954c355..9667b4937b3a 100644
> --- a/drivers/scsi/lpfc/lpfc_els.c
> +++ b/drivers/scsi/lpfc/lpfc_els.c
> @@ -12398,6 +12398,7 @@ lpfc_cmpl_els_uvem(struct lpfc_hba *phba, struct lpfc_iocbq *icmdiocb,
> u32 ulp_word4 = get_job_word4(phba, rspiocb);
> struct lpfc_dmabuf *dmabuf = icmdiocb->cmd_dmabuf;
> struct lpfc_vmid *vmid;
> + unsigned long flags;
>
> vmid = vmid_context->vmp;
> if (!ndlp || ndlp->nlp_state != NLP_STE_UNMAPPED_NODE)
> @@ -12419,11 +12420,11 @@ lpfc_cmpl_els_uvem(struct lpfc_hba *phba, struct lpfc_iocbq *icmdiocb,
> ulp_status, ulp_word4);
> goto out;
> }
> - spin_lock(&phba->hbalock);
> + spin_lock_irqsave(&phba->hbalock, flags);
> /* Set IN USE flag */
> vport->vmid_flag |= LPFC_VMID_IN_USE;
> phba->pport->vmid_flag |= LPFC_VMID_IN_USE;
> - spin_unlock(&phba->hbalock);
> + spin_unlock_irqrestore(&phba->hbalock, flags);
>
> if (vmid_context->instantiated) {
> write_lock(&vport->vmid_lock);
> --
> 2.17.1
>

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.


Attachments:
smime.p7s (4.10 kB)
S/MIME Cryptographic Signature

2023-07-27 06:33:54

by Chengfeng Ye

[permalink] [raw]
Subject: Re: [PATCH] scsi: lpfc: Fix potential deadlock on &phba->hbalock

Hi Justin,

Thanks much for the reply, it is my negligence of not have noticed it,
and sorry for this.

I inspect the bug report of my tool again and just find that actually
lpfc_sli4_intr_handler() also acquires that lock.

lpfc_sli4_intr_handler()
-> lpfc_sli4_hba_intr_handler()
-> spin_lock_irqsave(&phba->hbalock, iflag);

It seems like this isr is called on an SLI4 type HBA. If consider this
one could it be a deadlock problem?

Thanks again,
Chengfeng

2023-07-27 19:30:45

by Justin Tee

[permalink] [raw]
Subject: Re: [PATCH] scsi: lpfc: Fix potential deadlock on &phba->hbalock

Hi Chengfeng,

That’s still a unlikely scenario:

/* Check device state for handling interrupt */
if (unlikely(lpfc_intr_state_check(phba))) {
/* Check again for link_state with lock held */
spin_lock_irqsave(&phba->hbalock, iflag);
if (phba->link_state < LPFC_LINK_DOWN)
/* Flush, clear interrupt, and rearm the EQ */
lpfc_sli4_eqcq_flush(phba, fpeq);
spin_unlock_irqrestore(&phba->hbalock, iflag);
return IRQ_NONE;
}

In order to enter that if statement and obtain the hbalock, the PCI
channel has to be offline or the HBA’s link is in not in an
initialized state. If either of those were true, lpfc_cmpl_els_uvem
would never get called to begin with.

Thanks,
Justin


On Wed, Jul 26, 2023 at 10:40 PM Chengfeng Ye <[email protected]> wrote:
>
> Hi Justin,
>
> Thanks much for the reply, it is my negligence of not have noticed it,
> and sorry for this.
>
> I inspect the bug report of my tool again and just find that actually
> lpfc_sli4_intr_handler() also acquires that lock.
>
> lpfc_sli4_intr_handler()
> -> lpfc_sli4_hba_intr_handler()
> -> spin_lock_irqsave(&phba->hbalock, iflag);
>
> It seems like this isr is called on an SLI4 type HBA. If consider this
> one could it be a deadlock problem?
>
> Thanks again,
> Chengfeng

--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.


Attachments:
smime.p7s (4.10 kB)
S/MIME Cryptographic Signature