2021-07-15 13:44:02

by Jia-Ju Bai

[permalink] [raw]
Subject: [BUG] scsi: lpfc: possible ABBA deadlock

Hello,

I find there is a possible ABBA deadlock in the lpfc driver in Linux 5.10:

In lpfc_nvmet_unsol_fcp_issue_abort():
3502:     spin_lock_irqsave(&ctxp->ctxlock, flags);
3504: spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock);

In lpfc_sli4_nvmet_xri_aborted():
1787: spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
1794:     spin_lock(&ctxp->ctxlock);

When lpfc_nvmet_unsol_fcp_issue_abort() and
lpfc_sli4_nvmet_xri_aborted() are concurrently executed, the deadlock
can occur.

I am not quite sure whether this possible deadlock is real and how to
fix it if it is real.
Any feedback would be appreciated, thanks :)


Best wishes,
Jia-Ju Bai


2021-07-24 20:20:13

by James Smart

[permalink] [raw]
Subject: Re: [BUG] scsi: lpfc: possible ABBA deadlock

On 7/15/2021 3:37 AM, Jia-Ju Bai wrote:
> Hello,
>
> I find there is a possible ABBA deadlock in the lpfc driver in Linux 5.10:
>
> In lpfc_nvmet_unsol_fcp_issue_abort():
> 3502:     spin_lock_irqsave(&ctxp->ctxlock, flags);
> 3504: spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
>
> In lpfc_sli4_nvmet_xri_aborted():
> 1787: spin_lock(&phba->sli4_hba.abts_nvmet_buf_list_lock);
> 1794:     spin_lock(&ctxp->ctxlock);
>
> When lpfc_nvmet_unsol_fcp_issue_abort() and
> lpfc_sli4_nvmet_xri_aborted() are concurrently executed, the deadlock
> can occur.
>
> I am not quite sure whether this possible deadlock is real and how to
> fix it if it is real.
> Any feedback would be appreciated, thanks :)
>
>
> Best wishes,
> Jia-Ju Bai

Jia-Ju,

It's a valid issue, but rather difficult to actually occur. We've put
together a fix and am testing it. Will post when ready.

-- james