2023-07-29 19:18:50

by Chengfeng Ye

[permalink] [raw]
Subject: [PATCH v2] dmaengine: plx_dma: Fix potential deadlock on &plxdev->ring_lock

As plx_dma_process_desc() is invoked by both tasklet plx_dma_desc_task()
under softirq context and plx_dma_tx_status() callback that executed under
process context, the lock aquicision of &plxdev->ring_lock inside
plx_dma_process_desc() should disable irq otherwise deadlock could happen
if the irq preempts the execution of process context code while the lock
is held in process context on the same CPU.

Possible deadlock scenario:
plx_dma_tx_status()
-> plx_dma_process_desc()
-> spin_lock(&plxdev->ring_lock)
<tasklet softirq>
-> plx_dma_desc_task()
-> plx_dma_process_desc()
-> spin_lock(&plxdev->ring_lock) (deadlock here)

This flaw was found by an experimental static analysis tool I am developing
for irq-related deadlock.

The lock was changed from spin_lock_bh() to spin_lock() by a previous patch
for performance concern but unintentionally brought this potential deadlock
problem.

This patch reverts back to spin_lock_bh() to fix the deadlock problem.

Fixes: 1d05a0bdb420 ("dmaengine: plx_dma: Move spin_lock_bh() to spin_lock()")
Signed-off-by: Chengfeng Ye <[email protected]>

Changes in v2
- Consistently use spin_lock_bh() on &plxdev->ring_lock instead of
spin_lock_irqsave().
---
drivers/dma/plx_dma.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/plx_dma.c b/drivers/dma/plx_dma.c
index 34b6416c3287..7693c067a1aa 100644
--- a/drivers/dma/plx_dma.c
+++ b/drivers/dma/plx_dma.c
@@ -137,7 +137,7 @@ static void plx_dma_process_desc(struct plx_dma_dev *plxdev)
struct plx_dma_desc *desc;
u32 flags;

- spin_lock(&plxdev->ring_lock);
+ spin_lock_bh(&plxdev->ring_lock);

while (plxdev->tail != plxdev->head) {
desc = plx_dma_get_desc(plxdev, plxdev->tail);
@@ -165,7 +165,7 @@ static void plx_dma_process_desc(struct plx_dma_dev *plxdev)
plxdev->tail++;
}

- spin_unlock(&plxdev->ring_lock);
+ spin_unlock_bh(&plxdev->ring_lock);
}

static void plx_dma_abort_desc(struct plx_dma_dev *plxdev)
--
2.17.1



2023-07-31 01:19:48

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2] dmaengine: plx_dma: Fix potential deadlock on &plxdev->ring_lock



On 7/29/23 11:59, Chengfeng Ye wrote:
> As plx_dma_process_desc() is invoked by both tasklet plx_dma_desc_task()
> under softirq context and plx_dma_tx_status() callback that executed under
> process context, the lock aquicision of &plxdev->ring_lock inside
> plx_dma_process_desc() should disable irq otherwise deadlock could happen
> if the irq preempts the execution of process context code while the lock
> is held in process context on the same CPU.
>
> Possible deadlock scenario:
> plx_dma_tx_status()
> -> plx_dma_process_desc()
> -> spin_lock(&plxdev->ring_lock)
> <tasklet softirq>
> -> plx_dma_desc_task()
> -> plx_dma_process_desc()
> -> spin_lock(&plxdev->ring_lock) (deadlock here)
>
> This flaw was found by an experimental static analysis tool I am developing
> for irq-related deadlock.
>
> The lock was changed from spin_lock_bh() to spin_lock() by a previous patch
> for performance concern but unintentionally brought this potential deadlock
> problem.
>
> This patch reverts back to spin_lock_bh() to fix the deadlock problem.
>
> Fixes: 1d05a0bdb420 ("dmaengine: plx_dma: Move spin_lock_bh() to spin_lock()")
> Signed-off-by: Chengfeng Ye <[email protected]>
>

Reviewed-by: Logan Gunthorpe <[email protected]>

Thanks!

Logan

2023-08-29 13:08:25

by Eric Schwarz

[permalink] [raw]
Subject: Re: [PATCH v2] dmaengine: plx_dma: Fix potential deadlock on &plxdev->ring_lock

Hello Chengfeng,

Am 29.08.2023 um 05:10 schrieb Chengfeng Ye:
> Hi Eric,
>
> Thank you for your interest in it.

Thanks for getting back to me.

> For a dynamic detection solution, then the answer is yes.
> Lockdep, which should be enabled by CONFIG_DEBUG_SPINLOCK,
> has the ability to detect such deadlocks. But the problem is that the detection
> requires input and exact thread interleaving to trigger the bug, otherwise
> the bugs would be buried and cannot be detected.
>
> For static analysis, I think the answer is no. Smatch, like other
> static deadlock detection algorithms in CBMC[1] and Infer[2], should be
> designed to reason thread interaction but not interrupts, which requires
> new algorithms that I am working on.

Will you publish your work later on e.g. on github?
Actually maybe it would even make sense to integrate your work into
scripts/checkpatch.pl of the Linux kernel (or the like).
Basically if a patch to be committed fails locking it should not be
committed anyway.
IMHO the quality standard one could expect from the code should always
be the same. So adding it to a mandatory check procedure (script which
must be executed before committing patches) and/or to "0-DAY CI Kernel
Test Service" [5] would definitely be worth a thought.

> Besides, may I ask a question that I have sent some patches[3][4] weeks
> ago, but have not yet got a reply. Would reviewers check the patches
> later or should I ping them again?

You never have a guarantee who will when review your patch on the
mailing list. It is kind of best effort based system mainly of volunteers.
Just give people a bit of time since it is currently also holiday time.
You may ping the maintainer of the subsystem when some time has passed
since he is responsible for the patches to be administered.
BTW, I think you already pinged indirectly w/ your e-mail.

> [1] http://www.cprover.org/deadlock-detection/
> [2] https://github.com/facebook/infer
> [3] https://lore.kernel.org/lkml/[email protected]/
> [4] https://lore.kernel.org/lkml/[email protected]/

[5] https://github.com/intel/lkp-tests/wiki

Cheers
Eric