2022-09-20 06:12:12

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Mon, Sep 19, 2022 at 09:47:11PM -0400, Daniel Jordan wrote:
> On Tue, Sep 20, 2022 at 08:39:08AM +0800, [email protected] wrote:
> > From: Edward Adam Davis <[email protected]>
> >
> > On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> > > Hi Edward,
> > >
> > > On Mon, Sep 19, 2022 at 09:05:55AM +0800, [email protected] wrote:
> > > > From: Edward Adam Davis <[email protected]>
> > > >
> > > > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > > > that was previously locked with spin_lock.
> > >
> > > There's nothing unusual about that, though?
> > >
> > > > This caused the following lockdep warning about an inconsistent lock
> > > > state:
> > > >
> > > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> > >
> > > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> > > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
> > Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > >
> > > > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > > > from an irq handler.
> > >
> > > A softirq handler, not a hardirq handler. I'd suggest using
> > > spin_lock_bh() instead of _irqsave in your patch.
> > I think _irqsave better than _bh, it can save the irq context, but _bh not,
> > and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.
>
> _irqsave saving the context is about handling nested hardirq disables.
> It's not needed here since we don't need to care about disabling
> hardirq.
>
> _bh is for disabling softirq, a different context from hardirq. We want
> _bh here since the deadlock happens when a CPU takes the lock in both
> task and softirq context. padata uses _bh lock variants because it can
> be called in softirq context but not hardirq. Let's be consistent and
> do it in this case too.

padata_do_serial is called with BHs off, so using spin_lock_bh should not
fix anything here. I guess the problem is that we call padata_find_next
after we enabled the BHs in padata_reorder.


2022-09-20 14:16:06

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

Hi Steffen,

On Tue, Sep 20, 2022 at 07:54:43AM +0200, Steffen Klassert wrote:
> On Mon, Sep 19, 2022 at 09:47:11PM -0400, Daniel Jordan wrote:
> > On Tue, Sep 20, 2022 at 08:39:08AM +0800, [email protected] wrote:
> > > From: Edward Adam Davis <[email protected]>
> > >
> > > On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> > > > Hi Edward,
> > > >
> > > > On Mon, Sep 19, 2022 at 09:05:55AM +0800, [email protected] wrote:
> > > > > From: Edward Adam Davis <[email protected]>
> > > > >
> > > > > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > > > > that was previously locked with spin_lock.
> > > >
> > > > There's nothing unusual about that, though?
> > > >
> > > > > This caused the following lockdep warning about an inconsistent lock
> > > > > state:
> > > > >
> > > > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> > > >
> > > > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> > > > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
> > > Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > > >
> > > > > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > > > > from an irq handler.
> > > >
> > > > A softirq handler, not a hardirq handler. I'd suggest using
> > > > spin_lock_bh() instead of _irqsave in your patch.
> > > I think _irqsave better than _bh, it can save the irq context, but _bh not,
> > > and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.
> >
> > _irqsave saving the context is about handling nested hardirq disables.
> > It's not needed here since we don't need to care about disabling
> > hardirq.
> >
> > _bh is for disabling softirq, a different context from hardirq. We want
> > _bh here since the deadlock happens when a CPU takes the lock in both
> > task and softirq context. padata uses _bh lock variants because it can
> > be called in softirq context but not hardirq. Let's be consistent and
> > do it in this case too.
>
> padata_do_serial is called with BHs off, so using spin_lock_bh should not
> fix anything here. I guess the problem is that we call padata_find_next
> after we enabled the BHs in padata_reorder.

Yeah, padata_do_serial can be called with BHs off, like in the tipc
stack, but there are also cases where BHs can be on, like lockdep said
here:

{SOFTIRQ-ON-W} state was registered at:
...
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
...

Line 392 is in _do_serial, not _reorder or _find_next.

2022-09-21 07:44:33

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> Hi Steffen,
>
> On Tue, Sep 20, 2022 at 07:54:43AM +0200, Steffen Klassert wrote:
> > On Mon, Sep 19, 2022 at 09:47:11PM -0400, Daniel Jordan wrote:
> > > On Tue, Sep 20, 2022 at 08:39:08AM +0800, [email protected] wrote:
> > > > From: Edward Adam Davis <[email protected]>
> > > >
> > > > On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> > > > > Hi Edward,
> > > > >
> > > > > On Mon, Sep 19, 2022 at 09:05:55AM +0800, [email protected] wrote:
> > > > > > From: Edward Adam Davis <[email protected]>
> > > > > >
> > > > > > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > > > > > that was previously locked with spin_lock.
> > > > >
> > > > > There's nothing unusual about that, though?
> > > > >
> > > > > > This caused the following lockdep warning about an inconsistent lock
> > > > > > state:
> > > > > >
> > > > > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> > > > >
> > > > > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> > > > > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
> > > > Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > > > >
> > > > > > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > > > > > from an irq handler.
> > > > >
> > > > > A softirq handler, not a hardirq handler. I'd suggest using
> > > > > spin_lock_bh() instead of _irqsave in your patch.
> > > > I think _irqsave better than _bh, it can save the irq context, but _bh not,
> > > > and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.
> > >
> > > _irqsave saving the context is about handling nested hardirq disables.
> > > It's not needed here since we don't need to care about disabling
> > > hardirq.
> > >
> > > _bh is for disabling softirq, a different context from hardirq. We want
> > > _bh here since the deadlock happens when a CPU takes the lock in both
> > > task and softirq context. padata uses _bh lock variants because it can
> > > be called in softirq context but not hardirq. Let's be consistent and
> > > do it in this case too.
> >
> > padata_do_serial is called with BHs off, so using spin_lock_bh should not
> > fix anything here. I guess the problem is that we call padata_find_next
> > after we enabled the BHs in padata_reorder.
>
> Yeah, padata_do_serial can be called with BHs off, like in the tipc
> stack, but there are also cases where BHs can be on, like lockdep said
> here:

padata_do_serial was designed to run with BHs off, it is a bug if it
runs with BHs on. But I don't see a case where this can happen. The
only user of padata_do_serial is pcrypt in its serialization callbacks
(pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
the padata_serial_worker with the padata->serial call. BHs are
off here. The crypto callback also runs with BHs off.

What do I miss here?

2022-09-21 18:58:00

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote:
> On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> > Yeah, padata_do_serial can be called with BHs off, like in the tipc
> > stack, but there are also cases where BHs can be on, like lockdep said
> > here:
>
> padata_do_serial was designed to run with BHs off, it is a bug if it
> runs with BHs on. But I don't see a case where this can happen. The
> only user of padata_do_serial is pcrypt in its serialization callbacks
> (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
> pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
> the padata_serial_worker with the padata->serial call. BHs are
> off here. The crypto callback also runs with BHs off.
>
> What do I miss here?

Ugh.. this newer, buggy part of padata_do_parallel:

/* Maximum works limit exceeded, run in the current task. */
padata->parallel(padata);

This skips the usual path in padata_parallel_worker, which disables BHs.
They should be left off in the above case too.

What about this?

---8<---

Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()

A deadlock can happen when an overloaded system runs ->parallel() in the
context of the current task:

padata_do_parallel
->parallel()
pcrypt_aead_enc/dec
padata_do_serial
spin_lock(&reorder->lock) // BHs still enabled
<interrupt>
...
__do_softirq
...
padata_do_serial
spin_lock(&reorder->lock)

It's a bug for BHs to be on in _do_serial as Steffen points out, so
ensure they're off in the "current task" case like they are in
padata_parallel_worker to avoid this situation.

Reported-by: [email protected]
Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
Signed-off-by: Daniel Jordan <[email protected]>
---
kernel/padata.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index e5819bb8bd1d..97f51e0c1776 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -207,14 +207,16 @@ int padata_do_parallel(struct padata_shell *ps,
pw = padata_work_alloc();
spin_unlock(&padata_works_lock);

+ if (!pw) {
+ /* Maximum works limit exceeded, run in the current task. */
+ padata->parallel(padata);
+ }
+
rcu_read_unlock_bh();

if (pw) {
padata_work_init(pw, padata_parallel_worker, padata, 0);
queue_work(pinst->parallel_wq, &pw->pw_work);
- } else {
- /* Maximum works limit exceeded, run in the current task. */
- padata->parallel(padata);
}

return 0;
--
2.37.2

2022-09-22 11:03:05

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Wed, Sep 21, 2022 at 02:51:38PM -0400, Daniel Jordan wrote:
> On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote:
> > On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> > > Yeah, padata_do_serial can be called with BHs off, like in the tipc
> > > stack, but there are also cases where BHs can be on, like lockdep said
> > > here:
> >
> > padata_do_serial was designed to run with BHs off, it is a bug if it
> > runs with BHs on. But I don't see a case where this can happen. The
> > only user of padata_do_serial is pcrypt in its serialization callbacks
> > (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
> > pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
> > the padata_serial_worker with the padata->serial call. BHs are
> > off here. The crypto callback also runs with BHs off.
> >
> > What do I miss here?
>
> Ugh.. this newer, buggy part of padata_do_parallel:
>
> /* Maximum works limit exceeded, run in the current task. */
> padata->parallel(padata);

Oh well...

> This skips the usual path in padata_parallel_worker, which disables BHs.
> They should be left off in the above case too.
>
> What about this?
>
> ---8<---
>
> Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()
>
> A deadlock can happen when an overloaded system runs ->parallel() in the
> context of the current task:
>
> padata_do_parallel
> ->parallel()
> pcrypt_aead_enc/dec
> padata_do_serial
> spin_lock(&reorder->lock) // BHs still enabled
> <interrupt>
> ...
> __do_softirq
> ...
> padata_do_serial
> spin_lock(&reorder->lock)
>
> It's a bug for BHs to be on in _do_serial as Steffen points out, so
> ensure they're off in the "current task" case like they are in
> padata_parallel_worker to avoid this situation.
>
> Reported-by: [email protected]
> Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
> Signed-off-by: Daniel Jordan <[email protected]>

Yes, that makes sense.

Acked-by: Steffen Klassert <[email protected]>

But we also should look at the call to padata_find_next where BHs are
on. padata_find_next takes the same lock as padata_do_serial, so this
might be a candidate for a deadlock too.

2022-09-23 02:25:02

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Thu, Sep 22, 2022 at 12:55:37PM +0200, Steffen Klassert wrote:
> On Wed, Sep 21, 2022 at 02:51:38PM -0400, Daniel Jordan wrote:
> > On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote:
> > > On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> > > > Yeah, padata_do_serial can be called with BHs off, like in the tipc
> > > > stack, but there are also cases where BHs can be on, like lockdep said
> > > > here:
> > >
> > > padata_do_serial was designed to run with BHs off, it is a bug if it
> > > runs with BHs on. But I don't see a case where this can happen. The
> > > only user of padata_do_serial is pcrypt in its serialization callbacks
> > > (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
> > > pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
> > > the padata_serial_worker with the padata->serial call. BHs are
> > > off here. The crypto callback also runs with BHs off.
> > >
> > > What do I miss here?
> >
> > Ugh.. this newer, buggy part of padata_do_parallel:
> >
> > /* Maximum works limit exceeded, run in the current task. */
> > padata->parallel(padata);
>
> Oh well...
>
> > This skips the usual path in padata_parallel_worker, which disables BHs.
> > They should be left off in the above case too.
> >
> > What about this?
> >
> > ---8<---
> >
> > Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()
> >
> > A deadlock can happen when an overloaded system runs ->parallel() in the
> > context of the current task:
> >
> > padata_do_parallel
> > ->parallel()
> > pcrypt_aead_enc/dec
> > padata_do_serial
> > spin_lock(&reorder->lock) // BHs still enabled
> > <interrupt>
> > ...
> > __do_softirq
> > ...
> > padata_do_serial
> > spin_lock(&reorder->lock)
> >
> > It's a bug for BHs to be on in _do_serial as Steffen points out, so
> > ensure they're off in the "current task" case like they are in
> > padata_parallel_worker to avoid this situation.
> >
> > Reported-by: [email protected]
> > Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
> > Signed-off-by: Daniel Jordan <[email protected]>
>
> Yes, that makes sense.
>
> Acked-by: Steffen Klassert <[email protected]>

Thanks.

> But we also should look at the call to padata_find_next where BHs are
> on. padata_find_next takes the same lock as padata_do_serial, so this
> might be a candidate for a deadlock too.

Yeah, that seems broken, it's now on my list of things to fix. Probably
worth staring at the rest of the locking for a bit too.