2022-09-19 15:14:29

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

Hi Edward,

On Mon, Sep 19, 2022 at 09:05:55AM +0800, [email protected] wrote:
> From: Edward Adam Davis <[email protected]>
>
> Parallelized object serialization uses spin_unlock for unlocking a spin lock
> that was previously locked with spin_lock.

There's nothing unusual about that, though?

> This caused the following lockdep warning about an inconsistent lock
> state:
>
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.

Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?

> We must use spin_lock_irqsave, because it is possible to trigger tipc
> from an irq handler.

A softirq handler, not a hardirq handler. I'd suggest using
spin_lock_bh() instead of _irqsave in your patch.


A Fixes tag would be helpful for stable and folks backporting this fix
to understand what kernel versions are affected.

> WARNING: inconsistent lock state
> 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0 Not tainted
> --------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> syz-executor.2/27685 [HC0[0]:SC1[1]:HE1:SE0] takes:
> ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
> ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> {SOFTIRQ-ON-W} state was registered at:
> lock_acquire kernel/locking/lockdep.c:5666 [inline]
> lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> spin_lock include/linux/spinlock.h:349 [inline]
> padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
> padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
> pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
> crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
> tls_do_encryption net/tls/tls_sw.c:529 [inline]
> tls_push_record+0x13e8/0x3260 net/tls/tls_sw.c:762
> bpf_exec_tx_verdict+0xd82/0x11a0 net/tls/tls_sw.c:802
> tls_sw_sendmsg+0xa62/0x1820 net/tls/tls_sw.c:1014
> inet6_sendmsg+0x99/0xe0 net/ipv6/af_inet6.c:653
> sock_sendmsg_nosec net/socket.c:714 [inline]
> sock_sendmsg+0xcf/0x120 net/socket.c:734
> sock_write_iter+0x291/0x3d0 net/socket.c:1108
> call_write_iter include/linux/fs.h:2187 [inline]
> new_sync_write fs/read_write.c:491 [inline]
> vfs_write+0x9e9/0xdd0 fs/read_write.c:578
> ksys_write+0x1e8/0x250 fs/read_write.c:631
> do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
> __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
> do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
> entry_SYSENTER_compat_after_hwframe+0x70/0x82
> irq event stamp: 740
> hardirqs last enabled at (740): [<ffffffff814919a0>] __local_bh_enable_ip+0xa0/0x120 kernel/softirq.c:401
> hardirqs last disabled at (739): [<ffffffff814919c3>] __local_bh_enable_ip+0xc3/0x120 kernel/softirq.c:378
> softirqs last enabled at (0): [<ffffffff8146f02e>] copy_process+0x213e/0x7090 kernel/fork.c:2202
> softirqs last disabled at (717): [<ffffffff81491843>] invoke_softirq kernel/softirq.c:445 [inline]
> softirqs last disabled at (717): [<ffffffff81491843>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&pd_list->lock);
> <Interrupt>
> lock(&pd_list->lock);
>
> *** DEADLOCK ***
>
> 4 locks held by syz-executor.2/27685:
> #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: open_last_lookups fs/namei.c:3470 [inline]
> #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: path_openat+0x2613/0x28f0 fs/namei.c:3688
> #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock_shared include/linux/fs.h:766 [inline]
> #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: open_last_lookups fs/namei.c:3480 [inline]
> #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: path_openat+0x1514/0x28f0 fs/namei.c:3688
> #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:31 [inline]
> #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1464
> #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: buf_msg net/tipc/msg.h:202 [inline]
> #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: tipc_bearer_xmit_skb+0x8c/0x410 net/tipc/bearer.c:550
>
> stack backtrace:
> CPU: 1 PID: 27685 Comm: syz-executor.2 Not tainted 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
> Call Trace:
> <IRQ>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> print_usage_bug kernel/locking/lockdep.c:3961 [inline]
> valid_state kernel/locking/lockdep.c:3973 [inline]
> mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
> mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
> mark_lock kernel/locking/lockdep.c:4596 [inline]
> mark_usage kernel/locking/lockdep.c:4527 [inline]
> __lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
> lock_acquire kernel/locking/lockdep.c:5666 [inline]
> lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> spin_lock include/linux/spinlock.h:349 [inline]
> padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
> padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
> pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
> crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
> tipc_aead_encrypt net/tipc/crypto.c:821 [inline]
> tipc_crypto_xmit+0xf7a/0x2af0 net/tipc/crypto.c:1756
> tipc_bearer_xmit_skb+0x1ed/0x410 net/tipc/bearer.c:557
> tipc_disc_timeout+0x75e/0xcb0 net/tipc/discover.c:335
> call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
> expire_timers kernel/time/timer.c:1519 [inline]
> __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
> __run_timers kernel/time/timer.c:1768 [inline]
> run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
> __do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
> invoke_softirq kernel/softirq.c:445 [inline]
> __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
> sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106
> </IRQ>

The changelog doesn't explain the problem or why the proposed solution
fixes it.

If I can read these splats right, it seems lockdep is complaining about
how a task can take the reorder lock when softirqs are enabled
(SOFTIRQ-ON-W) as in the tls_push_record() stack, but also when it's in
softirq context (IN-SOFTIRQ-W), as in the tipc_disc_timeout() stack. So
it should be enough to disable softirq here.

> Signed-off-by: Edward Adam Davis <[email protected]>
> Reported-by: [email protected]
> ---
> kernel/padata.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/padata.c b/kernel/padata.c
> index e5819bb8bd1d..38c7b17da796 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -388,14 +388,15 @@ void padata_do_serial(struct padata_priv *padata)
> int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr);
> struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
> struct padata_priv *cur;
> + unsigned long flags;
>
> - spin_lock(&reorder->lock);
> + spin_lock_irqsave(&reorder->lock, flags);
> /* Sort in ascending order of sequence number. */
> list_for_each_entry_reverse(cur, &reorder->list, list)
> if (cur->seq_nr < padata->seq_nr)
> break;
> list_add(&padata->list, &cur->list);
> - spin_unlock(&reorder->lock);
> + spin_unlock_irqrestore(&reorder->lock, flags);
>
> /*
> * Ensure the addition to the reorder list is ordered correctly
> --
> 2.37.2
>


2022-09-19 15:35:38

by Daniel Jordan

[permalink] [raw]
Subject: Re: [PATCH] padata: fix lockdep warning in padata serialization

On Mon, Sep 19, 2022 at 11:12:48AM -0400, Daniel Jordan wrote:
> Hi Edward,
>
> On Mon, Sep 19, 2022 at 09:05:55AM +0800, [email protected] wrote:
> > From: Edward Adam Davis <[email protected]>
> >
> > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > that was previously locked with spin_lock.
>
> There's nothing unusual about that, though?
>
> > This caused the following lockdep warning about an inconsistent lock
> > state:
> >
> > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>
> Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
>
> > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > from an irq handler.
>
> A softirq handler, not a hardirq handler. I'd suggest using
> spin_lock_bh() instead of _irqsave in your patch.
>
>
> A Fixes tag would be helpful for stable and folks backporting this fix
> to understand what kernel versions are affected.
>
> > WARNING: inconsistent lock state
> > 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0 Not tainted
> > --------------------------------
> > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > syz-executor.2/27685 [HC0[0]:SC1[1]:HE1:SE0] takes:
> > ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
> > ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> > {SOFTIRQ-ON-W} state was registered at:
> > lock_acquire kernel/locking/lockdep.c:5666 [inline]
> > lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
> > __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> > _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> > spin_lock include/linux/spinlock.h:349 [inline]
> > padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> > pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
> > padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
> > pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
> > crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
> > tls_do_encryption net/tls/tls_sw.c:529 [inline]
> > tls_push_record+0x13e8/0x3260 net/tls/tls_sw.c:762
> > bpf_exec_tx_verdict+0xd82/0x11a0 net/tls/tls_sw.c:802
> > tls_sw_sendmsg+0xa62/0x1820 net/tls/tls_sw.c:1014
> > inet6_sendmsg+0x99/0xe0 net/ipv6/af_inet6.c:653
> > sock_sendmsg_nosec net/socket.c:714 [inline]
> > sock_sendmsg+0xcf/0x120 net/socket.c:734
> > sock_write_iter+0x291/0x3d0 net/socket.c:1108
> > call_write_iter include/linux/fs.h:2187 [inline]
> > new_sync_write fs/read_write.c:491 [inline]
> > vfs_write+0x9e9/0xdd0 fs/read_write.c:578
> > ksys_write+0x1e8/0x250 fs/read_write.c:631
> > do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
> > __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
> > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
> > entry_SYSENTER_compat_after_hwframe+0x70/0x82
> > irq event stamp: 740
> > hardirqs last enabled at (740): [<ffffffff814919a0>] __local_bh_enable_ip+0xa0/0x120 kernel/softirq.c:401
> > hardirqs last disabled at (739): [<ffffffff814919c3>] __local_bh_enable_ip+0xc3/0x120 kernel/softirq.c:378
> > softirqs last enabled at (0): [<ffffffff8146f02e>] copy_process+0x213e/0x7090 kernel/fork.c:2202
> > softirqs last disabled at (717): [<ffffffff81491843>] invoke_softirq kernel/softirq.c:445 [inline]
> > softirqs last disabled at (717): [<ffffffff81491843>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> >
> > other info that might help us debug this:
> > Possible unsafe locking scenario:
> >
> > CPU0
> > ----
> > lock(&pd_list->lock);
> > <Interrupt>
> > lock(&pd_list->lock);
> >
> > *** DEADLOCK ***
> >
> > 4 locks held by syz-executor.2/27685:
> > #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: open_last_lookups fs/namei.c:3470 [inline]
> > #0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: path_openat+0x2613/0x28f0 fs/namei.c:3688
> > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock_shared include/linux/fs.h:766 [inline]
> > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: open_last_lookups fs/namei.c:3480 [inline]
> > #1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: path_openat+0x1514/0x28f0 fs/namei.c:3688
> > #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:31 [inline]
> > #2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1464
> > #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: buf_msg net/tipc/msg.h:202 [inline]
> > #3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: tipc_bearer_xmit_skb+0x8c/0x410 net/tipc/bearer.c:550
> >
> > stack backtrace:
> > CPU: 1 PID: 27685 Comm: syz-executor.2 Not tainted 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
> > Call Trace:
> > <IRQ>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> > print_usage_bug kernel/locking/lockdep.c:3961 [inline]
> > valid_state kernel/locking/lockdep.c:3973 [inline]
> > mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
> > mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
> > mark_lock kernel/locking/lockdep.c:4596 [inline]
> > mark_usage kernel/locking/lockdep.c:4527 [inline]
> > __lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
> > lock_acquire kernel/locking/lockdep.c:5666 [inline]
> > lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
> > __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> > _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> > spin_lock include/linux/spinlock.h:349 [inline]
> > padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
> > pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
> > padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
> > pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
> > crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
> > tipc_aead_encrypt net/tipc/crypto.c:821 [inline]
> > tipc_crypto_xmit+0xf7a/0x2af0 net/tipc/crypto.c:1756
> > tipc_bearer_xmit_skb+0x1ed/0x410 net/tipc/bearer.c:557
> > tipc_disc_timeout+0x75e/0xcb0 net/tipc/discover.c:335
> > call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
> > expire_timers kernel/time/timer.c:1519 [inline]
> > __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
> > __run_timers kernel/time/timer.c:1768 [inline]
> > run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
> > __do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
> > invoke_softirq kernel/softirq.c:445 [inline]
> > __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> > irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
> > sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106
> > </IRQ>
>
> The changelog doesn't explain the problem or why the proposed solution
> fixes it.
>
> If I can read these splats right, it seems lockdep is complaining about
> how a task can take the reorder lock when softirqs are enabled
> (SOFTIRQ-ON-W) as in the tls_push_record() stack, but also when it's in
> softirq context (IN-SOFTIRQ-W), as in the tipc_disc_timeout() stack. So
> it should be enough to disable softirq here.

...to avoid a deadlock. Helpful to include the impact to the user too.

>
> > Signed-off-by: Edward Adam Davis <[email protected]>
> > Reported-by: [email protected]
> > ---
> > kernel/padata.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/padata.c b/kernel/padata.c
> > index e5819bb8bd1d..38c7b17da796 100644
> > --- a/kernel/padata.c
> > +++ b/kernel/padata.c
> > @@ -388,14 +388,15 @@ void padata_do_serial(struct padata_priv *padata)
> > int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr);
> > struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
> > struct padata_priv *cur;
> > + unsigned long flags;
> >
> > - spin_lock(&reorder->lock);
> > + spin_lock_irqsave(&reorder->lock, flags);
> > /* Sort in ascending order of sequence number. */
> > list_for_each_entry_reverse(cur, &reorder->list, list)
> > if (cur->seq_nr < padata->seq_nr)
> > break;
> > list_add(&padata->list, &cur->list);
> > - spin_unlock(&reorder->lock);
> > + spin_unlock_irqrestore(&reorder->lock, flags);
> >
> > /*
> > * Ensure the addition to the reorder list is ordered correctly
> > --
> > 2.37.2
> >