2018-10-11 08:17:12

by Dmitry Vyukov

[permalink] [raw]
Subject: net/tipc: recursive locking in tipc_link_reset

Hi,

I am getting the following error while booting the latest kernel on
bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.

Since this happens during boot, this makes LOCKDEP completely
unusable, does not allow to discover any other locking issues and
masks all new bugs being introduced into kernel.
Please fix asap.
Thanks


WARNING: possible recursive locking detected
4.19.0-rc7+ #14 Not tainted
--------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850

but task is already holding lock:
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&(&list->lock)->rlock#4);
lock(&(&list->lock)->rlock#4);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by swapper/0/1:
#0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
#1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
spin_lock_bh include/linux/spinlock.h:334 [inline]
#1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1af/0x295 lib/dump_stack.c:113
print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
check_deadlock kernel/locking/lockdep.c:1803 [inline]
validate_chain kernel/locking/lockdep.c:2399 [inline]
__lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
spin_lock_bh include/linux/spinlock.h:334 [inline]
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
tipc_init_net+0x472/0x610 net/tipc/core.c:82
ops_init+0xf7/0x520 net/core/net_namespace.c:129
__register_pernet_operations net/core/net_namespace.c:940 [inline]
register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
tipc_init+0x83/0x104 net/tipc/core.c:140
do_one_initcall+0x109/0x70a init/main.c:885
do_initcall_level init/main.c:953 [inline]
do_initcalls init/main.c:961 [inline]
do_basic_setup init/main.c:979 [inline]
kernel_init_freeable+0x4bd/0x57f init/main.c:1144
kernel_init+0x13/0x180 init/main.c:1063
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413


Attachments:
.config (141.95 kB)

2018-10-11 08:00:45

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net/tipc: recursive locking in tipc_link_reset

On Thu, Oct 11, 2018 at 9:55 AM, Dmitry Vyukov <[email protected]> wrote:
> Hi,
>
> I am getting the following error while booting the latest kernel on
> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>
> Since this happens during boot, this makes LOCKDEP completely
> unusable, does not allow to discover any other locking issues and
> masks all new bugs being introduced into kernel.
> Please fix asap.
> Thanks

-parthasarathy.bhuvaragan address as it gives me bounces
but this is highly likely due to:

commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
Author: Parthasarathy Bhuvaragan
Date: Tue Sep 25 22:09:10 2018 +0200

tipc: lock wakeup & inputq at tipc_link_reset()


> WARNING: possible recursive locking detected
> 4.19.0-rc7+ #14 Not tainted
> --------------------------------------------
> swapper/0/1 is trying to acquire lock:
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
>
> but task is already holding lock:
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&(&list->lock)->rlock#4);
> lock(&(&list->lock)->rlock#4);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 2 locks held by swapper/0/1:
> #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
> register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> spin_lock_bh include/linux/spinlock.h:334 [inline]
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1af/0x295 lib/dump_stack.c:113
> print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
> check_deadlock kernel/locking/lockdep.c:1803 [inline]
> validate_chain kernel/locking/lockdep.c:2399 [inline]
> __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
> lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
> _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
> spin_lock_bh include/linux/spinlock.h:334 [inline]
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
> tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
> tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
> tipc_init_net+0x472/0x610 net/tipc/core.c:82
> ops_init+0xf7/0x520 net/core/net_namespace.c:129
> __register_pernet_operations net/core/net_namespace.c:940 [inline]
> register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
> register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
> tipc_init+0x83/0x104 net/tipc/core.c:140
> do_one_initcall+0x109/0x70a init/main.c:885
> do_initcall_level init/main.c:953 [inline]
> do_initcalls init/main.c:961 [inline]
> do_basic_setup init/main.c:979 [inline]
> kernel_init_freeable+0x4bd/0x57f init/main.c:1144
> kernel_init+0x13/0x180 init/main.c:1063
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

2018-10-11 11:23:50

by Jon Maloy

[permalink] [raw]
Subject: RE: net/tipc: recursive locking in tipc_link_reset

Hi Dmitry,
Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
I am looking into it.

///jon

> -----Original Message-----
> From: Dmitry Vyukov <[email protected]>
> Sent: October 11, 2018 3:55 AM
> To: [email protected]; Jon Maloy
> <[email protected]>; David Miller <[email protected]>; Ying Xue
> <[email protected]>; netdev <[email protected]>; tipc-
> [email protected]; LKML <[email protected]
> Subject: net/tipc: recursive locking in tipc_link_reset
>
> Hi,
>
> I am getting the following error while booting the latest kernel on
> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>
> Since this happens during boot, this makes LOCKDEP completely unusable,
> does not allow to discover any other locking issues and masks all new bugs
> being introduced into kernel.
> Please fix asap.
> Thanks
>
>
> WARNING: possible recursive locking detected 4.19.0-rc7+ #14 Not tainted
> --------------------------------------------
> swapper/0/1 is trying to acquire lock:
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
>
> but task is already holding lock:
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&(&list->lock)->rlock#4);
> lock(&(&list->lock)->rlock#4);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 2 locks held by swapper/0/1:
> #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
> register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> spin_lock_bh include/linux/spinlock.h:334 [inline]
> #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
>
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14 Hardware name:
> QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1af/0x295 lib/dump_stack.c:113 print_deadlock_bug
> kernel/locking/lockdep.c:1759 [inline] check_deadlock
> kernel/locking/lockdep.c:1803 [inline] validate_chain
> kernel/locking/lockdep.c:2399 [inline]
> __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
> lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
> _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168 spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
> tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
> tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
> tipc_init_net+0x472/0x610 net/tipc/core.c:82
> ops_init+0xf7/0x520 net/core/net_namespace.c:129
> __register_pernet_operations net/core/net_namespace.c:940 [inline]
> register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
> register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
> tipc_init+0x83/0x104 net/tipc/core.c:140 do_one_initcall+0x109/0x70a
> init/main.c:885 do_initcall_level init/main.c:953 [inline] do_initcalls
> init/main.c:961 [inline] do_basic_setup init/main.c:979 [inline]
> kernel_init_freeable+0x4bd/0x57f init/main.c:1144
> kernel_init+0x13/0x180 init/main.c:1063
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

2018-10-11 12:21:26

by Ying Xue

[permalink] [raw]
Subject: Re: net/tipc: recursive locking in tipc_link_reset

On 10/11/2018 03:59 PM, Dmitry Vyukov wrote:
> On Thu, Oct 11, 2018 at 9:55 AM, Dmitry Vyukov <[email protected]> wrote:
>> Hi,
>>
>> I am getting the following error while booting the latest kernel on
>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>
>> Since this happens during boot, this makes LOCKDEP completely
>> unusable, does not allow to discover any other locking issues and
>> masks all new bugs being introduced into kernel.
>> Please fix asap.
>> Thanks
> -parthasarathy.bhuvaragan address as it gives me bounces
> but this is highly likely due to:
>
> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
> Author: Parthasarathy Bhuvaragan
> Date: Tue Sep 25 22:09:10 2018 +0200
>
> tipc: lock wakeup & inputq at tipc_link_reset()
>
>

Dmitry, I agree with you. The complaint should be caused by the commit
above. Please try to verify the patch:
https://patchwork.ozlabs.org/patch/982447.

Thanks,
Ying

2018-10-11 12:21:49

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net/tipc: recursive locking in tipc_link_reset

On Thu, Oct 11, 2018 at 2:03 PM, Ying Xue <[email protected]> wrote:
>>> Hi,
>>>
>>> I am getting the following error while booting the latest kernel on
>>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>>
>>> Since this happens during boot, this makes LOCKDEP completely
>>> unusable, does not allow to discover any other locking issues and
>>> masks all new bugs being introduced into kernel.
>>> Please fix asap.
>>> Thanks
>> -parthasarathy.bhuvaragan address as it gives me bounces
>> but this is highly likely due to:
>>
>> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
>> Author: Parthasarathy Bhuvaragan
>> Date: Tue Sep 25 22:09:10 2018 +0200
>>
>> tipc: lock wakeup & inputq at tipc_link_reset()
>>
>>
>
> Dmitry, I agree with you. The complaint should be caused by the commit
> above. Please try to verify the patch:
> https://patchwork.ozlabs.org/patch/982447.


I trust you for testing ;)

Thanks for the quick fix!

2018-10-11 12:22:40

by Ying Xue

[permalink] [raw]
Subject: Re: net/tipc: recursive locking in tipc_link_reset

Jon, please help to review the patch:
https://patchwork.ozlabs.org/patch/982447.

Thanks,
Ying

On 10/11/2018 06:55 PM, Jon Maloy wrote:
> Hi Dmitry,
> Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
> I am looking into it.
>
> ///jon