2016-11-26 17:04:43

by Dmitry Vyukov

[permalink] [raw]
Subject: net: deadlock on genl_mutex

Hello,

The following program triggers deadlock warnings on genl_mutex:

https://gist.githubusercontent.com/dvyukov/65e33d053e507d2ab0bf6ae83d989585/raw/b3c640ec58e894b50bcbf255c471406466cfa5d0/gistfile1.txt

On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24).

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 0, pid: 32289, name: syz-executor
CPU: 0 PID: 32289 Comm: syz-executor Not tainted 4.9.0-rc5+ #54
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003ec06420 ffffffff834c2e39 ffffffff00000000 1ffff10007d80c17
ffffed0007d80c0f 0000000041b58ab3 ffffffff89575550 ffffffff834c2b4b
ffffffff8baab1a0 dffffc0000000000 0000000000000000 ffff880068f794e0
Call Trace:
<IRQ> [ 287.394552] [< inline >] __dump_stack lib/dump_stack.c:15
<IRQ> [ 287.394552] [<ffffffff834c2e39>] dump_stack+0x2ee/0x3f5
lib/dump_stack.c:51
[<ffffffff814b6ac3>] ___might_sleep+0x483/0x660 kernel/sched/core.c:7761
[<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
[<ffffffff88139aaa>] mutex_lock_nested+0x1ea/0xf20 kernel/locking/mutex.c:620
[< inline >] genl_lock net/netlink/genetlink.c:31
[<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
[<ffffffff86ca5458>] netlink_sock_destruct+0xf8/0x400
net/netlink/af_netlink.c:331
[<ffffffff86a7b234>] __sk_destruct+0xf4/0x7f0 net/core/sock.c:1423
[<ffffffff86a87d6c>] sk_destruct+0x4c/0x80 net/core/sock.c:1453
[<ffffffff86a87dfc>] __sk_free+0x5c/0x230 net/core/sock.c:1461
[<ffffffff86a87ff8>] sk_free+0x28/0x30 net/core/sock.c:1472
[< inline >] sock_put include/net/sock.h:1591
[<ffffffff86ca6cd1>] deferred_put_nlk_sk+0x31/0x40 net/netlink/af_netlink.c:652
[< inline >] __rcu_reclaim kernel/rcu/rcu.h:118
[<ffffffff815cbc9d>] rcu_do_batch.isra.70+0x9ed/0xe20 kernel/rcu/tree.c:2776
[< inline >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
[< inline >] __rcu_process_callbacks kernel/rcu/tree.c:3007
[<ffffffff815cc55c>] rcu_process_callbacks+0x48c/0xd70 kernel/rcu/tree.c:3024
[<ffffffff8814d53b>] __do_softirq+0x32b/0xca8 kernel/softirq.c:284
[< inline >] invoke_softirq kernel/softirq.c:364
[<ffffffff8141a941>] irq_exit+0x1d1/0x210 kernel/softirq.c:405
[< inline >] exiting_irq arch/x86/include/asm/apic.h:659
[<ffffffff8814ca30>] smp_apic_timer_interrupt+0x80/0xa0
arch/x86/kernel/apic/apic.c:960
[<ffffffff8814badc>] apic_timer_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:489
<EOI> [ 287.403717] [<ffffffff8155c987>] ? lock_is_held+0x247/0x310
[<ffffffff814b6bde>] ___might_sleep+0x59e/0x660 kernel/sched/core.c:7729
[<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
[<ffffffff88142d08>] down_read+0x78/0x160 kernel/locking/rwsem.c:21
[< inline >] anon_vma_lock_read include/linux/rmap.h:127
[<ffffffff81968295>] validate_mm+0xe5/0x880 mm/mmap.c:347
[<ffffffff8196bf0b>] vma_link+0x11b/0x180 mm/mmap.c:605
[<ffffffff81977f46>] mmap_region+0x1076/0x1880 mm/mmap.c:1692
[<ffffffff81978e4f>] do_mmap+0x6ff/0xe80 mm/mmap.c:1450
[< inline >] do_mmap_pgoff include/linux/mm.h:2039
[<ffffffff818fd527>] vm_mmap_pgoff+0x1b7/0x210 mm/util.c:305
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1500
[<ffffffff8196f961>] SyS_mmap_pgoff+0x231/0x5e0 mm/mmap.c:1458
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8124bf4b>] SyS_mmap+0x1b/0x30 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff88149dc5>] entry_SYSCALL_64_fastpath+0x23/0xc6

=================================
[ INFO: inconsistent lock state ]
4.9.0-rc5+ #54 Tainted: G W
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
syz-executor/32289 [HC0[0]:SC1[1]:HE1:SE0] takes:
([ 287.580014] genl_mutex
[< inline >] genl_lock net/netlink/genetlink.c:31
[<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
{SOFTIRQ-ON-W} state was registered at:
[ 287.580014] [< inline >] mark_irqflags
kernel/locking/lockdep.c:2938
[ 287.580014] [<ffffffff81567ad7>] __lock_acquire+0x6e7/0x3380
kernel/locking/lockdep.c:3292
[ 287.580014] [<ffffffff8156b642>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3746
[ 287.580014] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 287.580014] [<ffffffff88139aff>] mutex_lock_nested+0x23f/0xf20
kernel/locking/mutex.c:621
[ 287.580014] [< inline >] genl_lock net/netlink/genetlink.c:31
[ 287.580014] [< inline >] genl_lock_all net/netlink/genetlink.c:52
[ 287.580014] [<ffffffff86cba52e>]
__genl_register_family+0x2ce/0x1870 net/netlink/genetlink.c:374
[ 287.580014] [< inline >]
_genl_register_family_with_ops_grps include/net/genetlink.h:173
[ 287.580014] [<ffffffff8ab90c02>] genl_init+0x11d/0x185
net/netlink/genetlink.c:1084
[ 287.580014] [<ffffffff8100244b>] do_one_initcall+0xfb/0x3f0 init/main.c:778
[ 287.580014] [< inline >] do_initcall_level init/main.c:844
[ 287.580014] [< inline >] do_initcalls init/main.c:852
[ 287.580014] [< inline >] do_basic_setup init/main.c:870
[ 287.580014] [<ffffffff8aa3d03d>] kernel_init_freeable+0x5c4/0x69e
init/main.c:1017
[ 287.580014] [<ffffffff88129c88>] kernel_init+0x18/0x180 init/main.c:943
[ 287.580014] [<ffffffff8814a05a>] ret_from_fork+0x2a/0x40
arch/x86/entry/entry_64.S:433

[ 78.258919] [ INFO: inconsistent lock state ]
[ 78.258919] 4.9.0-rc5+ #54 Tainted: G W
[ 78.258919] ---------------------------------
[ 78.258919] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 78.258919] syz-fuzzer/5211 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 78.258919] ([ 78.258919] genl_mutex
){+.?.+.}[ 78.258919] , at:
[ 78.258919] [<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0
[ 78.258919] {SOFTIRQ-ON-W} state was registered at:
[ 78.258919] [ 78.258919] [<ffffffff81567ad7>] __lock_acquire+0x6e7/0x3380
[ 78.258919] [ 78.258919] [<ffffffff8156b642>] lock_acquire+0x2a2/0x790
[ 78.258919] [ 78.258919] [<ffffffff88139aff>]
mutex_lock_nested+0x23f/0xf20
[ 78.258919] [ 78.258919] [<ffffffff86cba52e>]
__genl_register_family+0x2ce/0x1870
[ 78.258919] [ 78.258919] [<ffffffff8ab90c02>] genl_init+0x11d/0x185
[ 78.258919] [ 78.258919] [<ffffffff8100244b>] do_one_initcall+0xfb/0x3f0
[ 78.258919] [ 78.258919] [<ffffffff8aa3d03d>]
kernel_init_freeable+0x5c4/0x69e
[ 78.258919] [ 78.258919] [<ffffffff88129c88>] kernel_init+0x18/0x180
[ 78.258919] [ 78.258919] [<ffffffff8814a05a>] ret_from_fork+0x2a/0x40
[ 78.258919] irq event stamp: 149484
[ 78.258919] hardirqs last enabled at (149484): [ 78.258919]
[<ffffffff8814a7df>] restore_regs_and_iret+0x0/0x1d
[ 78.258919] hardirqs last disabled at (149483): [ 78.258919]
[<ffffffff8814bad7>] apic_timer_interrupt+0x87/0xa0
[ 78.258919] softirqs last enabled at (149302): [ 78.258919]
[<ffffffff8814da39>] __do_softirq+0x829/0xca8
[ 78.258919] softirqs last disabled at (149437): [ 78.258919]
[<ffffffff8141a941>] irq_exit+0x1d1/0x210

[ 78.258919]
[ 78.258919] other info that might help us debug this:
[ 78.258919] Possible unsafe locking scenario:
[ 78.258919]
[ 78.258919] CPU0
[ 78.258919] ----
[ 78.258919] lock([ 78.258919] genl_mutex
[ 78.258919] );
[ 78.258919] <Interrupt>
[ 78.258919] lock([ 78.258919] genl_mutex
[ 78.258919] );
[ 78.258919]
[ 78.258919] *** DEADLOCK ***
[ 78.258919]
[ 78.258919] 1 lock held by syz-fuzzer/5211:
[ 78.258919] #0: [ 78.258919] (
rcu_callback[ 78.258919] ){......}
, at: [ 78.258919] [<ffffffff815cbc43>] rcu_do_batch.isra.70+0x993/0xe20
[ 78.258919]
[ 78.258919] stack backtrace:

CPU: 0 PID: 32289 Comm: syz-executor Tainted: G W 4.9.0-rc5+ #54
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003ec05db8 ffffffff834c2e39 ffffffff00000000 1ffff10007d80b4a
ffffed0007d80b42 0000000041b58ab3 ffffffff89575550 ffffffff834c2b4b
ffff88003948a340 ffff88003ec22cc0 ffff8800384dd280 0000000041b58ab3
Call Trace:
<IRQ> [ 287.580014] [< inline >] __dump_stack lib/dump_stack.c:15
<IRQ> [ 287.580014] [<ffffffff834c2e39>] dump_stack+0x2ee/0x3f5
lib/dump_stack.c:51
[<ffffffff815648df>] print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2388
[< inline >] valid_state kernel/locking/lockdep.c:2401
[< inline >] mark_lock_irq kernel/locking/lockdep.c:2599
[<ffffffff81565870>] mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3062
[< inline >] mark_irqflags kernel/locking/lockdep.c:2920
[<ffffffff8156811e>] __lock_acquire+0xd2e/0x3380 kernel/locking/lockdep.c:3292
[<ffffffff8156b642>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746
[< inline >] __mutex_lock_common kernel/locking/mutex.c:521
[<ffffffff88139aff>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[< inline >] genl_lock net/netlink/genetlink.c:31
[<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
[<ffffffff86ca5458>] netlink_sock_destruct+0xf8/0x400
net/netlink/af_netlink.c:331
[<ffffffff86a7b234>] __sk_destruct+0xf4/0x7f0 net/core/sock.c:1423
[<ffffffff86a87d6c>] sk_destruct+0x4c/0x80 net/core/sock.c:1453
[<ffffffff86a87dfc>] __sk_free+0x5c/0x230 net/core/sock.c:1461
[<ffffffff86a87ff8>] sk_free+0x28/0x30 net/core/sock.c:1472
[< inline >] sock_put include/net/sock.h:1591
[<ffffffff86ca6cd1>] deferred_put_nlk_sk+0x31/0x40 net/netlink/af_netlink.c:652
[< inline >] __rcu_reclaim kernel/rcu/rcu.h:118
[<ffffffff815cbc9d>] rcu_do_batch.isra.70+0x9ed/0xe20 kernel/rcu/tree.c:2776
[< inline >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
[< inline >] __rcu_process_callbacks kernel/rcu/tree.c:3007
[<ffffffff815cc55c>] rcu_process_callbacks+0x48c/0xd70 kernel/rcu/tree.c:3024
[<ffffffff8814d53b>] __do_softirq+0x32b/0xca8 kernel/softirq.c:284
[< inline >] invoke_softirq kernel/softirq.c:364
[<ffffffff8141a941>] irq_exit+0x1d1/0x210 kernel/softirq.c:405
[< inline >] exiting_irq arch/x86/include/asm/apic.h:659
[<ffffffff8814ca30>] smp_apic_timer_interrupt+0x80/0xa0
arch/x86/kernel/apic/apic.c:960
[<ffffffff8814badc>] apic_timer_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:489
<EOI> [ 287.580014] [<ffffffff8155c987>] ? lock_is_held+0x247/0x310
[<ffffffff814b6bde>] ___might_sleep+0x59e/0x660 kernel/sched/core.c:7729
[<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
[<ffffffff88142d08>] down_read+0x78/0x160 kernel/locking/rwsem.c:21
[< inline >] anon_vma_lock_read include/linux/rmap.h:127
[<ffffffff81968295>] validate_mm+0xe5/0x880 mm/mmap.c:347
[<ffffffff8196bf0b>] vma_link+0x11b/0x180 mm/mmap.c:605
[<ffffffff81977f46>] mmap_region+0x1076/0x1880 mm/mmap.c:1692
[<ffffffff81978e4f>] do_mmap+0x6ff/0xe80 mm/mmap.c:1450
[< inline >] do_mmap_pgoff include/linux/mm.h:2039
[<ffffffff818fd527>] vm_mmap_pgoff+0x1b7/0x210 mm/util.c:305
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1500
[<ffffffff8196f961>] SyS_mmap_pgoff+0x231/0x5e0 mm/mmap.c:1458
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8124bf4b>] SyS_mmap+0x1b/0x30 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff88149dc5>] entry_SYSCALL_64_fastpath+0x23/0xc6


2016-11-26 17:12:44

by Eric Dumazet

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Sat, Nov 26, 2016 at 9:04 AM, Dmitry Vyukov <[email protected]> wrote:
> Hello,
>
> The following program triggers deadlock warnings on genl_mutex:
>
> https://gist.githubusercontent.com/dvyukov/65e33d053e507d2ab0bf6ae83d989585/raw/b3c640ec58e894b50bcbf255c471406466cfa5d0/gistfile1.txt
>
> On commit 16ae16c6e5616c084168740990fc508bda6655d4 (Nov 24).
>
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
> in_atomic(): 1, irqs_disabled(): 0, pid: 32289, name: syz-executor
> CPU: 0 PID: 32289 Comm: syz-executor Not tainted 4.9.0-rc5+ #54
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffff88003ec06420 ffffffff834c2e39 ffffffff00000000 1ffff10007d80c17
> ffffed0007d80c0f 0000000041b58ab3 ffffffff89575550 ffffffff834c2b4b
> ffffffff8baab1a0 dffffc0000000000 0000000000000000 ffff880068f794e0
> Call Trace:
> <IRQ> [ 287.394552] [< inline >] __dump_stack lib/dump_stack.c:15
> <IRQ> [ 287.394552] [<ffffffff834c2e39>] dump_stack+0x2ee/0x3f5
> lib/dump_stack.c:51
> [<ffffffff814b6ac3>] ___might_sleep+0x483/0x660 kernel/sched/core.c:7761
> [<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
> [<ffffffff88139aaa>] mutex_lock_nested+0x1ea/0xf20 kernel/locking/mutex.c:620
> [< inline >] genl_lock net/netlink/genetlink.c:31
> [<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
> [<ffffffff86ca5458>] netlink_sock_destruct+0xf8/0x400
> net/netlink/af_netlink.c:331
> [<ffffffff86a7b234>] __sk_destruct+0xf4/0x7f0 net/core/sock.c:1423
> [<ffffffff86a87d6c>] sk_destruct+0x4c/0x80 net/core/sock.c:1453
> [<ffffffff86a87dfc>] __sk_free+0x5c/0x230 net/core/sock.c:1461
> [<ffffffff86a87ff8>] sk_free+0x28/0x30 net/core/sock.c:1472
> [< inline >] sock_put include/net/sock.h:1591
> [<ffffffff86ca6cd1>] deferred_put_nlk_sk+0x31/0x40 net/netlink/af_netlink.c:652
> [< inline >] __rcu_reclaim kernel/rcu/rcu.h:118
> [<ffffffff815cbc9d>] rcu_do_batch.isra.70+0x9ed/0xe20 kernel/rcu/tree.c:2776
> [< inline >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
> [< inline >] __rcu_process_callbacks kernel/rcu/tree.c:3007
> [<ffffffff815cc55c>] rcu_process_callbacks+0x48c/0xd70 kernel/rcu/tree.c:3024
> [<ffffffff8814d53b>] __do_softirq+0x32b/0xca8 kernel/softirq.c:284
> [< inline >] invoke_softirq kernel/softirq.c:364
> [<ffffffff8141a941>] irq_exit+0x1d1/0x210 kernel/softirq.c:405
> [< inline >] exiting_irq arch/x86/include/asm/apic.h:659
> [<ffffffff8814ca30>] smp_apic_timer_interrupt+0x80/0xa0
> arch/x86/kernel/apic/apic.c:960
> [<ffffffff8814badc>] apic_timer_interrupt+0x8c/0xa0
> arch/x86/entry/entry_64.S:489
> <EOI> [ 287.403717] [<ffffffff8155c987>] ? lock_is_held+0x247/0x310
> [<ffffffff814b6bde>] ___might_sleep+0x59e/0x660 kernel/sched/core.c:7729
> [<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
> [<ffffffff88142d08>] down_read+0x78/0x160 kernel/locking/rwsem.c:21
> [< inline >] anon_vma_lock_read include/linux/rmap.h:127
> [<ffffffff81968295>] validate_mm+0xe5/0x880 mm/mmap.c:347
> [<ffffffff8196bf0b>] vma_link+0x11b/0x180 mm/mmap.c:605
> [<ffffffff81977f46>] mmap_region+0x1076/0x1880 mm/mmap.c:1692
> [<ffffffff81978e4f>] do_mmap+0x6ff/0xe80 mm/mmap.c:1450
> [< inline >] do_mmap_pgoff include/linux/mm.h:2039
> [<ffffffff818fd527>] vm_mmap_pgoff+0x1b7/0x210 mm/util.c:305
> [< inline >] SYSC_mmap_pgoff mm/mmap.c:1500
> [<ffffffff8196f961>] SyS_mmap_pgoff+0x231/0x5e0 mm/mmap.c:1458
> [< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
> [<ffffffff8124bf4b>] SyS_mmap+0x1b/0x30 arch/x86/kernel/sys_x86_64.c:86
> [<ffffffff88149dc5>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> =================================
> [ INFO: inconsistent lock state ]
> 4.9.0-rc5+ #54 Tainted: G W
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> syz-executor/32289 [HC0[0]:SC1[1]:HE1:SE0] takes:
> ([ 287.580014] genl_mutex
> [< inline >] genl_lock net/netlink/genetlink.c:31
> [<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
> {SOFTIRQ-ON-W} state was registered at:
> [ 287.580014] [< inline >] mark_irqflags
> kernel/locking/lockdep.c:2938
> [ 287.580014] [<ffffffff81567ad7>] __lock_acquire+0x6e7/0x3380
> kernel/locking/lockdep.c:3292
> [ 287.580014] [<ffffffff8156b642>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3746
> [ 287.580014] [< inline >] __mutex_lock_common
> kernel/locking/mutex.c:521
> [ 287.580014] [<ffffffff88139aff>] mutex_lock_nested+0x23f/0xf20
> kernel/locking/mutex.c:621
> [ 287.580014] [< inline >] genl_lock net/netlink/genetlink.c:31
> [ 287.580014] [< inline >] genl_lock_all net/netlink/genetlink.c:52
> [ 287.580014] [<ffffffff86cba52e>]
> __genl_register_family+0x2ce/0x1870 net/netlink/genetlink.c:374
> [ 287.580014] [< inline >]
> _genl_register_family_with_ops_grps include/net/genetlink.h:173
> [ 287.580014] [<ffffffff8ab90c02>] genl_init+0x11d/0x185
> net/netlink/genetlink.c:1084
> [ 287.580014] [<ffffffff8100244b>] do_one_initcall+0xfb/0x3f0 init/main.c:778
> [ 287.580014] [< inline >] do_initcall_level init/main.c:844
> [ 287.580014] [< inline >] do_initcalls init/main.c:852
> [ 287.580014] [< inline >] do_basic_setup init/main.c:870
> [ 287.580014] [<ffffffff8aa3d03d>] kernel_init_freeable+0x5c4/0x69e
> init/main.c:1017
> [ 287.580014] [<ffffffff88129c88>] kernel_init+0x18/0x180 init/main.c:943
> [ 287.580014] [<ffffffff8814a05a>] ret_from_fork+0x2a/0x40
> arch/x86/entry/entry_64.S:433
>
> [ 78.258919] [ INFO: inconsistent lock state ]
> [ 78.258919] 4.9.0-rc5+ #54 Tainted: G W
> [ 78.258919] ---------------------------------
> [ 78.258919] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> [ 78.258919] syz-fuzzer/5211 [HC0[0]:SC1[1]:HE1:SE0] takes:
> [ 78.258919] ([ 78.258919] genl_mutex
> ){+.?.+.}[ 78.258919] , at:
> [ 78.258919] [<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0
> [ 78.258919] {SOFTIRQ-ON-W} state was registered at:
> [ 78.258919] [ 78.258919] [<ffffffff81567ad7>] __lock_acquire+0x6e7/0x3380
> [ 78.258919] [ 78.258919] [<ffffffff8156b642>] lock_acquire+0x2a2/0x790
> [ 78.258919] [ 78.258919] [<ffffffff88139aff>]
> mutex_lock_nested+0x23f/0xf20
> [ 78.258919] [ 78.258919] [<ffffffff86cba52e>]
> __genl_register_family+0x2ce/0x1870
> [ 78.258919] [ 78.258919] [<ffffffff8ab90c02>] genl_init+0x11d/0x185
> [ 78.258919] [ 78.258919] [<ffffffff8100244b>] do_one_initcall+0xfb/0x3f0
> [ 78.258919] [ 78.258919] [<ffffffff8aa3d03d>]
> kernel_init_freeable+0x5c4/0x69e
> [ 78.258919] [ 78.258919] [<ffffffff88129c88>] kernel_init+0x18/0x180
> [ 78.258919] [ 78.258919] [<ffffffff8814a05a>] ret_from_fork+0x2a/0x40
> [ 78.258919] irq event stamp: 149484
> [ 78.258919] hardirqs last enabled at (149484): [ 78.258919]
> [<ffffffff8814a7df>] restore_regs_and_iret+0x0/0x1d
> [ 78.258919] hardirqs last disabled at (149483): [ 78.258919]
> [<ffffffff8814bad7>] apic_timer_interrupt+0x87/0xa0
> [ 78.258919] softirqs last enabled at (149302): [ 78.258919]
> [<ffffffff8814da39>] __do_softirq+0x829/0xca8
> [ 78.258919] softirqs last disabled at (149437): [ 78.258919]
> [<ffffffff8141a941>] irq_exit+0x1d1/0x210
>
> [ 78.258919]
> [ 78.258919] other info that might help us debug this:
> [ 78.258919] Possible unsafe locking scenario:
> [ 78.258919]
> [ 78.258919] CPU0
> [ 78.258919] ----
> [ 78.258919] lock([ 78.258919] genl_mutex
> [ 78.258919] );
> [ 78.258919] <Interrupt>
> [ 78.258919] lock([ 78.258919] genl_mutex
> [ 78.258919] );
> [ 78.258919]
> [ 78.258919] *** DEADLOCK ***
> [ 78.258919]
> [ 78.258919] 1 lock held by syz-fuzzer/5211:
> [ 78.258919] #0: [ 78.258919] (
> rcu_callback[ 78.258919] ){......}
> , at: [ 78.258919] [<ffffffff815cbc43>] rcu_do_batch.isra.70+0x993/0xe20
> [ 78.258919]
> [ 78.258919] stack backtrace:
>
> CPU: 0 PID: 32289 Comm: syz-executor Tainted: G W 4.9.0-rc5+ #54
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffff88003ec05db8 ffffffff834c2e39 ffffffff00000000 1ffff10007d80b4a
> ffffed0007d80b42 0000000041b58ab3 ffffffff89575550 ffffffff834c2b4b
> ffff88003948a340 ffff88003ec22cc0 ffff8800384dd280 0000000041b58ab3
> Call Trace:
> <IRQ> [ 287.580014] [< inline >] __dump_stack lib/dump_stack.c:15
> <IRQ> [ 287.580014] [<ffffffff834c2e39>] dump_stack+0x2ee/0x3f5
> lib/dump_stack.c:51
> [<ffffffff815648df>] print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2388
> [< inline >] valid_state kernel/locking/lockdep.c:2401
> [< inline >] mark_lock_irq kernel/locking/lockdep.c:2599
> [<ffffffff81565870>] mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3062
> [< inline >] mark_irqflags kernel/locking/lockdep.c:2920
> [<ffffffff8156811e>] __lock_acquire+0xd2e/0x3380 kernel/locking/lockdep.c:3292
> [<ffffffff8156b642>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:521
> [<ffffffff88139aff>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [< inline >] genl_lock net/netlink/genetlink.c:31
> [<ffffffff86cb5a11>] genl_lock_done+0x71/0xd0 net/netlink/genetlink.c:531
> [<ffffffff86ca5458>] netlink_sock_destruct+0xf8/0x400
> net/netlink/af_netlink.c:331
> [<ffffffff86a7b234>] __sk_destruct+0xf4/0x7f0 net/core/sock.c:1423
> [<ffffffff86a87d6c>] sk_destruct+0x4c/0x80 net/core/sock.c:1453
> [<ffffffff86a87dfc>] __sk_free+0x5c/0x230 net/core/sock.c:1461
> [<ffffffff86a87ff8>] sk_free+0x28/0x30 net/core/sock.c:1472
> [< inline >] sock_put include/net/sock.h:1591
> [<ffffffff86ca6cd1>] deferred_put_nlk_sk+0x31/0x40 net/netlink/af_netlink.c:652
> [< inline >] __rcu_reclaim kernel/rcu/rcu.h:118
> [<ffffffff815cbc9d>] rcu_do_batch.isra.70+0x9ed/0xe20 kernel/rcu/tree.c:2776
> [< inline >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
> [< inline >] __rcu_process_callbacks kernel/rcu/tree.c:3007
> [<ffffffff815cc55c>] rcu_process_callbacks+0x48c/0xd70 kernel/rcu/tree.c:3024
> [<ffffffff8814d53b>] __do_softirq+0x32b/0xca8 kernel/softirq.c:284
> [< inline >] invoke_softirq kernel/softirq.c:364
> [<ffffffff8141a941>] irq_exit+0x1d1/0x210 kernel/softirq.c:405
> [< inline >] exiting_irq arch/x86/include/asm/apic.h:659
> [<ffffffff8814ca30>] smp_apic_timer_interrupt+0x80/0xa0
> arch/x86/kernel/apic/apic.c:960
> [<ffffffff8814badc>] apic_timer_interrupt+0x8c/0xa0
> arch/x86/entry/entry_64.S:489
> <EOI> [ 287.580014] [<ffffffff8155c987>] ? lock_is_held+0x247/0x310
> [<ffffffff814b6bde>] ___might_sleep+0x59e/0x660 kernel/sched/core.c:7729
> [<ffffffff814b6d3a>] __might_sleep+0x9a/0x1a0 kernel/sched/core.c:7720
> [<ffffffff88142d08>] down_read+0x78/0x160 kernel/locking/rwsem.c:21
> [< inline >] anon_vma_lock_read include/linux/rmap.h:127
> [<ffffffff81968295>] validate_mm+0xe5/0x880 mm/mmap.c:347
> [<ffffffff8196bf0b>] vma_link+0x11b/0x180 mm/mmap.c:605
> [<ffffffff81977f46>] mmap_region+0x1076/0x1880 mm/mmap.c:1692
> [<ffffffff81978e4f>] do_mmap+0x6ff/0xe80 mm/mmap.c:1450
> [< inline >] do_mmap_pgoff include/linux/mm.h:2039
> [<ffffffff818fd527>] vm_mmap_pgoff+0x1b7/0x210 mm/util.c:305
> [< inline >] SYSC_mmap_pgoff mm/mmap.c:1500
> [<ffffffff8196f961>] SyS_mmap_pgoff+0x231/0x5e0 mm/mmap.c:1458
> [< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
> [<ffffffff8124bf4b>] SyS_mmap+0x1b/0x30 arch/x86/kernel/sys_x86_64.c:86
> [<ffffffff88149dc5>] entry_SYSCALL_64_fastpath+0x23/0xc6


Issue was reported yesterday and is under investigation.


http://marc.info/?l=linux-netdev&m=148014004331663&w=2


Thanks !

Subject: Re: net: deadlock on genl_mutex

>
> Issue was reported yesterday and is under investigation.
>
>
> http://marc.info/?l=linux-netdev&m=148014004331663&w=2
>
>
> Thanks !

Hi Dmitry

Can you try the patch below with your reproducer? I haven't seen similar
crashes reported after this (or even with Eric's patch).

https://patchwork.ozlabs.org/patch/699937/

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project

2016-11-29 06:06:08

by Eric Dumazet

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Mon, 2016-11-28 at 22:59 -0700, [email protected] wrote:
> >
> > Issue was reported yesterday and is under investigation.
> >
> >
> > http://marc.info/?l=linux-netdev&m=148014004331663&w=2
> >
> >
> > Thanks !
>
> Hi Dmitry
>
> Can you try the patch below with your reproducer? I haven't seen similar
> crashes reported after this (or even with Eric's patch).
>
> https://patchwork.ozlabs.org/patch/699937/

Yeah, I will post my patch on top of this one.



2016-12-08 16:16:31

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Tue, Nov 29, 2016 at 6:59 AM, <[email protected]> wrote:
>>
>> Issue was reported yesterday and is under investigation.
>>
>>
>> http://marc.info/?l=linux-netdev&m=148014004331663&w=2
>>
>>
>> Thanks !
>
>
> Hi Dmitry
>
> Can you try the patch below with your reproducer? I haven't seen similar
> crashes reported after this (or even with Eric's patch).

I've synced to 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7) and do
_not_ see this report happening anymore.
Thanks.

2016-12-08 17:16:58

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Thu, Dec 8, 2016 at 5:16 PM, Dmitry Vyukov <[email protected]> wrote:
> On Tue, Nov 29, 2016 at 6:59 AM, <[email protected]> wrote:
>>>
>>> Issue was reported yesterday and is under investigation.
>>>
>>>
>>> http://marc.info/?l=linux-netdev&m=148014004331663&w=2
>>>
>>>
>>> Thanks !
>>
>>
>> Hi Dmitry
>>
>> Can you try the patch below with your reproducer? I haven't seen similar
>> crashes reported after this (or even with Eric's patch).
>
> I've synced to 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7) and do
> _not_ see this report happening anymore.
> Thanks.


But now I am seeing "possible deadlock" warnings involving genl_lock:

[ INFO: possible circular locking dependency detected ]
4.9.0-rc8+ #77 Not tainted
-------------------------------------------------------
syz-executor7/18794 is trying to acquire lock:
(rtnl_mutex){+.+.+.}, at: [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20
net/core/rtnetlink.c:70
but task is already holding lock:
(genl_mutex){+.+.+.}, at: [< inline >] genl_lock
net/netlink/genetlink.c:31
(genl_mutex){+.+.+.}, at: [<ffffffff86cc27c9>]
genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658
which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

[ 315.403815] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 315.403815] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 315.403815] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 315.403815] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 315.403815] [< inline >] genl_lock net/netlink/genetlink.c:31
[ 315.403815] [<ffffffff86cc0c26>] genl_lock_dumpit+0x46/0xa0
net/netlink/genetlink.c:518
[ 315.403815] [<ffffffff86cb33ac>] netlink_dump+0x57c/0xd70
net/netlink/af_netlink.c:2127
[ 315.403815] [<ffffffff86cb7b6a>]
__netlink_dump_start+0x4ea/0x760 net/netlink/af_netlink.c:2217
[ 315.403815] [<ffffffff86cc2319>]
genl_family_rcv_msg+0xdc9/0x1070 net/netlink/genetlink.c:586
[ 315.403815] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 315.403815] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 315.403815] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 315.403815] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 315.403815] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 315.403815] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 315.403815] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 315.403815] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 315.403815] [< inline >] new_sync_write fs/read_write.c:499
[ 315.403815] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 315.403815] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 315.403815] [< inline >] SYSC_write fs/read_write.c:607
[ 315.403815] [<ffffffff81a760e0>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 315.403815] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 315.403815] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 315.403815] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 315.403815] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 315.403815] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 315.403815] [<ffffffff86cb7779>]
__netlink_dump_start+0xf9/0x760 net/netlink/af_netlink.c:2187
[ 315.403815] [< inline >] netlink_dump_start
include/linux/netlink.h:165
[ 315.403815] [<ffffffff86d14d48>]
ctnetlink_stat_ct_cpu+0x198/0x1e0
net/netfilter/nf_conntrack_netlink.c:2045
[ 315.403815] [<ffffffff86cd313e>]
nfnetlink_rcv_msg+0x9be/0xd60 net/netfilter/nfnetlink.c:212
[ 315.403815] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 315.403815] [<ffffffff86cd1b71>] nfnetlink_rcv+0x7e1/0x10d0
net/netfilter/nfnetlink.c:474
[ 315.403815] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 315.403815] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 315.403815] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 315.403815] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 315.403815] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 315.403815] [< inline >] new_sync_write fs/read_write.c:499
[ 315.403815] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 315.403815] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 315.403815] [< inline >] SYSC_write fs/read_write.c:607
[ 315.403815] [<ffffffff81a760e0>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 315.403815] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 315.403815] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 315.403815] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 315.403815] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 315.403815] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 315.403815] [<ffffffff86cd083d>] nfnl_lock+0x2d/0x30
net/netfilter/nfnetlink.c:61
[ 315.403815] [<ffffffff86d7c5b1>]
nf_tables_netdev_event+0x1f1/0x720
net/netfilter/nf_tables_netdev.c:122
[ 315.403815] [<ffffffff8149095a>]
notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93
[ 315.403815] [< inline >] __raw_notifier_call_chain
kernel/notifier.c:394
[ 315.403815] [<ffffffff81490b82>]
raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401
[ 315.403815] [<ffffffff86ae4af6>]
call_netdevice_notifiers_info+0x56/0x90 net/core/dev.c:1645
[ 315.403815] [< inline >] call_netdevice_notifiers
net/core/dev.c:1661
[ 315.403815] [<ffffffff86af898d>]
rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759
[ 315.403815] [<ffffffff86af8e9e>]
rollback_registered+0xae/0x100 net/core/dev.c:6800
[ 315.403815] [<ffffffff86af8f76>]
unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787
[ 315.403815] [< inline >] unregister_netdevice
include/linux/netdevice.h:2455
[ 315.403815] [<ffffffff84912be6>] __tun_detach+0xc66/0xea0
drivers/net/tun.c:567
[ 315.808015] [< inline >] tun_detach drivers/net/tun.c:578
[ 315.808015] [<ffffffff84912e69>] tun_chr_close+0x49/0x60
drivers/net/tun.c:2350
[ 315.808015] [<ffffffff81a77f7e>] __fput+0x34e/0x910
fs/file_table.c:208
[ 315.808015] [<ffffffff81a785ca>] ____fput+0x1a/0x20
fs/file_table.c:244
[ 315.808015] [<ffffffff81483c20>] task_work_run+0x1a0/0x280
kernel/task_work.c:116
[ 315.808015] [< inline >] exit_task_work
include/linux/task_work.h:21
[ 315.808015] [<ffffffff814129e2>] do_exit+0x1842/0x2650
kernel/exit.c:828
[ 315.808015] [<ffffffff814139ae>] do_group_exit+0x14e/0x420
kernel/exit.c:932
[ 315.808015] [<ffffffff81442b43>] get_signal+0x663/0x1880
kernel/signal.c:2307
[ 315.808015] [<ffffffff81239b45>] do_signal+0xc5/0x2190
arch/x86/kernel/signal.c:807
[ 315.808015] [<ffffffff8100666a>]
exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156
[ 315.808015] [< inline >] prepare_exit_to_usermode
arch/x86/entry/common.c:190
[ 315.808015] [<ffffffff81009693>]
syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
[ 315.808015] [<ffffffff881a6026>] entry_SYSCALL_64_fastpath+0xc4/0xc6

[ 315.808015] [< inline >] check_prev_add
kernel/locking/lockdep.c:1828
[ 315.808015] [<ffffffff8156309b>]
check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[ 315.808015] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 315.808015] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 315.808015] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 315.808015] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 315.808015] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 315.808015] [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20
net/core/rtnetlink.c:70
[ 315.808015] [<ffffffff87b5cdf9>]
nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
[ 315.808015] [<ffffffff86cc1cd0>]
genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631
[ 315.808015] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 315.808015] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 315.808015] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 315.808015] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 315.808015] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 315.808015] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 315.808015] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 315.808015] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 315.808015] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 315.808015] [<ffffffff81a6f9a3>]
do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
[ 315.808015] [<ffffffff81a723f1>] do_readv_writev+0x431/0x9b0
fs/read_write.c:872
[ 315.808015] [<ffffffff81a72f2c>] vfs_writev+0x8c/0xc0
fs/read_write.c:911
[ 315.808015] [<ffffffff81a73075>] do_writev+0x115/0x2d0
fs/read_write.c:944
[ 315.808015] [< inline >] SYSC_writev fs/read_write.c:1017
[ 315.808015] [<ffffffff81a7682c>] SyS_writev+0x2c/0x40
fs/read_write.c:1014
[ 315.808015] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

other info that might help us debug this:

Chain exists of:
Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(genl_mutex);
lock(nlk->cb_mutex);
lock(genl_mutex);
lock(rtnl_mutex);

*** DEADLOCK ***

2 locks held by syz-executor7/18794:
#0: (cb_lock){++++++}, at: [<ffffffff86cc152e>] genl_rcv+0x1e/0x40
net/netlink/genetlink.c:670
#1: (genl_mutex){+.+.+.}, at: [< inline >] genl_lock
net/netlink/genetlink.c:31
#1: (genl_mutex){+.+.+.}, at: [<ffffffff86cc27c9>]
genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658

stack backtrace:
CPU: 0 PID: 18794 Comm: syz-executor7 Not tainted 4.9.0-rc8+ #77
Hardware name: Google Google/Google, BIOS Google 01/01/2011
ffff88004add6468 ffffffff834c44f9 ffffffff00000000 1ffff100095bac20
ffffed00095bac18 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
[<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
kernel/locking/lockdep.c:1202
[< inline >] check_prev_add kernel/locking/lockdep.c:1828
[<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[< inline >] validate_chain kernel/locking/lockdep.c:2265
[<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
[< inline >] __mutex_lock_common kernel/locking/mutex.c:521
[<ffffffff88195bcf>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[<ffffffff86b4682c>] rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70
[<ffffffff87b5cdf9>] nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
[<ffffffff86cc1cd0>] genl_family_rcv_msg+0x780/0x1070
net/netlink/genetlink.c:631
[<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660
[<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298
[<ffffffff86cc153d>] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671
[< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231
[<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257
[<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803
[< inline >] sock_sendmsg_nosec net/socket.c:621
[<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
[<ffffffff86a764fb>] sock_write_iter+0x32b/0x620 net/socket.c:829
[<ffffffff81a6f9a3>] do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
[<ffffffff81a723f1>] do_readv_writev+0x431/0x9b0 fs/read_write.c:872
[<ffffffff81a72f2c>] vfs_writev+0x8c/0xc0 fs/read_write.c:911
[<ffffffff81a73075>] do_writev+0x115/0x2d0 fs/read_write.c:944
[< inline >] SYSC_writev fs/read_write.c:1017
[<ffffffff81a7682c>] SyS_writev+0x2c/0x40 fs/read_write.c:1014
[<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

2016-12-08 18:02:43

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Thu, Dec 8, 2016 at 6:16 PM, Dmitry Vyukov <[email protected]> wrote:
> On Thu, Dec 8, 2016 at 5:16 PM, Dmitry Vyukov <[email protected]> wrote:
>> On Tue, Nov 29, 2016 at 6:59 AM, <[email protected]> wrote:
>>>>
>>>> Issue was reported yesterday and is under investigation.
>>>>
>>>>
>>>> http://marc.info/?l=linux-netdev&m=148014004331663&w=2
>>>>
>>>>
>>>> Thanks !
>>>
>>>
>>> Hi Dmitry
>>>
>>> Can you try the patch below with your reproducer? I haven't seen similar
>>> crashes reported after this (or even with Eric's patch).
>>
>> I've synced to 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7) and do
>> _not_ see this report happening anymore.
>> Thanks.
>
>
> But now I am seeing "possible deadlock" warnings involving genl_lock:
>
> [ INFO: possible circular locking dependency detected ]
> 4.9.0-rc8+ #77 Not tainted
> -------------------------------------------------------
> syz-executor7/18794 is trying to acquire lock:
> (rtnl_mutex){+.+.+.}, at: [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20
> net/core/rtnetlink.c:70
> but task is already holding lock:
> (genl_mutex){+.+.+.}, at: [< inline >] genl_lock
> net/netlink/genetlink.c:31
> (genl_mutex){+.+.+.}, at: [<ffffffff86cc27c9>]
> genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> [ 315.403815] [< inline >] validate_chain
> kernel/locking/lockdep.c:2265
> [ 315.403815] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
> [ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
> [ 315.403815] [< inline >] __mutex_lock_common
> kernel/locking/mutex.c:521
> [ 315.403815] [<ffffffff88195bcf>]
> mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [ 315.403815] [< inline >] genl_lock net/netlink/genetlink.c:31
> [ 315.403815] [<ffffffff86cc0c26>] genl_lock_dumpit+0x46/0xa0
> net/netlink/genetlink.c:518
> [ 315.403815] [<ffffffff86cb33ac>] netlink_dump+0x57c/0xd70
> net/netlink/af_netlink.c:2127
> [ 315.403815] [<ffffffff86cb7b6a>]
> __netlink_dump_start+0x4ea/0x760 net/netlink/af_netlink.c:2217
> [ 315.403815] [<ffffffff86cc2319>]
> genl_family_rcv_msg+0xdc9/0x1070 net/netlink/genetlink.c:586
> [ 315.403815] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
> net/netlink/genetlink.c:660
> [ 315.403815] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
> net/netlink/af_netlink.c:2298
> [ 315.403815] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
> net/netlink/genetlink.c:671
> [ 315.403815] [< inline >] netlink_unicast_kernel
> net/netlink/af_netlink.c:1231
> [ 315.403815] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
> net/netlink/af_netlink.c:1257
> [ 315.403815] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
> net/netlink/af_netlink.c:1803
> [ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621
> [ 315.403815] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
> net/socket.c:631
> [ 315.403815] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
> net/socket.c:829
> [ 315.403815] [< inline >] new_sync_write fs/read_write.c:499
> [ 315.403815] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
> fs/read_write.c:512
> [ 315.403815] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
> fs/read_write.c:560
> [ 315.403815] [< inline >] SYSC_write fs/read_write.c:607
> [ 315.403815] [<ffffffff81a760e0>] SyS_write+0x100/0x240
> fs/read_write.c:599
> [ 315.403815] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> [ 315.403815] [< inline >] validate_chain
> kernel/locking/lockdep.c:2265
> [ 315.403815] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
> [ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
> [ 315.403815] [< inline >] __mutex_lock_common
> kernel/locking/mutex.c:521
> [ 315.403815] [<ffffffff88195bcf>]
> mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [ 315.403815] [<ffffffff86cb7779>]
> __netlink_dump_start+0xf9/0x760 net/netlink/af_netlink.c:2187
> [ 315.403815] [< inline >] netlink_dump_start
> include/linux/netlink.h:165
> [ 315.403815] [<ffffffff86d14d48>]
> ctnetlink_stat_ct_cpu+0x198/0x1e0
> net/netfilter/nf_conntrack_netlink.c:2045
> [ 315.403815] [<ffffffff86cd313e>]
> nfnetlink_rcv_msg+0x9be/0xd60 net/netfilter/nfnetlink.c:212
> [ 315.403815] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
> net/netlink/af_netlink.c:2298
> [ 315.403815] [<ffffffff86cd1b71>] nfnetlink_rcv+0x7e1/0x10d0
> net/netfilter/nfnetlink.c:474
> [ 315.403815] [< inline >] netlink_unicast_kernel
> net/netlink/af_netlink.c:1231
> [ 315.403815] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
> net/netlink/af_netlink.c:1257
> [ 315.403815] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
> net/netlink/af_netlink.c:1803
> [ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621
> [ 315.403815] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
> net/socket.c:631
> [ 315.403815] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
> net/socket.c:829
> [ 315.403815] [< inline >] new_sync_write fs/read_write.c:499
> [ 315.403815] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
> fs/read_write.c:512
> [ 315.403815] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
> fs/read_write.c:560
> [ 315.403815] [< inline >] SYSC_write fs/read_write.c:607
> [ 315.403815] [<ffffffff81a760e0>] SyS_write+0x100/0x240
> fs/read_write.c:599
> [ 315.403815] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> [ 315.403815] [< inline >] validate_chain
> kernel/locking/lockdep.c:2265
> [ 315.403815] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
> [ 315.403815] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
> [ 315.403815] [< inline >] __mutex_lock_common
> kernel/locking/mutex.c:521
> [ 315.403815] [<ffffffff88195bcf>]
> mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [ 315.403815] [<ffffffff86cd083d>] nfnl_lock+0x2d/0x30
> net/netfilter/nfnetlink.c:61
> [ 315.403815] [<ffffffff86d7c5b1>]
> nf_tables_netdev_event+0x1f1/0x720
> net/netfilter/nf_tables_netdev.c:122
> [ 315.403815] [<ffffffff8149095a>]
> notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93
> [ 315.403815] [< inline >] __raw_notifier_call_chain
> kernel/notifier.c:394
> [ 315.403815] [<ffffffff81490b82>]
> raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401
> [ 315.403815] [<ffffffff86ae4af6>]
> call_netdevice_notifiers_info+0x56/0x90 net/core/dev.c:1645
> [ 315.403815] [< inline >] call_netdevice_notifiers
> net/core/dev.c:1661
> [ 315.403815] [<ffffffff86af898d>]
> rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759
> [ 315.403815] [<ffffffff86af8e9e>]
> rollback_registered+0xae/0x100 net/core/dev.c:6800
> [ 315.403815] [<ffffffff86af8f76>]
> unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787
> [ 315.403815] [< inline >] unregister_netdevice
> include/linux/netdevice.h:2455
> [ 315.403815] [<ffffffff84912be6>] __tun_detach+0xc66/0xea0
> drivers/net/tun.c:567
> [ 315.808015] [< inline >] tun_detach drivers/net/tun.c:578
> [ 315.808015] [<ffffffff84912e69>] tun_chr_close+0x49/0x60
> drivers/net/tun.c:2350
> [ 315.808015] [<ffffffff81a77f7e>] __fput+0x34e/0x910
> fs/file_table.c:208
> [ 315.808015] [<ffffffff81a785ca>] ____fput+0x1a/0x20
> fs/file_table.c:244
> [ 315.808015] [<ffffffff81483c20>] task_work_run+0x1a0/0x280
> kernel/task_work.c:116
> [ 315.808015] [< inline >] exit_task_work
> include/linux/task_work.h:21
> [ 315.808015] [<ffffffff814129e2>] do_exit+0x1842/0x2650
> kernel/exit.c:828
> [ 315.808015] [<ffffffff814139ae>] do_group_exit+0x14e/0x420
> kernel/exit.c:932
> [ 315.808015] [<ffffffff81442b43>] get_signal+0x663/0x1880
> kernel/signal.c:2307
> [ 315.808015] [<ffffffff81239b45>] do_signal+0xc5/0x2190
> arch/x86/kernel/signal.c:807
> [ 315.808015] [<ffffffff8100666a>]
> exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156
> [ 315.808015] [< inline >] prepare_exit_to_usermode
> arch/x86/entry/common.c:190
> [ 315.808015] [<ffffffff81009693>]
> syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
> [ 315.808015] [<ffffffff881a6026>] entry_SYSCALL_64_fastpath+0xc4/0xc6
>
> [ 315.808015] [< inline >] check_prev_add
> kernel/locking/lockdep.c:1828
> [ 315.808015] [<ffffffff8156309b>]
> check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
> [ 315.808015] [< inline >] validate_chain
> kernel/locking/lockdep.c:2265
> [ 315.808015] [<ffffffff81569576>]
> __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
> [ 315.808015] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
> kernel/locking/lockdep.c:3749
> [ 315.808015] [< inline >] __mutex_lock_common
> kernel/locking/mutex.c:521
> [ 315.808015] [<ffffffff88195bcf>]
> mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [ 315.808015] [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20
> net/core/rtnetlink.c:70
> [ 315.808015] [<ffffffff87b5cdf9>]
> nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
> [ 315.808015] [<ffffffff86cc1cd0>]
> genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631
> [ 315.808015] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
> net/netlink/genetlink.c:660
> [ 315.808015] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
> net/netlink/af_netlink.c:2298
> [ 315.808015] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
> net/netlink/genetlink.c:671
> [ 315.808015] [< inline >] netlink_unicast_kernel
> net/netlink/af_netlink.c:1231
> [ 315.808015] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
> net/netlink/af_netlink.c:1257
> [ 315.808015] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
> net/netlink/af_netlink.c:1803
> [ 315.808015] [< inline >] sock_sendmsg_nosec net/socket.c:621
> [ 315.808015] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
> net/socket.c:631
> [ 315.808015] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
> net/socket.c:829
> [ 315.808015] [<ffffffff81a6f9a3>]
> do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
> [ 315.808015] [<ffffffff81a723f1>] do_readv_writev+0x431/0x9b0
> fs/read_write.c:872
> [ 315.808015] [<ffffffff81a72f2c>] vfs_writev+0x8c/0xc0
> fs/read_write.c:911
> [ 315.808015] [<ffffffff81a73075>] do_writev+0x115/0x2d0
> fs/read_write.c:944
> [ 315.808015] [< inline >] SYSC_writev fs/read_write.c:1017
> [ 315.808015] [<ffffffff81a7682c>] SyS_writev+0x2c/0x40
> fs/read_write.c:1014
> [ 315.808015] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
>
> other info that might help us debug this:
>
> Chain exists of:
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(genl_mutex);
> lock(nlk->cb_mutex);
> lock(genl_mutex);
> lock(rtnl_mutex);
>
> *** DEADLOCK ***
>
> 2 locks held by syz-executor7/18794:
> #0: (cb_lock){++++++}, at: [<ffffffff86cc152e>] genl_rcv+0x1e/0x40
> net/netlink/genetlink.c:670
> #1: (genl_mutex){+.+.+.}, at: [< inline >] genl_lock
> net/netlink/genetlink.c:31
> #1: (genl_mutex){+.+.+.}, at: [<ffffffff86cc27c9>]
> genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658
>
> stack backtrace:
> CPU: 0 PID: 18794 Comm: syz-executor7 Not tainted 4.9.0-rc8+ #77
> Hardware name: Google Google/Google, BIOS Google 01/01/2011
> ffff88004add6468 ffffffff834c44f9 ffffffff00000000 1ffff100095bac20
> ffffed00095bac18 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> [< inline >] __dump_stack lib/dump_stack.c:15
> [<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
> [<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
> kernel/locking/lockdep.c:1202
> [< inline >] check_prev_add kernel/locking/lockdep.c:1828
> [<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
> [< inline >] validate_chain kernel/locking/lockdep.c:2265
> [<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
> [<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
> [< inline >] __mutex_lock_common kernel/locking/mutex.c:521
> [<ffffffff88195bcf>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
> [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70
> [<ffffffff87b5cdf9>] nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
> [<ffffffff86cc1cd0>] genl_family_rcv_msg+0x780/0x1070
> net/netlink/genetlink.c:631
> [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660
> [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298
> [<ffffffff86cc153d>] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671
> [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231
> [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257
> [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803
> [< inline >] sock_sendmsg_nosec net/socket.c:621
> [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
> [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620 net/socket.c:829
> [<ffffffff81a6f9a3>] do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
> [<ffffffff81a723f1>] do_readv_writev+0x431/0x9b0 fs/read_write.c:872
> [<ffffffff81a72f2c>] vfs_writev+0x8c/0xc0 fs/read_write.c:911
> [<ffffffff81a73075>] do_writev+0x115/0x2d0 fs/read_write.c:944
> [< inline >] SYSC_writev fs/read_write.c:1017
> [<ffffffff81a7682c>] SyS_writev+0x2c/0x40 fs/read_write.c:1014
> [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6



Probably a related one:

[ INFO: possible circular locking dependency detected ]
4.9.0-rc8+ #77 Not tainted
-------------------------------------------------------
syz-executor5/5777 is trying to acquire lock:
(genl_mutex){+.+.+.}, at: [< inline >] genl_lock
net/netlink/genetlink.c:31
(genl_mutex){+.+.+.}, at: [<ffffffff86cc0c26>]
genl_lock_dumpit+0x46/0xa0 net/netlink/genetlink.c:518
but task is already holding lock:
(nlk->cb_mutex){+.+.+.}, at: [<ffffffff86cb2f08>]
netlink_dump+0xd8/0xd70 net/netlink/af_netlink.c:2084
which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

[ 158.966653] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 158.966653] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 158.966653] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 158.966653] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 158.966653] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 158.966653] [<ffffffff86cb7779>]
__netlink_dump_start+0xf9/0x760 net/netlink/af_netlink.c:2187
[ 158.966653] [< inline >] netlink_dump_start
include/linux/netlink.h:165
[ 158.966653] [<ffffffff86d1395f>]
ctnetlink_get_ct_unconfirmed+0x17f/0x220
net/netfilter/nf_conntrack_netlink.c:1369
[ 158.966653] [<ffffffff86cd313e>]
nfnetlink_rcv_msg+0x9be/0xd60 net/netfilter/nfnetlink.c:212
[ 158.966653] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 158.966653] [<ffffffff86cd1b71>] nfnetlink_rcv+0x7e1/0x10d0
net/netfilter/nfnetlink.c:474
[ 158.966653] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 158.966653] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 158.966653] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 158.966653] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 158.966653] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 158.966653] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 158.966653] [< inline >] new_sync_write fs/read_write.c:499
[ 158.966653] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 158.966653] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 158.966653] [< inline >] SYSC_write fs/read_write.c:607
[ 158.966653] [<ffffffff81a760e0>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 158.966653] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 158.966653] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 158.966653] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 158.966653] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 158.966653] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 158.966653] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 158.966653] [<ffffffff86cd083d>] nfnl_lock+0x2d/0x30
net/netfilter/nfnetlink.c:61
[ 158.966653] [<ffffffff86d7c5b1>]
nf_tables_netdev_event+0x1f1/0x720
net/netfilter/nf_tables_netdev.c:122
[ 158.966653] [<ffffffff8149095a>]
notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93
[ 158.966653] [< inline >] __raw_notifier_call_chain
kernel/notifier.c:394
[ 158.966653] [<ffffffff81490b82>]
raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401
[ 158.966653] [<ffffffff86ae4af6>]
call_netdevice_notifiers_info+0x56/0x90 net/core/dev.c:1645
[ 158.966653] [< inline >] call_netdevice_notifiers
net/core/dev.c:1661
[ 158.966653] [<ffffffff86af898d>]
rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759
[ 158.966653] [<ffffffff86af8e9e>]
rollback_registered+0xae/0x100 net/core/dev.c:6800
[ 158.966653] [<ffffffff86af8f76>]
unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787
[ 158.966653] [< inline >] unregister_netdevice
include/linux/netdevice.h:2455
[ 158.966653] [<ffffffff84912be6>] __tun_detach+0xc66/0xea0
drivers/net/tun.c:567
[ 158.966653] [< inline >] tun_detach drivers/net/tun.c:578
[ 158.966653] [<ffffffff84912e69>] tun_chr_close+0x49/0x60
drivers/net/tun.c:2350
[ 158.966653] [<ffffffff81a77f7e>] __fput+0x34e/0x910
fs/file_table.c:208
[ 158.966653] [<ffffffff81a785ca>] ____fput+0x1a/0x20
fs/file_table.c:244
[ 158.966653] [<ffffffff81483c20>] task_work_run+0x1a0/0x280
kernel/task_work.c:116
[ 158.966653] [< inline >] exit_task_work
include/linux/task_work.h:21
[ 158.966653] [<ffffffff814129e2>] do_exit+0x1842/0x2650
kernel/exit.c:828
[ 158.966653] [<ffffffff814139ae>] do_group_exit+0x14e/0x420
kernel/exit.c:932
[ 159.308048] [<ffffffff81442b43>] get_signal+0x663/0x1880
kernel/signal.c:2307
[ 159.308048] [<ffffffff81239b45>] do_signal+0xc5/0x2190
arch/x86/kernel/signal.c:807
[ 159.308048] [<ffffffff8100666a>]
exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156
[ 159.308048] [< inline >] prepare_exit_to_usermode
arch/x86/entry/common.c:190
[ 159.308048] [<ffffffff81009693>]
syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
[ 159.308048] [<ffffffff881a6026>] entry_SYSCALL_64_fastpath+0xc4/0xc6

[ 159.308048] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 159.308048] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 159.308048] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 159.308048] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 159.308048] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 159.308048] [<ffffffff86b4682c>] rtnl_lock+0x1c/0x20
net/core/rtnetlink.c:70
[ 159.308048] [<ffffffff87b5cdf9>]
nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
[ 159.308048] [<ffffffff86cc1cd0>]
genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631
[ 159.308048] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 159.308048] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 159.308048] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 159.308048] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 159.308048] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 159.308048] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 159.308048] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 159.308048] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 159.308048] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 159.308048] [<ffffffff81a6f9a3>]
do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
[ 159.308048] [<ffffffff81a723f1>] do_readv_writev+0x431/0x9b0
fs/read_write.c:872
[ 159.308048] [<ffffffff81a72f2c>] vfs_writev+0x8c/0xc0
fs/read_write.c:911
[ 159.308048] [<ffffffff81a73075>] do_writev+0x115/0x2d0
fs/read_write.c:944
[ 159.308048] [< inline >] SYSC_writev fs/read_write.c:1017
[ 159.308048] [<ffffffff81a7682c>] SyS_writev+0x2c/0x40
fs/read_write.c:1014
[ 159.308048] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 159.308048] [< inline >] check_prev_add
kernel/locking/lockdep.c:1828
[ 159.308048] [<ffffffff8156309b>]
check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[ 159.308048] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 159.308048] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 159.308048] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 159.308048] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 159.308048] [<ffffffff88195bcf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 159.308048] [< inline >] genl_lock net/netlink/genetlink.c:31
[ 159.308048] [<ffffffff86cc0c26>] genl_lock_dumpit+0x46/0xa0
net/netlink/genetlink.c:518
[ 159.308048] [<ffffffff86cb33ac>] netlink_dump+0x57c/0xd70
net/netlink/af_netlink.c:2127
[ 159.308048] [<ffffffff86cb7b6a>]
__netlink_dump_start+0x4ea/0x760 net/netlink/af_netlink.c:2217
[ 159.308048] [<ffffffff86cc2319>]
genl_family_rcv_msg+0xdc9/0x1070 net/netlink/genetlink.c:586
[ 159.308048] [<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 159.308048] [<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 159.308048] [<ffffffff86cc153d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 159.308048] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 159.308048] [<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 159.308048] [<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 159.308048] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 159.308048] [<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 159.308048] [<ffffffff86a764fb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 159.308048] [< inline >] new_sync_write fs/read_write.c:499
[ 159.308048] [<ffffffff81a701ae>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 159.308048] [<ffffffff81a71c55>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 159.308048] [< inline >] SYSC_write fs/read_write.c:607
[ 159.308048] [<ffffffff81a760e0>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 159.308048] [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

other info that might help us debug this:

Chain exists of:
Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(nlk->cb_mutex);
lock(&table[i].mutex);
lock(nlk->cb_mutex);
lock(genl_mutex);

*** DEADLOCK ***

2 locks held by syz-executor5/5777:
#0: (cb_lock){++++++}, at: [<ffffffff86cc152e>] genl_rcv+0x1e/0x40
net/netlink/genetlink.c:670
#1: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff86cb2f08>]
netlink_dump+0xd8/0xd70 net/netlink/af_netlink.c:2084

stack backtrace:
CPU: 1 PID: 5777 Comm: syz-executor5 Not tainted 4.9.0-rc8+ #77
Hardware name: Google Google/Google, BIOS Google 01/01/2011
ffff88005fe363e8 ffffffff834c44f9 ffffffff00000001 1ffff1000bfc6c10
ffffed000bfc6c08 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b
0000000000000000 0000000000000000 0000000000000000 dffffc0000000000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff834c44f9>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
[<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
kernel/locking/lockdep.c:1202
[< inline >] check_prev_add kernel/locking/lockdep.c:1828
[<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[< inline >] validate_chain kernel/locking/lockdep.c:2265
[<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
[< inline >] __mutex_lock_common kernel/locking/mutex.c:521
[<ffffffff88195bcf>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[< inline >] genl_lock net/netlink/genetlink.c:31
[<ffffffff86cc0c26>] genl_lock_dumpit+0x46/0xa0 net/netlink/genetlink.c:518
[<ffffffff86cb33ac>] netlink_dump+0x57c/0xd70 net/netlink/af_netlink.c:2127
[<ffffffff86cb7b6a>] __netlink_dump_start+0x4ea/0x760
net/netlink/af_netlink.c:2217
[<ffffffff86cc2319>] genl_family_rcv_msg+0xdc9/0x1070
net/netlink/genetlink.c:586
[<ffffffff86cc2770>] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660
[<ffffffff86cc034c>] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298
[<ffffffff86cc153d>] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671
[< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231
[<ffffffff86cbeb6a>] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257
[<ffffffff86cbf834>] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803
[< inline >] sock_sendmsg_nosec net/socket.c:621
[<ffffffff86a7618f>] sock_sendmsg+0xcf/0x110 net/socket.c:631
[<ffffffff86a764fb>] sock_write_iter+0x32b/0x620 net/socket.c:829
[< inline >] new_sync_write fs/read_write.c:499
[<ffffffff81a701ae>] __vfs_write+0x4fe/0x830 fs/read_write.c:512
[<ffffffff81a71c55>] vfs_write+0x175/0x4e0 fs/read_write.c:560
[< inline >] SYSC_write fs/read_write.c:607
[<ffffffff81a760e0>] SyS_write+0x100/0x240 fs/read_write.c:599
[<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6

2016-12-09 00:14:13

by Cong Wang

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Thu, Dec 8, 2016 at 10:02 AM, Dmitry Vyukov <[email protected]> wrote:
> Chain exists of:
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(nlk->cb_mutex);
> lock(&table[i].mutex);
> lock(nlk->cb_mutex);
> lock(genl_mutex);

Similar to the unix bindlock, this one looks false positive to me too.

2016-12-09 00:32:33

by Cong Wang

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Thu, Dec 8, 2016 at 9:16 AM, Dmitry Vyukov <[email protected]> wrote:
> Chain exists of:
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(genl_mutex);
> lock(nlk->cb_mutex);
> lock(genl_mutex);
> lock(rtnl_mutex);
>
> *** DEADLOCK ***

This one looks legitimate, because nlk->cb_mutex could be rtnl_mutex.
Let me think about it.

2016-12-09 05:09:10

by Cong Wang

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Thu, Dec 8, 2016 at 4:32 PM, Cong Wang <[email protected]> wrote:
> On Thu, Dec 8, 2016 at 9:16 AM, Dmitry Vyukov <[email protected]> wrote:
>> Chain exists of:
>> Possible unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(genl_mutex);
>> lock(nlk->cb_mutex);
>> lock(genl_mutex);
>> lock(rtnl_mutex);
>>
>> *** DEADLOCK ***
>
> This one looks legitimate, because nlk->cb_mutex could be rtnl_mutex.
> Let me think about it.

Never mind. Actually both reports in this thread are legitimate.

I know what happened now, the lock chain is so long, 4 locks are involved
to form a chain!!!

Let me think about how to break the chain.

2016-12-11 09:41:20

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: deadlock on genl_mutex

On Fri, Dec 9, 2016 at 6:08 AM, Cong Wang <[email protected]> wrote:
> On Thu, Dec 8, 2016 at 4:32 PM, Cong Wang <[email protected]> wrote:
>> On Thu, Dec 8, 2016 at 9:16 AM, Dmitry Vyukov <[email protected]> wrote:
>>> Chain exists of:
>>> Possible unsafe locking scenario:
>>>
>>> CPU0 CPU1
>>> ---- ----
>>> lock(genl_mutex);
>>> lock(nlk->cb_mutex);
>>> lock(genl_mutex);
>>> lock(rtnl_mutex);
>>>
>>> *** DEADLOCK ***
>>
>> This one looks legitimate, because nlk->cb_mutex could be rtnl_mutex.
>> Let me think about it.
>
> Never mind. Actually both reports in this thread are legitimate.
>
> I know what happened now, the lock chain is so long, 4 locks are involved
> to form a chain!!!
>
> Let me think about how to break the chain.



Seems to be a related one, now on nfnl_lock :



[ INFO: possible circular locking dependency detected ]
4.9.0-rc8+ #82 Not tainted
-------------------------------------------------------
syz-executor3/10151 is trying to acquire lock:
(&table[i].mutex){+.+.+.}, at: [<ffffffff86c96f1d>]
nfnl_lock+0x2d/0x30 net/netfilter/nfnetlink.c:61
but task is already holding lock:
(rtnl_mutex){+.+.+.}, at: [<ffffffff86b0cf0c>] rtnl_lock+0x1c/0x20
net/core/rtnetlink.c:70
which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

[ 231.942041] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 231.942041] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
floppy0: disk absent or changed during operation
floppy0: disk absent or changed during operation
[ 231.950342] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 231.950342] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 231.950342] [<ffffffff8815c2bf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 231.950342] [<ffffffff86b0cf0c>] rtnl_lock+0x1c/0x20
net/core/rtnetlink.c:70
[ 231.950342] [<ffffffff87b234e9>]
nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750
[ 231.950342] [<ffffffff86c883b0>]
genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631
[ 231.950342] [<ffffffff86c88e50>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 231.950342] [<ffffffff86c86a2c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 231.950342] [<ffffffff86c87c1d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 231.950342] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 231.950342] [<ffffffff86c8524a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 231.950342] [<ffffffff86c85f14>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 231.950342] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 231.950342] [<ffffffff86a3c86f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 231.950342] [<ffffffff86a3cbdb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 231.950342] [< inline >] new_sync_write fs/read_write.c:499
[ 231.950342] [<ffffffff81a7021e>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 231.950342] [<ffffffff81a71cc5>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 231.950342] [< inline >] SYSC_write fs/read_write.c:607
[ 231.950342] [<ffffffff81a76150>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 231.950342] [<ffffffff8816c685>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 231.950342] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 231.950342] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 231.950342] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 231.950342] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 231.950342] [<ffffffff8815c2bf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 231.950342] [< inline >] genl_lock net/netlink/genetlink.c:31
[ 231.950342] [<ffffffff86c87306>] genl_lock_dumpit+0x46/0xa0
net/netlink/genetlink.c:518
[ 231.950342] [<ffffffff86c79a8c>] netlink_dump+0x57c/0xd70
net/netlink/af_netlink.c:2127
[ 231.950342] [<ffffffff86c7e24a>]
__netlink_dump_start+0x4ea/0x760 net/netlink/af_netlink.c:2217
[ 231.950342] [<ffffffff86c889f9>]
genl_family_rcv_msg+0xdc9/0x1070 net/netlink/genetlink.c:586
[ 231.950342] [<ffffffff86c88e50>] genl_rcv_msg+0x1b0/0x260
net/netlink/genetlink.c:660
[ 231.950342] [<ffffffff86c86a2c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 231.950342] [<ffffffff86c87c1d>] genl_rcv+0x2d/0x40
net/netlink/genetlink.c:671
[ 231.950342] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 231.950342] [<ffffffff86c8524a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 231.950342] [<ffffffff86c85f14>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 231.950342] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 231.950342] [<ffffffff86a3c86f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 231.950342] [<ffffffff86a3cbdb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 231.950342] [<ffffffff81a6fa13>]
do_iter_readv_writev+0x363/0x670 fs/read_write.c:695
[ 231.950342] [<ffffffff81a72461>] do_readv_writev+0x431/0x9b0
fs/read_write.c:872
[ 231.950342] [<ffffffff81a72f9c>] vfs_writev+0x8c/0xc0
fs/read_write.c:911
[ 231.950342] [<ffffffff81a730e5>] do_writev+0x115/0x2d0
fs/read_write.c:944
[ 231.950342] [< inline >] SYSC_writev fs/read_write.c:1017
[ 231.950342] [<ffffffff81a7689c>] SyS_writev+0x2c/0x40
fs/read_write.c:1014
[ 231.950342] [<ffffffff8816c685>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 231.950342] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 231.950342] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 231.950342] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 231.950342] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 231.950342] [<ffffffff8815c2bf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 231.950342] [<ffffffff86c7de59>]
__netlink_dump_start+0xf9/0x760 net/netlink/af_netlink.c:2187
[ 231.950342] [< inline >] netlink_dump_start
include/linux/netlink.h:165
[ 231.950342] [<ffffffff86d9d964>] ip_set_dump+0x204/0x2b0
net/netfilter/ipset/ip_set_core.c:1447
[ 231.950342] [<ffffffff86c9981e>]
nfnetlink_rcv_msg+0x9be/0xd60 net/netfilter/nfnetlink.c:212
[ 231.950342] [<ffffffff86c86a2c>] netlink_rcv_skb+0x2bc/0x3a0
net/netlink/af_netlink.c:2298
[ 231.950342] [<ffffffff86c98251>] nfnetlink_rcv+0x7e1/0x10d0
net/netfilter/nfnetlink.c:474
[ 231.950342] [< inline >] netlink_unicast_kernel
net/netlink/af_netlink.c:1231
[ 231.950342] [<ffffffff86c8524a>] netlink_unicast+0x51a/0x740
net/netlink/af_netlink.c:1257
[ 231.950342] [<ffffffff86c85f14>] netlink_sendmsg+0xaa4/0xe50
net/netlink/af_netlink.c:1803
[ 231.950342] [< inline >] sock_sendmsg_nosec net/socket.c:621
[ 231.950342] [<ffffffff86a3c86f>] sock_sendmsg+0xcf/0x110
net/socket.c:631
[ 231.950342] [<ffffffff86a3cbdb>] sock_write_iter+0x32b/0x620
net/socket.c:829
[ 231.950342] [< inline >] new_sync_write fs/read_write.c:499
[ 231.950342] [<ffffffff81a7021e>] __vfs_write+0x4fe/0x830
fs/read_write.c:512
[ 231.950342] [<ffffffff81a71cc5>] vfs_write+0x175/0x4e0
fs/read_write.c:560
[ 231.950342] [< inline >] SYSC_write fs/read_write.c:607
[ 231.950342] [<ffffffff81a76150>] SyS_write+0x100/0x240
fs/read_write.c:599
[ 231.950342] [<ffffffff8816c685>] entry_SYSCALL_64_fastpath+0x23/0xc6

[ 231.950342] [< inline >] check_prev_add
kernel/locking/lockdep.c:1828
[ 231.950342] [<ffffffff8156309b>]
check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[ 231.950342] [< inline >] validate_chain
kernel/locking/lockdep.c:2265
[ 231.950342] [<ffffffff81569576>]
__lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[ 231.950342] [<ffffffff8156b672>] lock_acquire+0x2a2/0x790
kernel/locking/lockdep.c:3749
[ 231.950342] [< inline >] __mutex_lock_common
kernel/locking/mutex.c:521
[ 231.950342] [<ffffffff8815c2bf>]
mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[ 231.950342] [<ffffffff86c96f1d>] nfnl_lock+0x2d/0x30
net/netfilter/nfnetlink.c:61
[ 231.950342] [<ffffffff86d42c91>]
nf_tables_netdev_event+0x1f1/0x720
net/netfilter/nf_tables_netdev.c:122
[ 231.950342] [<ffffffff8149095a>]
notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93
[ 231.950342] [< inline >] __raw_notifier_call_chain
kernel/notifier.c:394
[ 231.950342] [<ffffffff81490b82>]
raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401
[ 231.950342] [<ffffffff86aab1d6>]
call_netdevice_notifiers_info+0x56/0x90 net/core/dev.c:1645
[ 231.950342] [< inline >] call_netdevice_notifiers
net/core/dev.c:1661
[ 231.950342] [<ffffffff86abf06d>]
rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759
[ 231.950342] [<ffffffff86abf57e>]
rollback_registered+0xae/0x100 net/core/dev.c:6800
[ 231.950342] [<ffffffff86abf656>]
unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787
[ 231.950342] [< inline >] unregister_netdevice
include/linux/netdevice.h:2455
[ 231.950342] [<ffffffff848d9296>] __tun_detach+0xc66/0xea0
drivers/net/tun.c:567
[ 231.950342] [< inline >] tun_detach drivers/net/tun.c:578
[ 231.950342] [<ffffffff848d9519>] tun_chr_close+0x49/0x60
drivers/net/tun.c:2350
[ 231.950342] [<ffffffff81a77fee>] __fput+0x34e/0x910
fs/file_table.c:208
[ 231.950342] [<ffffffff81a7863a>] ____fput+0x1a/0x20
fs/file_table.c:244
[ 231.950342] [<ffffffff81483c20>] task_work_run+0x1a0/0x280
kernel/task_work.c:116
[ 231.950342] [< inline >] exit_task_work
include/linux/task_work.h:21
[ 231.950342] [<ffffffff814129e2>] do_exit+0x1842/0x2650
kernel/exit.c:828
[ 231.950342] [<ffffffff814139ae>] do_group_exit+0x14e/0x420
kernel/exit.c:932
[ 231.950342] [<ffffffff81442b43>] get_signal+0x663/0x1880
kernel/signal.c:2307
[ 231.950342] [<ffffffff81239b45>] do_signal+0xc5/0x2190
arch/x86/kernel/signal.c:807
[ 231.950342] [<ffffffff8100666a>]
exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156
[ 231.950342] [< inline >] prepare_exit_to_usermode
arch/x86/entry/common.c:190
[ 231.950342] [<ffffffff81009693>]
syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
[ 231.950342] [<ffffffff8816c726>] entry_SYSCALL_64_fastpath+0xc4/0xc6

other info that might help us debug this:

Chain exists of:
Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock(genl_mutex);
lock(rtnl_mutex);
lock(&table[i].mutex);

*** DEADLOCK ***

1 lock held by syz-executor3/10151:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff86b0cf0c>]
rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70

stack backtrace:
CPU: 2 PID: 10151 Comm: syz-executor3 Not tainted 4.9.0-rc8+ #82
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff8800311057f8 ffffffff8348fc59 ffffffff00000002 1ffff10006220a92
ffffed0006220a8a 0000000041b58ab3 ffffffff8957cf18 ffffffff8348f96b
ffffffff894eb258 ffffffff81564970 ffffffff8b565c30 ffffffff8b8e5020
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff8348fc59>] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51
[<ffffffff81560cb0>] print_circular_bug+0x310/0x3c0
kernel/locking/lockdep.c:1202
[< inline >] check_prev_add kernel/locking/lockdep.c:1828
[<ffffffff8156309b>] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938
[< inline >] validate_chain kernel/locking/lockdep.c:2265
[<ffffffff81569576>] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338
[<ffffffff8156b672>] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749
[< inline >] __mutex_lock_common kernel/locking/mutex.c:521
[<ffffffff8815c2bf>] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621
[<ffffffff86c96f1d>] nfnl_lock+0x2d/0x30 net/netfilter/nfnetlink.c:61
[<ffffffff86d42c91>] nf_tables_netdev_event+0x1f1/0x720
net/netfilter/nf_tables_netdev.c:122
[<ffffffff8149095a>] notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93
[< inline >] __raw_notifier_call_chain kernel/notifier.c:394
[<ffffffff81490b82>] raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401
[<ffffffff86aab1d6>] call_netdevice_notifiers_info+0x56/0x90
net/core/dev.c:1645
[< inline >] call_netdevice_notifiers net/core/dev.c:1661
[<ffffffff86abf06d>] rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759
[<ffffffff86abf57e>] rollback_registered+0xae/0x100 net/core/dev.c:6800
[<ffffffff86abf656>] unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787
[< inline >] unregister_netdevice include/linux/netdevice.h:2455
[<ffffffff848d9296>] __tun_detach+0xc66/0xea0 drivers/net/tun.c:567
[< inline >] tun_detach drivers/net/tun.c:578
[<ffffffff848d9519>] tun_chr_close+0x49/0x60 drivers/net/tun.c:2350
[<ffffffff81a77fee>] __fput+0x34e/0x910 fs/file_table.c:208
[<ffffffff81a7863a>] ____fput+0x1a/0x20 fs/file_table.c:244
[<ffffffff81483c20>] task_work_run+0x1a0/0x280 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff814129e2>] do_exit+0x1842/0x2650 kernel/exit.c:828
[<ffffffff814139ae>] do_group_exit+0x14e/0x420 kernel/exit.c:932
[<ffffffff81442b43>] get_signal+0x663/0x1880 kernel/signal.c:2307
[<ffffffff81239b45>] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
[<ffffffff8100666a>] exit_to_usermode_loop+0x1ea/0x2d0
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[<ffffffff81009693>] syscall_return_slowpath+0x4d3/0x570
arch/x86/entry/common.c:259
[<ffffffff8816c726>] entry_SYSCALL_64_fastpath+0xc4/0xc6