Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932818AbcLHRQ6 (ORCPT ); Thu, 8 Dec 2016 12:16:58 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:36705 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753767AbcLHRQ4 (ORCPT ); Thu, 8 Dec 2016 12:16:56 -0500 MIME-Version: 1.0 In-Reply-To: References: <0227d7e83cc5ac0a192d1ba0fee61413@codeaurora.org> From: Dmitry Vyukov Date: Thu, 8 Dec 2016 18:16:32 +0100 Message-ID: Subject: Re: net: deadlock on genl_mutex To: syzkaller Cc: Eric Dumazet , David Miller , Matti Vaittinen , Tycho Andersen , Cong Wang , Florian Westphal , stephen hemminger , Tom Herbert , netdev , LKML , Richard Guy Briggs , netdev-owner@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14030 Lines: 302 On Thu, Dec 8, 2016 at 5:16 PM, Dmitry Vyukov wrote: > On Tue, Nov 29, 2016 at 6:59 AM, wrote: >>> >>> Issue was reported yesterday and is under investigation. >>> >>> >>> http://marc.info/?l=linux-netdev&m=148014004331663&w=2 >>> >>> >>> Thanks ! >> >> >> Hi Dmitry >> >> Can you try the patch below with your reproducer? I haven't seen similar >> crashes reported after this (or even with Eric's patch). > > I've synced to 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7) and do > _not_ see this report happening anymore. > Thanks. But now I am seeing "possible deadlock" warnings involving genl_lock: [ INFO: possible circular locking dependency detected ] 4.9.0-rc8+ #77 Not tainted ------------------------------------------------------- syz-executor7/18794 is trying to acquire lock: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70 but task is already holding lock: (genl_mutex){+.+.+.}, at: [< inline >] genl_lock net/netlink/genetlink.c:31 (genl_mutex){+.+.+.}, at: [] genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: [ 315.403815] [< inline >] validate_chain kernel/locking/lockdep.c:2265 [ 315.403815] [] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338 [ 315.403815] [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749 [ 315.403815] [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [ 315.403815] [] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621 [ 315.403815] [< inline >] genl_lock net/netlink/genetlink.c:31 [ 315.403815] [] genl_lock_dumpit+0x46/0xa0 net/netlink/genetlink.c:518 [ 315.403815] [] netlink_dump+0x57c/0xd70 net/netlink/af_netlink.c:2127 [ 315.403815] [] __netlink_dump_start+0x4ea/0x760 net/netlink/af_netlink.c:2217 [ 315.403815] [] genl_family_rcv_msg+0xdc9/0x1070 net/netlink/genetlink.c:586 [ 315.403815] [] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660 [ 315.403815] [] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298 [ 315.403815] [] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671 [ 315.403815] [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231 [ 315.403815] [] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257 [ 315.403815] [] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803 [ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621 [ 315.403815] [] sock_sendmsg+0xcf/0x110 net/socket.c:631 [ 315.403815] [] sock_write_iter+0x32b/0x620 net/socket.c:829 [ 315.403815] [< inline >] new_sync_write fs/read_write.c:499 [ 315.403815] [] __vfs_write+0x4fe/0x830 fs/read_write.c:512 [ 315.403815] [] vfs_write+0x175/0x4e0 fs/read_write.c:560 [ 315.403815] [< inline >] SYSC_write fs/read_write.c:607 [ 315.403815] [] SyS_write+0x100/0x240 fs/read_write.c:599 [ 315.403815] [] entry_SYSCALL_64_fastpath+0x23/0xc6 [ 315.403815] [< inline >] validate_chain kernel/locking/lockdep.c:2265 [ 315.403815] [] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338 [ 315.403815] [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749 [ 315.403815] [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [ 315.403815] [] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621 [ 315.403815] [] __netlink_dump_start+0xf9/0x760 net/netlink/af_netlink.c:2187 [ 315.403815] [< inline >] netlink_dump_start include/linux/netlink.h:165 [ 315.403815] [] ctnetlink_stat_ct_cpu+0x198/0x1e0 net/netfilter/nf_conntrack_netlink.c:2045 [ 315.403815] [] nfnetlink_rcv_msg+0x9be/0xd60 net/netfilter/nfnetlink.c:212 [ 315.403815] [] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298 [ 315.403815] [] nfnetlink_rcv+0x7e1/0x10d0 net/netfilter/nfnetlink.c:474 [ 315.403815] [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231 [ 315.403815] [] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257 [ 315.403815] [] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803 [ 315.403815] [< inline >] sock_sendmsg_nosec net/socket.c:621 [ 315.403815] [] sock_sendmsg+0xcf/0x110 net/socket.c:631 [ 315.403815] [] sock_write_iter+0x32b/0x620 net/socket.c:829 [ 315.403815] [< inline >] new_sync_write fs/read_write.c:499 [ 315.403815] [] __vfs_write+0x4fe/0x830 fs/read_write.c:512 [ 315.403815] [] vfs_write+0x175/0x4e0 fs/read_write.c:560 [ 315.403815] [< inline >] SYSC_write fs/read_write.c:607 [ 315.403815] [] SyS_write+0x100/0x240 fs/read_write.c:599 [ 315.403815] [] entry_SYSCALL_64_fastpath+0x23/0xc6 [ 315.403815] [< inline >] validate_chain kernel/locking/lockdep.c:2265 [ 315.403815] [] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338 [ 315.403815] [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749 [ 315.403815] [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [ 315.403815] [] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621 [ 315.403815] [] nfnl_lock+0x2d/0x30 net/netfilter/nfnetlink.c:61 [ 315.403815] [] nf_tables_netdev_event+0x1f1/0x720 net/netfilter/nf_tables_netdev.c:122 [ 315.403815] [] notifier_call_chain+0x14a/0x2f0 kernel/notifier.c:93 [ 315.403815] [< inline >] __raw_notifier_call_chain kernel/notifier.c:394 [ 315.403815] [] raw_notifier_call_chain+0x32/0x40 kernel/notifier.c:401 [ 315.403815] [] call_netdevice_notifiers_info+0x56/0x90 net/core/dev.c:1645 [ 315.403815] [< inline >] call_netdevice_notifiers net/core/dev.c:1661 [ 315.403815] [] rollback_registered_many+0x73d/0xba0 net/core/dev.c:6759 [ 315.403815] [] rollback_registered+0xae/0x100 net/core/dev.c:6800 [ 315.403815] [] unregister_netdevice_queue+0x86/0x140 net/core/dev.c:7787 [ 315.403815] [< inline >] unregister_netdevice include/linux/netdevice.h:2455 [ 315.403815] [] __tun_detach+0xc66/0xea0 drivers/net/tun.c:567 [ 315.808015] [< inline >] tun_detach drivers/net/tun.c:578 [ 315.808015] [] tun_chr_close+0x49/0x60 drivers/net/tun.c:2350 [ 315.808015] [] __fput+0x34e/0x910 fs/file_table.c:208 [ 315.808015] [] ____fput+0x1a/0x20 fs/file_table.c:244 [ 315.808015] [] task_work_run+0x1a0/0x280 kernel/task_work.c:116 [ 315.808015] [< inline >] exit_task_work include/linux/task_work.h:21 [ 315.808015] [] do_exit+0x1842/0x2650 kernel/exit.c:828 [ 315.808015] [] do_group_exit+0x14e/0x420 kernel/exit.c:932 [ 315.808015] [] get_signal+0x663/0x1880 kernel/signal.c:2307 [ 315.808015] [] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807 [ 315.808015] [] exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156 [ 315.808015] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [ 315.808015] [] syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259 [ 315.808015] [] entry_SYSCALL_64_fastpath+0xc4/0xc6 [ 315.808015] [< inline >] check_prev_add kernel/locking/lockdep.c:1828 [ 315.808015] [] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938 [ 315.808015] [< inline >] validate_chain kernel/locking/lockdep.c:2265 [ 315.808015] [] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338 [ 315.808015] [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749 [ 315.808015] [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [ 315.808015] [] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621 [ 315.808015] [] rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70 [ 315.808015] [] nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750 [ 315.808015] [] genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631 [ 315.808015] [] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660 [ 315.808015] [] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298 [ 315.808015] [] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671 [ 315.808015] [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231 [ 315.808015] [] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257 [ 315.808015] [] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803 [ 315.808015] [< inline >] sock_sendmsg_nosec net/socket.c:621 [ 315.808015] [] sock_sendmsg+0xcf/0x110 net/socket.c:631 [ 315.808015] [] sock_write_iter+0x32b/0x620 net/socket.c:829 [ 315.808015] [] do_iter_readv_writev+0x363/0x670 fs/read_write.c:695 [ 315.808015] [] do_readv_writev+0x431/0x9b0 fs/read_write.c:872 [ 315.808015] [] vfs_writev+0x8c/0xc0 fs/read_write.c:911 [ 315.808015] [] do_writev+0x115/0x2d0 fs/read_write.c:944 [ 315.808015] [< inline >] SYSC_writev fs/read_write.c:1017 [ 315.808015] [] SyS_writev+0x2c/0x40 fs/read_write.c:1014 [ 315.808015] [] entry_SYSCALL_64_fastpath+0x23/0xc6 other info that might help us debug this: Chain exists of: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(genl_mutex); lock(nlk->cb_mutex); lock(genl_mutex); lock(rtnl_mutex); *** DEADLOCK *** 2 locks held by syz-executor7/18794: #0: (cb_lock){++++++}, at: [] genl_rcv+0x1e/0x40 net/netlink/genetlink.c:670 #1: (genl_mutex){+.+.+.}, at: [< inline >] genl_lock net/netlink/genetlink.c:31 #1: (genl_mutex){+.+.+.}, at: [] genl_rcv_msg+0x209/0x260 net/netlink/genetlink.c:658 stack backtrace: CPU: 0 PID: 18794 Comm: syz-executor7 Not tainted 4.9.0-rc8+ #77 Hardware name: Google Google/Google, BIOS Google 01/01/2011 ffff88004add6468 ffffffff834c44f9 ffffffff00000000 1ffff100095bac20 ffffed00095bac18 0000000041b58ab3 ffffffff895816f0 ffffffff834c420b 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [] dump_stack+0x2ee/0x3f5 lib/dump_stack.c:51 [] print_circular_bug+0x310/0x3c0 kernel/locking/lockdep.c:1202 [< inline >] check_prev_add kernel/locking/lockdep.c:1828 [] check_prevs_add+0xaab/0x1c20 kernel/locking/lockdep.c:1938 [< inline >] validate_chain kernel/locking/lockdep.c:2265 [] __lock_acquire+0x2156/0x3380 kernel/locking/lockdep.c:3338 [] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3749 [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [] mutex_lock_nested+0x23f/0xf20 kernel/locking/mutex.c:621 [] rtnl_lock+0x1c/0x20 net/core/rtnetlink.c:70 [] nl80211_pre_doit+0x309/0x5b0 net/wireless/nl80211.c:11750 [] genl_family_rcv_msg+0x780/0x1070 net/netlink/genetlink.c:631 [] genl_rcv_msg+0x1b0/0x260 net/netlink/genetlink.c:660 [] netlink_rcv_skb+0x2bc/0x3a0 net/netlink/af_netlink.c:2298 [] genl_rcv+0x2d/0x40 net/netlink/genetlink.c:671 [< inline >] netlink_unicast_kernel net/netlink/af_netlink.c:1231 [] netlink_unicast+0x51a/0x740 net/netlink/af_netlink.c:1257 [] netlink_sendmsg+0xaa4/0xe50 net/netlink/af_netlink.c:1803 [< inline >] sock_sendmsg_nosec net/socket.c:621 [] sock_sendmsg+0xcf/0x110 net/socket.c:631 [] sock_write_iter+0x32b/0x620 net/socket.c:829 [] do_iter_readv_writev+0x363/0x670 fs/read_write.c:695 [] do_readv_writev+0x431/0x9b0 fs/read_write.c:872 [] vfs_writev+0x8c/0xc0 fs/read_write.c:911 [] do_writev+0x115/0x2d0 fs/read_write.c:944 [< inline >] SYSC_writev fs/read_write.c:1017 [] SyS_writev+0x2c/0x40 fs/read_write.c:1014 [] entry_SYSCALL_64_fastpath+0x23/0xc6