Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750819Ab3HSFFV (ORCPT ); Mon, 19 Aug 2013 01:05:21 -0400 Received: from mail-pd0-f179.google.com ([209.85.192.179]:46948 "EHLO mail-pd0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750787Ab3HSFFR (ORCPT ); Mon, 19 Aug 2013 01:05:17 -0400 Date: Sun, 18 Aug 2013 22:04:57 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Johannes Berg cc: Linus Torvalds , Greg KH , "David S. Miller" , Andrei Otcheretianski , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, stable@vger.kernel.org Subject: 3.11-rc6 genetlink locking fix offends lockdep Message-ID: User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6481 Lines: 116 3.11-rc6's commit 58ad436fcf49 ("genetlink: fix family dump race") gives me the lockdep trace below at startup. I think it needs to be reverted until you can refine it. And it has already gone into today's stable review series, as 04/12 for 3.0.92, 26/34 for 3.4.59, 18/45 for 3.10.8: I raise an objection to those. Hugh [ 4.004286] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X [ 4.105671] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X [ 4.106123] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready [ 4.110096] [ 4.110113] ====================================================== [ 4.110146] [ INFO: possible circular locking dependency detected ] [ 4.110180] 3.11.0-rc6 #1 Not tainted [ 4.110201] ------------------------------------------------------- [ 4.110234] NetworkManager/358 is trying to acquire lock: [ 4.110262] (genl_mutex){+.+.+.}, at: [] genl_lock+0x12/0x14 [ 4.110315] [ 4.110315] but task is already holding lock: [ 4.110346] (nlk->cb_mutex){+.+.+.}, at: [] netlink_dump+0x1c/0x1d7 [ 4.110400] [ 4.110400] which lock already depends on the new lock. [ 4.110400] [ 4.110442] [ 4.110442] the existing dependency chain (in reverse order) is: [ 4.110482] [ 4.110482] -> #1 (nlk->cb_mutex){+.+.+.}: [ 4.110517] [] __lock_acquire+0x865/0x956 [ 4.110555] [] lock_acquire+0x57/0x6d [ 4.110589] [] mutex_lock_nested+0x5e/0x345 [ 4.110627] [] __netlink_dump_start+0xae/0x14e [ 4.110665] [] genl_rcv_msg+0xf4/0x252 [ 4.110699] [] netlink_rcv_skb+0x3e/0x8c [ 4.110734] [] genl_rcv+0x24/0x34 [ 4.110766] [] netlink_unicast+0xed/0x17a [ 4.110801] [] netlink_sendmsg+0x2fb/0x345 [ 4.110838] [] sock_sendmsg+0x79/0x8e [ 4.110871] [] ___sys_sendmsg+0x231/0x2be [ 4.110907] [] __sys_sendmsg+0x3d/0x5e [ 4.110942] [] SyS_sendmsg+0xd/0x19 [ 4.110975] [] system_call_fastpath+0x16/0x1b [ 4.111012] [ 4.111012] -> #0 (genl_mutex){+.+.+.}: [ 4.111047] [] validate_chain.isra.21+0x836/0xe8e [ 4.111086] [] __lock_acquire+0x865/0x956 [ 4.111122] [] lock_acquire+0x57/0x6d [ 4.111157] [] mutex_lock_nested+0x5e/0x345 [ 4.111193] [] genl_lock+0x12/0x14 [ 4.111226] [] ctrl_dumpfamily+0x31/0xfa [ 4.111260] [] netlink_dump+0x88/0x1d7 [ 4.111295] [] netlink_recvmsg+0x1b1/0x2d1 [ 4.111331] [] sock_recvmsg+0x83/0x98 [ 4.111365] [] ___sys_recvmsg+0x15d/0x207 [ 4.111400] [] __sys_recvmsg+0x3d/0x5e [ 4.111434] [] SyS_recvmsg+0xd/0x19 [ 4.111467] [] system_call_fastpath+0x16/0x1b [ 4.111504] [ 4.111504] other info that might help us debug this: [ 4.111504] [ 4.111545] Possible unsafe locking scenario: [ 4.111545] [ 4.111577] CPU0 CPU1 [ 4.111601] ---- ---- [ 4.111625] lock(nlk->cb_mutex); [ 4.112865] lock(genl_mutex); [ 4.114216] lock(nlk->cb_mutex); [ 4.115315] lock(genl_mutex); [ 4.116500] [ 4.116500] *** DEADLOCK *** [ 4.116500] [ 4.119670] 1 lock held by NetworkManager/358: [ 4.120906] #0: (nlk->cb_mutex){+.+.+.}, at: [] netlink_dump+0x1c/0x1d7 [ 4.122196] [ 4.122196] stack backtrace: [ 4.124533] CPU: 0 PID: 358 Comm: NetworkManager Not tainted 3.11.0-rc6 #1 [ 4.125779] Hardware name: LENOVO 4174EH1/4174EH1, BIOS 8CET51WW (1.31 ) 11/29/2011 [ 4.126979] ffffffff81d0a0f0 ffff88022b91d8c8 ffffffff8157cf80 0000000000000006 [ 4.128274] ffffffff81cc8750 ffff88022b91d918 ffffffff8157a898 ffff88022d798080 [ 4.129472] ffff88022d798080 ffff88022d798080 ffff88022d798750 ffff88022d798080 [ 4.130645] Call Trace: [ 4.131801] [] dump_stack+0x4f/0x84 [ 4.132817] [] print_circular_bug+0x2ad/0x2be [ 4.133839] [] validate_chain.isra.21+0x836/0xe8e [ 4.134821] [] ? sock_def_write_space+0x1b5/0x1b5 [ 4.135800] [] __lock_acquire+0x865/0x956 [ 4.136842] [] ? mark_held_locks+0xce/0xfa [ 4.137828] [] ? genl_lock+0x12/0x14 [ 4.138876] [] lock_acquire+0x57/0x6d [ 4.139856] [] ? genl_lock+0x12/0x14 [ 4.141027] [] mutex_lock_nested+0x5e/0x345 [ 4.142194] [] ? genl_lock+0x12/0x14 [ 4.143219] [] ? __kmalloc_node_track_caller+0x26/0x2d [ 4.144340] [] genl_lock+0x12/0x14 [ 4.145387] [] ctrl_dumpfamily+0x31/0xfa [ 4.146387] [] ? __alloc_skb+0x97/0x1a0 [ 4.147454] [] netlink_dump+0x88/0x1d7 [ 4.148448] [] netlink_recvmsg+0x1b1/0x2d1 [ 4.149475] [] sock_recvmsg+0x83/0x98 [ 4.150494] [] ? might_fault+0x52/0xa2 [ 4.151471] [] ___sys_recvmsg+0x15d/0x207 [ 4.152516] [] ? __lock_acquire+0x865/0x956 [ 4.153501] [] ? fget_light+0x35c/0x377 [ 4.154550] [] ? fget_light+0x164/0x377 [ 4.155521] [] __sys_recvmsg+0x3d/0x5e [ 4.156568] [] ? sock_def_write_space+0x1b5/0x1b5 [ 4.157552] [] SyS_recvmsg+0xd/0x19 [ 4.158607] [] system_call_fastpath+0x16/0x1b [ 4.160507] iwlwifi 0000:03:00.0: L1 Enabled; Disabling L0S [ 4.160709] iwlwifi 0000:03:00.0: Radio type=0x0-0x3-0x1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/