Return-Path: MIME-Version: 1.0 In-Reply-To: <1327704123-31313-1-git-send-email-andre.guedes@openbossa.org> References: <1327704123-31313-1-git-send-email-andre.guedes@openbossa.org> Date: Mon, 30 Jan 2012 10:26:04 -0200 Message-ID: Subject: Re: [PATCH 1/2] Bluetooth: Fix potential deadlock From: Ulisses Furquim To: Andre Guedes Cc: linux-bluetooth@vger.kernel.org, Vinicius Costa Gomes Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-bluetooth-owner@vger.kernel.org List-ID: Hi Andre, On Fri, Jan 27, 2012 at 8:42 PM, Andre Guedes wrote: > We don't need to use the _sync variant in hci_conn_hold and > hci_conn_put to cancel conn->disc_work delayed work. This way > we avoid potential deadlocks like this one reported by lockdep. > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 3.2.0+ #1 Not tainted > ------------------------------------------------------- > kworker/u:1/17 is trying to acquire lock: > ?(&hdev->lock){+.+.+.}, at: [] hci_conn_timeout+0x62/0x158 [bluetooth] > > but task is already holding lock: > ?((&(&conn->disc_work)->work)){+.+...}, at: [] process_one_work+0x11a/0x2bf > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #2 ((&(&conn->disc_work)->work)){+.+...}: > ? ? ? [] lock_acquire+0x8a/0xa7 > ? ? ? [] wait_on_work+0x3d/0xaa > ? ? ? [] __cancel_work_timer+0xac/0xef > ? ? ? [] cancel_delayed_work_sync+0xd/0xf > ? ? ? [] smp_chan_create+0xde/0xe6 [bluetooth] > ? ? ? [] smp_conn_security+0xa3/0x12d [bluetooth] > ? ? ? [] l2cap_connect_cfm+0x237/0x2e8 [bluetooth] > ? ? ? [] hci_proto_connect_cfm+0x2d/0x6f [bluetooth] > ? ? ? [] hci_event_packet+0x29d1/0x2d60 [bluetooth] > ? ? ? [] hci_rx_work+0xd0/0x2e1 [bluetooth] > ? ? ? [] process_one_work+0x178/0x2bf > ? ? ? [] worker_thread+0xce/0x152 > ? ? ? [] kthread+0x95/0x9d > ? ? ? [] kernel_thread_helper+0x4/0x10 > > -> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}: > ? ? ? [] lock_acquire+0x8a/0xa7 > ? ? ? [] _raw_spin_lock_bh+0x36/0x6a > ? ? ? [] lock_sock_nested+0x24/0x7f > ? ? ? [] lock_sock+0xb/0xd [bluetooth] > ? ? ? [] l2cap_chan_connect+0xa9/0x26f [bluetooth] > ? ? ? [] l2cap_sock_connect+0xb3/0xff [bluetooth] > ? ? ? [] sys_connect+0x69/0x8a > ? ? ? [] system_call_fastpath+0x16/0x1b > > -> #0 (&hdev->lock){+.+.+.}: > ? ? ? [] __lock_acquire+0xa80/0xd74 > ? ? ? [] lock_acquire+0x8a/0xa7 > ? ? ? [] __mutex_lock_common+0x48/0x38e > ? ? ? [] mutex_lock_nested+0x2a/0x31 > ? ? ? [] hci_conn_timeout+0x62/0x158 [bluetooth] > ? ? ? [] process_one_work+0x178/0x2bf > ? ? ? [] worker_thread+0xce/0x152 > ? ? ? [] kthread+0x95/0x9d > ? ? ? [] kernel_thread_helper+0x4/0x10 > > other info that might help us debug this: > > Chain exists of: > ?&hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work) > > ?Possible unsafe locking scenario: > > ? ? ? CPU0 ? ? ? ? ? ? ? ? ? ?CPU1 > ? ? ? ---- ? ? ? ? ? ? ? ? ? ?---- > ?lock((&(&conn->disc_work)->work)); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? lock(slock-AF_BLUETOOTH-BTPROTO_L2CAP); > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? lock((&(&conn->disc_work)->work)); > ?lock(&hdev->lock); > > ?*** DEADLOCK *** > > 2 locks held by kworker/u:1/17: > ?#0: ?(hdev->name){.+.+.+}, at: [] process_one_work+0x11a/0x2bf > ?#1: ?((&(&conn->disc_work)->work)){+.+...}, at: [] process_one_work+0x11a/0x2bf > > stack backtrace: > Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1 > Call Trace: > ?[] print_circular_bug+0x1f8/0x209 > ?[] __lock_acquire+0xa80/0xd74 > ?[] ? arch_local_irq_restore+0x6/0xd > ?[] ? vprintk+0x3f9/0x41e > ?[] lock_acquire+0x8a/0xa7 > ?[] ? hci_conn_timeout+0x62/0x158 [bluetooth] > ?[] __mutex_lock_common+0x48/0x38e > ?[] ? hci_conn_timeout+0x62/0x158 [bluetooth] > ?[] ? __dynamic_pr_debug+0x6d/0x6f > ?[] ? hci_conn_timeout+0x62/0x158 [bluetooth] > ?[] ? trace_hardirqs_off+0xd/0xf > ?[] mutex_lock_nested+0x2a/0x31 > ?[] hci_conn_timeout+0x62/0x158 [bluetooth] > ?[] process_one_work+0x178/0x2bf > ?[] ? process_one_work+0x11a/0x2bf > ?[] ? lock_acquired+0x1d0/0x1df > ?[] ? hci_acl_disconn+0x65/0x65 [bluetooth] > ?[] worker_thread+0xce/0x152 > ?[] ? finish_task_switch+0x45/0xc5 > ?[] ? manage_workers.isra.25+0x16a/0x16a > ?[] kthread+0x95/0x9d > ?[] kernel_thread_helper+0x4/0x10 > ?[] ? retint_restore_args+0x13/0x13 > ?[] ? __init_kthread_worker+0x55/0x55 > ?[] ? gs_change+0x13/0x13 > > Signed-off-by: Andre Guedes > Signed-off-by: Vinicius Costa Gomes > --- > ?include/net/bluetooth/hci_core.h | ? ?4 ++-- > ?1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h > index 25f449f..896d9e4 100644 > --- a/include/net/bluetooth/hci_core.h > +++ b/include/net/bluetooth/hci_core.h > @@ -572,7 +572,7 @@ void hci_conn_put_device(struct hci_conn *conn); > ?static inline void hci_conn_hold(struct hci_conn *conn) > ?{ > ? ? ? ?atomic_inc(&conn->refcnt); > - ? ? ? cancel_delayed_work_sync(&conn->disc_work); > + ? ? ? cancel_delayed_work(&conn->disc_work); > ?} > > ?static inline void hci_conn_put(struct hci_conn *conn) > @@ -591,7 +591,7 @@ static inline void hci_conn_put(struct hci_conn *conn) > ? ? ? ? ? ? ? ?} else { > ? ? ? ? ? ? ? ? ? ? ? ?timeo = msecs_to_jiffies(10); > ? ? ? ? ? ? ? ?} > - ? ? ? ? ? ? ? cancel_delayed_work_sync(&conn->disc_work); > + ? ? ? ? ? ? ? cancel_delayed_work(&conn->disc_work); > ? ? ? ? ? ? ? ?queue_delayed_work(conn->hdev->workqueue, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&conn->disc_work, timeo); > ? ? ? ?} Looks correct to me. I hope we're finishing cleaning these deadlocks caused by delayed work manipulation. :-/ Best regards, -- Ulisses Furquim ProFUSION embedded systems http://profusion.mobi Mobile: +55 19 9250 0942 Skype: ulissesffs