Return-path: Received: from bu3sch.de ([62.75.166.246]:38447 "EHLO vs166246.vserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759937AbZJGTrB (ORCPT ); Wed, 7 Oct 2009 15:47:01 -0400 From: Michael Buesch To: Larry Finger Subject: Re: [PATCH] b43: Fix locking problem when stopping rfkill polling Date: Wed, 7 Oct 2009 21:46:16 +0200 Cc: "John W. Linville" , linux-wireless@vger.kernel.org References: <4accae5d.BgSJpcmlvg+W5PGM%Larry.Finger@lwfinger.net> <20091007190106.GB22394@tuxdriver.com> <4ACCEBE8.8010803@lwfinger.net> In-Reply-To: <4ACCEBE8.8010803@lwfinger.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Message-Id: <200910072146.18699.mb@bu3sch.de> Sender: linux-wireless-owner@vger.kernel.org List-ID: On Wednesday 07 October 2009 21:28:40 Larry Finger wrote: > John W. Linville wrote: > > On Wed, Oct 07, 2009 at 10:06:05AM -0500, Larry Finger wrote: > >> In commit 26e5ab35b4c7b1d4cb487a11084520aed9a8d05e entitled "b43: Fix PPC > >> crash in rfkill polling on unload", the call to stop polling should not have > >> been placed inside the wl->mutex. The result was incorrect locking messages. > >> > >> Signed-off-by: Larry Finger > >> --- > >> > >> John, > >> > >> I had not intended for the previous patch to be applied as I was waiting for > >> the Bugzilla OP to test. He promised to do that today. In any case, that patch > >> introduced a locking problem that needs to be fixed. > >> > >> Why do the one-liners cause so many problems? > >> > >> Larry > >> --- > >> > >> Index: wireless-testing/drivers/net/wireless/b43/main.c > >> =================================================================== > >> --- wireless-testing.orig/drivers/net/wireless/b43/main.c > >> +++ wireless-testing/drivers/net/wireless/b43/main.c > >> @@ -4501,8 +4501,8 @@ static void b43_op_stop(struct ieee80211 > >> > >> cancel_work_sync(&(wl->beacon_update_trigger)); > >> > >> - mutex_lock(&wl->mutex); > >> wiphy_rfkill_stop_polling(hw->wiphy); > >> + mutex_lock(&wl->mutex); > >> if (b43_status(dev) >= B43_STAT_STARTED) { > >> dev = b43_wireless_core_stop(dev); > >> if (!dev) > > > > OK, but why do we start polling under the lock but stop polling without > > the lock? Should we start polling without holding the lock too? > > I'll test that, but I suspect it doesn't matter. Of course, the reason > I put the stop under the lock was for symmetry, but then I got the > following when shutting down: > > b43-phy0 debug: Removing Interface type 2 > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.32-rc3-wl #225 > ------------------------------------------------------- > modprobe/25391 is trying to acquire lock: > (&(&rfkill->poll_work)->work){+.+...}, at: [] > __cancel_work_timer+0xd9/0x224 > > but task is already holding lock: > (&wl->mutex){+.+.+.}, at: [] b43_op_stop+0x30/0x7f > [b43] > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (&wl->mutex){+.+.+.}: > [] __lock_acquire+0x140e/0x174d > [] lock_acquire+0xbc/0xd9 > [] mutex_lock_nested+0x58/0x29c > [] b43_rfkill_poll+0x3a/0xfc [b43] > [] ieee80211_rfkill_poll+0x26/0x28 [mac80211] > [] cfg80211_rfkill_poll+0x14/0x16 [cfg80211] > [] rfkill_poll+0x23/0x3d [rfkill] > [] worker_thread+0x22c/0x332 > [] kthread+0x7d/0x85 > [] child_rip+0xa/0x20 > > Moving the stop ooutside the lock cured the problem. > Just move it right after the existing cancel_work_sync() call -- Greetings, Michael.