Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758076Ab2EUS3M (ORCPT ); Mon, 21 May 2012 14:29:12 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:57664 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1750884Ab2EUS3K (ORCPT ); Mon, 21 May 2012 14:29:10 -0400 Date: Mon, 21 May 2012 14:29:09 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Ming Lei cc: Greg KH , Wedson Almeida Filho , Andrew Morton , , Linux PM List Subject: =?UTF-8?Q?Re=3A_Race_condition_between_driver=5Fprobe=5Fdevice_and_d?= =?UTF-8?Q?evice=5Fshutdown=E2=80=8F?= In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2862 Lines: 83 On Mon, 21 May 2012, Ming Lei wrote: > Cc pm list because it is related with PM. > > Hi Greg, > > On Mon, May 21, 2012 at 3:51 AM, Greg KH wrote: > > > > > And how can that happen with a real bus? ?Don't we have a lock > > The races may be triggered when one device is just probed(triggered > by plug) or released(triggered by unplug) at the same time of running > reboot/poweroff. > > > somewhere per-bus that should be protecting this type of thing (sorry, > > can't dig through the code at the moment, on the road...) > > device_shutdown is called with only holding reboot_mutex, so I think no > any protection on dev->driver there. > > > > > How come no one has ever hit them in the past 10 years? ?What am I > > missing here? > > The window is so small that maybe it is very very difficult to trigger > the races, :-) > But looks Wedson is luck enough to observe it. > > >> Looks the above makes sense to serialize .shutdown with > >> .probe and .release. > > > > Let me look at the code when I get back in a few days, but I really > > thought we already had a lock protecting all of this... > > Also the previous patch don't cover the .runtime_resume races with > .probe or .release, so the right fix may be below: > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index 346be8b..cbc8bd2 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -1820,6 +1820,11 @@ void device_shutdown(void) > list_del_init(&dev->kobj.entry); > spin_unlock(&devices_kset->list_lock); > > + /*hold lock[s] to avoid races with .probe/.release*/ > + if (dev->parent) > + device_lock(dev->parent); > + device_lock(dev); > + > /* Don't allow any more runtime suspends */ > pm_runtime_get_noresume(dev); > pm_runtime_barrier(dev); > @@ -1831,6 +1836,9 @@ void device_shutdown(void) > dev_dbg(dev, "shutdown\n"); > dev->driver->shutdown(dev); > } > + device_unlock(dev); > + if (dev->parent) > + device_unlock(dev->parent); > put_device(dev); > > spin_lock(&devices_kset->list_lock); > > Another candidate fix is to register a reboot notifier in driver core to prevent > driver from being bound or unbound from start of reboot/shutdown, but looks > not easy as the way of holding device locks. I'd guess it was done this way so that the shutdown task wouldn't have to wait for a buggy driver that didn't want to release the device lock (or that crashed while holding the lock). It's not clear that the reboot notifier approach would work. What about probes that had already started when notifier was called? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/