Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752722AbaG1OvQ (ORCPT ); Mon, 28 Jul 2014 10:51:16 -0400 Received: from cantor2.suse.de ([195.135.220.15]:55169 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751423AbaG1OvO (ORCPT ); Mon, 28 Jul 2014 10:51:14 -0400 Date: Mon, 28 Jul 2014 16:51:07 +0200 Message-ID: From: Takashi Iwai To: "Luis R. Rodriguez" Cc: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, "Luis R. Rodriguez" , Tetsuo Handa , Joseph Salisbury , Kay Sievers , One Thousand Gnomes , Tim Gardner , Pierre Fersing , Andrew Morton , Oleg Nesterov , Benjamin Poirier , Nagalakshmi Nandigama , Praveen Krishnamoorthy , Sreekanth Reddy , Abhijit Mahajan , Hariprasad S , Santosh Rastapur , MPT-FusionLinux.pdl@avagotech.com, linux-scsi@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH 1/3] driver core: enable drivers to use deferred probe from init In-Reply-To: <1406558067-25308-1-git-send-email-mcgrof@do-not-panic.com> References: <1406558067-25308-1-git-send-email-mcgrof@do-not-panic.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/24.3 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org At Mon, 28 Jul 2014 07:34:25 -0700, Luis R. Rodriguez wrote: > > From: "Luis R. Rodriguez" > > Tetsuo bisected and found that commit 786235ee "kthread: make > kthread_create() killable" modified kthread_create() to bail as > soon as SIGKILL is received. This is causing some issues with > some drivers and at times boot. Joseph then found that failures > occur as the systemd-udevd process sends SIGKILL to modprobe if > probe on a driver takes over 30 seconds. When this happens probe > will fail on any driver, its why booting on some system will fail > if the driver happens to be a storage related driver. Some folks > have suggested fixing this by modifying kthread_create() to not > leave upon SIGKILL [3], upon review Oleg rejected this change and > the discussion was punted out to systemd to see if the default > timeout could be increased from 30 seconds to 120. The opinion of > the systemd maintainers is that the driver's behavior should > be fixed [4]. Linus seems to agree [5], however more recently even > networking drivers have been reported to fail on probe since just > writing the firmware to a device and kicking it can take easy over > 60 seconds [6]. Benjamim was able to trace the issues recently > reported on cxgb4 down to the same systemd-udevd 30 second timeout [6]. > > This is an alternative solution which enables drivers that are > known to take long to use deferred probe workqueue. This avoids > the 30 second timeout and lets us annotate drivers with long > init sequences. > > As drivers determine a component is not yet available and needs > to defer probe you'll be notified this happen upon init for each > device but now with a message such as: > > pci 0000:03:00.0: Driver cxgb4 requests probe deferral on init > > You should see one of these per struct device probed. > > [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705 > [1] https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 > [2] http://lists.freedesktop.org/archives/systemd-devel/2014-March/018006.html > [3] http://thread.gmane.org/gmane.linux.ubuntu.devel.kernel.general/39123 > [4] http://article.gmane.org/gmane.comp.sysutils.systemd.devel/17860 > [5] http://article.gmane.org/gmane.linux.kernel/1671333 > [6] https://bugzilla.novell.com/show_bug.cgi?id=877622 > > Cc: Greg Kroah-Hartman > Cc: Tetsuo Handa > Cc: Joseph Salisbury > Cc: Kay Sievers > Cc: One Thousand Gnomes > Cc: Tim Gardner > Cc: Pierre Fersing > Cc: Andrew Morton > Cc: Oleg Nesterov > Cc: Benjamin Poirier > Cc: Greg Kroah-Hartman > Cc: Nagalakshmi Nandigama > Cc: Praveen Krishnamoorthy > Cc: Sreekanth Reddy > Cc: Abhijit Mahajan > Cc: Hariprasad S > Cc: Santosh Rastapur > Cc: MPT-FusionLinux.pdl@avagotech.com > Cc: linux-scsi@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: netdev@vger.kernel.org > Signed-off-by: Luis R. Rodriguez > --- > drivers/base/dd.c | 15 ++++++++++++++- > include/linux/device.h | 12 ++++++++++++ > 2 files changed, 26 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index e4ffbcf..7a271dc 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -374,6 +374,19 @@ void wait_for_device_probe(void) > } > EXPORT_SYMBOL_GPL(wait_for_device_probe); > > +static int __driver_probe_device(struct device_driver *drv, struct device *dev) > +{ > + if (drv->delay_probe && !dev->init_delayed_probe) { > + dev_info(dev, "Driver %s requests probe deferral on init\n", > + drv->name); > + dev->init_delayed_probe = true; > + driver_deferred_probe_add(dev); > + return -EPROBE_DEFER; > + } > + > + return really_probe(dev, drv); > +} > + > /** > * driver_probe_device - attempt to bind device & driver together > * @drv: driver to bind a device to > @@ -396,7 +409,7 @@ int driver_probe_device(struct device_driver *drv, struct device *dev) > drv->bus->name, __func__, dev_name(dev), drv->name); > > pm_runtime_barrier(dev); > - ret = really_probe(dev, drv); > + ret = __driver_probe_device(drv, dev); > pm_request_idle(dev); > > return ret; > diff --git a/include/linux/device.h b/include/linux/device.h > index af424ac..11da1b7 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -200,6 +200,12 @@ extern struct klist *bus_get_device_klist(struct bus_type *bus); > * @owner: The module owner. > * @mod_name: Used for built-in modules. > * @suppress_bind_attrs: Disables bind/unbind via sysfs. > + * @delay_probe: this driver is requesting a deferred probe since > + * initialization. This can be desirable if its known the device probe > + * or initialization takes more than 30 seconds. > + * @delayed_probe_devs: devices which have gone through a delayed probe. This > + * is used internally by the driver core to keep track of which devices > + * have gone through a delayed probe. > * @of_match_table: The open firmware table. > * @acpi_match_table: The ACPI match table. > * @probe: Called to query the existence of a specific device, > @@ -234,6 +240,9 @@ struct device_driver { > > bool suppress_bind_attrs; /* disables bind/unbind via sysfs */ > > + bool delay_probe; /* requests deferred probe */ > + struct list_head delayed_probe_devs; > + This field isn't used anywhere actually in the patch...? Takashi > const struct of_device_id *of_match_table; > const struct acpi_device_id *acpi_match_table; > > @@ -715,6 +724,8 @@ struct acpi_dev_node { > * > * @offline_disabled: If set, the device is permanently online. > * @offline: Set after successful invocation of bus type's .offline(). > + * @init_delayed_probe: lets the coore keep track if the device has already > + * gone through a delayed probe upon init. > * > * At the lowest level, every device in a Linux system is represented by an > * instance of struct device. The device structure contains the information > @@ -793,6 +804,7 @@ struct device { > > bool offline_disabled:1; > bool offline:1; > + bool init_delayed_probe:1; > }; > > static inline struct device *kobj_to_dev(struct kobject *kobj) > -- > 2.0.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/