Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756113AbbGFXX0 (ORCPT ); Mon, 6 Jul 2015 19:23:26 -0400 Received: from cantor2.suse.de ([195.135.220.15]:48937 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754742AbbGFXXT (ORCPT ); Mon, 6 Jul 2015 19:23:19 -0400 Date: Tue, 7 Jul 2015 01:23:15 +0200 From: "Luis R. Rodriguez" To: Dan Williams Cc: Tom Gundersen , Dmitry Torokhov , Greg Kroah-Hartman , Tejun Heo , Linux Kernel Mailing List , Arjan van de Ven , Rusty Russell , Olof Johansson , Tetsuo Handa Subject: Re: [PATCH 2/8] driver-core: add asynchronous probing support for drivers Message-ID: <20150706232315.GK7021@wotan.suse.de> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5220 Lines: 108 On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote: > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez wrote: > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote: > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov > >> wrote: > >> > Some devices take a long time when initializing, and not all drivers are > >> > suited to initialize their devices when they are open. For example, > >> > input drivers need to interrogate their devices in order to publish > >> > device's capabilities before userspace will open them. When such drivers > >> > are compiled into kernel they may stall entire kernel initialization. > >> > > >> > This change allows drivers request for their probe functions to be > >> > called asynchronously during driver and device registration (manual > >> > binding is still synchronous). Because async_schedule is used to perform > >> > asynchronous calls module loading will still wait for the probing to > >> > complete. > >> > > >> > Note that the end goal is to make the probing asynchronous by default, > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary > >> > measure that allows us to speed up boot process while we validating and > >> > fixing the rest of the drivers and preparing userspace. > >> > > >> > This change is based on earlier patch by "Luis R. Rodriguez" > >> > > >> > > >> > Signed-off-by: Dmitry Torokhov > >> > --- > >> > drivers/base/base.h | 1 + > >> > drivers/base/bus.c | 31 +++++++--- > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++------- > >> > include/linux/device.h | 28 ++++++++++ > >> > 4 files changed, 182 insertions(+), 27 deletions(-) > >> > >> Just noticed this patch. It caught my eye because I had a hard time > >> getting an open coded implementation of asynchronous probing to work > >> in the new libnvdimm subsystem. Especially the messy races of tearing > >> things down while probing is still in flight. I ended up implementing > >> asynchronous device registration which eliminated a lot of complexity > >> and of course the bugs. In general I tend to think that async > >> registration is less risky than async probe since it keeps wider > >> portions of the traditional device model synchronous > > > > but its not see -DEFER_PROBE even before async probe. > > Except in that case you know probe has been seen by the driver at > least once. So I see that as less of a surprise, but point taken. > > >> and leverages the > >> fact that the device model is already well prepared for asynchronous > >> arrival of devices due to hotplug. > > > > I think this sounds reasonable, do you have your code upstream or posted? > > Yes, see nd_device_register() in drivers/nvdimm/bus.c It should be I think rather easy for Dmitry to see if he can convert this input driver (not yet upstream) to this API and see if the same issues are fixed. This however does not address systemd's assumption over detachment of module load and probe. The inherent problem there was the timeout implemented and carried in systemd over the worker that uses modlib to load modules. Upon review the code was complex enough already and surely increasing the timeout helps but that doesn't address all issues with a general timeout in place. At SUSE we ended up not using a timeout for kmod built-in commands. That leaves the original timeout purpose in place. The code for async probe was not put in the kernel though but since its now upstream we should be able to replace that userspace systemd work around with async probe, but systemd folks would need to decide what they want to do. For full gory details of this refer to: https://bugzilla.novell.com/show_bug.cgi?id=889297 > > If not will you be at Plumbers? > > Yes. Great, Tom, Dmitry, will you be at Plumbers? > > Maybe we shoudl talk about this as although > > ChromeOS already likely already jumped on async probe we should address a > > way forward and path forward for other distributions and I don't think anyone > > is looking too much into it. async probe came to Linux for two reasons: > > > > * chromeos wanting it > > * an incorrect systemd assumption on how the driver core works > > > > So long term we still need to address the systemd approach, are they going > > to be defaulting now to async probe for all modules? How about for built-ins? > > > > We should talk about this and maybe at plumbers. > > > >> Splitting the "initial probe" from > >> the "manual probe" case seems like a recipe for confusion. > > > > If you can come up with pros / cons on both strategies it'd be > > valuable. > > The problem I ran into was needing to remove devices that still had > yet to be probed and not being able to use registration completion vs > the device_lock() to effectively synchronize the sub-system. Interesting, what cases would this happen under? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/