On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
>> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
>> <[email protected]> wrote:
>> > Some devices take a long time when initializing, and not all drivers are
>> > suited to initialize their devices when they are open. For example,
>> > input drivers need to interrogate their devices in order to publish
>> > device's capabilities before userspace will open them. When such drivers
>> > are compiled into kernel they may stall entire kernel initialization.
>> >
>> > This change allows drivers request for their probe functions to be
>> > called asynchronously during driver and device registration (manual
>> > binding is still synchronous). Because async_schedule is used to perform
>> > asynchronous calls module loading will still wait for the probing to
>> > complete.
>> >
>> > Note that the end goal is to make the probing asynchronous by default,
>> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
>> > measure that allows us to speed up boot process while we validating and
>> > fixing the rest of the drivers and preparing userspace.
>> >
>> > This change is based on earlier patch by "Luis R. Rodriguez"
>> > <[email protected]>
>> >
>> > Signed-off-by: Dmitry Torokhov <[email protected]>
>> > ---
>> > drivers/base/base.h | 1 +
>> > drivers/base/bus.c | 31 +++++++---
>> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
>> > include/linux/device.h | 28 ++++++++++
>> > 4 files changed, 182 insertions(+), 27 deletions(-)
>>
>> Just noticed this patch. It caught my eye because I had a hard time
>> getting an open coded implementation of asynchronous probing to work
>> in the new libnvdimm subsystem. Especially the messy races of tearing
>> things down while probing is still in flight. I ended up implementing
>> asynchronous device registration which eliminated a lot of complexity
>> and of course the bugs. In general I tend to think that async
>> registration is less risky than async probe since it keeps wider
>> portions of the traditional device model synchronous
>
> but its not see -DEFER_PROBE even before async probe.
Except in that case you know probe has been seen by the driver at
least once. So I see that as less of a surprise, but point taken.
>> and leverages the
>> fact that the device model is already well prepared for asynchronous
>> arrival of devices due to hotplug.
>
> I think this sounds reasonable, do you have your code upstream or posted?
Yes, see nd_device_register() in drivers/nvdimm/bus.c
> If not will you be at Plumbers?
Yes.
> Maybe we shoudl talk about this as although
> ChromeOS already likely already jumped on async probe we should address a
> way forward and path forward for other distributions and I don't think anyone
> is looking too much into it. async probe came to Linux for two reasons:
>
> * chromeos wanting it
> * an incorrect systemd assumption on how the driver core works
>
> So long term we still need to address the systemd approach, are they going
> to be defaulting now to async probe for all modules? How about for built-ins?
>
> We should talk about this and maybe at plumbers.
>
>> Splitting the "initial probe" from
>> the "manual probe" case seems like a recipe for confusion.
>
> If you can come up with pros / cons on both strategies it'd be
> valuable.
The problem I ran into was needing to remove devices that still had
yet to be probed and not being able to use registration completion vs
the device_lock() to effectively synchronize the sub-system.
On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
> >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
> >> <[email protected]> wrote:
> >> > Some devices take a long time when initializing, and not all drivers are
> >> > suited to initialize their devices when they are open. For example,
> >> > input drivers need to interrogate their devices in order to publish
> >> > device's capabilities before userspace will open them. When such drivers
> >> > are compiled into kernel they may stall entire kernel initialization.
> >> >
> >> > This change allows drivers request for their probe functions to be
> >> > called asynchronously during driver and device registration (manual
> >> > binding is still synchronous). Because async_schedule is used to perform
> >> > asynchronous calls module loading will still wait for the probing to
> >> > complete.
> >> >
> >> > Note that the end goal is to make the probing asynchronous by default,
> >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
> >> > measure that allows us to speed up boot process while we validating and
> >> > fixing the rest of the drivers and preparing userspace.
> >> >
> >> > This change is based on earlier patch by "Luis R. Rodriguez"
> >> > <[email protected]>
> >> >
> >> > Signed-off-by: Dmitry Torokhov <[email protected]>
> >> > ---
> >> > drivers/base/base.h | 1 +
> >> > drivers/base/bus.c | 31 +++++++---
> >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
> >> > include/linux/device.h | 28 ++++++++++
> >> > 4 files changed, 182 insertions(+), 27 deletions(-)
> >>
> >> Just noticed this patch. It caught my eye because I had a hard time
> >> getting an open coded implementation of asynchronous probing to work
> >> in the new libnvdimm subsystem. Especially the messy races of tearing
> >> things down while probing is still in flight. I ended up implementing
> >> asynchronous device registration which eliminated a lot of complexity
> >> and of course the bugs. In general I tend to think that async
> >> registration is less risky than async probe since it keeps wider
> >> portions of the traditional device model synchronous
> >
> > but its not see -DEFER_PROBE even before async probe.
>
> Except in that case you know probe has been seen by the driver at
> least once. So I see that as less of a surprise, but point taken.
>
> >> and leverages the
> >> fact that the device model is already well prepared for asynchronous
> >> arrival of devices due to hotplug.
> >
> > I think this sounds reasonable, do you have your code upstream or posted?
>
> Yes, see nd_device_register() in drivers/nvdimm/bus.c
It should be I think rather easy for Dmitry to see if he can convert this input
driver (not yet upstream) to this API and see if the same issues are fixed.
This however does not address systemd's assumption over detachment of module
load and probe. The inherent problem there was the timeout implemented and
carried in systemd over the worker that uses modlib to load modules. Upon
review the code was complex enough already and surely increasing the timeout
helps but that doesn't address all issues with a general timeout in place.
At SUSE we ended up not using a timeout for kmod built-in commands. That
leaves the original timeout purpose in place. The code for async probe was
not put in the kernel though but since its now upstream we should be able
to replace that userspace systemd work around with async probe, but systemd
folks would need to decide what they want to do. For full gory details of
this refer to:
https://bugzilla.novell.com/show_bug.cgi?id=889297
> > If not will you be at Plumbers?
>
> Yes.
Great, Tom, Dmitry, will you be at Plumbers?
> > Maybe we shoudl talk about this as although
> > ChromeOS already likely already jumped on async probe we should address a
> > way forward and path forward for other distributions and I don't think anyone
> > is looking too much into it. async probe came to Linux for two reasons:
> >
> > * chromeos wanting it
> > * an incorrect systemd assumption on how the driver core works
> >
> > So long term we still need to address the systemd approach, are they going
> > to be defaulting now to async probe for all modules? How about for built-ins?
> >
> > We should talk about this and maybe at plumbers.
> >
> >> Splitting the "initial probe" from
> >> the "manual probe" case seems like a recipe for confusion.
> >
> > If you can come up with pros / cons on both strategies it'd be
> > valuable.
>
> The problem I ran into was needing to remove devices that still had
> yet to be probed and not being able to use registration completion vs
> the device_lock() to effectively synchronize the sub-system.
Interesting, what cases would this happen under?
Luis
On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
> >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
> >> <[email protected]> wrote:
> >> > Some devices take a long time when initializing, and not all drivers are
> >> > suited to initialize their devices when they are open. For example,
> >> > input drivers need to interrogate their devices in order to publish
> >> > device's capabilities before userspace will open them. When such drivers
> >> > are compiled into kernel they may stall entire kernel initialization.
> >> >
> >> > This change allows drivers request for their probe functions to be
> >> > called asynchronously during driver and device registration (manual
> >> > binding is still synchronous). Because async_schedule is used to perform
> >> > asynchronous calls module loading will still wait for the probing to
> >> > complete.
> >> >
> >> > Note that the end goal is to make the probing asynchronous by default,
> >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
> >> > measure that allows us to speed up boot process while we validating and
> >> > fixing the rest of the drivers and preparing userspace.
> >> >
> >> > This change is based on earlier patch by "Luis R. Rodriguez"
> >> > <[email protected]>
> >> >
> >> > Signed-off-by: Dmitry Torokhov <[email protected]>
> >> > ---
> >> > drivers/base/base.h | 1 +
> >> > drivers/base/bus.c | 31 +++++++---
> >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
> >> > include/linux/device.h | 28 ++++++++++
> >> > 4 files changed, 182 insertions(+), 27 deletions(-)
> >>
> >> Just noticed this patch. It caught my eye because I had a hard time
> >> getting an open coded implementation of asynchronous probing to work
> >> in the new libnvdimm subsystem. Especially the messy races of tearing
> >> things down while probing is still in flight. I ended up implementing
> >> asynchronous device registration which eliminated a lot of complexity
> >> and of course the bugs. In general I tend to think that async
> >> registration is less risky than async probe since it keeps wider
> >> portions of the traditional device model synchronous
> >
> > but its not see -DEFER_PROBE even before async probe.
>
> Except in that case you know probe has been seen by the driver at
> least once. So I see that as less of a surprise, but point taken.
>
> >> and leverages the
> >> fact that the device model is already well prepared for asynchronous
> >> arrival of devices due to hotplug.
> >
> > I think this sounds reasonable, do you have your code upstream or posted?
>
> Yes, see nd_device_register() in drivers/nvdimm/bus.c
So no error handling whatsoever, as expected...
>
> > If not will you be at Plumbers?
>
> Yes.
Me too.
>
> > Maybe we shoudl talk about this as although
> > ChromeOS already likely already jumped on async probe we should address a
> > way forward and path forward for other distributions and I don't think anyone
> > is looking too much into it. async probe came to Linux for two reasons:
> >
> > * chromeos wanting it
> > * an incorrect systemd assumption on how the driver core works
> >
> > So long term we still need to address the systemd approach, are they going
> > to be defaulting now to async probe for all modules? How about for built-ins?
> >
> > We should talk about this and maybe at plumbers.
> >
> >> Splitting the "initial probe" from
> >> the "manual probe" case seems like a recipe for confusion.
> >
> > If you can come up with pros / cons on both strategies it'd be
> > valuable.
>
> The problem I ran into was needing to remove devices that still had
> yet to be probed and not being able to use registration completion vs
> the device_lock() to effectively synchronize the sub-system.
Why do you need to "synchronize the sub-system"? The asynchronous
probing should be transparent to the driver. Just unregister the device
(or the driver) and driver core will ensure that probe() is not in
flight.
Confused.
--
Dmitry
On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
> > >> <[email protected]> wrote:
> > >> > Some devices take a long time when initializing, and not all drivers are
> > >> > suited to initialize their devices when they are open. For example,
> > >> > input drivers need to interrogate their devices in order to publish
> > >> > device's capabilities before userspace will open them. When such drivers
> > >> > are compiled into kernel they may stall entire kernel initialization.
> > >> >
> > >> > This change allows drivers request for their probe functions to be
> > >> > called asynchronously during driver and device registration (manual
> > >> > binding is still synchronous). Because async_schedule is used to perform
> > >> > asynchronous calls module loading will still wait for the probing to
> > >> > complete.
> > >> >
> > >> > Note that the end goal is to make the probing asynchronous by default,
> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
> > >> > measure that allows us to speed up boot process while we validating and
> > >> > fixing the rest of the drivers and preparing userspace.
> > >> >
> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
> > >> > <[email protected]>
> > >> >
> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
> > >> > ---
> > >> > drivers/base/base.h | 1 +
> > >> > drivers/base/bus.c | 31 +++++++---
> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
> > >> > include/linux/device.h | 28 ++++++++++
> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
> > >>
> > >> Just noticed this patch. It caught my eye because I had a hard time
> > >> getting an open coded implementation of asynchronous probing to work
> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
> > >> things down while probing is still in flight. I ended up implementing
> > >> asynchronous device registration which eliminated a lot of complexity
> > >> and of course the bugs. In general I tend to think that async
> > >> registration is less risky than async probe since it keeps wider
> > >> portions of the traditional device model synchronous
> > >
> > > but its not see -DEFER_PROBE even before async probe.
> >
> > Except in that case you know probe has been seen by the driver at
> > least once. So I see that as less of a surprise, but point taken.
> >
> > >> and leverages the
> > >> fact that the device model is already well prepared for asynchronous
> > >> arrival of devices due to hotplug.
> > >
> > > I think this sounds reasonable, do you have your code upstream or posted?
> >
> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
>
> It should be I think rather easy for Dmitry to see if he can convert this input
> driver (not yet upstream) to this API and see if the same issues are fixed.
No, I would rather not as it means we lose error handling on device
registration.
Thanks.
--
Dmitry
On Tue, Jul 7, 2015 at 1:23 AM, Luis R. Rodriguez <[email protected]> wrote:
> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
>> > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
>> >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
>> >> <[email protected]> wrote:
>> >> > Some devices take a long time when initializing, and not all drivers are
>> >> > suited to initialize their devices when they are open. For example,
>> >> > input drivers need to interrogate their devices in order to publish
>> >> > device's capabilities before userspace will open them. When such drivers
>> >> > are compiled into kernel they may stall entire kernel initialization.
>> >> >
>> >> > This change allows drivers request for their probe functions to be
>> >> > called asynchronously during driver and device registration (manual
>> >> > binding is still synchronous). Because async_schedule is used to perform
>> >> > asynchronous calls module loading will still wait for the probing to
>> >> > complete.
>> >> >
>> >> > Note that the end goal is to make the probing asynchronous by default,
>> >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
>> >> > measure that allows us to speed up boot process while we validating and
>> >> > fixing the rest of the drivers and preparing userspace.
>> >> >
>> >> > This change is based on earlier patch by "Luis R. Rodriguez"
>> >> > <[email protected]>
>> >> >
>> >> > Signed-off-by: Dmitry Torokhov <[email protected]>
>> >> > ---
>> >> > drivers/base/base.h | 1 +
>> >> > drivers/base/bus.c | 31 +++++++---
>> >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
>> >> > include/linux/device.h | 28 ++++++++++
>> >> > 4 files changed, 182 insertions(+), 27 deletions(-)
>> >>
>> >> Just noticed this patch. It caught my eye because I had a hard time
>> >> getting an open coded implementation of asynchronous probing to work
>> >> in the new libnvdimm subsystem. Especially the messy races of tearing
>> >> things down while probing is still in flight. I ended up implementing
>> >> asynchronous device registration which eliminated a lot of complexity
>> >> and of course the bugs. In general I tend to think that async
>> >> registration is less risky than async probe since it keeps wider
>> >> portions of the traditional device model synchronous
>> >
>> > but its not see -DEFER_PROBE even before async probe.
>>
>> Except in that case you know probe has been seen by the driver at
>> least once. So I see that as less of a surprise, but point taken.
>>
>> >> and leverages the
>> >> fact that the device model is already well prepared for asynchronous
>> >> arrival of devices due to hotplug.
>> >
>> > I think this sounds reasonable, do you have your code upstream or posted?
>>
>> Yes, see nd_device_register() in drivers/nvdimm/bus.c
>
> It should be I think rather easy for Dmitry to see if he can convert this input
> driver (not yet upstream) to this API and see if the same issues are fixed.
> This however does not address systemd's assumption over detachment of module
> load and probe. The inherent problem there was the timeout implemented and
> carried in systemd over the worker that uses modlib to load modules. Upon
> review the code was complex enough already and surely increasing the timeout
> helps but that doesn't address all issues with a general timeout in place.
> At SUSE we ended up not using a timeout for kmod built-in commands. That
> leaves the original timeout purpose in place. The code for async probe was
> not put in the kernel though but since its now upstream we should be able
> to replace that userspace systemd work around with async probe, but systemd
> folks would need to decide what they want to do. For full gory details of
> this refer to:
>
> https://bugzilla.novell.com/show_bug.cgi?id=889297
FTR, this does not appear to be public, so I was not able to see it.
>> > If not will you be at Plumbers?
>>
>> Yes.
>
> Great, Tom, Dmitry, will you be at Plumbers?
Sadly, I won't make plumbers this year.
Cheers,
Tom
On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
>> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
>> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
>> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
>> > >> <[email protected]> wrote:
>> > >> > Some devices take a long time when initializing, and not all drivers are
>> > >> > suited to initialize their devices when they are open. For example,
>> > >> > input drivers need to interrogate their devices in order to publish
>> > >> > device's capabilities before userspace will open them. When such drivers
>> > >> > are compiled into kernel they may stall entire kernel initialization.
>> > >> >
>> > >> > This change allows drivers request for their probe functions to be
>> > >> > called asynchronously during driver and device registration (manual
>> > >> > binding is still synchronous). Because async_schedule is used to perform
>> > >> > asynchronous calls module loading will still wait for the probing to
>> > >> > complete.
>> > >> >
>> > >> > Note that the end goal is to make the probing asynchronous by default,
>> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
>> > >> > measure that allows us to speed up boot process while we validating and
>> > >> > fixing the rest of the drivers and preparing userspace.
>> > >> >
>> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
>> > >> > <[email protected]>
>> > >> >
>> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
>> > >> > ---
>> > >> > drivers/base/base.h | 1 +
>> > >> > drivers/base/bus.c | 31 +++++++---
>> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
>> > >> > include/linux/device.h | 28 ++++++++++
>> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
>> > >>
>> > >> Just noticed this patch. It caught my eye because I had a hard time
>> > >> getting an open coded implementation of asynchronous probing to work
>> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
>> > >> things down while probing is still in flight. I ended up implementing
>> > >> asynchronous device registration which eliminated a lot of complexity
>> > >> and of course the bugs. In general I tend to think that async
>> > >> registration is less risky than async probe since it keeps wider
>> > >> portions of the traditional device model synchronous
>> > >
>> > > but its not see -DEFER_PROBE even before async probe.
>> >
>> > Except in that case you know probe has been seen by the driver at
>> > least once. So I see that as less of a surprise, but point taken.
>> >
>> > >> and leverages the
>> > >> fact that the device model is already well prepared for asynchronous
>> > >> arrival of devices due to hotplug.
>> > >
>> > > I think this sounds reasonable, do you have your code upstream or posted?
>> >
>> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
>>
>> It should be I think rather easy for Dmitry to see if he can convert this input
>> driver (not yet upstream) to this API and see if the same issues are fixed.
>
> No, I would rather not as it means we lose error handling on device
> registration.
>
I think this is a red herring as I don't see how async probing is any
better at handling device registration errors. The error is logged
and "handled" by the fact that a device fails to appear, what other
action would you take? In fact libnvdimm does detect registration
failures and reports that in a parent device attribute (at least for a
region device and their namespace child devices).
On Mon, Jul 6, 2015 at 4:38 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> The problem I ran into was needing to remove devices that still had
>> yet to be probed and not being able to use registration completion vs
>> the device_lock() to effectively synchronize the sub-system.
>
> Why do you need to "synchronize the sub-system"? The asynchronous
> probing should be transparent to the driver. Just unregister the device
> (or the driver) and driver core will ensure that probe() is not in
> flight.
Async registration is indeed transparent to the driver. The primary
need to "flush registration" is the case of "region" devices that
reference a set of NVDIMM devices. A region device requires all
related NVDIMMs to be active before the region can be enabled.
I'll look into a more concrete example of the tradeoffs between
asynchronous probing vs registration.
On Wed, Jul 08, 2015 at 05:36:04PM -0700, Dan Williams wrote:
> On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov
> <[email protected]> wrote:
> > On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
> >> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> >> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> >> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
> >> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
> >> > >> <[email protected]> wrote:
> >> > >> > Some devices take a long time when initializing, and not all drivers are
> >> > >> > suited to initialize their devices when they are open. For example,
> >> > >> > input drivers need to interrogate their devices in order to publish
> >> > >> > device's capabilities before userspace will open them. When such drivers
> >> > >> > are compiled into kernel they may stall entire kernel initialization.
> >> > >> >
> >> > >> > This change allows drivers request for their probe functions to be
> >> > >> > called asynchronously during driver and device registration (manual
> >> > >> > binding is still synchronous). Because async_schedule is used to perform
> >> > >> > asynchronous calls module loading will still wait for the probing to
> >> > >> > complete.
> >> > >> >
> >> > >> > Note that the end goal is to make the probing asynchronous by default,
> >> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
> >> > >> > measure that allows us to speed up boot process while we validating and
> >> > >> > fixing the rest of the drivers and preparing userspace.
> >> > >> >
> >> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
> >> > >> > <[email protected]>
> >> > >> >
> >> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
> >> > >> > ---
> >> > >> > drivers/base/base.h | 1 +
> >> > >> > drivers/base/bus.c | 31 +++++++---
> >> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
> >> > >> > include/linux/device.h | 28 ++++++++++
> >> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
> >> > >>
> >> > >> Just noticed this patch. It caught my eye because I had a hard time
> >> > >> getting an open coded implementation of asynchronous probing to work
> >> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
> >> > >> things down while probing is still in flight. I ended up implementing
> >> > >> asynchronous device registration which eliminated a lot of complexity
> >> > >> and of course the bugs. In general I tend to think that async
> >> > >> registration is less risky than async probe since it keeps wider
> >> > >> portions of the traditional device model synchronous
> >> > >
> >> > > but its not see -DEFER_PROBE even before async probe.
> >> >
> >> > Except in that case you know probe has been seen by the driver at
> >> > least once. So I see that as less of a surprise, but point taken.
> >> >
> >> > >> and leverages the
> >> > >> fact that the device model is already well prepared for asynchronous
> >> > >> arrival of devices due to hotplug.
> >> > >
> >> > > I think this sounds reasonable, do you have your code upstream or posted?
> >> >
> >> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
> >>
> >> It should be I think rather easy for Dmitry to see if he can convert this input
> >> driver (not yet upstream) to this API and see if the same issues are fixed.
> >
> > No, I would rather not as it means we lose error handling on device
> > registration.
> >
>
> I think this is a red herring as I don't see how async probing is any
> better at handling device registration errors. The error is logged
> and "handled" by the fact that a device fails to appear, what other
> action would you take? In fact libnvdimm does detect registration
> failures and reports that in a parent device attribute (at least for a
> region device and their namespace child devices).
What is libnvdimm behavior if you try to unload a module that tries to
register a device but it failed? Memory leak or crash, right?
--
Dmitry
On Wed, Jul 08, 2015 at 05:43:23PM -0700, Dan Williams wrote:
> On Mon, Jul 6, 2015 at 4:38 PM, Dmitry Torokhov
> <[email protected]> wrote:
> > On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> >> The problem I ran into was needing to remove devices that still had
> >> yet to be probed and not being able to use registration completion vs
> >> the device_lock() to effectively synchronize the sub-system.
> >
> > Why do you need to "synchronize the sub-system"? The asynchronous
> > probing should be transparent to the driver. Just unregister the device
> > (or the driver) and driver core will ensure that probe() is not in
> > flight.
>
> Async registration is indeed transparent to the driver. The primary
> need to "flush registration" is the case of "region" devices that
> reference a set of NVDIMM devices. A region device requires all
> related NVDIMMs to be active before the region can be enabled.
Sounds like you need to call into the subsystem to let it know that the
device is active and activate region devices when they are ready. Could
be either explicit call or you can try using bus notifiers for
bind/unbind events.
BTW, do you handle bind/unbind via sysfs (everyone forgets about this
mechanism)?
Thanks.
--
Dmitry
On Wed, Jul 8, 2015 at 5:52 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Wed, Jul 08, 2015 at 05:43:23PM -0700, Dan Williams wrote:
>> On Mon, Jul 6, 2015 at 4:38 PM, Dmitry Torokhov
>> <[email protected]> wrote:
>> > On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> >> The problem I ran into was needing to remove devices that still had
>> >> yet to be probed and not being able to use registration completion vs
>> >> the device_lock() to effectively synchronize the sub-system.
>> >
>> > Why do you need to "synchronize the sub-system"? The asynchronous
>> > probing should be transparent to the driver. Just unregister the device
>> > (or the driver) and driver core will ensure that probe() is not in
>> > flight.
>>
>> Async registration is indeed transparent to the driver. The primary
>> need to "flush registration" is the case of "region" devices that
>> reference a set of NVDIMM devices. A region device requires all
>> related NVDIMMs to be active before the region can be enabled.
>
> Sounds like you need to call into the subsystem to let it know that the
> device is active and activate region devices when they are ready. Could
> be either explicit call or you can try using bus notifiers for
> bind/unbind events.
>
> BTW, do you handle bind/unbind via sysfs (everyone forgets about this
> mechanism)?
bind/unbind via systs is central to how libnvdimm operates. It's
covered by our unit tests.
On Wed, Jul 08, 2015 at 05:54:26PM -0700, Dan Williams wrote:
> On Wed, Jul 8, 2015 at 5:52 PM, Dmitry Torokhov
> <[email protected]> wrote:
> > On Wed, Jul 08, 2015 at 05:43:23PM -0700, Dan Williams wrote:
> >> On Mon, Jul 6, 2015 at 4:38 PM, Dmitry Torokhov
> >> <[email protected]> wrote:
> >> > On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> >> >> The problem I ran into was needing to remove devices that still had
> >> >> yet to be probed and not being able to use registration completion vs
> >> >> the device_lock() to effectively synchronize the sub-system.
> >> >
> >> > Why do you need to "synchronize the sub-system"? The asynchronous
> >> > probing should be transparent to the driver. Just unregister the device
> >> > (or the driver) and driver core will ensure that probe() is not in
> >> > flight.
> >>
> >> Async registration is indeed transparent to the driver. The primary
> >> need to "flush registration" is the case of "region" devices that
> >> reference a set of NVDIMM devices. A region device requires all
> >> related NVDIMMs to be active before the region can be enabled.
> >
> > Sounds like you need to call into the subsystem to let it know that the
> > device is active and activate region devices when they are ready. Could
> > be either explicit call or you can try using bus notifiers for
> > bind/unbind events.
> >
> > BTW, do you handle bind/unbind via sysfs (everyone forgets about this
> > mechanism)?
>
> bind/unbind via systs is central to how libnvdimm operates. It's
> covered by our unit tests.
Ah, excellent then.
--
Dmitry
On Wed, Jul 8, 2015 at 5:49 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Wed, Jul 08, 2015 at 05:36:04PM -0700, Dan Williams wrote:
>> On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov
>> <[email protected]> wrote:
>> > On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
>> >> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> >> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
>> >> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
>> >> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
>> >> > >> <[email protected]> wrote:
>> >> > >> > Some devices take a long time when initializing, and not all drivers are
>> >> > >> > suited to initialize their devices when they are open. For example,
>> >> > >> > input drivers need to interrogate their devices in order to publish
>> >> > >> > device's capabilities before userspace will open them. When such drivers
>> >> > >> > are compiled into kernel they may stall entire kernel initialization.
>> >> > >> >
>> >> > >> > This change allows drivers request for their probe functions to be
>> >> > >> > called asynchronously during driver and device registration (manual
>> >> > >> > binding is still synchronous). Because async_schedule is used to perform
>> >> > >> > asynchronous calls module loading will still wait for the probing to
>> >> > >> > complete.
>> >> > >> >
>> >> > >> > Note that the end goal is to make the probing asynchronous by default,
>> >> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
>> >> > >> > measure that allows us to speed up boot process while we validating and
>> >> > >> > fixing the rest of the drivers and preparing userspace.
>> >> > >> >
>> >> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
>> >> > >> > <[email protected]>
>> >> > >> >
>> >> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
>> >> > >> > ---
>> >> > >> > drivers/base/base.h | 1 +
>> >> > >> > drivers/base/bus.c | 31 +++++++---
>> >> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
>> >> > >> > include/linux/device.h | 28 ++++++++++
>> >> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
>> >> > >>
>> >> > >> Just noticed this patch. It caught my eye because I had a hard time
>> >> > >> getting an open coded implementation of asynchronous probing to work
>> >> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
>> >> > >> things down while probing is still in flight. I ended up implementing
>> >> > >> asynchronous device registration which eliminated a lot of complexity
>> >> > >> and of course the bugs. In general I tend to think that async
>> >> > >> registration is less risky than async probe since it keeps wider
>> >> > >> portions of the traditional device model synchronous
>> >> > >
>> >> > > but its not see -DEFER_PROBE even before async probe.
>> >> >
>> >> > Except in that case you know probe has been seen by the driver at
>> >> > least once. So I see that as less of a surprise, but point taken.
>> >> >
>> >> > >> and leverages the
>> >> > >> fact that the device model is already well prepared for asynchronous
>> >> > >> arrival of devices due to hotplug.
>> >> > >
>> >> > > I think this sounds reasonable, do you have your code upstream or posted?
>> >> >
>> >> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
>> >>
>> >> It should be I think rather easy for Dmitry to see if he can convert this input
>> >> driver (not yet upstream) to this API and see if the same issues are fixed.
>> >
>> > No, I would rather not as it means we lose error handling on device
>> > registration.
>> >
>>
>> I think this is a red herring as I don't see how async probing is any
>> better at handling device registration errors. The error is logged
>> and "handled" by the fact that a device fails to appear, what other
>> action would you take? In fact libnvdimm does detect registration
>> failures and reports that in a parent device attribute (at least for a
>> region device and their namespace child devices).
>
> What is libnvdimm behavior if you try to unload a module that tries to
> register a device but it failed? Memory leak or crash, right?
No, in the case of the "region" driver it is part of the core
libnvdimm and it is pinned while any region device is active. But, it
is a fair point for a generic facility it would need to consider cases
where a driver registers children and then is unloaded. I'll need to
think about that one more relative to async probing.
On Wed, Jul 08, 2015 at 06:00:41PM -0700, Dan Williams wrote:
> On Wed, Jul 8, 2015 at 5:49 PM, Dmitry Torokhov
> <[email protected]> wrote:
> > On Wed, Jul 08, 2015 at 05:36:04PM -0700, Dan Williams wrote:
> >> On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov
> >> <[email protected]> wrote:
> >> > On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
> >> >> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
> >> >> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
> >> >> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
> >> >> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
> >> >> > >> <[email protected]> wrote:
> >> >> > >> > Some devices take a long time when initializing, and not all drivers are
> >> >> > >> > suited to initialize their devices when they are open. For example,
> >> >> > >> > input drivers need to interrogate their devices in order to publish
> >> >> > >> > device's capabilities before userspace will open them. When such drivers
> >> >> > >> > are compiled into kernel they may stall entire kernel initialization.
> >> >> > >> >
> >> >> > >> > This change allows drivers request for their probe functions to be
> >> >> > >> > called asynchronously during driver and device registration (manual
> >> >> > >> > binding is still synchronous). Because async_schedule is used to perform
> >> >> > >> > asynchronous calls module loading will still wait for the probing to
> >> >> > >> > complete.
> >> >> > >> >
> >> >> > >> > Note that the end goal is to make the probing asynchronous by default,
> >> >> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
> >> >> > >> > measure that allows us to speed up boot process while we validating and
> >> >> > >> > fixing the rest of the drivers and preparing userspace.
> >> >> > >> >
> >> >> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
> >> >> > >> > <[email protected]>
> >> >> > >> >
> >> >> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
> >> >> > >> > ---
> >> >> > >> > drivers/base/base.h | 1 +
> >> >> > >> > drivers/base/bus.c | 31 +++++++---
> >> >> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
> >> >> > >> > include/linux/device.h | 28 ++++++++++
> >> >> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
> >> >> > >>
> >> >> > >> Just noticed this patch. It caught my eye because I had a hard time
> >> >> > >> getting an open coded implementation of asynchronous probing to work
> >> >> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
> >> >> > >> things down while probing is still in flight. I ended up implementing
> >> >> > >> asynchronous device registration which eliminated a lot of complexity
> >> >> > >> and of course the bugs. In general I tend to think that async
> >> >> > >> registration is less risky than async probe since it keeps wider
> >> >> > >> portions of the traditional device model synchronous
> >> >> > >
> >> >> > > but its not see -DEFER_PROBE even before async probe.
> >> >> >
> >> >> > Except in that case you know probe has been seen by the driver at
> >> >> > least once. So I see that as less of a surprise, but point taken.
> >> >> >
> >> >> > >> and leverages the
> >> >> > >> fact that the device model is already well prepared for asynchronous
> >> >> > >> arrival of devices due to hotplug.
> >> >> > >
> >> >> > > I think this sounds reasonable, do you have your code upstream or posted?
> >> >> >
> >> >> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
> >> >>
> >> >> It should be I think rather easy for Dmitry to see if he can convert this input
> >> >> driver (not yet upstream) to this API and see if the same issues are fixed.
> >> >
> >> > No, I would rather not as it means we lose error handling on device
> >> > registration.
> >> >
> >>
> >> I think this is a red herring as I don't see how async probing is any
> >> better at handling device registration errors. The error is logged
> >> and "handled" by the fact that a device fails to appear, what other
> >> action would you take? In fact libnvdimm does detect registration
> >> failures and reports that in a parent device attribute (at least for a
> >> region device and their namespace child devices).
> >
> > What is libnvdimm behavior if you try to unload a module that tries to
> > register a device but it failed? Memory leak or crash, right?
>
> No, in the case of the "region" driver it is part of the core
> libnvdimm and it is pinned while any region device is active.
No, not quite. Let's take a look for example at nd_btt_probe(). It calls
__nd_btt_create() which in turn calls __nd_device_register() which
returns void and asynchronously schedules device registration. Now
consider the device registration fails. The async code will drop 2
references to the device, effectively freeing it. In the mean time
nd_btt_probe() stores the device pointer which may or may no longer be
valid and goes on it's merry way using it.
The similar thing in nvdimm_create which returns a pointer that may no
longer be valid. I have not traced enough through the code to make sure
if it can blow up, but this kind of situation is not desirable,
especially if the async registration pattern is applied generally
throughout the kernel.
Thanks.
--
Dmitry
On Wed, Jul 8, 2015 at 9:44 PM, Dmitry Torokhov
<[email protected]> wrote:
> On Wed, Jul 08, 2015 at 06:00:41PM -0700, Dan Williams wrote:
>> On Wed, Jul 8, 2015 at 5:49 PM, Dmitry Torokhov
>> <[email protected]> wrote:
>> > On Wed, Jul 08, 2015 at 05:36:04PM -0700, Dan Williams wrote:
>> >> On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov
>> >> <[email protected]> wrote:
>> >> > On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote:
>> >> >> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote:
>> >> >> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> wrote:
>> >> >> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote:
>> >> >> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov
>> >> >> > >> <[email protected]> wrote:
>> >> >> > >> > Some devices take a long time when initializing, and not all drivers are
>> >> >> > >> > suited to initialize their devices when they are open. For example,
>> >> >> > >> > input drivers need to interrogate their devices in order to publish
>> >> >> > >> > device's capabilities before userspace will open them. When such drivers
>> >> >> > >> > are compiled into kernel they may stall entire kernel initialization.
>> >> >> > >> >
>> >> >> > >> > This change allows drivers request for their probe functions to be
>> >> >> > >> > called asynchronously during driver and device registration (manual
>> >> >> > >> > binding is still synchronous). Because async_schedule is used to perform
>> >> >> > >> > asynchronous calls module loading will still wait for the probing to
>> >> >> > >> > complete.
>> >> >> > >> >
>> >> >> > >> > Note that the end goal is to make the probing asynchronous by default,
>> >> >> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a temporary
>> >> >> > >> > measure that allows us to speed up boot process while we validating and
>> >> >> > >> > fixing the rest of the drivers and preparing userspace.
>> >> >> > >> >
>> >> >> > >> > This change is based on earlier patch by "Luis R. Rodriguez"
>> >> >> > >> > <[email protected]>
>> >> >> > >> >
>> >> >> > >> > Signed-off-by: Dmitry Torokhov <[email protected]>
>> >> >> > >> > ---
>> >> >> > >> > drivers/base/base.h | 1 +
>> >> >> > >> > drivers/base/bus.c | 31 +++++++---
>> >> >> > >> > drivers/base/dd.c | 149 ++++++++++++++++++++++++++++++++++++++++++-------
>> >> >> > >> > include/linux/device.h | 28 ++++++++++
>> >> >> > >> > 4 files changed, 182 insertions(+), 27 deletions(-)
>> >> >> > >>
>> >> >> > >> Just noticed this patch. It caught my eye because I had a hard time
>> >> >> > >> getting an open coded implementation of asynchronous probing to work
>> >> >> > >> in the new libnvdimm subsystem. Especially the messy races of tearing
>> >> >> > >> things down while probing is still in flight. I ended up implementing
>> >> >> > >> asynchronous device registration which eliminated a lot of complexity
>> >> >> > >> and of course the bugs. In general I tend to think that async
>> >> >> > >> registration is less risky than async probe since it keeps wider
>> >> >> > >> portions of the traditional device model synchronous
>> >> >> > >
>> >> >> > > but its not see -DEFER_PROBE even before async probe.
>> >> >> >
>> >> >> > Except in that case you know probe has been seen by the driver at
>> >> >> > least once. So I see that as less of a surprise, but point taken.
>> >> >> >
>> >> >> > >> and leverages the
>> >> >> > >> fact that the device model is already well prepared for asynchronous
>> >> >> > >> arrival of devices due to hotplug.
>> >> >> > >
>> >> >> > > I think this sounds reasonable, do you have your code upstream or posted?
>> >> >> >
>> >> >> > Yes, see nd_device_register() in drivers/nvdimm/bus.c
>> >> >>
>> >> >> It should be I think rather easy for Dmitry to see if he can convert this input
>> >> >> driver (not yet upstream) to this API and see if the same issues are fixed.
>> >> >
>> >> > No, I would rather not as it means we lose error handling on device
>> >> > registration.
>> >> >
>> >>
>> >> I think this is a red herring as I don't see how async probing is any
>> >> better at handling device registration errors. The error is logged
>> >> and "handled" by the fact that a device fails to appear, what other
>> >> action would you take? In fact libnvdimm does detect registration
>> >> failures and reports that in a parent device attribute (at least for a
>> >> region device and their namespace child devices).
>> >
>> > What is libnvdimm behavior if you try to unload a module that tries to
>> > register a device but it failed? Memory leak or crash, right?
>>
>> No, in the case of the "region" driver it is part of the core
>> libnvdimm and it is pinned while any region device is active.
>
> No, not quite. Let's take a look for example at nd_btt_probe(). It calls
> __nd_btt_create() which in turn calls __nd_device_register() which
> returns void and asynchronously schedules device registration. Now
> consider the device registration fails. The async code will drop 2
> references to the device, effectively freeing it. In the mean time
> nd_btt_probe() stores the device pointer which may or may no longer be
> valid and goes on it's merry way using it.
nd_btt_probe() is the driver probe for the btt device. If
registration fails then the device is never probed.
> The similar thing in nvdimm_create which returns a pointer that may no
> longer be valid
Exactly, which is why we fail the entire bus probe if any of the
nvdimm devices failed to register. See nvdimm_bus_check_dimm_count().
> I have not traced enough through the code to make sure
> if it can blow up, but this kind of situation is not desirable,
> especially if the async registration pattern is applied generally
> throughout the kernel.
I do appreciate the review, but I don't think this signals the death
knell for async registration just yet.