2014-10-02 06:13:03

by Tom Gundersen

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] driver-core: add driver asynchronous probe support

On Tue, Sep 30, 2014 at 5:24 PM, Luis R. Rodriguez <[email protected]> wrote:
>> > commit e64fae5573e566ce4fd9b23c68ac8f3096603314
>> > Author: Kay Sievers <[email protected]>
>> > Date: Wed Jan 18 05:06:18 2012 +0100
>> >
>> > udevd: kill hanging event processes after 30 seconds
>> >
>> > Some broken kernel drivers load firmware synchronously in the module init
>> > path and block modprobe until the firmware request is fulfilled.
>> > <...>
>>
>> This was a workaround to avoid a deadlock between udev and the kernel.
>> The 180 s timeout was already in place before this change, and was not
>> motivated by firmware loading. Also note that this patch was not about
>> "tracking device drivers", just about avoiding dead-lock.
>
> Thanks, can you elaborate on how a deadlock can occur if the kmod
> worker is not at some point sigkilled?

This was only relevant whet udev did the firmware loading. modprobe
would wait for the kernel, which would wait for the firmware loading,
which would wait for modprobe. This is no longer a problem as udev
does not do firmware loading any more.

> Is the issue that if there is no extra worker available and all are
> idling on sleep / synchronous long work boot will potentially hang
> unless a new worker becomes available to do more work?

Correct.

> If so I can
> see the sigkill helping for hanging tasks but it doesn't necessarily
> mean its a good idea to kill modules loading taking a while. Also
> what if the sigkill is just avoided for *just* kmod workers?

Depending on the number of devices you have, I suppose we could still
exhaust the workers.

>> The way I see it, the current status from systemd's side is: our
>> short-term work-around is to increase the timeout, and at the moment
>> it appears no long-term solution is needed (i.e., it seems like the
>> right thing to do is to make sure insmod can be near instantaneous, it
>> appears people are working towards this goal, and so far no examples
>> have cropped up showing that it is fundamentally impossible (once/if
>> they do, we should of course revisit the problem)).
>
> That again would be reactive behaviour, what would prevent avoiding the
> sigkill only for kmod workers? Is it known the deadlock is immiment?
> If the amount of workers for kmod that would hit the timeout is
> considered low I don't see how that's possible and why not just lift
> the sigkill.

Making kmod a special case is of course possible. However, as long as
there is no fundamental reason why kmod should get this special
treatment, this just looks like a work-around to me. We already have a
work-around, which is to increase the global timeout. If you still
think we should do something different in systemd, it is probably best
to take the discussion to systemd-devel to make sure all the relevant
people are involved.

Cheers,

Tom


2014-10-02 20:06:23

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] driver-core: add driver asynchronous probe support

As per Tom, adding systemd-devel for advice / review / of the request to avoid
the sigkill for kmod workers. Keeping others on Cc as its a discussion that
I think can help if both camps are involved. Specially since we've been
ping ponging back and forth on this particular topic for a long time now.

On Thu, Oct 02, 2014 at 08:12:37AM +0200, Tom Gundersen wrote:
> On Tue, Sep 30, 2014 at 5:24 PM, Luis R. Rodriguez <[email protected]> wrote:
> >> > commit e64fae5573e566ce4fd9b23c68ac8f3096603314
> >> > Author: Kay Sievers <[email protected]>
> >> > Date: Wed Jan 18 05:06:18 2012 +0100
> >> >
> >> > udevd: kill hanging event processes after 30 seconds
> >> >
> >> > Some broken kernel drivers load firmware synchronously in the module init
> >> > path and block modprobe until the firmware request is fulfilled.
> >> > <...>
> >>
> >> This was a workaround to avoid a deadlock between udev and the kernel.
> >> The 180 s timeout was already in place before this change, and was not
> >> motivated by firmware loading. Also note that this patch was not about
> >> "tracking device drivers", just about avoiding dead-lock.
> >
> > Thanks, can you elaborate on how a deadlock can occur if the kmod
> > worker is not at some point sigkilled?
>
> This was only relevant whet udev did the firmware loading. modprobe
> would wait for the kernel, which would wait for the firmware loading,
> which would wait for modprobe. This is no longer a problem as udev
> does not do firmware loading any more.

Thanks for clarifying. So the deadlock concern is no longer there, therefore
it is not a reason to keep the sigkill for kmod.

> > Is the issue that if there is no extra worker available and all are
> > idling on sleep / synchronous long work boot will potentially hang
> > unless a new worker becomes available to do more work?
>
> Correct.

Ok.

> > If so I can
> > see the sigkill helping for hanging tasks but it doesn't necessarily
> > mean its a good idea to kill modules loading taking a while. Also
> > what if the sigkill is just avoided for *just* kmod workers?
>
> Depending on the number of devices you have, I suppose we could still
> exhaust the workers.

Ok can systemd dynamically create a worker or set of workers per device
that creeps up? Async probe for most drivers will help with this but
having it dynamic should help as well, specially since there are drivers
that will require probe synchronously -- and the fact that async probe
mechanism is not yet merged.

> >> The way I see it, the current status from systemd's side is: our
> >> short-term work-around is to increase the timeout, and at the moment
> >> it appears no long-term solution is needed (i.e., it seems like the
> >> right thing to do is to make sure insmod can be near instantaneous, it
> >> appears people are working towards this goal, and so far no examples
> >> have cropped up showing that it is fundamentally impossible (once/if
> >> they do, we should of course revisit the problem)).
> >
> > That again would be reactive behaviour, what would prevent avoiding the
> > sigkill only for kmod workers? Is it known the deadlock is immiment?
> > If the amount of workers for kmod that would hit the timeout is
> > considered low I don't see how that's possible and why not just lift
> > the sigkill.
>
> Making kmod a special case is of course possible. However, as long as
> there is no fundamental reason why kmod should get this special
> treatment, this just looks like a work-around to me.

I've mentioned a series of five reasons why its a bad idea right now to
sigkill modules [0], we're reviewed them each and still at least
items 2-4 remain particularly valid fundamental reasons to avoid it
specially if the deadlock is no longer possible. Running out of
workers because they are loading modules and that is taking a while
is not really a good standing reason to be killing them, specially
if the timeout already is set to a high value. All we're doing there is
limiting Linux / number of devices arbitrarily just to help free
workers, and it seems that should be dealt with differently. Killing
module loading arbitrarily in the middle is not advisable and can cause
more issue than help in any way.

Async probe mechanism will help free workers faster but this patch series is
still being evolved, we should still address the sigkill for kmod workers
separately and see what remaining reasons we have for it in light of the
possible issues highlighted that it can introduce if kept. If we want to
capture drivers taking long on probe each subsystem should handle that and WARN
/ pick up on it, we cannot however assume that this a generally bad things as
discussed before. We will also not be able to async probe *every* driver,
which is why the series allows a flag to specify for this.

[0] https://lkml.org/lkml/2014/9/26/879

> We already have a
> work-around, which is to increase the global timeout. If you still
> think we should do something different in systemd, it is probably best
> to take the discussion to systemd-devel to make sure all the relevant
> people are involved.

Sure, I've included systemd-devel. Hope is we can have a constructive
discussion on the sigkill for kmod.

Luis

2014-10-03 08:23:49

by Tom Gundersen

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] driver-core: add driver asynchronous probe support

On Thu, Oct 2, 2014 at 10:06 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, Oct 02, 2014 at 08:12:37AM +0200, Tom Gundersen wrote:
>> Making kmod a special case is of course possible. However, as long as
>> there is no fundamental reason why kmod should get this special
>> treatment, this just looks like a work-around to me.
>
> I've mentioned a series of five reasons why its a bad idea right now to
> sigkill modules [0], we're reviewed them each and still at least
> items 2-4 remain particularly valid fundamental reasons to avoid it

So items 2-4 basically say "there currently are drivers that cannot
deal with sigkill after a three minute timeout".

In the short-term we already have the solution: increase the timeout.
In the long-term, we have two choices, either permanently add some
heuristic to udev to deal with drivers taking a very long time to be
inserted, or fix the drivers not to take such a long time. A priori,
it makes no sense to me that drivers spend unbounded amounts of time
to get inserted, so fixing the drivers seems like the most reasonable
approach to me. That said, I'm of course open to be proven wrong if
there are some drivers that fundamentally _must_ take a long time to
insert (but we should then discuss why that is and how we can best
deal with the situation, rather than adding some hack up-front when we
don't even know if it is needed).

Your patch series should go a long way towards fixing the drivers (and
I imagine there being a lot of low-hanging fruit that can easily be
fixed once your series has landed), and the fact that we have now
increased the udev timeout from 30 to 180 seconds should also greatly
reduce the problem.

Cheers,

Tom

2014-10-03 16:55:17

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] driver-core: add driver asynchronous probe support

On Fri, Oct 3, 2014 at 1:23 AM, Tom Gundersen <[email protected]> wrote:
> On Thu, Oct 2, 2014 at 10:06 PM, Luis R. Rodriguez <[email protected]> wrote:
>> On Thu, Oct 02, 2014 at 08:12:37AM +0200, Tom Gundersen wrote:
>>> Making kmod a special case is of course possible. However, as long as
>>> there is no fundamental reason why kmod should get this special
>>> treatment, this just looks like a work-around to me.
>>
>> I've mentioned a series of five reasons why its a bad idea right now to
>> sigkill modules [0], we're reviewed them each and still at least
>> items 2-4 remain particularly valid fundamental reasons to avoid it
>
> So items 2-4 basically say "there currently are drivers that cannot
> deal with sigkill after a three minute timeout".

No, dealing with the sigkill gracefully is all related to 2) as it
says its probably a terrible idea to be triggering exit paths at
random points on device drivers on init / probe. And while one could
argue that perhaps that can be cleaned up I provided tons of
references and even *research effort* on this particular area so the
issues over this point should by no means easily be brushed off. And
it may be true that we can fix some things on Linux but a) that
requires a kernel upgrade on users and b) Some users may end up buying
hardware that only is supported through a proprietary driver and
getting those fixes is not trivial and almost impossible on some
cases.

3) says it is fundamentally incorrect to limit with any arbitrary
timeout the bus probe routine

4) talks about how the timeout is creating a limit on the number of
devices a device driver can support on Linux as follows give the
driver core batches *all* probes for one device driver serially:

number_devices = systemd_timeout
-------------------------------------
max known probe time for driver

We have device drivers which we *know* just on *probe* will take over
1 minute, this means that by default for these device drivers folks
can only install 3 devices of that type on a system. One can surely
address things on the kernel but again assuming folks use defaults and
don't upgrade their kernel the sigkill is simply limiting Linux right
now, even if it is for the short term.

> In the short-term we already have the solution: increase the timeout.

Short term implicates what will be supported for a while for tons of
deployments of systemd. The kernel command line work around for
increasing the timeout is a reactive measure, its not addressing the
problem architecturally. If the sigkill is going to be maintained for
kmod its implications should be well documented as well in terms of
the impact and limitations on both device drivers and number of
devices a driver can support.

> In the long-term, we have two choices, either permanently add some
> heuristic to udev to deal with drivers taking a very long time to be
> inserted, or fix the drivers not to take such a long time.

Drivers taking long on init should probably be addressed, drivers
taking long on probe are not broken specially since the driver core
probe's all supported devices on one device driver serially, so the
probe time is actually cumulative.

> A priori,
> it makes no sense to me that drivers spend unbounded amounts of time
> to get inserted, so fixing the drivers seems like the most reasonable
> approach to me. That said, I'm of course open to be proven wrong if
> there are some drivers that fundamentally _must_ take a long time to
> insert (but we should then discuss why that is and how we can best
> deal with the situation, rather than adding some hack up-front when we
> don't even know if it is needed).

Ok hold on. Async probe on the driver core will be a new feature and
there are even caveats that Tejun pointed out which are important for
distributions to consider before embracing it. Of course folks can
ignore these but by no means should it be considered that tons of
device device drivers were broken, what we are providing is a new
mechanism. And then there are device drivers which will need work in
order to use async probe, some will require fixes on init / probe
assumptions as I provided for the amd64_edac driver but for others
only time will tell what is required.

> Your patch series should go a long way towards fixing the drivers (and
> I imagine there being a lot of low-hanging fruit that can easily be
> fixed once your series has landed), and the fact that we have now
> increased the udev timeout from 30 to 180 seconds should also greatly
> reduce the problem.

Sure, I do ask for folks to revisit the short term solution though, I
did my best to communicate / document the issues.

Luis