I haven't been able to test the various rc-s on this machine.
Linux 5.12 hangs at reboot. No error message.
Bisected to commit 776a39b8196dbca4afb69669db0d9926ffac29ab, and
reverting this makes the machine reboot as usual.
The distribution is Void Linux. Config is attached.
--
Hilsen Harald
On Mon, Apr 26, 2021 at 11:47 AM Harald Arnesen <[email protected]> wrote:
>
> Bisected to commit 776a39b8196dbca4afb69669db0d9926ffac29ab, and
> reverting this makes the machine reboot as usual.
Hmm. That was already in rc1, so this isn't some late untested
last-minute commit that broke things for you.
Which implies that it's likely something fairly specific to your
setup (either the config or the hardware - or possibly Void Linux
doing something other distros don't).
Mind also attaching a dmesg of an affected kernel (or with the revert
in place, I guess - it shouldn't matter until the reboot ;)
There's a lockdep assertion there, but you don't seem to have lockdep
enabled. So it be interesting to see what happens if you
(a) enable lockdep
(b) make sure to reboot in text mode so that any lockdep messages
would actually be visible.
Maybe Johannes will go "Doh!" and see what's wrong.
Linus
Linus Torvalds [26.04.2021 20:59]:
> On Mon, Apr 26, 2021 at 11:47 AM Harald Arnesen <[email protected]> wrote:
>>
>> Bisected to commit 776a39b8196dbca4afb69669db0d9926ffac29ab, and
>> reverting this makes the machine reboot as usual.
>
> Hmm. That was already in rc1, so this isn't some late untested
> last-minute commit that broke things for you.
As I said, I could not test the various rcs this time. I had to use the
machine for some work, so I did'nt want to experiment too much.
> Which implies that it's likely something fairly specific to your
> setup (either the config or the hardware - or possibly Void Linux
> doing something other distros don't).
Init system is runit, not systemd? Could that affect something?
> Mind also attaching a dmesg of an affected kernel (or with the revert
> in place, I guess - it shouldn't matter until the reboot ;)
dmesg from 5.12 without the revert is attached.
> There's a lockdep assertion there, but you don't seem to have lockdep
> enabled. So it be interesting to see what happens if you
>
> (a) enable lockdep
Will report when I have tried it.
> (b) make sure to reboot in text mode so that any lockdep messages
> would actually be visible.
>
> Maybe Johannes will go "Doh!" and see what's wrong.
>
> Linus
>
--
Hilsen Harald
On Mon, 2021-04-26 at 19:29 +0000, Harald Arnesen wrote:
> > Which implies that it's likely something fairly specific to your
> > setup (either the config or the hardware - or possibly Void Linux
> > doing something other distros don't).
>
> Init system is runit, not systemd? Could that affect something?
Unlikely.
> > Mind also attaching a dmesg of an affected kernel (or with the revert
> > in place, I guess - it shouldn't matter until the reboot ;)
>
> dmesg from 5.12 without the revert is attached.
That's it? There it just hangs? Not even printing something about tasks
getting stuck? Did you let it sit for a couple of (2-3) minutes to see
the task stuck warnings?
What's weird though is that this is evidently with mac80211 - so
certainly all the patches I just sent out will do nothing for you, and
something else is strange, because mac80211 I've definitely not just
reviewed but also tested these code paths many times.
Hmm. But yeah, either way, a lockdep report or stack dump will surely
help. Perhaps something in the ath* driver you're using that I missed.
> > There's a lockdep assertion there, but you don't seem to have lockdep
> > enabled. So it be interesting to see what happens if you
> >
> > (a) enable lockdep
>
> Will report when I have tried it.
Thanks.
johannes
On Mon, Apr 26, 2021 at 12:38 PM Johannes Berg
<[email protected]> wrote:
>
> >
> > dmesg from 5.12 without the revert is attached.
>
> That's it? There it just hangs? Not even printing something about tasks
> getting stuck?
I assume the dmesg is from before trying to shut the system down.
Useful mainly to see what the hardware is, than to debug the hang
itself.
So it looks like it's an Atheros AR9462. That doesn't sound too unusual.
Getting a dmesg from the actual shutdown is tricky, because the system
is obviously no longer usable at that point. You tend to need serial
consoles etc.
Shutdown hangs can be a bit nasty to debug for that reason (similar to
suspend/resume issues).
Linus
On Mon, 2021-04-26 at 12:45 -0700, Linus Torvalds wrote:
> On Mon, Apr 26, 2021 at 12:38 PM Johannes Berg
> <[email protected]> wrote:
> >
> > >
> > > dmesg from 5.12 without the revert is attached.
> >
> > That's it? There it just hangs? Not even printing something about tasks
> > getting stuck?
>
> I assume the dmesg is from before trying to shut the system down.
Ahrg. I misread the subject as "boot" rather than "reboot".
Right. Maybe if it's modules, could try to remove them rather than
reboot?
johannes
On Mon, Apr 26, 2021 at 12:46 PM Johannes Berg
<[email protected]> wrote:
>
> Right. Maybe if it's modules, could try to remove them rather than
> reboot?
Yes, doing an 'rmmod ath9k' (or whatever that module is called)
sounds like a good idea, it might trigger the same lockup.
In fact, that might be the reason Harald sees this - maybe Void Linux
tries to unload modules before rebooting, and other distros don't?
Unloading modules doesn't tend to get a lot of testing, it's a fairly
unusual operation (and not one that is guaranteed to be safe - yes we
do basic module refcounting, but it's very basic indeed, not full).
Linus
On Mon, 2021-04-26 at 12:51 -0700, Linus Torvalds wrote:
> On Mon, Apr 26, 2021 at 12:46 PM Johannes Berg
> <[email protected]> wrote:
> >
> > Right. Maybe if it's modules, could try to remove them rather than
> > reboot?
>
> Yes, doing an 'rmmod ath9k' (or whatever that module is called)
> sounds like a good idea, it might trigger the same lockup.
>
> In fact, that might be the reason Harald sees this - maybe Void Linux
> tries to unload modules before rebooting, and other distros don't?
Seems odd if they would, but maybe?
I guess we're well into speculation here now - Harald, even taking a
picture of a stack dump will help, I'll likely only need an indication
where it's actually locking up, unless it's actually in
cfg80211_destroy_iface_wk() itself, but I can't see how that'd be
possible.
Looks like with mac80211 this really should just go down into
ieee80211_if_remove() and that looks OK.
And it's coming from a work struct, so I thought maybe some flushing
happened in a bad context, but that's only in wiphy_unregister(),
without the lock(s) held around it, as it should be. I figured then
maybe wiphy_unregister() could be called in a bad context, but then that
would've deadlocked itself earlier, unrelated to the destroy_iface_wk().
Oh, I have another idea - maybe void linux is using iwd instead of
wpa_supplicant, and that insists on doing the netlink owner stuff so
everything is deleted in case it crashes. But I've been looking at the
code pretty much assuming that we get actual calls down, so ...
Dunno. I don't see anything obvious right now, any additional
information (stack dump, or lockdep report) would be great.
johannes
Linus Torvalds [26.04.2021 20:59]:
> There's a lockdep assertion there, but you don't seem to have lockdep
> enabled. So it be interesting to see what happens if you
>
> (a) enable lockdep
At the risk of sounding stupid: where is this config option?
--
Hilsen Harald
Johannes Berg [26.04.2021 22:11]:
> Oh, I have another idea - maybe void linux is using iwd instead of
> wpa_supplicant, and that insists on doing the netlink owner stuff so
> everything is deleted in case it crashes. But I've been looking at the
> code pretty much assuming that we get actual calls down, so ...
It is indeed using iwd:
$ ps ax | grep wpa
$
$ ps ax | grep iwd
996 ? Ss 0:00 runsv iwd
1395 ? S 0:00 /usr/libexec/iwd
$
--
Hilsen Harald
On Mon, 2021-04-26 at 22:10 +0200, Harald Arnesen wrote:
> Linus Torvalds [26.04.2021 20:59]:
>
> > There's a lockdep assertion there, but you don't seem to have lockdep
> > enabled. So it be interesting to see what happens if you
> >
> > (a) enable lockdep
>
> At the risk of sounding stupid: where is this config option?
CONFIG_PROVE_LOCKING
I have no idea *where* it is either (in menuconfig or so), so if you're
in menuconfig, just hit / and enter PROVE_LOCKING. It'll come up with a
list of entries (probably only one) labeled (1), (2), ... Hit the number
key with the right number and menuconfig will take you there.
If you can't enable it, then you need DEBUG_KERNEL first.
But if you can e.g. switch to a VT (ctrl-alt-fn) and get a stackdump
shown on the screen while it hangs while shutting down (need to wait 2-3
minutes I guess), that's probably enough.
johannes
On Mon, Apr 26, 2021 at 3:18 PM Harald Arnesen <[email protected]> wrote:
>
> Photos of the screen with lockdep enabled.
Ok, so this is certainly not all of it, but the rest must have
scrolled off even the hw scroll buffer. But you've caught at least one
interesting backtrace:
cfg80211_destroy_iface_wk() takes wiphy_lock
-> cfg80211_destroy_ifaces()
->ieee80211_del_iface
->ieeee80211_if_remove
->cfg80211_unregister_wdev
->unregister_netdevice_queue
->dev_close_many
->__dev_close_many
->raw_notifier_call_chain
->cfg80211_netdev_notifier_call
and that wants the rtnl lock. Which it won't get, because something
else is holding on to it.
At a guess, there is some other sequence that takes the rtnl lock, and
then takes the wiphy_lock inside of it, and we have a ABBA deadlock.
<insert-shocked-pikachu face>
I _hate_ that stupid rtnl lock. It's come up before. Several times.
It's probably the most broken lock in the kernel.
Linus
Linus Torvalds <[email protected]> wrote:
>
> At a guess, there is some other sequence that takes the rtnl lock, and
> then takes the wiphy_lock inside of it, and we have a ABBA deadlock.
The most common cause of these lockups is a stray reference on
some kind of a network object, e.g., a socket or some such. This
then causes the underlying network device to be reference counted
and triggers a lock-up if that network device needs to be removed.
> I _hate_ that stupid rtnl lock. It's come up before. Several times.
> It's probably the most broken lock in the kernel.
Yes, it's the networking equivalent of the BKL.
Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Hi,
On Mon, 2021-04-26 at 15:49 -0700, Linus Torvalds wrote:
>
> cfg80211_destroy_iface_wk() takes wiphy_lock
> -> cfg80211_destroy_ifaces()
> ->ieee80211_del_iface
> ->ieeee80211_if_remove
> ->cfg80211_unregister_wdev
> ->unregister_netdevice_queue
> ->dev_close_many
> ->__dev_close_many
> ->raw_notifier_call_chain
> ->cfg80211_netdev_notifier_call
>
> and that wants the rtnl lock. Which it won't get, because something
> else is holding on to it.
Thanks. I don't think your conclusion is quite right - it doesn't want
the RTNL, it wants the "wiphy->mtx" mutex, otherwise the revert wouldn't
help anyway since the RTNL is held, and also the code in
cfg80211_netdev_notifier_call() never acquires the RTNL in the first
place since it always assumes it's called with it held.
> At a guess, there is some other sequence that takes the rtnl lock, and
> then takes the wiphy_lock inside of it, and we have a ABBA deadlock.
I think we just have "AA" deadlock, but we don't have the full lockdep
report if there was one, the 'task stuck' obliterated it.
> <insert-shocked-pikachu face>
>
> I _hate_ that stupid rtnl lock. It's come up before. Several times.
> It's probably the most broken lock in the kernel.
Well, it's also _everywhere_, kind of like the "NBKL". In fact, in many
cases it has become a de-facto BLK again since there's basically no
(desktop) userspace that won't do _anything_ at all related to
networking, and thus it gets hit all the time. I've had things like a
crash with the RTNL held, and then I couldn't open new shell windows
anymore...
The sad thing is that the original commit that caused all this was meant
to alleviate problems with the RTNL, if you look at commit a05829a7222e
("cfg80211: avoid holding the RTNL when calling the driver"). The only
problem with that is of course that the RTNL is still required for all
the netdev management, and as such we need to be really careful around
it.
Anyway, back to the topic. This problem was supposed to be avoided by
the fact that cfg80211_unregister_wdev() sets
wdev->registered = false;
and then in the notifier call we check
static int cfg80211_netdev_notifier_call(struct notifier_block *nb,
unsigned long state, void *ptr)
{
...
case NETDEV_UNREGISTER:
/*
* It is possible to get NETDEV_UNREGISTER multiple
times,
* so check wdev->registered.
*/
if (wdev->registered && !wdev->registering) {
But, looking carefully at the stack trace, that's not what we see - we
see in there "dev_close_many()", so that means we got here with IFF_UP
still set on the netdev.
You could confirm this by taking gdb, this:
$ gdb vmlinux
(gdb) l *cfg80211_netdev_notifier_call+0xe5
should show you that that's actually line 1415.
This means that 'iwd' is actually just getting killed without ever
having set the interface down (removing IFF_UP, the equivalent of what
"ip link set wlan0 down" would do).
Which - yeah, that's a special case with iwd.
A trivial fix for this special case would be to close the interface(s)
first, such as:
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -345,11 +345,18 @@ ...
static void cfg80211_destroy_iface_wk(struct work_struct *work)
{
struct cfg80211_registered_device *rdev;
+ struct wireless_dev *wdev;
rdev = container_of(work, struct cfg80211_registered_device,
destroy_work);
rtnl_lock();
+
+ list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
+ if (wdev->nl_owner_dead && wdev->netdev)
+ dev_close(wdev->netdev);
+ }
+
wiphy_lock(&rdev->wiphy);
cfg80211_destroy_ifaces(rdev);
wiphy_unlock(&rdev->wiphy);
But while that will almost certainly fix all the issues you're seeing,
it's not sufficient in general because it leaves a race if the other
caller of cfg80211_destroy_ifaces() ever needs to do anything (which
relatively unlikely), since it would still run into the same deadlock.
Looking at this more carefully, even the original commit that added this
code missed something in this case, however. Basically, it relied on the
driver unregistering the netdev, this causing the notifier, and this
would actually stop operation (e.g. disconnect from the AP) before
actually removing the interface etc. However, there are some virtual
interfaces that do not have a netdev at all, and in those cases the
operations would never be stopped appropriately. Most drivers will
probably kind of obviously do that ("if you tell me to remove the thing
I better stop using it first") but I wouldn't be surprised if there are
bugs in this area in the drivers.
In any case, I'll test it now/soon, but this should fix the issues:
https://p.sipsolutions.net/d458baf9b04f2c8f.txt
I'll test it and send a proper patch.
johannes