2018-01-30 12:11:55

by Dmitry Vyukov

[permalink] [raw]
Subject: net: hang in unregister_netdevice: waiting for lo to become free

Hello,

The following program creates a hang in unregister_netdevice.
cleanup_net work hangs there forever periodically printing
"unregister_netdevice: waiting for lo to become free. Usage count = 3"
and creation of any new network namespaces hangs forever.

[ 101.320142] unregister_netdevice: waiting for lo to become free.
Usage count = 3
[ 103.368144] unregister_netdevice: waiting for lo to become free.
Usage count = 3
[ 105.416149] unregister_netdevice: waiting for lo to become free.
Usage count = 3

[ 951.239250] Workqueue: netns cleanup_net
[ 951.240312] Call Trace:
[ 951.247450] schedule+0xe8/0x420
[ 951.256257] schedule_timeout+0x4cb/0x14b0
[ 951.277475] msleep+0x66/0x90
[ 951.278071] netdev_wait_allrefs+0x13a/0x410
[ 951.290040] netdev_run_todo+0x4a5/0xb40
[ 951.302415] rtnl_unlock+0xe/0x10
[ 951.302919] default_device_exit_batch+0x4bf/0x5d0
[ 951.313823] ops_exit_list.isra.6+0x100/0x150
[ 951.315009] cleanup_net+0x5db/0xbb0
[ 951.327867] process_one_work+0x9d5/0x1480
[ 951.343357] worker_thread+0x1cc/0x1410
[ 951.354108] kthread+0x304/0x3c0

I've currently reproduced it on 4.15, but I think I saw something
similar for a long time, so most likely this is not a new bug. I've
also attached my .config.

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <net/if.h>
#include <sched.h>
#include <memory.h>
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>

int main()
{
int sock;
struct sockaddr_in6 addr;

unshare(CLONE_NEWNET);
system("ip -6 addr add fe80::00:0a/120 dev gretap0");
system("ip link set dev ip6gre0 up");
sock = socket(AF_INET6, SOCK_SEQPACKET, IPPROTO_IP);
memset(&addr, 0, sizeof(addr));
addr.sin6_family = AF_INET6;
addr.sin6_port = htobe16(0x4e20);
bind(sock, &addr, sizeof(addr));
memset(&addr, 0, sizeof(addr));
addr.sin6_family = AF_INET6;
inet_pton(AF_INET6, "fe80::bb", &addr.sin6_addr);
addr.sin6_scope_id = if_nametoindex("gretap0");
fcntl(sock, F_SETFL, O_NONBLOCK);
connect(sock, &addr, sizeof(addr));
return 0;
}


Attachments:
.config (132.77 kB)

2018-01-30 18:49:39

by Cong Wang

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
> Hello,
>
> The following program creates a hang in unregister_netdevice.
> cleanup_net work hangs there forever periodically printing
> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
> and creation of any new network namespaces hangs forever.

Interestingly, this is not reproducible on net-next.

2018-01-30 20:24:39

by Daniel Borkmann

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On 01/30/2018 07:32 PM, Cong Wang wrote:
> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>> Hello,
>>
>> The following program creates a hang in unregister_netdevice.
>> cleanup_net work hangs there forever periodically printing
>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>> and creation of any new network namespaces hangs forever.
>
> Interestingly, this is not reproducible on net-next.

The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
fixed due to that?

2018-01-30 22:14:10

by David Ahern

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On 1/30/18 1:08 PM, Daniel Borkmann wrote:
> On 01/30/2018 07:32 PM, Cong Wang wrote:
>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>> Hello,
>>>
>>> The following program creates a hang in unregister_netdevice.
>>> cleanup_net work hangs there forever periodically printing
>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>> and creation of any new network namespaces hangs forever.
>>
>> Interestingly, this is not reproducible on net-next.
>
> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
> fixed due to that?
>

This appears to be the commit introducing the refcnt leak:

$ git bisect bad
dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
Author: Xin Long <[email protected]>
Date: Fri May 12 14:39:52 2017 +0800

sctp: fix src address selection if using secondary addresses for ipv6


v4.14 is bad. Running bisect in the background while doing other things....

2018-01-30 23:00:46

by David Ahern

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On 1/30/18 1:57 PM, David Ahern wrote:
> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>>> Hello,
>>>>
>>>> The following program creates a hang in unregister_netdevice.
>>>> cleanup_net work hangs there forever periodically printing
>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>> and creation of any new network namespaces hangs forever.
>>>
>>> Interestingly, this is not reproducible on net-next.
>>
>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>> fixed due to that?
>>
>
> This appears to be the commit introducing the refcnt leak:
>
> $ git bisect bad
> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
> Author: Xin Long <[email protected]>
> Date: Fri May 12 14:39:52 2017 +0800
>
> sctp: fix src address selection if using secondary addresses for ipv6
>
>
> v4.14 is bad. Running bisect in the background while doing other things....
>

Interesting. The commit that avoids the refcnt leak is

commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
Author: David Ahern <[email protected]>
Date: Wed Jan 24 19:45:29 2018 -0800

net/ipv6: Do not allow route add with a device that is down

That commit does not intentionally address the problem so it is just
masking the problematic code introduced by the commit above.

2018-02-01 00:50:25

by Xin Long

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Tue, Jan 30, 2018 at 11:59 PM, David Ahern <[email protected]> wrote:
> On 1/30/18 1:57 PM, David Ahern wrote:
>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>>>> Hello,
>>>>>
>>>>> The following program creates a hang in unregister_netdevice.
>>>>> cleanup_net work hangs there forever periodically printing
>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>> and creation of any new network namespaces hangs forever.
>>>>
>>>> Interestingly, this is not reproducible on net-next.
>>>
>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>> fixed due to that?
>>>
>>
>> This appears to be the commit introducing the refcnt leak:
>>
>> $ git bisect bad
>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>> Author: Xin Long <[email protected]>
>> Date: Fri May 12 14:39:52 2017 +0800
>>
>> sctp: fix src address selection if using secondary addresses for ipv6
>>
>>
>> v4.14 is bad. Running bisect in the background while doing other things....
>>
>
> Interesting. The commit that avoids the refcnt leak is
>
> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
> Author: David Ahern <[email protected]>
> Date: Wed Jan 24 19:45:29 2018 -0800
>
> net/ipv6: Do not allow route add with a device that is down
>
> That commit does not intentionally address the problem so it is just
> masking the problematic code introduced by the commit above.
Thanks, David A.

I'm still on a trip. will look into this asap.

2018-02-03 12:17:32

by Xin Long

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Thu, Feb 1, 2018 at 1:49 AM, Xin Long <[email protected]> wrote:
> On Tue, Jan 30, 2018 at 11:59 PM, David Ahern <[email protected]> wrote:
>> On 1/30/18 1:57 PM, David Ahern wrote:
>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> The following program creates a hang in unregister_netdevice.
>>>>>> cleanup_net work hangs there forever periodically printing
>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>>> and creation of any new network namespaces hangs forever.
>>>>>
>>>>> Interestingly, this is not reproducible on net-next.
>>>>
>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>>> fixed due to that?
>>>>
>>>
>>> This appears to be the commit introducing the refcnt leak:
>>>
>>> $ git bisect bad
>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>>> Author: Xin Long <[email protected]>
>>> Date: Fri May 12 14:39:52 2017 +0800
>>>
>>> sctp: fix src address selection if using secondary addresses for ipv6
>>>
>>>
>>> v4.14 is bad. Running bisect in the background while doing other things....
>>>
>>
>> Interesting. The commit that avoids the refcnt leak is
>>
>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
>> Author: David Ahern <[email protected]>
>> Date: Wed Jan 24 19:45:29 2018 -0800
>>
>> net/ipv6: Do not allow route add with a device that is down
>>
>> That commit does not intentionally address the problem so it is just
>> masking the problematic code introduced by the commit above.
> Thanks, David A.
>
> I'm still on a trip. will look into this asap.

Alexey and Tommi already had the patches for this issue on
both SCTP v4 and v6 dst_get, Thanks.

2018-02-19 19:01:00

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Sat, Feb 3, 2018 at 1:15 PM, Xin Long <[email protected]> wrote:
>>> On 1/30/18 1:57 PM, David Ahern wrote:
>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> The following program creates a hang in unregister_netdevice.
>>>>>>> cleanup_net work hangs there forever periodically printing
>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>>>> and creation of any new network namespaces hangs forever.
>>>>>>
>>>>>> Interestingly, this is not reproducible on net-next.
>>>>>
>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>>>> fixed due to that?
>>>>>
>>>>
>>>> This appears to be the commit introducing the refcnt leak:
>>>>
>>>> $ git bisect bad
>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>>>> Author: Xin Long <[email protected]>
>>>> Date: Fri May 12 14:39:52 2017 +0800
>>>>
>>>> sctp: fix src address selection if using secondary addresses for ipv6
>>>>
>>>>
>>>> v4.14 is bad. Running bisect in the background while doing other things....
>>>>
>>>
>>> Interesting. The commit that avoids the refcnt leak is
>>>
>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
>>> Author: David Ahern <[email protected]>
>>> Date: Wed Jan 24 19:45:29 2018 -0800
>>>
>>> net/ipv6: Do not allow route add with a device that is down
>>>
>>> That commit does not intentionally address the problem so it is just
>>> masking the problematic code introduced by the commit above.
>> Thanks, David A.
>>
>> I'm still on a trip. will look into this asap.
>
> Alexey and Tommi already had the patches for this issue on
> both SCTP v4 and v6 dst_get, Thanks.



Is this meant to be fixed already? I am still seeing this on the
latest upstream tree.

2018-02-20 07:57:55

by Tommi Rantala

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On 19.02.2018 20:59, Dmitry Vyukov wrote:
> On Sat, Feb 3, 2018 at 1:15 PM, Xin Long <[email protected]> wrote:
>>>> On 1/30/18 1:57 PM, David Ahern wrote:
>>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The following program creates a hang in unregister_netdevice.
>>>>>>>> cleanup_net work hangs there forever periodically printing
>>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>>>>> and creation of any new network namespaces hangs forever.
>>>>>>>
>>>>>>> Interestingly, this is not reproducible on net-next.
>>>>>>
>>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>>>>> fixed due to that?
>>>>>>
>>>>>
>>>>> This appears to be the commit introducing the refcnt leak:
>>>>>
>>>>> $ git bisect bad
>>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>>>>> Author: Xin Long <[email protected]>
>>>>> Date: Fri May 12 14:39:52 2017 +0800
>>>>>
>>>>> sctp: fix src address selection if using secondary addresses for ipv6
>>>>>
>>>>>
>>>>> v4.14 is bad. Running bisect in the background while doing other things....
>>>>>
>>>>
>>>> Interesting. The commit that avoids the refcnt leak is
>>>>
>>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
>>>> Author: David Ahern <[email protected]>
>>>> Date: Wed Jan 24 19:45:29 2018 -0800
>>>>
>>>> net/ipv6: Do not allow route add with a device that is down
>>>>
>>>> That commit does not intentionally address the problem so it is just
>>>> masking the problematic code introduced by the commit above.
>>> Thanks, David A.
>>>
>>> I'm still on a trip. will look into this asap.
>>
>> Alexey and Tommi already had the patches for this issue on
>> both SCTP v4 and v6 dst_get, Thanks.
>
>
>
> Is this meant to be fixed already? I am still seeing this on the
> latest upstream tree.
>

These two commits are in v4.16-rc1:

commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
Author: Tommi Rantala <[email protected]>
Date: Mon Feb 5 21:48:14 2018 +0200

sctp: fix dst refcnt leak in sctp_v4_get_dst
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using
secondary addresses")


commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
Author: Alexey Kodanev <[email protected]>
Date: Mon Feb 5 15:10:35 2018 +0300

sctp: fix dst refcnt leak in sctp_v6_get_dst()
...
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
secondary addresses for ipv6")


I guess we missed something if it's still reproducible.

I can check it later this week, unless someone else beat me to it.

Tommi

2018-02-20 08:15:56

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
<[email protected]> wrote:
> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>
>> On Sat, Feb 3, 2018 at 1:15 PM, Xin Long <[email protected]> wrote:
>>>>>
>>>>> On 1/30/18 1:57 PM, David Ahern wrote:
>>>>>>
>>>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>>>>>>
>>>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>>>>>>
>>>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> The following program creates a hang in unregister_netdevice.
>>>>>>>>> cleanup_net work hangs there forever periodically printing
>>>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count =
>>>>>>>>> 3"
>>>>>>>>> and creation of any new network namespaces hangs forever.
>>>>>>>>
>>>>>>>>
>>>>>>>> Interestingly, this is not reproducible on net-next.
>>>>>>>
>>>>>>>
>>>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp:
>>>>>>> close
>>>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago,
>>>>>>> maybe
>>>>>>> fixed due to that?
>>>>>>>
>>>>>>
>>>>>> This appears to be the commit introducing the refcnt leak:
>>>>>>
>>>>>> $ git bisect bad
>>>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>>>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>>>>>> Author: Xin Long <[email protected]>
>>>>>> Date: Fri May 12 14:39:52 2017 +0800
>>>>>>
>>>>>> sctp: fix src address selection if using secondary addresses for
>>>>>> ipv6
>>>>>>
>>>>>>
>>>>>> v4.14 is bad. Running bisect in the background while doing other
>>>>>> things....
>>>>>>
>>>>>
>>>>> Interesting. The commit that avoids the refcnt leak is
>>>>>
>>>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
>>>>> Author: David Ahern <[email protected]>
>>>>> Date: Wed Jan 24 19:45:29 2018 -0800
>>>>>
>>>>> net/ipv6: Do not allow route add with a device that is down
>>>>>
>>>>> That commit does not intentionally address the problem so it is just
>>>>> masking the problematic code introduced by the commit above.
>>>>
>>>> Thanks, David A.
>>>>
>>>> I'm still on a trip. will look into this asap.
>>>
>>>
>>> Alexey and Tommi already had the patches for this issue on
>>> both SCTP v4 and v6 dst_get, Thanks.
>>
>>
>>
>>
>> Is this meant to be fixed already? I am still seeing this on the
>> latest upstream tree.
>>
>
> These two commits are in v4.16-rc1:
>
> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
> Author: Tommi Rantala <[email protected]>
> Date: Mon Feb 5 21:48:14 2018 +0200
>
> sctp: fix dst refcnt leak in sctp_v4_get_dst
> ...
> Fixes: 410f03831 ("sctp: add routing output fallback")
> Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary
> addresses")
>
>
> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
> Author: Alexey Kodanev <[email protected]>
> Date: Mon Feb 5 15:10:35 2018 +0300
>
> sctp: fix dst refcnt leak in sctp_v6_get_dst()
> ...
> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary
> addresses for ipv6")
>
>
> I guess we missed something if it's still reproducible.
>
> I can check it later this week, unless someone else beat me to it.

Hi Tommi,

Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
another one then. But I am still seeing these:

[ 58.799130] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 60.847138] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 62.895093] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 64.943103] unregister_netdevice: waiting for lo to become free.
Usage count = 4

on upstream tree pulled ~12 hours ago.

Kernel does not detect this as any kind of BUG/WARNING, so
syzkaller/syzbot do not catch it as bug and do not try to reproduce,
localize and report.

2018-02-20 16:28:45

by Neil Horman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
> <[email protected]> wrote:
> > On 19.02.2018 20:59, Dmitry Vyukov wrote:
> >>
> >> On Sat, Feb 3, 2018 at 1:15 PM, Xin Long <[email protected]> wrote:
> >>>>>
> >>>>> On 1/30/18 1:57 PM, David Ahern wrote:
> >>>>>>
> >>>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
> >>>>>>>
> >>>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
> >>>>>>>>
> >>>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <[email protected]>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> The following program creates a hang in unregister_netdevice.
> >>>>>>>>> cleanup_net work hangs there forever periodically printing
> >>>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count =
> >>>>>>>>> 3"
> >>>>>>>>> and creation of any new network namespaces hangs forever.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Interestingly, this is not reproducible on net-next.
> >>>>>>>
> >>>>>>>
> >>>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp:
> >>>>>>> close
> >>>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago,
> >>>>>>> maybe
> >>>>>>> fixed due to that?
> >>>>>>>
> >>>>>>
> >>>>>> This appears to be the commit introducing the refcnt leak:
> >>>>>>
> >>>>>> $ git bisect bad
> >>>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
> >>>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
> >>>>>> Author: Xin Long <[email protected]>
> >>>>>> Date: Fri May 12 14:39:52 2017 +0800
> >>>>>>
> >>>>>> sctp: fix src address selection if using secondary addresses for
> >>>>>> ipv6
> >>>>>>
> >>>>>>
> >>>>>> v4.14 is bad. Running bisect in the background while doing other
> >>>>>> things....
> >>>>>>
> >>>>>
> >>>>> Interesting. The commit that avoids the refcnt leak is
> >>>>>
> >>>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
> >>>>> Author: David Ahern <[email protected]>
> >>>>> Date: Wed Jan 24 19:45:29 2018 -0800
> >>>>>
> >>>>> net/ipv6: Do not allow route add with a device that is down
> >>>>>
> >>>>> That commit does not intentionally address the problem so it is just
> >>>>> masking the problematic code introduced by the commit above.
> >>>>
> >>>> Thanks, David A.
> >>>>
> >>>> I'm still on a trip. will look into this asap.
> >>>
> >>>
> >>> Alexey and Tommi already had the patches for this issue on
> >>> both SCTP v4 and v6 dst_get, Thanks.
> >>
> >>
> >>
> >>
> >> Is this meant to be fixed already? I am still seeing this on the
> >> latest upstream tree.
> >>
> >
> > These two commits are in v4.16-rc1:
> >
> > commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
> > Author: Tommi Rantala <[email protected]>
> > Date: Mon Feb 5 21:48:14 2018 +0200
> >
> > sctp: fix dst refcnt leak in sctp_v4_get_dst
> > ...
> > Fixes: 410f03831 ("sctp: add routing output fallback")
> > Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary
> > addresses")
> >
> >
> > commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
> > Author: Alexey Kodanev <[email protected]>
> > Date: Mon Feb 5 15:10:35 2018 +0300
> >
> > sctp: fix dst refcnt leak in sctp_v6_get_dst()
> > ...
> > Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary
> > addresses for ipv6")
> >
> >
> > I guess we missed something if it's still reproducible.
> >
> > I can check it later this week, unless someone else beat me to it.
>
> Hi Tommi,
>
> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
> another one then. But I am still seeing these:
>
> [ 58.799130] unregister_netdevice: waiting for lo to become free.
> Usage count = 4
> [ 60.847138] unregister_netdevice: waiting for lo to become free.
> Usage count = 4
> [ 62.895093] unregister_netdevice: waiting for lo to become free.
> Usage count = 4
> [ 64.943103] unregister_netdevice: waiting for lo to become free.
> Usage count = 4
>
> on upstream tree pulled ~12 hours ago.
>
Can you write a systemtap script to probe dev_hold, and dev_put, printing out a
backtrace if the device name matches "lo". That should tell us definitively if
the problem is in the same location or not

Neil

> Kernel does not detect this as any kind of BUG/WARNING, so
> syzkaller/syzbot do not catch it as bug and do not try to reproduce,
> localize and report.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2018-02-21 18:54:52

by Tommi Rantala

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On 20.02.2018 18:26, Neil Horman wrote:
> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>> <[email protected]> wrote:
>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>> Is this meant to be fixed already? I am still seeing this on the
>>>> latest upstream tree.
>>>>
>>>
>>> These two commits are in v4.16-rc1:
>>>
>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>> Author: Tommi Rantala <[email protected]>
>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>
>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>> ...
>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary
>>> addresses")
>>>
>>>
>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>> Author: Alexey Kodanev <[email protected]>
>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>
>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>> ...
>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary
>>> addresses for ipv6")
>>>
>>>
>>> I guess we missed something if it's still reproducible.
>>>
>>> I can check it later this week, unless someone else beat me to it.
>>
>> Hi Tommi,
>>
>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>> another one then. But I am still seeing these:
>>
>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>> Usage count = 4
>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>> Usage count = 4
>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>> Usage count = 4
>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>> Usage count = 4
>>
>> on upstream tree pulled ~12 hours ago.
>>
> Can you write a systemtap script to probe dev_hold, and dev_put, printing out a
> backtrace if the device name matches "lo". That should tell us definitively if
> the problem is in the same location or not

Hi Dmitry, I tested with the reproducer and the kernel .config file that
you sent in the first email in this thread:

With 4.16-rc2 unable to reproduce.

With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting
for lo to become free. Usage count = 3"

With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in
sctp_v6_get_dst()" cherry-picked on top, unable to reproduce.


Is syzkaller doing something else now to trigger the bug...?
Can you still trigger the bug with the same reproducer?


Tommi

2018-04-12 12:19:22

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
<[email protected]> wrote:
> On 20.02.2018 18:26, Neil Horman wrote:
>>
>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>
>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>> <[email protected]> wrote:
>>>>
>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>
>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>> latest upstream tree.
>>>>>
>>>>
>>>> These two commits are in v4.16-rc1:
>>>>
>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>> Author: Tommi Rantala <[email protected]>
>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>
>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>> ...
>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>> secondary
>>>> addresses")
>>>>
>>>>
>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>> Author: Alexey Kodanev <[email protected]>
>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>
>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>> ...
>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>> secondary
>>>> addresses for ipv6")
>>>>
>>>>
>>>> I guess we missed something if it's still reproducible.
>>>>
>>>> I can check it later this week, unless someone else beat me to it.
>>>
>>>
>>> Hi Tommi,
>>>
>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>> another one then. But I am still seeing these:
>>>
>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 4
>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 4
>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 4
>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 4
>>>
>>> on upstream tree pulled ~12 hours ago.
>>>
>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>> out a
>> backtrace if the device name matches "lo". That should tell us
>> definitively if
>> the problem is in the same location or not
>
>
> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
> sent in the first email in this thread:
>
> With 4.16-rc2 unable to reproduce.
>
> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
> lo to become free. Usage count = 3"
>
> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
> cherry-picked on top, unable to reproduce.
>
>
> Is syzkaller doing something else now to trigger the bug...?
> Can you still trigger the bug with the same reproducer?

Hi Neil, Tommi,

Reviving this old thread about "unregister_netdevice: waiting for lo
to become free. Usage count = 3" hangs.
I still did not have time to deep dive into what happens there (too
many bugs coming from syzbot). But this still actively happens and I
suspect accounts to a significant portion of various hang reports,
which are quite unpleasant.

One idea that could make it all simpler:

Is this wait loop in netdev_wait_allrefs() supposed to wait for any
prolonged periods of time under any non-buggy conditions? E.g. more
than 1-2 minutes?
If it only supposed to wait briefly for things that already supposed
to be shutting down, and we add a WARNING there after some timeout,
then syzbot will report all info how/when it happens, hopefully
extracting reproducers, and all the nice things.
But this WARNING should not have any false positives under any
realistic conditions (e.g. waiting for arrival of remote packets with
large timeouts).

Looking at some task hung reports, it seems that this code holds some
mutexes, takes workqueue thread and prevents any progress with
destruction of other devices (and net namespace creation/destruction),
so I guess it should not wait for any indefinite periods of time?

2018-04-13 12:41:49

by Neil Horman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Thu, Apr 12, 2018 at 02:15:30PM +0200, Dmitry Vyukov wrote:
> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
> <[email protected]> wrote:
> > On 20.02.2018 18:26, Neil Horman wrote:
> >>
> >> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
> >>>
> >>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
> >>> <[email protected]> wrote:
> >>>>
> >>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
> >>>>>
> >>>>> Is this meant to be fixed already? I am still seeing this on the
> >>>>> latest upstream tree.
> >>>>>
> >>>>
> >>>> These two commits are in v4.16-rc1:
> >>>>
> >>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
> >>>> Author: Tommi Rantala <[email protected]>
> >>>> Date: Mon Feb 5 21:48:14 2018 +0200
> >>>>
> >>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
> >>>> ...
> >>>> Fixes: 410f03831 ("sctp: add routing output fallback")
> >>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
> >>>> secondary
> >>>> addresses")
> >>>>
> >>>>
> >>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
> >>>> Author: Alexey Kodanev <[email protected]>
> >>>> Date: Mon Feb 5 15:10:35 2018 +0300
> >>>>
> >>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
> >>>> ...
> >>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
> >>>> secondary
> >>>> addresses for ipv6")
> >>>>
> >>>>
> >>>> I guess we missed something if it's still reproducible.
> >>>>
> >>>> I can check it later this week, unless someone else beat me to it.
> >>>
> >>>
> >>> Hi Tommi,
> >>>
> >>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
> >>> another one then. But I am still seeing these:
> >>>
> >>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
> >>> Usage count = 4
> >>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
> >>> Usage count = 4
> >>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
> >>> Usage count = 4
> >>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
> >>> Usage count = 4
> >>>
> >>> on upstream tree pulled ~12 hours ago.
> >>>
> >> Can you write a systemtap script to probe dev_hold, and dev_put, printing
> >> out a
> >> backtrace if the device name matches "lo". That should tell us
> >> definitively if
> >> the problem is in the same location or not
> >
> >
> > Hi Dmitry, I tested with the reproducer and the kernel .config file that you
> > sent in the first email in this thread:
> >
> > With 4.16-rc2 unable to reproduce.
> >
> > With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
> > lo to become free. Usage count = 3"
> >
> > With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
> > cherry-picked on top, unable to reproduce.
> >
> >
> > Is syzkaller doing something else now to trigger the bug...?
> > Can you still trigger the bug with the same reproducer?
>
> Hi Neil, Tommi,
>
> Reviving this old thread about "unregister_netdevice: waiting for lo
> to become free. Usage count = 3" hangs.
> I still did not have time to deep dive into what happens there (too
> many bugs coming from syzbot). But this still actively happens and I
> suspect accounts to a significant portion of various hang reports,
> which are quite unpleasant.
>
> One idea that could make it all simpler:
>
> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
> prolonged periods of time under any non-buggy conditions? E.g. more
> than 1-2 minutes?
As the name implies, its supposed to wait for the reference count to be zero
indefinately, but yes, under normal operation, its intended to not have to wait
very long at all. The issuance of the NETDEV_UNREGISTER_FINAL notification is
meant to be a subscribable signal to any code path holding a reference that it
needs to be dropped so that the progress can be made.

Note that the "waiting for %s to become free" message is triggered after 10
seconds of waiting, and is likely the trigger you want, Its just an emergency
level log message rather a WARN. I don't think we want to change that
permanently, but you could certainly alter it in the code to cause syzbot to
catch it (i.e. WARN_ON(time_after(jiffies, warning_time + 10 * HZ)) )


> If it only supposed to wait briefly for things that already supposed
> to be shutting down, and we add a WARNING there after some timeout,
> then syzbot will report all info how/when it happens, hopefully
> extracting reproducers, and all the nice things.
> But this WARNING should not have any false positives under any
> realistic conditions (e.g. waiting for arrival of remote packets with
> large timeouts).
>
> Looking at some task hung reports, it seems that this code holds some
> mutexes, takes workqueue thread and prevents any progress with
> destruction of other devices (and net namespace creation/destruction),
> so I guess it should not wait for any indefinite periods of time?
Well, it drops everything and sleeps periodically, so its safe in and of itself.
The problem is its waiting for the reference count of a device to drop to zero,
and some other execution context isn't taking the appropriate action to do that.

Neil

>

2018-04-13 12:49:34

by Dan Streetman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Thu, Apr 12, 2018 at 8:15 AM, Dmitry Vyukov <[email protected]> wrote:
> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
> <[email protected]> wrote:
>> On 20.02.2018 18:26, Neil Horman wrote:
>>>
>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>
>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>> <[email protected]> wrote:
>>>>>
>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>
>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>> latest upstream tree.
>>>>>>
>>>>>
>>>>> These two commits are in v4.16-rc1:
>>>>>
>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>> Author: Tommi Rantala <[email protected]>
>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>
>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>> ...
>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>> secondary
>>>>> addresses")
>>>>>
>>>>>
>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>> Author: Alexey Kodanev <[email protected]>
>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>
>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>> ...
>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>> secondary
>>>>> addresses for ipv6")
>>>>>
>>>>>
>>>>> I guess we missed something if it's still reproducible.
>>>>>
>>>>> I can check it later this week, unless someone else beat me to it.
>>>>
>>>>
>>>> Hi Tommi,
>>>>
>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>> another one then. But I am still seeing these:
>>>>
>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 4
>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 4
>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 4
>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 4
>>>>
>>>> on upstream tree pulled ~12 hours ago.
>>>>
>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>> out a
>>> backtrace if the device name matches "lo". That should tell us
>>> definitively if
>>> the problem is in the same location or not
>>
>>
>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>> sent in the first email in this thread:
>>
>> With 4.16-rc2 unable to reproduce.
>>
>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>> lo to become free. Usage count = 3"
>>
>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>> cherry-picked on top, unable to reproduce.
>>
>>
>> Is syzkaller doing something else now to trigger the bug...?
>> Can you still trigger the bug with the same reproducer?
>
> Hi Neil, Tommi,
>
> Reviving this old thread about "unregister_netdevice: waiting for lo
> to become free. Usage count = 3" hangs.
> I still did not have time to deep dive into what happens there (too
> many bugs coming from syzbot). But this still actively happens and I
> suspect accounts to a significant portion of various hang reports,
> which are quite unpleasant.
>
> One idea that could make it all simpler:
>
> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
> prolonged periods of time under any non-buggy conditions? E.g. more
> than 1-2 minutes?
> If it only supposed to wait briefly for things that already supposed
> to be shutting down, and we add a WARNING there after some timeout,
> then syzbot will report all info how/when it happens, hopefully
> extracting reproducers, and all the nice things.
> But this WARNING should not have any false positives under any
> realistic conditions (e.g. waiting for arrival of remote packets with
> large timeouts).
>
> Looking at some task hung reports, it seems that this code holds some
> mutexes, takes workqueue thread and prevents any progress with
> destruction of other devices (and net namespace creation/destruction),
> so I guess it should not wait for any indefinite periods of time?

I'm working on this currently:
https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407

I added a summary of what I've found to be the cause (or at least, one
possible cause) of this:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72

I'm working on a patch to work around the main side-effect of this,
which is hanging while holding the global net mutex. Hangs will still
happen (e.g. if a dst leaks) but should not affect anything else,
other than a leak of the dst and its net namespace.

Fixing the dst leaks is important too, of course, but a dst leak (or
other cause) shouldn't break the entire system.

2018-04-13 15:56:16

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Fri, Apr 13, 2018 at 2:43 PM, Dan Streetman <[email protected]> wrote:
> On Thu, Apr 12, 2018 at 8:15 AM, Dmitry Vyukov <[email protected]> wrote:
>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>> <[email protected]> wrote:
>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>
>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>
>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>
>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>> latest upstream tree.
>>>>>>>
>>>>>>
>>>>>> These two commits are in v4.16-rc1:
>>>>>>
>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>
>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>> ...
>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>> secondary
>>>>>> addresses")
>>>>>>
>>>>>>
>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>
>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>> ...
>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>> secondary
>>>>>> addresses for ipv6")
>>>>>>
>>>>>>
>>>>>> I guess we missed something if it's still reproducible.
>>>>>>
>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>
>>>>>
>>>>> Hi Tommi,
>>>>>
>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>> another one then. But I am still seeing these:
>>>>>
>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 4
>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 4
>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 4
>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 4
>>>>>
>>>>> on upstream tree pulled ~12 hours ago.
>>>>>
>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>> out a
>>>> backtrace if the device name matches "lo". That should tell us
>>>> definitively if
>>>> the problem is in the same location or not
>>>
>>>
>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>> sent in the first email in this thread:
>>>
>>> With 4.16-rc2 unable to reproduce.
>>>
>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>> lo to become free. Usage count = 3"
>>>
>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>> cherry-picked on top, unable to reproduce.
>>>
>>>
>>> Is syzkaller doing something else now to trigger the bug...?
>>> Can you still trigger the bug with the same reproducer?
>>
>> Hi Neil, Tommi,
>>
>> Reviving this old thread about "unregister_netdevice: waiting for lo
>> to become free. Usage count = 3" hangs.
>> I still did not have time to deep dive into what happens there (too
>> many bugs coming from syzbot). But this still actively happens and I
>> suspect accounts to a significant portion of various hang reports,
>> which are quite unpleasant.
>>
>> One idea that could make it all simpler:
>>
>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>> prolonged periods of time under any non-buggy conditions? E.g. more
>> than 1-2 minutes?
>> If it only supposed to wait briefly for things that already supposed
>> to be shutting down, and we add a WARNING there after some timeout,
>> then syzbot will report all info how/when it happens, hopefully
>> extracting reproducers, and all the nice things.
>> But this WARNING should not have any false positives under any
>> realistic conditions (e.g. waiting for arrival of remote packets with
>> large timeouts).
>>
>> Looking at some task hung reports, it seems that this code holds some
>> mutexes, takes workqueue thread and prevents any progress with
>> destruction of other devices (and net namespace creation/destruction),
>> so I guess it should not wait for any indefinite periods of time?
>
> I'm working on this currently:
> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>
> I added a summary of what I've found to be the cause (or at least, one
> possible cause) of this:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>
> I'm working on a patch to work around the main side-effect of this,
> which is hanging while holding the global net mutex. Hangs will still
> happen (e.g. if a dst leaks) but should not affect anything else,
> other than a leak of the dst and its net namespace.
>
> Fixing the dst leaks is important too, of course, but a dst leak (or
> other cause) shouldn't break the entire system.

Leaking some memory is definitely better than hanging the system.

So I've made syzkaller to recognize "unregister_netdevice: waiting for
(.*) to become free" as a kernel bug:
https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
Unfortunately it does not make it catch these bugs because creating a
net namespace per test is too damn slow, so namespaces are reused for
lots of tests and when/if it's eventually destroyed it's already too
late to find root cause.

But I've run a one-off experiment with prompt net namespace
destruction and syzkaller was able to easily extract a C reproducer:
https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt

On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt

this gives me:

[ 83.183198] unregister_netdevice: waiting for lo to become free.
Usage count = 9
[ 85.231202] unregister_netdevice: waiting for lo to become free.
Usage count = 9
...
[ 523.511205] unregister_netdevice: waiting for lo to become free.
Usage count = 9
...

This is generated from this syzkaller program:

r0 = socket$inet6(0xa, 0x1, 0x84)
setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
&(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
@in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
0x14, 0xd}}, 0x10)
syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
{@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
@remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
@udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))

So this seems to be related to IPv6 and/or xfrm and is potentially
caused by external packets (that syz_emit_ethernet call).

2018-04-16 07:37:02

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Fri, Apr 13, 2018 at 5:54 PM, Dmitry Vyukov <[email protected]> wrote:
> On Fri, Apr 13, 2018 at 2:43 PM, Dan Streetman <[email protected]> wrote:
>> On Thu, Apr 12, 2018 at 8:15 AM, Dmitry Vyukov <[email protected]> wrote:
>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>>> <[email protected]> wrote:
>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>
>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>
>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>
>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>> latest upstream tree.
>>>>>>>>
>>>>>>>
>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>
>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>
>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>> ...
>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>> secondary
>>>>>>> addresses")
>>>>>>>
>>>>>>>
>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>
>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>> ...
>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>> secondary
>>>>>>> addresses for ipv6")
>>>>>>>
>>>>>>>
>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>
>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>
>>>>>>
>>>>>> Hi Tommi,
>>>>>>
>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>> another one then. But I am still seeing these:
>>>>>>
>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 4
>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 4
>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 4
>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 4
>>>>>>
>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>
>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>> out a
>>>>> backtrace if the device name matches "lo". That should tell us
>>>>> definitively if
>>>>> the problem is in the same location or not
>>>>
>>>>
>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>> sent in the first email in this thread:
>>>>
>>>> With 4.16-rc2 unable to reproduce.
>>>>
>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>> lo to become free. Usage count = 3"
>>>>
>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>> cherry-picked on top, unable to reproduce.
>>>>
>>>>
>>>> Is syzkaller doing something else now to trigger the bug...?
>>>> Can you still trigger the bug with the same reproducer?
>>>
>>> Hi Neil, Tommi,
>>>
>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>> to become free. Usage count = 3" hangs.
>>> I still did not have time to deep dive into what happens there (too
>>> many bugs coming from syzbot). But this still actively happens and I
>>> suspect accounts to a significant portion of various hang reports,
>>> which are quite unpleasant.
>>>
>>> One idea that could make it all simpler:
>>>
>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>> than 1-2 minutes?
>>> If it only supposed to wait briefly for things that already supposed
>>> to be shutting down, and we add a WARNING there after some timeout,
>>> then syzbot will report all info how/when it happens, hopefully
>>> extracting reproducers, and all the nice things.
>>> But this WARNING should not have any false positives under any
>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>> large timeouts).
>>>
>>> Looking at some task hung reports, it seems that this code holds some
>>> mutexes, takes workqueue thread and prevents any progress with
>>> destruction of other devices (and net namespace creation/destruction),
>>> so I guess it should not wait for any indefinite periods of time?
>>
>> I'm working on this currently:
>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>
>> I added a summary of what I've found to be the cause (or at least, one
>> possible cause) of this:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>
>> I'm working on a patch to work around the main side-effect of this,
>> which is hanging while holding the global net mutex. Hangs will still
>> happen (e.g. if a dst leaks) but should not affect anything else,
>> other than a leak of the dst and its net namespace.
>>
>> Fixing the dst leaks is important too, of course, but a dst leak (or
>> other cause) shouldn't break the entire system.
>
> Leaking some memory is definitely better than hanging the system.
>
> So I've made syzkaller to recognize "unregister_netdevice: waiting for
> (.*) to become free" as a kernel bug:
> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
> Unfortunately it does not make it catch these bugs because creating a
> net namespace per test is too damn slow, so namespaces are reused for
> lots of tests and when/if it's eventually destroyed it's already too
> late to find root cause.
>
> But I've run a one-off experiment with prompt net namespace
> destruction and syzkaller was able to easily extract a C reproducer:
> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>
> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>
> this gives me:
>
> [ 83.183198] unregister_netdevice: waiting for lo to become free.
> Usage count = 9
> [ 85.231202] unregister_netdevice: waiting for lo to become free.
> Usage count = 9
> ...
> [ 523.511205] unregister_netdevice: waiting for lo to become free.
> Usage count = 9
> ...
>
> This is generated from this syzkaller program:
>
> r0 = socket$inet6(0xa, 0x1, 0x84)
> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
> 0x14, 0xd}}, 0x10)
> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>
> So this seems to be related to IPv6 and/or xfrm and is potentially
> caused by external packets (that syz_emit_ethernet call).



Here is another repro which seems to be a different bug (note that it
requires fault injection):

https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt

Dan, do you mind taking a look at them? Fixing these should eliminate
root causes of these hangs/leaks.

2018-04-16 19:44:18

by Dan Streetman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Mon, Apr 16, 2018 at 3:35 AM, Dmitry Vyukov <[email protected]> wrote:
> On Fri, Apr 13, 2018 at 5:54 PM, Dmitry Vyukov <[email protected]> wrote:
>> On Fri, Apr 13, 2018 at 2:43 PM, Dan Streetman <[email protected]> wrote:
>>> On Thu, Apr 12, 2018 at 8:15 AM, Dmitry Vyukov <[email protected]> wrote:
>>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>>>> <[email protected]> wrote:
>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>
>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>
>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>
>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>> latest upstream tree.
>>>>>>>>>
>>>>>>>>
>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>
>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>
>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>> ...
>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>> secondary
>>>>>>>> addresses")
>>>>>>>>
>>>>>>>>
>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>
>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>> ...
>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>> secondary
>>>>>>>> addresses for ipv6")
>>>>>>>>
>>>>>>>>
>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>
>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>
>>>>>>>
>>>>>>> Hi Tommi,
>>>>>>>
>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>> another one then. But I am still seeing these:
>>>>>>>
>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>> Usage count = 4
>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>> Usage count = 4
>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>> Usage count = 4
>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>> Usage count = 4
>>>>>>>
>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>
>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>> out a
>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>> definitively if
>>>>>> the problem is in the same location or not
>>>>>
>>>>>
>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>> sent in the first email in this thread:
>>>>>
>>>>> With 4.16-rc2 unable to reproduce.
>>>>>
>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>> lo to become free. Usage count = 3"
>>>>>
>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>> cherry-picked on top, unable to reproduce.
>>>>>
>>>>>
>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>> Can you still trigger the bug with the same reproducer?
>>>>
>>>> Hi Neil, Tommi,
>>>>
>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>> to become free. Usage count = 3" hangs.
>>>> I still did not have time to deep dive into what happens there (too
>>>> many bugs coming from syzbot). But this still actively happens and I
>>>> suspect accounts to a significant portion of various hang reports,
>>>> which are quite unpleasant.
>>>>
>>>> One idea that could make it all simpler:
>>>>
>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>> than 1-2 minutes?
>>>> If it only supposed to wait briefly for things that already supposed
>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>> then syzbot will report all info how/when it happens, hopefully
>>>> extracting reproducers, and all the nice things.
>>>> But this WARNING should not have any false positives under any
>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>> large timeouts).
>>>>
>>>> Looking at some task hung reports, it seems that this code holds some
>>>> mutexes, takes workqueue thread and prevents any progress with
>>>> destruction of other devices (and net namespace creation/destruction),
>>>> so I guess it should not wait for any indefinite periods of time?
>>>
>>> I'm working on this currently:
>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>
>>> I added a summary of what I've found to be the cause (or at least, one
>>> possible cause) of this:
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>
>>> I'm working on a patch to work around the main side-effect of this,
>>> which is hanging while holding the global net mutex. Hangs will still
>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>> other than a leak of the dst and its net namespace.
>>>
>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>> other cause) shouldn't break the entire system.
>>
>> Leaking some memory is definitely better than hanging the system.
>>
>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>> (.*) to become free" as a kernel bug:
>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>> Unfortunately it does not make it catch these bugs because creating a
>> net namespace per test is too damn slow, so namespaces are reused for
>> lots of tests and when/if it's eventually destroyed it's already too
>> late to find root cause.
>>
>> But I've run a one-off experiment with prompt net namespace
>> destruction and syzkaller was able to easily extract a C reproducer:
>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>
>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>
>> this gives me:
>>
>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>> Usage count = 9
>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>> Usage count = 9
>> ...
>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>> Usage count = 9
>> ...
>>
>> This is generated from this syzkaller program:
>>
>> r0 = socket$inet6(0xa, 0x1, 0x84)
>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>> 0x14, 0xd}}, 0x10)
>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>
>> So this seems to be related to IPv6 and/or xfrm and is potentially
>> caused by external packets (that syz_emit_ethernet call).
>
>
>
> Here is another repro which seems to be a different bug (note that it
> requires fault injection):
>
> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>
> Dan, do you mind taking a look at them? Fixing these should eliminate
> root causes of these hangs/leaks.

Yep I will look at them, thanks for the reproducers.

2018-05-10 06:48:32

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Mon, Apr 16, 2018 at 9:42 PM, Dan Streetman <[email protected]> wrote:
>>>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>>>>> <[email protected]> wrote:
>>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>>
>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>>
>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>>> <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>>
>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>>> latest upstream tree.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>>
>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>>
>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>>> ...
>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>>> secondary
>>>>>>>>> addresses")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>>
>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>>> ...
>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>>> secondary
>>>>>>>>> addresses for ipv6")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>>
>>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Tommi,
>>>>>>>>
>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>>> another one then. But I am still seeing these:
>>>>>>>>
>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>>
>>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>>
>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>>> out a
>>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>>> definitively if
>>>>>>> the problem is in the same location or not
>>>>>>
>>>>>>
>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>>> sent in the first email in this thread:
>>>>>>
>>>>>> With 4.16-rc2 unable to reproduce.
>>>>>>
>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>>> lo to become free. Usage count = 3"
>>>>>>
>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>>> cherry-picked on top, unable to reproduce.
>>>>>>
>>>>>>
>>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>>> Can you still trigger the bug with the same reproducer?
>>>>>
>>>>> Hi Neil, Tommi,
>>>>>
>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>>> to become free. Usage count = 3" hangs.
>>>>> I still did not have time to deep dive into what happens there (too
>>>>> many bugs coming from syzbot). But this still actively happens and I
>>>>> suspect accounts to a significant portion of various hang reports,
>>>>> which are quite unpleasant.
>>>>>
>>>>> One idea that could make it all simpler:
>>>>>
>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>>> than 1-2 minutes?
>>>>> If it only supposed to wait briefly for things that already supposed
>>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>>> then syzbot will report all info how/when it happens, hopefully
>>>>> extracting reproducers, and all the nice things.
>>>>> But this WARNING should not have any false positives under any
>>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>>> large timeouts).
>>>>>
>>>>> Looking at some task hung reports, it seems that this code holds some
>>>>> mutexes, takes workqueue thread and prevents any progress with
>>>>> destruction of other devices (and net namespace creation/destruction),
>>>>> so I guess it should not wait for any indefinite periods of time?
>>>>
>>>> I'm working on this currently:
>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>>
>>>> I added a summary of what I've found to be the cause (or at least, one
>>>> possible cause) of this:
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>>
>>>> I'm working on a patch to work around the main side-effect of this,
>>>> which is hanging while holding the global net mutex. Hangs will still
>>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>>> other than a leak of the dst and its net namespace.
>>>>
>>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>>> other cause) shouldn't break the entire system.
>>>
>>> Leaking some memory is definitely better than hanging the system.
>>>
>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>>> (.*) to become free" as a kernel bug:
>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>>> Unfortunately it does not make it catch these bugs because creating a
>>> net namespace per test is too damn slow, so namespaces are reused for
>>> lots of tests and when/if it's eventually destroyed it's already too
>>> late to find root cause.
>>>
>>> But I've run a one-off experiment with prompt net namespace
>>> destruction and syzkaller was able to easily extract a C reproducer:
>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>>
>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>>
>>> this gives me:
>>>
>>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> ...
>>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> ...
>>>
>>> This is generated from this syzkaller program:
>>>
>>> r0 = socket$inet6(0xa, 0x1, 0x84)
>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>>> 0x14, 0xd}}, 0x10)
>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>>
>>> So this seems to be related to IPv6 and/or xfrm and is potentially
>>> caused by external packets (that syz_emit_ethernet call).
>>
>>
>>
>> Here is another repro which seems to be a different bug (note that it
>> requires fault injection):
>>
>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>>
>> Dan, do you mind taking a look at them? Fixing these should eliminate
>> root causes of these hangs/leaks.
>
> Yep I will look at them, thanks for the reproducers.

Hi Dan,

Any updates on this? syzbot is hitting this all the time.

2018-05-10 10:25:50

by Dan Streetman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Thu, May 10, 2018 at 2:46 AM, Dmitry Vyukov <[email protected]> wrote:
> On Mon, Apr 16, 2018 at 9:42 PM, Dan Streetman <[email protected]> wrote:
>>>>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>>>>>> <[email protected]> wrote:
>>>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>>>
>>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>>>
>>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>>>> latest upstream tree.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>>>
>>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>>>
>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>>>> ...
>>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>>>> secondary
>>>>>>>>>> addresses")
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>>>
>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>>>> ...
>>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>>>> secondary
>>>>>>>>>> addresses for ipv6")
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>>>
>>>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Tommi,
>>>>>>>>>
>>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>>>> another one then. But I am still seeing these:
>>>>>>>>>
>>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>>>> Usage count = 4
>>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>>>> Usage count = 4
>>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>>>> Usage count = 4
>>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>>>> Usage count = 4
>>>>>>>>>
>>>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>>>
>>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>>>> out a
>>>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>>>> definitively if
>>>>>>>> the problem is in the same location or not
>>>>>>>
>>>>>>>
>>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>>>> sent in the first email in this thread:
>>>>>>>
>>>>>>> With 4.16-rc2 unable to reproduce.
>>>>>>>
>>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>>>> lo to become free. Usage count = 3"
>>>>>>>
>>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>>>> cherry-picked on top, unable to reproduce.
>>>>>>>
>>>>>>>
>>>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>>>> Can you still trigger the bug with the same reproducer?
>>>>>>
>>>>>> Hi Neil, Tommi,
>>>>>>
>>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>>>> to become free. Usage count = 3" hangs.
>>>>>> I still did not have time to deep dive into what happens there (too
>>>>>> many bugs coming from syzbot). But this still actively happens and I
>>>>>> suspect accounts to a significant portion of various hang reports,
>>>>>> which are quite unpleasant.
>>>>>>
>>>>>> One idea that could make it all simpler:
>>>>>>
>>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>>>> than 1-2 minutes?
>>>>>> If it only supposed to wait briefly for things that already supposed
>>>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>>>> then syzbot will report all info how/when it happens, hopefully
>>>>>> extracting reproducers, and all the nice things.
>>>>>> But this WARNING should not have any false positives under any
>>>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>>>> large timeouts).
>>>>>>
>>>>>> Looking at some task hung reports, it seems that this code holds some
>>>>>> mutexes, takes workqueue thread and prevents any progress with
>>>>>> destruction of other devices (and net namespace creation/destruction),
>>>>>> so I guess it should not wait for any indefinite periods of time?
>>>>>
>>>>> I'm working on this currently:
>>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>>>
>>>>> I added a summary of what I've found to be the cause (or at least, one
>>>>> possible cause) of this:
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>>>
>>>>> I'm working on a patch to work around the main side-effect of this,
>>>>> which is hanging while holding the global net mutex. Hangs will still
>>>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>>>> other than a leak of the dst and its net namespace.
>>>>>
>>>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>>>> other cause) shouldn't break the entire system.
>>>>
>>>> Leaking some memory is definitely better than hanging the system.
>>>>
>>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>>>> (.*) to become free" as a kernel bug:
>>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>>>> Unfortunately it does not make it catch these bugs because creating a
>>>> net namespace per test is too damn slow, so namespaces are reused for
>>>> lots of tests and when/if it's eventually destroyed it's already too
>>>> late to find root cause.
>>>>
>>>> But I've run a one-off experiment with prompt net namespace
>>>> destruction and syzkaller was able to easily extract a C reproducer:
>>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>>>
>>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>>>
>>>> this gives me:
>>>>
>>>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 9
>>>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 9
>>>> ...
>>>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 9
>>>> ...
>>>>
>>>> This is generated from this syzkaller program:
>>>>
>>>> r0 = socket$inet6(0xa, 0x1, 0x84)
>>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>>>> 0x14, 0xd}}, 0x10)
>>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>>>
>>>> So this seems to be related to IPv6 and/or xfrm and is potentially
>>>> caused by external packets (that syz_emit_ethernet call).
>>>
>>>
>>>
>>> Here is another repro which seems to be a different bug (note that it
>>> requires fault injection):
>>>
>>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>>>
>>> Dan, do you mind taking a look at them? Fixing these should eliminate
>>> root causes of these hangs/leaks.
>>
>> Yep I will look at them, thanks for the reproducers.
>
> Hi Dan,
>
> Any updates on this? syzbot is hitting this all the time.

Sorry, the recent changes from net_mutex -> net_rwsem/pernet_ops_rwsem
have complicated what I had done to workaround this, but I'm still
working on it. Apologies for the delay.

2018-05-11 09:22:04

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Thu, May 10, 2018 at 12:23 PM, Dan Streetman <[email protected]> wrote:
>>>>>>> <[email protected]> wrote:
>>>>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>>>>> latest upstream tree.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>>>>
>>>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>>>>
>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>>>>> ...
>>>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>>>>> secondary
>>>>>>>>>>> addresses")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>>>>
>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>>>>> ...
>>>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>>>>> secondary
>>>>>>>>>>> addresses for ipv6")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>>>>
>>>>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Tommi,
>>>>>>>>>>
>>>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>>>>> another one then. But I am still seeing these:
>>>>>>>>>>
>>>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>> Usage count = 4
>>>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>> Usage count = 4
>>>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>> Usage count = 4
>>>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>> Usage count = 4
>>>>>>>>>>
>>>>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>>>>
>>>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>>>>> out a
>>>>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>>>>> definitively if
>>>>>>>>> the problem is in the same location or not
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>>>>> sent in the first email in this thread:
>>>>>>>>
>>>>>>>> With 4.16-rc2 unable to reproduce.
>>>>>>>>
>>>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>>>>> lo to become free. Usage count = 3"
>>>>>>>>
>>>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>>>>> cherry-picked on top, unable to reproduce.
>>>>>>>>
>>>>>>>>
>>>>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>>>>> Can you still trigger the bug with the same reproducer?
>>>>>>>
>>>>>>> Hi Neil, Tommi,
>>>>>>>
>>>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>>>>> to become free. Usage count = 3" hangs.
>>>>>>> I still did not have time to deep dive into what happens there (too
>>>>>>> many bugs coming from syzbot). But this still actively happens and I
>>>>>>> suspect accounts to a significant portion of various hang reports,
>>>>>>> which are quite unpleasant.
>>>>>>>
>>>>>>> One idea that could make it all simpler:
>>>>>>>
>>>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>>>>> than 1-2 minutes?
>>>>>>> If it only supposed to wait briefly for things that already supposed
>>>>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>>>>> then syzbot will report all info how/when it happens, hopefully
>>>>>>> extracting reproducers, and all the nice things.
>>>>>>> But this WARNING should not have any false positives under any
>>>>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>>>>> large timeouts).
>>>>>>>
>>>>>>> Looking at some task hung reports, it seems that this code holds some
>>>>>>> mutexes, takes workqueue thread and prevents any progress with
>>>>>>> destruction of other devices (and net namespace creation/destruction),
>>>>>>> so I guess it should not wait for any indefinite periods of time?
>>>>>>
>>>>>> I'm working on this currently:
>>>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>>>>
>>>>>> I added a summary of what I've found to be the cause (or at least, one
>>>>>> possible cause) of this:
>>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>>>>
>>>>>> I'm working on a patch to work around the main side-effect of this,
>>>>>> which is hanging while holding the global net mutex. Hangs will still
>>>>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>>>>> other than a leak of the dst and its net namespace.
>>>>>>
>>>>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>>>>> other cause) shouldn't break the entire system.
>>>>>
>>>>> Leaking some memory is definitely better than hanging the system.
>>>>>
>>>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>>>>> (.*) to become free" as a kernel bug:
>>>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>>>>> Unfortunately it does not make it catch these bugs because creating a
>>>>> net namespace per test is too damn slow, so namespaces are reused for
>>>>> lots of tests and when/if it's eventually destroyed it's already too
>>>>> late to find root cause.
>>>>>
>>>>> But I've run a one-off experiment with prompt net namespace
>>>>> destruction and syzkaller was able to easily extract a C reproducer:
>>>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>>>>
>>>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>>>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>>>>
>>>>> this gives me:
>>>>>
>>>>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 9
>>>>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 9
>>>>> ...
>>>>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>>>>> Usage count = 9
>>>>> ...
>>>>>
>>>>> This is generated from this syzkaller program:
>>>>>
>>>>> r0 = socket$inet6(0xa, 0x1, 0x84)
>>>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>>>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>>>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>>>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>>>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>>>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>>>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>>>>> 0x14, 0xd}}, 0x10)
>>>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>>>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>>>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>>>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>>>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>>>>
>>>>> So this seems to be related to IPv6 and/or xfrm and is potentially
>>>>> caused by external packets (that syz_emit_ethernet call).
>>>>
>>>>
>>>>
>>>> Here is another repro which seems to be a different bug (note that it
>>>> requires fault injection):
>>>>
>>>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>>>>
>>>> Dan, do you mind taking a look at them? Fixing these should eliminate
>>>> root causes of these hangs/leaks.
>>>
>>> Yep I will look at them, thanks for the reproducers.
>>
>> Hi Dan,
>>
>> Any updates on this? syzbot is hitting this all the time.
>
> Sorry, the recent changes from net_mutex -> net_rwsem/pernet_ops_rwsem
> have complicated what I had done to workaround this, but I'm still
> working on it. Apologies for the delay.

Are you looking at the mitigation? Or the bugs that trigger it? Or both?

2018-05-11 11:42:09

by Dan Streetman

[permalink] [raw]
Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free

On Fri, May 11, 2018 at 5:19 AM, Dmitry Vyukov <[email protected]> wrote:
> On Thu, May 10, 2018 at 12:23 PM, Dan Streetman <[email protected]> wrote:
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>>>>>> latest upstream tree.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>>>>>
>>>>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>>>>>> Author: Tommi Rantala <[email protected]>
>>>>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>>>>>
>>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>>>>>> ...
>>>>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>>>>>> secondary
>>>>>>>>>>>> addresses")
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>>>>>> Author: Alexey Kodanev <[email protected]>
>>>>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>>>>>
>>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>>>>>> ...
>>>>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>>>>>> secondary
>>>>>>>>>>>> addresses for ipv6")
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>>>>>
>>>>>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Tommi,
>>>>>>>>>>>
>>>>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>>>>>> another one then. But I am still seeing these:
>>>>>>>>>>>
>>>>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>>> Usage count = 4
>>>>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>>> Usage count = 4
>>>>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>>> Usage count = 4
>>>>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>>>>>> Usage count = 4
>>>>>>>>>>>
>>>>>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>>>>>
>>>>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>>>>>> out a
>>>>>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>>>>>> definitively if
>>>>>>>>>> the problem is in the same location or not
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>>>>>> sent in the first email in this thread:
>>>>>>>>>
>>>>>>>>> With 4.16-rc2 unable to reproduce.
>>>>>>>>>
>>>>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>>>>>> lo to become free. Usage count = 3"
>>>>>>>>>
>>>>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>>>>>> cherry-picked on top, unable to reproduce.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>>>>>> Can you still trigger the bug with the same reproducer?
>>>>>>>>
>>>>>>>> Hi Neil, Tommi,
>>>>>>>>
>>>>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>>>>>> to become free. Usage count = 3" hangs.
>>>>>>>> I still did not have time to deep dive into what happens there (too
>>>>>>>> many bugs coming from syzbot). But this still actively happens and I
>>>>>>>> suspect accounts to a significant portion of various hang reports,
>>>>>>>> which are quite unpleasant.
>>>>>>>>
>>>>>>>> One idea that could make it all simpler:
>>>>>>>>
>>>>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>>>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>>>>>> than 1-2 minutes?
>>>>>>>> If it only supposed to wait briefly for things that already supposed
>>>>>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>>>>>> then syzbot will report all info how/when it happens, hopefully
>>>>>>>> extracting reproducers, and all the nice things.
>>>>>>>> But this WARNING should not have any false positives under any
>>>>>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>>>>>> large timeouts).
>>>>>>>>
>>>>>>>> Looking at some task hung reports, it seems that this code holds some
>>>>>>>> mutexes, takes workqueue thread and prevents any progress with
>>>>>>>> destruction of other devices (and net namespace creation/destruction),
>>>>>>>> so I guess it should not wait for any indefinite periods of time?
>>>>>>>
>>>>>>> I'm working on this currently:
>>>>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>>>>>
>>>>>>> I added a summary of what I've found to be the cause (or at least, one
>>>>>>> possible cause) of this:
>>>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>>>>>
>>>>>>> I'm working on a patch to work around the main side-effect of this,
>>>>>>> which is hanging while holding the global net mutex. Hangs will still
>>>>>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>>>>>> other than a leak of the dst and its net namespace.
>>>>>>>
>>>>>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>>>>>> other cause) shouldn't break the entire system.
>>>>>>
>>>>>> Leaking some memory is definitely better than hanging the system.
>>>>>>
>>>>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>>>>>> (.*) to become free" as a kernel bug:
>>>>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>>>>>> Unfortunately it does not make it catch these bugs because creating a
>>>>>> net namespace per test is too damn slow, so namespaces are reused for
>>>>>> lots of tests and when/if it's eventually destroyed it's already too
>>>>>> late to find root cause.
>>>>>>
>>>>>> But I've run a one-off experiment with prompt net namespace
>>>>>> destruction and syzkaller was able to easily extract a C reproducer:
>>>>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>>>>>
>>>>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>>>>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>>>>>
>>>>>> this gives me:
>>>>>>
>>>>>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 9
>>>>>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 9
>>>>>> ...
>>>>>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>>>>>> Usage count = 9
>>>>>> ...
>>>>>>
>>>>>> This is generated from this syzkaller program:
>>>>>>
>>>>>> r0 = socket$inet6(0xa, 0x1, 0x84)
>>>>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>>>>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>>>>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>>>>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>>>>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>>>>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>>>>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>>>>>> 0x14, 0xd}}, 0x10)
>>>>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>>>>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>>>>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>>>>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>>>>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>>>>>
>>>>>> So this seems to be related to IPv6 and/or xfrm and is potentially
>>>>>> caused by external packets (that syz_emit_ethernet call).
>>>>>
>>>>>
>>>>>
>>>>> Here is another repro which seems to be a different bug (note that it
>>>>> requires fault injection):
>>>>>
>>>>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>>>>>
>>>>> Dan, do you mind taking a look at them? Fixing these should eliminate
>>>>> root causes of these hangs/leaks.
>>>>
>>>> Yep I will look at them, thanks for the reproducers.
>>>
>>> Hi Dan,
>>>
>>> Any updates on this? syzbot is hitting this all the time.
>>
>> Sorry, the recent changes from net_mutex -> net_rwsem/pernet_ops_rwsem
>> have complicated what I had done to workaround this, but I'm still
>> working on it. Apologies for the delay.
>
> Are you looking at the mitigation? Or the bugs that trigger it? Or both?

Both - workaround first, as most important (and relatively easiest) is
allowing the system to continue to create and destroy net namespaces
once this happens (instead of hanging all further netns init/cleanup
until a system reboot).

Then, I do want to try to add some debug to make it easier to debug a
leaked dst (or other cause).