Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp671145imm; Fri, 11 May 2018 04:42:09 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqklOR++hms5UJx9UpPm/wRcnrfa56smgLbn0OqXsyWSvrkbl8HAZwPKkn+jLuhPr1s9vGI X-Received: by 2002:a17:902:b28:: with SMTP id 37-v6mr5128387plq.207.1526038929616; Fri, 11 May 2018 04:42:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526038929; cv=none; d=google.com; s=arc-20160816; b=MDzM9TxFqCE3aead3m/j6a+uZuRBU+xNFFhHHAtZvpcw5W/h2Kp65zIi2Lkq89A8Mw EttBlff8kI/LlhpHad1WxOvhMcOfeEuIvr5Tj0urjKzcTAjiw0WEtVS1pT2D3ntGot62 5CQl1LfByVnsr9rz4A2R2stHVIZ02SUOxwaFcgIpetfck12D7qICNMpttBPlrqYLmR2V U7aTlblPljWRM7fClTOXGXWT7kgandMTbcraaT7sjfJTvw12I66cIJX8WHCNauFyAxIg GvqK8lcwq+AVMOdvtEq8ol3ZKYUL0LfNsuUazQkcZkNFHteAqt7C4UgmSKDMFE7XDpsc vm/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=5J6/5VFY104jLfo7qQmI4RhlSjoKV/74TeHargAgHNY=; b=QqU+g7nSH4xmg4i/cOVaVb1q0JbAzMZZ+oWfoSjfFrBlrFjzaml5at4ie4jpm6jL1f 7hWAWBfKVcZuBwp2IFJ6XZe8URzF3CijlkBXN/DoIOzixrUhjqT+cfIVB2mlUw7lwL/R oGVTr9MGAPm+jw7MXaCn7mTxIdRP2OmPgEE2Wcbw4JaF7tESjEJWxohjZxS5mgA9TIhu NkJCkUHbDcZy4Rds2aLFNnGxIEoe3nBC05jhx41PycewoAAMFlPZ1b+lM9d4inb4oDHq j4Rx6HoWb3r6XKY4GqFcIXTke4W45qMxNdAP+QTSSNAUDsZW7kjc5juzwUF9VX1rh/we 5PzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=DVBQDNPF; dkim=fail header.i=@ieee-org.20150623.gappssmtp.com header.s=20150623 header.b=W9yP3Qqb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u7-v6si2996961plr.553.2018.05.11.04.41.53; Fri, 11 May 2018 04:42:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=DVBQDNPF; dkim=fail header.i=@ieee-org.20150623.gappssmtp.com header.s=20150623 header.b=W9yP3Qqb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752862AbeEKLln (ORCPT + 99 others); Fri, 11 May 2018 07:41:43 -0400 Received: from mail-io0-f179.google.com ([209.85.223.179]:43638 "EHLO mail-io0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752386AbeEKLll (ORCPT ); Fri, 11 May 2018 07:41:41 -0400 Received: by mail-io0-f179.google.com with SMTP id t23-v6so6625389ioc.10; Fri, 11 May 2018 04:41:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=5J6/5VFY104jLfo7qQmI4RhlSjoKV/74TeHargAgHNY=; b=DVBQDNPFW867c+Eja4NP7+BfkE4W5lPRARaR0KCkER7cVypdEmPfL6JmpJq6K8IIHP orBZAXua//mt3MkwvmfVaZYquwJ+n5S/bnZ4TI6gP4LcyQpOV7fU8h0iENrv8OfdrI47 OB2RAMMjR6k9zqwfkI7SFDvuQ1mHIV5fyx0eb+71p9IT9BZIk117knj3j1kjI6z3aCHr LmMS5RXu0XY4N2KiS9jlcROwoG6UVyb0XOxJ3ll8Gyvp76TttCnZ9DzfgrhHMy3Z0uuz xSTjXApQmBzI+QVb/TltAI8Pb9Ji5R7qKxrY/3ReJJQ+nCU0tzY5BSWU+NEfkSXLtbZ3 KWtw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee-org.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=5J6/5VFY104jLfo7qQmI4RhlSjoKV/74TeHargAgHNY=; b=W9yP3QqboPoTGEm89K3EavwMkZEjsSMEBXTetxWtRbBIg+E2Sd4XA6dJ3zxyucvX8p CPxUF6N3Q3Ko0h/jBNHbJz13CpUwA2ddqCZCepVFqE/o+CObLKpwy8BW0FgU6mu9+6Tk 0VmbAqziFskjm0kkkws7V0xU+3MqyCT4DEyKO15uBuLaTZaJb+UnnXaILPfcCW/0TtV0 1Lhu3eD2IDlzI6NhIqbBM7JNmQVjiidQdNAhsgFQqVfcNDhQnphtZ8lxN/VGjKaWUt6p 6v4VRi2B1xHmsoJQ2L6YlpsDBlrRGCIfVNoHyE2JDN4NIvkk7vfhGqQHGup9j/VBmzxk 5iyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=5J6/5VFY104jLfo7qQmI4RhlSjoKV/74TeHargAgHNY=; b=ADbA7q8BXbb6BRcoZTdhi5XQJaBLzQF8WjLrKS0Ro+nKwfwJUVDnJukL19IQuYqg45 gZ00OnfObXVyHb8p9GOga9/x0hnUZrd+c/nJPVHN72BK6brnXe1q6oqbIemCCltEVANG O6nn2+qmllFpqRFGobPhlNXXqUKunsVYZfdYyC0jdwK6GcBCxtMabYcVIEa1jgN+W/L/ xx93PnVlJp4P0ZlsJJDX9p1IJLTv6alv0L7krvE32vh1wWphMvkNeDNyZPpJOYjp+l1E uqeSRFfuYr0DkEbr3P0IP8TSx/RuYJhOphgVjKMYvn8F/MyGhl1eLOHkX1ZYGtp4UQG/ ulzw== X-Gm-Message-State: ALKqPwekxXn4CFS+ExtOEJc5p/4ZXBKiN6XWdOe8i3EXyLTxzG60/dC+ uTeGC9aeO5fFIwmxAgSgfvT0UQzIQiP/+NXXYrw= X-Received: by 2002:a6b:895c:: with SMTP id l89-v6mr5224991iod.22.1526038900102; Fri, 11 May 2018 04:41:40 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:8a02:0:0:0:0:0 with HTTP; Fri, 11 May 2018 04:40:59 -0700 (PDT) In-Reply-To: References: <7fd7e3b3-77b1-0936-b169-d08b946bedc7@iogearbox.net> <991243e2-e7c2-f2b2-72b9-d37b0d569b3b@gmail.com> <5973966e-fcd9-7ee5-a9c4-b79d22c1b9dd@nokia.com> <20180220162622.GA32068@hmswarspite.think-freely.org> <7d98027d-e810-a079-49c5-0bf8beef390e@nokia.com> From: Dan Streetman Date: Fri, 11 May 2018 07:40:59 -0400 X-Google-Sender-Auth: GP9nF2AriLlv4z_9ds0due-ACPY Message-ID: Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free To: Dmitry Vyukov Cc: Tommi Rantala , Neil Horman , Xin Long , David Ahern , Daniel Borkmann , Cong Wang , David Miller , Eric Dumazet , Willem de Bruijn , Jakub Kicinski , Rasmus Villemoes , netdev , LKML , Alexey Kuznetsov , Hideaki YOSHIFUJI , syzkaller , Dan Streetman , "Eric W. Biederman" , Alexey Kodanev , Marcelo Ricardo Leitner , linux-sctp@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 11, 2018 at 5:19 AM, Dmitry Vyukov wrote: > On Thu, May 10, 2018 at 12:23 PM, Dan Streetman wrote: >>>>>>>> wrote: >>>>>>>>> On 20.02.2018 18:26, Neil Horman wrote: >>>>>>>>>> >>>>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote: >>>>>>>>>>> >>>>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the >>>>>>>>>>>>> latest upstream tree. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> These two commits are in v4.16-rc1: >>>>>>>>>>>> >>>>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 >>>>>>>>>>>> Author: Tommi Rantala >>>>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200 >>>>>>>>>>>> >>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst >>>>>>>>>>>> ... >>>>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback") >>>>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using >>>>>>>>>>>> secondary >>>>>>>>>>>> addresses") >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 >>>>>>>>>>>> Author: Alexey Kodanev >>>>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300 >>>>>>>>>>>> >>>>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst() >>>>>>>>>>>> ... >>>>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using >>>>>>>>>>>> secondary >>>>>>>>>>>> addresses for ipv6") >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I guess we missed something if it's still reproducible. >>>>>>>>>>>> >>>>>>>>>>>> I can check it later this week, unless someone else beat me to it. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Tommi, >>>>>>>>>>> >>>>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's >>>>>>>>>>> another one then. But I am still seeing these: >>>>>>>>>>> >>>>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free. >>>>>>>>>>> Usage count = 4 >>>>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free. >>>>>>>>>>> Usage count = 4 >>>>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free. >>>>>>>>>>> Usage count = 4 >>>>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free. >>>>>>>>>>> Usage count = 4 >>>>>>>>>>> >>>>>>>>>>> on upstream tree pulled ~12 hours ago. >>>>>>>>>>> >>>>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing >>>>>>>>>> out a >>>>>>>>>> backtrace if the device name matches "lo". That should tell us >>>>>>>>>> definitively if >>>>>>>>>> the problem is in the same location or not >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you >>>>>>>>> sent in the first email in this thread: >>>>>>>>> >>>>>>>>> With 4.16-rc2 unable to reproduce. >>>>>>>>> >>>>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for >>>>>>>>> lo to become free. Usage count = 3" >>>>>>>>> >>>>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()" >>>>>>>>> cherry-picked on top, unable to reproduce. >>>>>>>>> >>>>>>>>> >>>>>>>>> Is syzkaller doing something else now to trigger the bug...? >>>>>>>>> Can you still trigger the bug with the same reproducer? >>>>>>>> >>>>>>>> Hi Neil, Tommi, >>>>>>>> >>>>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo >>>>>>>> to become free. Usage count = 3" hangs. >>>>>>>> I still did not have time to deep dive into what happens there (too >>>>>>>> many bugs coming from syzbot). But this still actively happens and I >>>>>>>> suspect accounts to a significant portion of various hang reports, >>>>>>>> which are quite unpleasant. >>>>>>>> >>>>>>>> One idea that could make it all simpler: >>>>>>>> >>>>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any >>>>>>>> prolonged periods of time under any non-buggy conditions? E.g. more >>>>>>>> than 1-2 minutes? >>>>>>>> If it only supposed to wait briefly for things that already supposed >>>>>>>> to be shutting down, and we add a WARNING there after some timeout, >>>>>>>> then syzbot will report all info how/when it happens, hopefully >>>>>>>> extracting reproducers, and all the nice things. >>>>>>>> But this WARNING should not have any false positives under any >>>>>>>> realistic conditions (e.g. waiting for arrival of remote packets with >>>>>>>> large timeouts). >>>>>>>> >>>>>>>> Looking at some task hung reports, it seems that this code holds some >>>>>>>> mutexes, takes workqueue thread and prevents any progress with >>>>>>>> destruction of other devices (and net namespace creation/destruction), >>>>>>>> so I guess it should not wait for any indefinite periods of time? >>>>>>> >>>>>>> I'm working on this currently: >>>>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407 >>>>>>> >>>>>>> I added a summary of what I've found to be the cause (or at least, one >>>>>>> possible cause) of this: >>>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72 >>>>>>> >>>>>>> I'm working on a patch to work around the main side-effect of this, >>>>>>> which is hanging while holding the global net mutex. Hangs will still >>>>>>> happen (e.g. if a dst leaks) but should not affect anything else, >>>>>>> other than a leak of the dst and its net namespace. >>>>>>> >>>>>>> Fixing the dst leaks is important too, of course, but a dst leak (or >>>>>>> other cause) shouldn't break the entire system. >>>>>> >>>>>> Leaking some memory is definitely better than hanging the system. >>>>>> >>>>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for >>>>>> (.*) to become free" as a kernel bug: >>>>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396 >>>>>> Unfortunately it does not make it catch these bugs because creating a >>>>>> net namespace per test is too damn slow, so namespaces are reused for >>>>>> lots of tests and when/if it's eventually destroyed it's already too >>>>>> late to find root cause. >>>>>> >>>>>> But I've run a one-off experiment with prompt net namespace >>>>>> destruction and syzkaller was able to easily extract a C reproducer: >>>>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt >>>>>> >>>>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config: >>>>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt >>>>>> >>>>>> this gives me: >>>>>> >>>>>> [ 83.183198] unregister_netdevice: waiting for lo to become free. >>>>>> Usage count = 9 >>>>>> [ 85.231202] unregister_netdevice: waiting for lo to become free. >>>>>> Usage count = 9 >>>>>> ... >>>>>> [ 523.511205] unregister_netdevice: waiting for lo to become free. >>>>>> Usage count = 9 >>>>>> ... >>>>>> >>>>>> This is generated from this syzkaller program: >>>>>> >>>>>> r0 = socket$inet6(0xa, 0x1, 0x84) >>>>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23, >>>>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb}, >>>>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0, >>>>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0, >>>>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8) >>>>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c) >>>>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14, >>>>>> 0x14, 0xd}}, 0x10) >>>>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa, >>>>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [], >>>>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0, >>>>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[], >>>>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040)) >>>>>> >>>>>> So this seems to be related to IPv6 and/or xfrm and is potentially >>>>>> caused by external packets (that syz_emit_ethernet call). >>>>> >>>>> >>>>> >>>>> Here is another repro which seems to be a different bug (note that it >>>>> requires fault injection): >>>>> >>>>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt >>>>> >>>>> Dan, do you mind taking a look at them? Fixing these should eliminate >>>>> root causes of these hangs/leaks. >>>> >>>> Yep I will look at them, thanks for the reproducers. >>> >>> Hi Dan, >>> >>> Any updates on this? syzbot is hitting this all the time. >> >> Sorry, the recent changes from net_mutex -> net_rwsem/pernet_ops_rwsem >> have complicated what I had done to workaround this, but I'm still >> working on it. Apologies for the delay. > > Are you looking at the mitigation? Or the bugs that trigger it? Or both? Both - workaround first, as most important (and relatively easiest) is allowing the system to continue to create and destroy net namespaces once this happens (instead of hanging all further netns init/cleanup until a system reboot). Then, I do want to try to add some debug to make it easier to debug a leaked dst (or other cause).