Received: by 10.192.165.148 with SMTP id m20csp955431imm; Thu, 10 May 2018 03:25:50 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrHuamA/BZRuJ8W6oTeld+hNrmuoFmvIhgRxOD8cqL7uz/enqoMyKogE/M1YlD5w6DWGNHT X-Received: by 2002:a65:4dca:: with SMTP id q10-v6mr692947pgt.22.1525947949962; Thu, 10 May 2018 03:25:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525947949; cv=none; d=google.com; s=arc-20160816; b=cxYMjODMbpTThupgoIwDMi14ao1UUnG+/0UUPViHfCAKXjJs27e3luJS79/c0S05Bu 7xEw1ycuVdbI7pjIu1XeVKAHWw0mp20bo3vRO9+HybvZr/Dz1/cRbyCLH0RmKxOIzjUz rZ1khuLBVHyp6UWMW1lRHnzS3h2D00ownmn3oz1PZwMij9YCwrisUpEMsMxNr0yIlsER cpXAsv8FZ1uo5LEpAHOTx+IWI4oyrJuTaQc38YuINaZLQl2oNjqWTXLkX5h1DZSVdLJk tvWW0ao1HDnB/gJoIIQ3VhzxSwv6R5CYuVeM71YoSbTMx4JuuWYdkhuWZG8IrQA7Ucyy NCqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=3fCWnTUbPiW+UnOu+G2K+wiKcQvnezNL1uJi3jZISJg=; b=X5aD5EL0O3A/JHxRdGwSIk9k0D+yzebFmWC2X8oHZWnWg2HzOI4k05PrjEeY5p9ERN Xa5Vl6QlIPmLQPaU3uchm4UOh/L287X8A1wMI1GpMKyDjuAEgZ+RVwJ8yM5quWLCz1Vp 1cZLlKqMQD2frd6UX1OktAD38Bc0ThzrTnnFzyC8Kya1DTBQ9wdwIFWnwOEh1IOp/4PK 1OtZhhmFLnjqxVEPIG0O6/vB5b6TUP8xfJwrosHho6pCqSSgcwiMhJjsl6WhROLsEvcU sAdYbSYhuqfXid7PbG6zLjp5EFpT8NqBMGBDSwnXfsIq4HHejZf6hmGy5KuVP0geUjFj 5OfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=JThYwRvG; dkim=fail header.i=@ieee-org.20150623.gappssmtp.com header.s=20150623 header.b=AVqin4oY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d124-v6si493239pfc.176.2018.05.10.03.25.35; Thu, 10 May 2018 03:25:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=JThYwRvG; dkim=fail header.i=@ieee-org.20150623.gappssmtp.com header.s=20150623 header.b=AVqin4oY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935052AbeEJKYb (ORCPT + 99 others); Thu, 10 May 2018 06:24:31 -0400 Received: from mail-it0-f66.google.com ([209.85.214.66]:36984 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934266AbeEJKY1 (ORCPT ); Thu, 10 May 2018 06:24:27 -0400 Received: by mail-it0-f66.google.com with SMTP id 70-v6so2508233ity.2; Thu, 10 May 2018 03:24:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=3fCWnTUbPiW+UnOu+G2K+wiKcQvnezNL1uJi3jZISJg=; b=JThYwRvG1KstLTXqiA4JAOuKUUu5doJyqJVjXSlZO6naI/P7p+7Jj7UqLt/hit1r72 kSNW1NMdgu/gBW/5Wi/jmvU5hx1mmu35/WNjkZwNrissQcbGX6b4pnDtwMTTVMXFPQln ZpyohkJNYcfTTz4Yb3tJ1WlWDYF26tMYQ6BbvW84KFfxzVCDoFAV+lDU/XqwS4cNgEt/ DzML8mVoIKVkNTTNi/owFEt4gvXlaDOA1W45YSbLI/i7zagfSVuQeASi9SpaXFgwFW2N Xul195SROXdiX9ZrSqQ/EFy5eKDVpwJr9WjZsEExFSn4X7Dc51iN1XbnR2676x3bLrm4 ad2A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee-org.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=3fCWnTUbPiW+UnOu+G2K+wiKcQvnezNL1uJi3jZISJg=; b=AVqin4oY0UtX5qGtIJx3/AftRpuwIvUhpaoDXP/tEUo9sedogd9zQ0PEYMlj6uZJSE oiy4UEu0HK/NdwL3qXJ62y6smH84QX9vrX3Sk9xSuCnDErLqCWWE3aTyJKUYq2Q8Z9Ps Hjgo03pC6b4zoUA+mZgw0KpGI0ypjVanr0pJxxquJdv9NyR4qijigADQl8cFQbo6rDge EYJhe/Cq/wGMTXdpgCnuVtFLOs3gGIjYyQoSo3BYir9JQOUkxsI4C+meg+kmfJsW+N8H vkOk23ICe3TJJbxqfaBhBF5/fAtf/F31FSlkTgJJxF1lazpOcOFn0GioKbZGPTHm0cIn mhdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=3fCWnTUbPiW+UnOu+G2K+wiKcQvnezNL1uJi3jZISJg=; b=LYWzt9e2tKL0FF599smE2/i3DEQKCHWMNK2ignPC1BjMgkuC3HQhNFJR1v7pZPVJwx KEmByCmmHe7K/BBHuQNWlaukY3uSrCg7OgjAKuj4GR9lx/zIlrZoqRJXAlfQFgiutXkr cgQdkvzwX9QzEgtyCqemoil9d+us4kxhotUaldRwrxJ/MDyjob4A4TNsLIXQ9uY0Ps3b yfPNrVq/V4NKzy4RtVwiZi5vLkBo+oMKmlhvr05sZo/ShF/cny6JdxHzIAw+0Z134q3Q W/OLDHWQ7vm1o1Pza3p0FNnbD4UaUOXi32TLg79xaLzQbwVcxoS3oaPI552DNXoiMKKN AHTw== X-Gm-Message-State: ALKqPwc0arnhpZq54rp4+hUN2fwI1NULHk8Pr6PQ9TC+vMY5rxKJAr7Z GJIQ/TVspjMu238XWyuxsXZhi+QG4wrLp1Xwm8Q= X-Received: by 2002:a24:5954:: with SMTP id p81-v6mr1085515itb.152.1525947866477; Thu, 10 May 2018 03:24:26 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:8a02:0:0:0:0:0 with HTTP; Thu, 10 May 2018 03:23:45 -0700 (PDT) In-Reply-To: References: <7fd7e3b3-77b1-0936-b169-d08b946bedc7@iogearbox.net> <991243e2-e7c2-f2b2-72b9-d37b0d569b3b@gmail.com> <5973966e-fcd9-7ee5-a9c4-b79d22c1b9dd@nokia.com> <20180220162622.GA32068@hmswarspite.think-freely.org> <7d98027d-e810-a079-49c5-0bf8beef390e@nokia.com> From: Dan Streetman Date: Thu, 10 May 2018 06:23:45 -0400 X-Google-Sender-Auth: kyFq5_0j3-skoPjYFDApN3A2K1I Message-ID: Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free To: Dmitry Vyukov Cc: Tommi Rantala , Neil Horman , Xin Long , David Ahern , Daniel Borkmann , Cong Wang , David Miller , Eric Dumazet , Willem de Bruijn , Jakub Kicinski , Rasmus Villemoes , netdev , LKML , Alexey Kuznetsov , Hideaki YOSHIFUJI , syzkaller , Dan Streetman , "Eric W. Biederman" , Alexey Kodanev , Marcelo Ricardo Leitner , linux-sctp@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 10, 2018 at 2:46 AM, Dmitry Vyukov wrote: > On Mon, Apr 16, 2018 at 9:42 PM, Dan Streetman wrote: >>>>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala >>>>>> wrote: >>>>>>> On 20.02.2018 18:26, Neil Horman wrote: >>>>>>>> >>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote: >>>>>>>>> >>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote: >>>>>>>>>>> >>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the >>>>>>>>>>> latest upstream tree. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> These two commits are in v4.16-rc1: >>>>>>>>>> >>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 >>>>>>>>>> Author: Tommi Rantala >>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200 >>>>>>>>>> >>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst >>>>>>>>>> ... >>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback") >>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using >>>>>>>>>> secondary >>>>>>>>>> addresses") >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 >>>>>>>>>> Author: Alexey Kodanev >>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300 >>>>>>>>>> >>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst() >>>>>>>>>> ... >>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using >>>>>>>>>> secondary >>>>>>>>>> addresses for ipv6") >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I guess we missed something if it's still reproducible. >>>>>>>>>> >>>>>>>>>> I can check it later this week, unless someone else beat me to it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Tommi, >>>>>>>>> >>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's >>>>>>>>> another one then. But I am still seeing these: >>>>>>>>> >>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free. >>>>>>>>> Usage count = 4 >>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free. >>>>>>>>> Usage count = 4 >>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free. >>>>>>>>> Usage count = 4 >>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free. >>>>>>>>> Usage count = 4 >>>>>>>>> >>>>>>>>> on upstream tree pulled ~12 hours ago. >>>>>>>>> >>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing >>>>>>>> out a >>>>>>>> backtrace if the device name matches "lo". That should tell us >>>>>>>> definitively if >>>>>>>> the problem is in the same location or not >>>>>>> >>>>>>> >>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you >>>>>>> sent in the first email in this thread: >>>>>>> >>>>>>> With 4.16-rc2 unable to reproduce. >>>>>>> >>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for >>>>>>> lo to become free. Usage count = 3" >>>>>>> >>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()" >>>>>>> cherry-picked on top, unable to reproduce. >>>>>>> >>>>>>> >>>>>>> Is syzkaller doing something else now to trigger the bug...? >>>>>>> Can you still trigger the bug with the same reproducer? >>>>>> >>>>>> Hi Neil, Tommi, >>>>>> >>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo >>>>>> to become free. Usage count = 3" hangs. >>>>>> I still did not have time to deep dive into what happens there (too >>>>>> many bugs coming from syzbot). But this still actively happens and I >>>>>> suspect accounts to a significant portion of various hang reports, >>>>>> which are quite unpleasant. >>>>>> >>>>>> One idea that could make it all simpler: >>>>>> >>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any >>>>>> prolonged periods of time under any non-buggy conditions? E.g. more >>>>>> than 1-2 minutes? >>>>>> If it only supposed to wait briefly for things that already supposed >>>>>> to be shutting down, and we add a WARNING there after some timeout, >>>>>> then syzbot will report all info how/when it happens, hopefully >>>>>> extracting reproducers, and all the nice things. >>>>>> But this WARNING should not have any false positives under any >>>>>> realistic conditions (e.g. waiting for arrival of remote packets with >>>>>> large timeouts). >>>>>> >>>>>> Looking at some task hung reports, it seems that this code holds some >>>>>> mutexes, takes workqueue thread and prevents any progress with >>>>>> destruction of other devices (and net namespace creation/destruction), >>>>>> so I guess it should not wait for any indefinite periods of time? >>>>> >>>>> I'm working on this currently: >>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407 >>>>> >>>>> I added a summary of what I've found to be the cause (or at least, one >>>>> possible cause) of this: >>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72 >>>>> >>>>> I'm working on a patch to work around the main side-effect of this, >>>>> which is hanging while holding the global net mutex. Hangs will still >>>>> happen (e.g. if a dst leaks) but should not affect anything else, >>>>> other than a leak of the dst and its net namespace. >>>>> >>>>> Fixing the dst leaks is important too, of course, but a dst leak (or >>>>> other cause) shouldn't break the entire system. >>>> >>>> Leaking some memory is definitely better than hanging the system. >>>> >>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for >>>> (.*) to become free" as a kernel bug: >>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396 >>>> Unfortunately it does not make it catch these bugs because creating a >>>> net namespace per test is too damn slow, so namespaces are reused for >>>> lots of tests and when/if it's eventually destroyed it's already too >>>> late to find root cause. >>>> >>>> But I've run a one-off experiment with prompt net namespace >>>> destruction and syzkaller was able to easily extract a C reproducer: >>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt >>>> >>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config: >>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt >>>> >>>> this gives me: >>>> >>>> [ 83.183198] unregister_netdevice: waiting for lo to become free. >>>> Usage count = 9 >>>> [ 85.231202] unregister_netdevice: waiting for lo to become free. >>>> Usage count = 9 >>>> ... >>>> [ 523.511205] unregister_netdevice: waiting for lo to become free. >>>> Usage count = 9 >>>> ... >>>> >>>> This is generated from this syzkaller program: >>>> >>>> r0 = socket$inet6(0xa, 0x1, 0x84) >>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23, >>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb}, >>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0, >>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0, >>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8) >>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c) >>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14, >>>> 0x14, 0xd}}, 0x10) >>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa, >>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [], >>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0, >>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[], >>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040)) >>>> >>>> So this seems to be related to IPv6 and/or xfrm and is potentially >>>> caused by external packets (that syz_emit_ethernet call). >>> >>> >>> >>> Here is another repro which seems to be a different bug (note that it >>> requires fault injection): >>> >>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt >>> >>> Dan, do you mind taking a look at them? Fixing these should eliminate >>> root causes of these hangs/leaks. >> >> Yep I will look at them, thanks for the reproducers. > > Hi Dan, > > Any updates on this? syzbot is hitting this all the time. Sorry, the recent changes from net_mutex -> net_rwsem/pernet_ops_rwsem have complicated what I had done to workaround this, but I'm still working on it. Apologies for the delay.