Received: by 10.223.185.116 with SMTP id b49csp366985wrg; Tue, 20 Feb 2018 00:15:56 -0800 (PST) X-Google-Smtp-Source: AH8x227hikUDWhA/KeG1Ew8fZn5x03dg+IPMCnudyMWHVd7o3XDUz82gKLjmhJ/KWlO3Dv+xho8J X-Received: by 10.98.101.195 with SMTP id z186mr17119749pfb.47.1519114556140; Tue, 20 Feb 2018 00:15:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519114556; cv=none; d=google.com; s=arc-20160816; b=uIZU7mErkLwBdWazRld6D1KyZv24H+J705P8YUWwBkV7o0I1r3CgmbPgYAa9uifO9Y LPVX8+lm61gV5j5G7mcNIOll5vy83I5Ho7DIr+lL30rcVkl2qnzspeIaEZS8GkKZClIK P1Xc5SxTkOZIQq01qxHSZRVMUgCBFE5FIi7o/RIHZswtGcDoPoPXhU1lsf2PPHebdTOK WKx3erhaDxpEUJ5JCoyO40xRLmYHNFFEr5NHyNGj+7VaeQrDshJUGMzhQuKPy4zBZgXZ z48DflTi+7/hi7CkOoTZRPGD59DIKungU0n1ajQfaRbXVdTb31kPXE1PILlJGqOLUARp aE+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=1+juHSKFl0oT62Mpjd0Z0No9rbr0TXOzyMIT9rKdd64=; b=E1YjdZYB3xysa3fPZKn3zXVU+hqMxwgdNQOXo+ArA5B42jOMcuZL/ksD33IJnwz5kh 4pa9y6hJn6+r1nnIpGTr1ca89qSE91aOM+U9nsruVAoH+eynRTLE5bwgEQZ4ddHLISrT LEYMYKJ6lO4YbeiJLLnFu8ocvp8ZZDT1sjMSZNcQ6UXZ9qPdIIXeCpw1INRma6OI2Z/t Kdv3riKA1PSC3nZdC4cPhE3/Wz20LHMqMheZXCCbrGV7aLYyrA5vbWcl1ca8vWHDbJo0 Q1rIAA8tqNAgxCrxi/P+dUMHriw5RiV+pgTyhJS4nVZIu3UTaahD/7UiJnA+70M9GRNE Wptw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NWBDcbNH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t2-v6si10167437plm.674.2018.02.20.00.15.41; Tue, 20 Feb 2018 00:15:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NWBDcbNH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751260AbeBTIPE (ORCPT + 99 others); Tue, 20 Feb 2018 03:15:04 -0500 Received: from mail-pf0-f170.google.com ([209.85.192.170]:45736 "EHLO mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750795AbeBTIPC (ORCPT ); Tue, 20 Feb 2018 03:15:02 -0500 Received: by mail-pf0-f170.google.com with SMTP id j24so2214835pff.12 for ; Tue, 20 Feb 2018 00:15:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=1+juHSKFl0oT62Mpjd0Z0No9rbr0TXOzyMIT9rKdd64=; b=NWBDcbNH5IM5SVSWuLxRm7hLeWGI5lLoNBshjyaiVuHLZhzajLh4WByaBbdTj1Usdw Iecs3tdLG1jsKeLyTBjcZVLK7kqDRVkfAvehi/6dipEY0xtSJXQ1WfVPHcPNtk0RXVM7 XkSO1bNp/VdoTwGJK/30Qu57tZBLk2iFbHO5+gBXBCdrwXYIiuCrMLtruk/DYrg8SCYl 0PeuRMI4mOVV8OKY7vKtq0pf9PWto7IObBaPrQL0JB1NkOIF//IVEaN5ANUFix8supAl f7jNYUx7oJD8RKzUWSm4fveCLAj3jOKIhkmTO2GKWhyLwAlhZrpWY5oaF2kIXY4cvOWQ zLAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=1+juHSKFl0oT62Mpjd0Z0No9rbr0TXOzyMIT9rKdd64=; b=QKbT3ai5KO4tQwq9Zpt3zm1nsoAb7pjOaam9CYn2XoVhWm+JZ70CXq6ZZx4iBvj85O A9KAUicKq+bT51Nl4dGGC0c3a1IQvaaPo3F5EjyIamxdGPmX0jTU6TqnPAMkwCYwTHoC EiT0injAoTxiiXR6yZUBdmWkcJ9BG48Y9G+BILjexIBGO3iS4G826S3m+koZWoZjgiMS Ht/kfdRGx3Rj+mgwxRCB10/3iI5JarIrNQublwlhmtbbHLl/Pkn4OCkFS52VK/p6THs0 JsQDCaQ79Uzhf5L0AD+p8VNx4oNisStcftx3tHcuFNZ8q+NoshqkuiiDtuuAjTOMRQDb 4guw== X-Gm-Message-State: APf1xPDcFQIyNI1Fg7f4GN3Y2LQ3Um+kmeXCM2zLEzlHdALrg0APyH4I koXaUl0zJwgAy9/3MPsov1iB9k9BR7GWrM6zA96mzw== X-Received: by 10.99.56.7 with SMTP id f7mr14535957pga.114.1519114501581; Tue, 20 Feb 2018 00:15:01 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.140.151 with HTTP; Tue, 20 Feb 2018 00:14:41 -0800 (PST) In-Reply-To: <5973966e-fcd9-7ee5-a9c4-b79d22c1b9dd@nokia.com> References: <7fd7e3b3-77b1-0936-b169-d08b946bedc7@iogearbox.net> <991243e2-e7c2-f2b2-72b9-d37b0d569b3b@gmail.com> <5973966e-fcd9-7ee5-a9c4-b79d22c1b9dd@nokia.com> From: Dmitry Vyukov Date: Tue, 20 Feb 2018 09:14:41 +0100 Message-ID: Subject: Re: net: hang in unregister_netdevice: waiting for lo to become free To: Tommi Rantala Cc: Xin Long , David Ahern , Daniel Borkmann , Cong Wang , David Miller , Eric Dumazet , Willem de Bruijn , Jakub Kicinski , Rasmus Villemoes , netdev , LKML , Alexey Kuznetsov , Hideaki YOSHIFUJI , syzkaller , Dan Streetman , "Eric W. Biederman" , Alexey Kodanev , Neil Horman , Marcelo Ricardo Leitner , linux-sctp@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala wrote: > On 19.02.2018 20:59, Dmitry Vyukov wrote: >> >> On Sat, Feb 3, 2018 at 1:15 PM, Xin Long wrote: >>>>> >>>>> On 1/30/18 1:57 PM, David Ahern wrote: >>>>>> >>>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote: >>>>>>> >>>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote: >>>>>>>> >>>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> The following program creates a hang in unregister_netdevice. >>>>>>>>> cleanup_net work hangs there forever periodically printing >>>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = >>>>>>>>> 3" >>>>>>>>> and creation of any new network namespaces hangs forever. >>>>>>>> >>>>>>>> >>>>>>>> Interestingly, this is not reproducible on net-next. >>>>>>> >>>>>>> >>>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: >>>>>>> close >>>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago, >>>>>>> maybe >>>>>>> fixed due to that? >>>>>>> >>>>>> >>>>>> This appears to be the commit introducing the refcnt leak: >>>>>> >>>>>> $ git bisect bad >>>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit >>>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 >>>>>> Author: Xin Long >>>>>> Date: Fri May 12 14:39:52 2017 +0800 >>>>>> >>>>>> sctp: fix src address selection if using secondary addresses for >>>>>> ipv6 >>>>>> >>>>>> >>>>>> v4.14 is bad. Running bisect in the background while doing other >>>>>> things.... >>>>>> >>>>> >>>>> Interesting. The commit that avoids the refcnt leak is >>>>> >>>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212 >>>>> Author: David Ahern >>>>> Date: Wed Jan 24 19:45:29 2018 -0800 >>>>> >>>>> net/ipv6: Do not allow route add with a device that is down >>>>> >>>>> That commit does not intentionally address the problem so it is just >>>>> masking the problematic code introduced by the commit above. >>>> >>>> Thanks, David A. >>>> >>>> I'm still on a trip. will look into this asap. >>> >>> >>> Alexey and Tommi already had the patches for this issue on >>> both SCTP v4 and v6 dst_get, Thanks. >> >> >> >> >> Is this meant to be fixed already? I am still seeing this on the >> latest upstream tree. >> > > These two commits are in v4.16-rc1: > > commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 > Author: Tommi Rantala > Date: Mon Feb 5 21:48:14 2018 +0200 > > sctp: fix dst refcnt leak in sctp_v4_get_dst > ... > Fixes: 410f03831 ("sctp: add routing output fallback") > Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary > addresses") > > > commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 > Author: Alexey Kodanev > Date: Mon Feb 5 15:10:35 2018 +0300 > > sctp: fix dst refcnt leak in sctp_v6_get_dst() > ... > Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary > addresses for ipv6") > > > I guess we missed something if it's still reproducible. > > I can check it later this week, unless someone else beat me to it. Hi Tommi, Hmmm, I can't claim that it's exactly the same bug. Perhaps it's another one then. But I am still seeing these: [ 58.799130] unregister_netdevice: waiting for lo to become free. Usage count = 4 [ 60.847138] unregister_netdevice: waiting for lo to become free. Usage count = 4 [ 62.895093] unregister_netdevice: waiting for lo to become free. Usage count = 4 [ 64.943103] unregister_netdevice: waiting for lo to become free. Usage count = 4 on upstream tree pulled ~12 hours ago. Kernel does not detect this as any kind of BUG/WARNING, so syzkaller/syzbot do not catch it as bug and do not try to reproduce, localize and report.