Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp2667163imd; Fri, 2 Nov 2018 15:46:38 -0700 (PDT) X-Google-Smtp-Source: AJdET5eZY21vHA7/vnAyy+G0s0HdqgLFkRWuOIn2zrnPuUo7NCt2C2ztUtB2wUyFSwC2K3ZfxwhR X-Received: by 2002:a63:f901:: with SMTP id h1mr4679962pgi.154.1541198798735; Fri, 02 Nov 2018 15:46:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541198798; cv=none; d=google.com; s=arc-20160816; b=c7yq8Jdnl3PqbGM5p7h7TWyHlal6/DRvxBNL7WyT4peRxb375j8W7gtSKezNwjrrXG CoGbrDP0mJe2rwIETfI4yyuqhDAPOvhaTahkyfTyOmWGqg5aovBbLW/o9/t64sCoF5sI Pz8N/sDrhS8uZogVKfaPT27ZM/DZ2OynNhLCfQmf++XdUgC1IdxJeiLGhRN6/JitThqV pOhf24p4kMefVxYxxouFB3GeyxdJ5urF1CD/xFA33SvTWmGoaX6kX7LK96pQj3mR0bWh PP94Nh1UMtVM4u/sMpBkrt8XyTa0HQl9zPJkGXYm3Q5hlbfVjGuu7GCy9AnN94Q4WKTv iRzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=7VOX6gyZ04nVenrZNS5iyKU/QukstnHGQaMlnbuJVZI=; b=n13Mjr1AfLI1bS9QTIWnG3kiH4i4e8BSxGWTjD0E/NEq9voxEpxJx3fIge8TWJQf/x cppw6ya/TkspZQ5LGV5g2bZbRNQX3kwgT89td1CWvgyB6gQTH7AVZPrP6Dj8ZCm+Q5Tp c0oJ3WkG8aqve+5sUNAwOQ+YWk10qVDNm6NWo83wzxp4FR+yi/08w2kGL90hR4+HVtZe uo8N+mt2CZ+rkS7CisqWm0T+1+kq1+AzkGaWb6Ztzsi3T1dkhPF93I9kaOHmgZVaThEC ohPORRKN2Zee2fR7KJi1APORhqQ30ba6GvtFDTCgGYpbBUZNFl6Y2wHM+dHSwuGBwY+r kufw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h187-v6si37331699pfc.62.2018.11.02.15.46.23; Fri, 02 Nov 2018 15:46:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728434AbeKCHy4 (ORCPT + 99 others); Sat, 3 Nov 2018 03:54:56 -0400 Received: from nautica.notk.org ([91.121.71.147]:37640 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728318AbeKCHy4 (ORCPT ); Sat, 3 Nov 2018 03:54:56 -0400 Received: by nautica.notk.org (Postfix, from userid 1001) id 149A2C009; Fri, 2 Nov 2018 23:45:53 +0100 (CET) Date: Fri, 2 Nov 2018 23:45:38 +0100 From: Dominique Martinet To: Dmitry Vyukov Cc: Tetsuo Handa , Eric Van Hensbergen , Ron Minnich , Latchesar Ionkov , v9fs-developer@lists.sourceforge.net, syzbot , linux-fsdevel , LKML , syzkaller-bugs , Al Viro Subject: Re: INFO: task hung in grab_super Message-ID: <20181102224538.GB9565@nautica> References: <0000000000002f5541057143a85e@google.com> <0adc592b-d4a3-f6da-3c5c-22490f641eb9@i-love.sakura.ne.jp> <727110bb-0154-e5df-4b2f-e965e3b98c62@i-love.sakura.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dmitry Vyukov wrote on Fri, Nov 02, 2018: > >> I guess that's the problem, right? SIGKILL-ed task must not ignore > >> SIGKILL and hang in infinite loop. This would explain a bunch of hangs > >> in 9p. > > > > Did you check /proc/18253/task/*/stack after manually sending SIGKILL? > > Yes: > > root@syzkaller:~# ps afxu | grep syz > root 18253 0.0 0.0 0 0 ttyS0 Zl 10:16 0:00 \_ > [syz-executor] > root@syzkaller:~# cat /proc/18253/task/*/stack > [<0>] p9_client_rpc+0x3a2/0x1400 > [<0>] p9_client_flush+0x134/0x2a0 > [<0>] p9_client_rpc+0x122c/0x1400 > [<0>] p9_client_create+0xc56/0x16af > [<0>] v9fs_session_init+0x21a/0x1a80 > [<0>] v9fs_mount+0x7c/0x900 > [<0>] mount_fs+0xae/0x328 > [<0>] vfs_kern_mount.part.34+0xdc/0x4e0 > [<0>] do_mount+0x581/0x30e0 > [<0>] ksys_mount+0x12d/0x140 > [<0>] __x64_sys_mount+0xbe/0x150 > [<0>] do_syscall_64+0x1b9/0x820 > [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [<0>] 0xffffffffffffffff Yes that's a known problem with the current code, since everything must be cleaned up on the spot, the first kill sends a flush and waits again for the flush reply to come; the second kill is completly ignored. With the refcounting work we've done that went in this merge window we're halfways there - memory can now have a lifetime independant of the current request and won't be freed when the process exits p9_client_rpc, so we can send the flush and return immediately; then have the rest of the cleanup happen asynchronously when the flush reply comes or the client is torn down, whichever happens first. I've got this planned for 4.21 if I can find the time to do it early in this cycle and I get it to work on first try, 4.22 if I run into complications to make sure it's well tested in -next first. My freetime is pretty limited this year so unless you want to help it'll get done when it's ready :) -- Dominique