Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp7096131imm; Tue, 24 Jul 2018 08:19:13 -0700 (PDT) X-Google-Smtp-Source: AAOMgpffcILFTAhuWsRvCIXdsbbOcIReKUrF9MA1wh41zXq9OPtbahgaf+K5pVnNOc+lrScESbgs X-Received: by 2002:a62:5486:: with SMTP id i128-v6mr17960988pfb.166.1532445553689; Tue, 24 Jul 2018 08:19:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532445553; cv=none; d=google.com; s=arc-20160816; b=w7BBjDuyi9/C3T2a7qHvx98sW3DlOA/phxvFaCYy2+pCBLX8AcflD70o6gEzpL3eR5 TPZjyzMQU5HsuT6mYN2Qv+rRLWcXftB6GGMIoEZPeXlgrXkC/qKvjEaDrh0ATsHHl+pT wvVwcqYHGuuxdGpfr+XoS6orfgESPuz55fk5pPyfrzOXfoC2grZMu5nyNWMf6IshzZmG MApfnqfJutgSiIYyxXkoZCjdUr5mLLo+BfhYz+6esNuOt8zL9GuQfWkaQepVYxH7tB0e ba+AB06WtH2daGStZurQFmAncu9wbYKdMojA1PsSUE6u3BZIXc9OuFLqymy5y/SugVYL kzVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=+HKEVsXmIk19WxUqQP4wbDNwV6Kbp7/8zkgIPr5CPf0=; b=RFqvfJ8wS9IwKzaPV0he1yh2Azdd42l37asIxNBduG5sVLP2xJc8fh8RCHSpf9u0jH rhmBlbzF59Vn5Kw5BykFq4FfwQkg5tlYev4AJC1PBE/xG7k0dhAT/amUScIQGhf+wXx7 PpYNffuCfS38DI65vh5VyHn1/PHof960YE02ujq6Yf+1XxJzRmMhukQ2wW3txRV3b5tr NubFlcZASgVYzw+gIt1r5NN36Ga3aaOolxPkK/oiVel1y0IYgtZkFxV4NrYrKBAytD7x AYx+4hg+iRuyMUHNQggU5GvG3vImCuZ1NcZVsOD4klXip/rGgWpp+0Ycp6P3F19IpXL6 8kVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=parsyu2J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q20-v6si11588896pgl.573.2018.07.24.08.18.58; Tue, 24 Jul 2018 08:19:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=parsyu2J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388400AbeGXQY7 (ORCPT + 99 others); Tue, 24 Jul 2018 12:24:59 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:41595 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388363AbeGXQY7 (ORCPT ); Tue, 24 Jul 2018 12:24:59 -0400 Received: by mail-oi0-f65.google.com with SMTP id k12-v6so8129829oiw.8 for ; Tue, 24 Jul 2018 08:18:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=+HKEVsXmIk19WxUqQP4wbDNwV6Kbp7/8zkgIPr5CPf0=; b=parsyu2JOaZA2hY+cB6KYreq72T8B0Wbnpb+jeeft1Yhu/58r3K5OzVdH8IaCJD2Ds vxJEr+RZd7FBCXCLRmvJ/Xok/00c1IINzG63E+fP3OpItAxI+pAbKpgR0XPRH8M+3XQI PfHB1KD/OvkUncokGZRRysBfTcep4Sm2os4AU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=+HKEVsXmIk19WxUqQP4wbDNwV6Kbp7/8zkgIPr5CPf0=; b=mPHTEQDIbFDLxyUjtMUyvcwkuNcXC3EiFztML095OICWSl2rxWYEbPsZNP2D75Tbor KJiF2K/fCp8cuFc9es5pc6DUfRKnKHIf6OYyPWtiobIeqyGQXLRwSp80dV2RsTzrEpPU kJ6IOFt6x2f9G6Ta340SxO42Y7B69rGWfnZQywqr2E+xO54fFp9Nb5x8+LPAFyf3IbWH OairFpEevnIJ1J4n2GWt4SGxMze5s5ipTnaEOKws4lfsVEtIoyzTHU+5cVCJNDseHbbi mYBcLKGLDyn9JxtsokvHbDS6+CJyRlCszFjKLM8xDndDEcB/YvLXTXXzZ+XsLjjIGlda huBA== X-Gm-Message-State: AOUpUlGVic2PiwJHnGhgYIoba0Qti71K7WCEs6D8Rc9nYjOdYGqF4Oha 3zWgMBp33dnGHGLFV6eDjmaROOWymutE5xcwi0U2Cg== X-Received: by 2002:aca:a64d:: with SMTP id p74-v6mr3398478oie.149.1532445480762; Tue, 24 Jul 2018 08:18:00 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:113c:0:0:0:0:0 with HTTP; Tue, 24 Jul 2018 08:17:59 -0700 (PDT) X-Originating-IP: [212.96.48.140] In-Reply-To: References: <000000000000bc17b60571a60434@google.com> From: Miklos Szeredi Date: Tue, 24 Jul 2018 17:17:59 +0200 Message-ID: Subject: Re: INFO: task hung in fuse_reverse_inval_entry To: Dmitry Vyukov Cc: linux-fsdevel , LKML , syzkaller-bugs , syzbot Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 23, 2018 at 5:19 PM, Dmitry Vyukov wrote: > On Mon, Jul 23, 2018 at 5:09 PM, Miklos Szeredi wrote: >> On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov wrote: >>> On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi wrote: >> >>>> Biggest conceptual problem: your definition of fuse-server is weak. >>>> Take the following example: process A is holding the fuse device fd >>>> and is forwarding requests and replies to/from process B via a pipe. >>>> So basically A is just a proxy that does nothing interesting, the >>>> "real" server is B. But according to your definition B is not a >>>> server, only A is. >>> >>> I proposed to abort fuse conn when all fuse device fd's are "killed" >>> (all processes having the fd opened are killed). So if _only_ process >>> B is killed, then, yes, it will still hang. However if A is killed or >>> both A and B (say, process group, everything inside of pid namespace, >>> etc) then the deadlock will be autoresolved without human >>> intervention. >> >> Okay, so you're saying: >> >> 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed >> 2) for a particular fuse instance find set of fuse device fd >> references that are in non-doomed tasks; if there are none then abort >> fuse instance >> >> Right? > > > Yes, something like this. > Perhaps checking for "uninterruptible sleep" is excessive. If it has > SIGKILL pending it's pretty much doomed already. This info should be > already available for tasks. > Not saying that it's better, but what I described was the other way > around: when a task killed it drops a reference to all opened fuse > fds, when the last fd is dropped, the connection can be aborted. struct task_struct { [...] struct files_struct *files; [...] }; struct files_struct { [...] struct fdtable __rcu *fdt; [...] }; struct fdtable { [...] struct file __rcu **fd; /* current fd array */ [...] }; So there we have an array of pointers to struct files. Suppose we'd magically be able to find files that point to fuse devices upon receiving SIGKILL, what would we do with them? We can't close them: other tasks might still be pointing to the same files_struct. We could do a global search for non-doomed tasks referencing the same fuse device, but I have no clue how we'd go about doing that without racing with forks, fd sending, etc... Thanks, Miklos