Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5960845imm; Mon, 23 Jul 2018 08:59:59 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfmlyL1nJtCs0dO1o42tveEchwgwWvaN0WThiSuTej46wirvokbaJ8H1tbI1dVB5tTCHr8r X-Received: by 2002:a65:4587:: with SMTP id o7-v6mr13028389pgq.317.1532361599316; Mon, 23 Jul 2018 08:59:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532361599; cv=none; d=google.com; s=arc-20160816; b=hIVMkmqtmglh4wU1Aaz7nld+3lJNSPf8DIdZthr/O3su2GuIYb7Y8UrRYUyQB31n+b c5X/kMiJMjTWb3qeWswvU7EaHB2CQV98DgY5zYssGxbo7emGHAuVD0/KyMvVVtH6jAfJ GQ57dmVCbV7OWtapNZk6fh7pfm+wYDLyv45XFv03fKboUMJ1ej78S71dXopp7uDn3JNe lJirI17VgfQ7FEKGx4e/VQ3RTHtAD+2/eHAvc+Y7YNJwhtg+xGc8P3LYbZTqq6Rsm5xP PEP88izAsY/lrqCGDvH+fqIzWK3/7/Qcp7UuJXsIp9Ga2JD4VwYljIgwmo6ORb3FgmF5 MMXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=gOIU6Fx97Gx3e5QXaoHB5ib9jh9Ws2P+yxqInVKs3G8=; b=uslzS2gpsYfq4WvReYi9hHoKDQgm6Sem6pkMRkVevAFYDRFvHRTyb6PqPVsVYyH7Uq SJPMZLcHp/Ny2X4uqia3emrReOrABp7Z5VWjBHQKXfRpORqzhon7TtgG0tsgWhAazO05 uVhTbfAunvs21TbUAM5DIzdexyP1H6DV6voKAd93ISk7sL/epl4BOWIvsPx4U1sANR7r M8Dglz+TL8nsIpL9Jm+io5xY3a6ChxYOqKqxx6PrBmsFrmI1ru84nC0EcikZmPrq8WrJ R9zh8Ek4oTKyJvD0r9GGSGy4/ZfwBQOSEHy7hyFW7RhjmSHZ60lkP2YZxr1oYvB21VO7 wKjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=NvcX3HH0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i28-v6si9174427pfi.105.2018.07.23.08.59.44; Mon, 23 Jul 2018 08:59:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=NvcX3HH0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388146AbeGWQK5 (ORCPT + 99 others); Mon, 23 Jul 2018 12:10:57 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:42107 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388042AbeGWQK5 (ORCPT ); Mon, 23 Jul 2018 12:10:57 -0400 Received: by mail-oi0-f66.google.com with SMTP id n84-v6so1700205oib.9 for ; Mon, 23 Jul 2018 08:09:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gOIU6Fx97Gx3e5QXaoHB5ib9jh9Ws2P+yxqInVKs3G8=; b=NvcX3HH0NEyWZ1b/7Wmrn0vM0R/2YjpK0zk+BHoEIhu4HkeIoe1SAfCUQeivFANw0f /Zzwefc2g9xk6qfVahkZG1eUcd2DXdkaMjaVkXHDimwqtnICuXWS/ZuRRX/aXKOI6nC4 s1CWECN7j+wiDcF5f1ksm/vGLXT+YVQyNVjMU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gOIU6Fx97Gx3e5QXaoHB5ib9jh9Ws2P+yxqInVKs3G8=; b=kWlqda+vAu3hlZwlzauZrPkXnHiJXLS6aPvSDujAoMW3lOeGVjTn5G2SlKv3avqEcE oPJ2GKM56mXApWU0XwnuFvZ23A1c1fastOnmjACx+wZMaC182kWJlL99XruE4zZjGvyU powhdcL9R198qCE2o/kZfVWk/M7rUG8hxCD+1Te41j+rfiDVZ1QdOfpojHn3q945ZeOR jDiVFSX9Cn1yOTRdh0Ui6rvROpH6GDq74BXoxwBgu5/QfyFMTbH7OWodVJxR/I3YTqC5 f1fvshczILlJrhS/fYJPLeaA/FtoqNNIiGjHBHef9yyhmUSHLXljRxhpae1CjMCT+Vdu KeUg== X-Gm-Message-State: AOUpUlG56IzWa4KnV46o5EPj/H0nacm4IbJ+3wuHa8UVNZyGgEKgnZPv ygR3ar5njy5eQCQzwA+fjicoalM4FdFyQlUmXuwDJg== X-Received: by 2002:aca:ce0f:: with SMTP id e15-v6mr9428763oig.13.1532358556720; Mon, 23 Jul 2018 08:09:16 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:113c:0:0:0:0:0 with HTTP; Mon, 23 Jul 2018 08:09:15 -0700 (PDT) X-Originating-IP: [212.96.48.140] In-Reply-To: References: <000000000000bc17b60571a60434@google.com> From: Miklos Szeredi Date: Mon, 23 Jul 2018 17:09:15 +0200 Message-ID: Subject: Re: INFO: task hung in fuse_reverse_inval_entry To: Dmitry Vyukov Cc: linux-fsdevel , LKML , syzkaller-bugs , syzbot Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov wrote: > On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi wrote: >> Biggest conceptual problem: your definition of fuse-server is weak. >> Take the following example: process A is holding the fuse device fd >> and is forwarding requests and replies to/from process B via a pipe. >> So basically A is just a proxy that does nothing interesting, the >> "real" server is B. But according to your definition B is not a >> server, only A is. > > I proposed to abort fuse conn when all fuse device fd's are "killed" > (all processes having the fd opened are killed). So if _only_ process > B is killed, then, yes, it will still hang. However if A is killed or > both A and B (say, process group, everything inside of pid namespace, > etc) then the deadlock will be autoresolved without human > intervention. Okay, so you're saying: 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed 2) for a particular fuse instance find set of fuse device fd references that are in non-doomed tasks; if there are none then abort fuse instance Right? The above is not an implementation proposal, just to get us on the same page regarding the concept. >> And this is just a simple example, parts of the server might be on >> different machines, etc... It's impossible to automatically detect if >> a process is acting as a fuse server or not. > > It does not seem we need the precise definition. If no one ever can > write anything into the fd, we can safely abort the connection (?). Seems to me so. > If > we don't, we can either get that the process exits normally and the > connection is doomed anyway, so no difference in behavior, or we can > get a deadlock. > >> We could let the fuse server itself notify the kernel that it's a fuse >> server. That might help in the cases where the deadlock is >> accidental, but obviously not in the case when done by a malicious >> agent. I'm not sure it's worth the effort. Also I have no idea how >> the respective maintainers would take the idea of "kill hooks"... It >> would probably be a lot of work for little gain. > > What looks wrong to me here is that fuse is only (?) subsystem in > kernel that stops SIGKILL from working and requires complex custom > dance performed by a human operator (which is not necessary there at > all). Say, if a process has opened a socket, whatever, I don't need to > locate and abort something in socketctl fs, just SIGKILL. If a > processes has opened a file, I don't need to locate the fd in /proc > and abort it, just SIGKILL. If a process has created an ipc object, I > don't need to do any special dance, just SIGKILL. fuse is somehow very > special, if we have more such cases, it definitely won't scale. > I understand that there can be implementation difficulties, but > fundamentally that's how things should work -- choose target > processes, kill, done, right? Yes, it would be nice. But I'm not sure it will fly due to implementation difficulties. It's definitely not a high prio feature currently for me, but I'll happily accept patches. Thanks, Miklos