Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3486403pxb; Mon, 9 Nov 2020 12:26:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJznC8Jf9FzigUKQg++cv/mqDbYSQ4R6gcq1E/ySaHy8BqE28hVPIvIQdV7p5Q0dbWKQffWC X-Received: by 2002:a17:906:31cb:: with SMTP id f11mr16660406ejf.142.1604953594229; Mon, 09 Nov 2020 12:26:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604953594; cv=none; d=google.com; s=arc-20160816; b=u6Wd6SiSPDWeekt0qNHo9VcEEC/qKMENP96z33tJuhxHpfvXGfi7/sB4SbjsxKDcRr uDH5JdAHYycF0yz6DdW3vPu49plz5K2j1gkPs+ft4x20qm+h3mGK98vMv5O2pbRPNjI4 7FltuSy5aMKa8FANXav2intVGRfITm5qty4E4JA4F07sSq1UUhfQIfAvnDcvlHcMUJqs 7XUQnG6j8wUkMobXfM0deEYkH2P+3M9nGmlXitkB1KaUyORGDBagPGWfy6Y8UUFdadpA kbWFlSR8MaDxEY/0a9WDtspY9PKpFbxtq8lntnsAWAllkYpd68CxOlbLSIlZtXKUHWLV xacQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=pUe6St4e112YPrcwx/bHzV1CZPra+32RU4VD8kX7rTA=; b=FhFWeSr9nLspWOdh+rp0nXukZFWeHjs3V1Jv670Gq67/eieOEZyIFBMl+BCQIam1QB +MgeiRSMB/3SRZ6FbNjtCdg/EPY+u8JBpTuqpfRuivFzwvD4p06bFeWI7LIj4bm8+nYh oeF4WGKsHJ+mDodSfZZyzke2dqviVQ6VYeSCkO1BhH9dtwQ8H5GefX5tyUrYgdDmvbuX NtcCTK5X4hLTuJrfOw5biTIMZi6poRnUIulNh/PcsPoFDqxZ1u0LG7bqRqLGm1GzF3lb eDLSauMVylCItenWAF+uXWjkdFCl8Fes9ynbS1REVxHo/jIFszUld4XsHLlsAAJUvuFF zYdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b="Y/Gf37WY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v7si8288053ejg.393.2020.11.09.12.25.58; Mon, 09 Nov 2020 12:26:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b="Y/Gf37WY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729658AbgKIUYZ (ORCPT + 99 others); Mon, 9 Nov 2020 15:24:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726952AbgKIUYZ (ORCPT ); Mon, 9 Nov 2020 15:24:25 -0500 Received: from mail-vs1-xe41.google.com (mail-vs1-xe41.google.com [IPv6:2607:f8b0:4864:20::e41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26024C0613CF for ; Mon, 9 Nov 2020 12:24:25 -0800 (PST) Received: by mail-vs1-xe41.google.com with SMTP id y78so5710417vsy.6 for ; Mon, 09 Nov 2020 12:24:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pUe6St4e112YPrcwx/bHzV1CZPra+32RU4VD8kX7rTA=; b=Y/Gf37WY5oe59YcUlJfCoH8cHuLia92ZRNE4gH8Mez2WvU5TfDl4STaOCDGPtxUzxd 6FpwuAzGIp0YilKcfIfS476bONNqmBYS/WrTfgGHLuDsFDh/Sm5dPmSaGos44IEKwsyR tEryLHvUTS6l9XnRLsnny9HKOCOj3FDZquM/E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pUe6St4e112YPrcwx/bHzV1CZPra+32RU4VD8kX7rTA=; b=VT6g8QHJ3Y2Yxz5+Ojezeep5fCgl2ACW3MWvxrjpdD4PHK6HZ9lwxcb9YP48AjmP/L v19rMasoKys+NZN9SN2JdH37IplfW+OzbO+ukwj1EU4N7BmDgQUjt13wCKnz+8xW8lRL 3ywQkDRpbrpH2178ealxDmkq1jJkW6lBgLnbCtIs9r5iUMUoJXKpfKwwERK+yx1IV06U wDM3qsBYNoi5lcRjQa4xRVhwiNgXeHu8woFGejYy00Vzc4HNXoD5gQu8k8DqkQcjEZ1e /NP99thmEHWavtOBcEzw1kRxKBMgqrlyX23kQuYzCUsCtRCA26iDYFrsU0hqf+m4FHXw d1+w== X-Gm-Message-State: AOAM530Qo2CQU+mZsMipV51Nm17xv8CSnG6fktMdCV9mX11GwxGQBuWm vtxysmohuBvoDVXfRS9fpKjWFiF/AruWWLX5PupdPA== X-Received: by 2002:a67:1442:: with SMTP id 63mr9140605vsu.0.1604953464346; Mon, 09 Nov 2020 12:24:24 -0800 (PST) MIME-Version: 1.0 References: <1e796f9e008fb78fb96358ff74f39bd4865a7c88.1604926010.git.gladkov.alexey@gmail.com> <87v9ee2wer.fsf@x220.int.ebiederm.org> In-Reply-To: <87v9ee2wer.fsf@x220.int.ebiederm.org> From: Miklos Szeredi Date: Mon, 9 Nov 2020 21:24:13 +0100 Message-ID: Subject: Re: [RESEND PATCH v3] fuse: Abort waiting for a response if the daemon receives a fatal signal To: "Eric W. Biederman" Cc: Alexey Gladkov , LKML , linux-fsdevel@vger.kernel.org, Alexey Gladkov Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 9, 2020 at 7:54 PM Eric W. Biederman wrote: > > Miklos Szeredi writes: > > > On Mon, Nov 9, 2020 at 1:48 PM Alexey Gladkov wrote: > >> > >> This patch removes one kind of the deadlocks inside the fuse daemon. The > >> problem appear when the fuse daemon itself makes a file operation on its > >> filesystem and receives a fatal signal. > >> > >> This deadlock can be interrupted via fusectl filesystem. But if you have > >> many fuse mountpoints, it will be difficult to figure out which > >> connection to break. > >> > >> This patch aborts the connection if the fuse server receives a fatal > >> signal. > > > > The patch itself might be acceptable, but I have some questions. > > > > To logic of this patch says: > > > > "If a task having the fuse device open in it's fd table receives > > SIGKILL (and filesystem was initially mounted in a non-init user > > namespace), then abort the filesystem operation" > > > > You just say "server" instead of "task having the fuse device open in > > it's fd table" which is sloppy to say the least. It might also lead > > to regressions, although I agree that it's unlikely. > > > > Also how is this solving any security issue? Just create the request > > loop using two fuse filesystems and the deadlock avoidance has just > > been circumvented. So AFAICS "selling" this as a CVE fix is not > > appropriate. > > The original report came in with a CVE on it. So referencing that CVE > seems reasonable. Even if the issue isn't particularly serious. It is > very annoying not to be able to kill processes with SIGKILL or the OOM > killer. > > You have a good point about the looping issue. I wonder if there is a > way to enhance this comparatively simple approach to prevent the more > complex scenario you mention. Let's take a concrete example: - task A is "server" for fuse fs a - task B is "server" for fuse fs b - task C: chmod(/a/x, ...) - task A: read UNLINK request - task A: chmod(/b/x, ...) - task B: read UNLINK request - task B: chmod (/a/x, ...) Now B is blocking on i_mutex on x , A is waiting for reply from B, C is holding i_mutex on x and waiting for reply from A. At this point B is truly uninterruptible (and I'm not betting large sums on Al accepting killable VFS locks patches), so killing B is out. Killing A with this patch does nothing, since A does not have b's dev fd in its fdtable. Killing C again does nothing, since it has no fuse dev fd at all. > Does tweaking the code to close every connection represented by a fuse > file descriptor after a SIGKILL has been delevered create any problems? In the above example are you suggesting that SIGKILL on A would abort "a" from fs b's code? Yeah, that would work, I guess. Poking into another instance this way sounds pretty horrid, though. > > What's the reason for making this user-ns only? If we drop the > > security aspect, then I don't see any reason not to do this > > unconditionally. > > > > Also note, there's a proper solution for making fuse requests always > > killable, and that is to introduce a shadow locking that ensures > > correct fs operation in the face of requests that have returned and > > released their respective VFS locks. Now this would be a much more > > complex solution, but also a much more correct one, not having issues > > with correctly defining what a server is (which is not a solvable > > problem). > > Is this the solution that was removed at some point from fuse, > or are you talking about something else? > > I think you are talking about adding a set of fuse specific locks > so fuse does not need to rely on the vfs locks. I don't quite have > enough insight to see that bigger problem so if you can expand in more > detail I would appreciate it. Okay, so the problem with making the wait_event() at the end of request_wait_answer() killable is that it would allow compromising the server's integrity by unlocking the VFS level lock (which protects the fs) while the server hasn't yet finished the request. The way this would be solvable is to add a fuse level lock for each VFS level lock. That lock would be taken before the request is sent to userspace and would be released when the answer is received. Normally there would be zero contention on these shadow locks, but if a request is forcibly killed, then the VFS lock is released and the shadow lock now protects the filesystem. This wouldn't solve the case where a fuse fs is deadlocked on a VFS lock (e.g. task B), but would allow tasks blocked directly on a fuse filesystem to be killed (e.g. task A or C, both of which would unwind the deadlock). Thanks, Miklos