Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp2534194lqb; Tue, 28 May 2024 03:01:41 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVd0AZDrcRLb3DRm4O8MByDMUP1WHzBUVqQxKnZzESwKxIoUw04ORMoQ+JWCQUnrfJk5A4lY2v1Ts5/N6ifuM4DGhhBWkxG6NJR+hYS5Q== X-Google-Smtp-Source: AGHT+IGLc9PqhY09VigiYam0kEcEBf3EQCf1hYdfLQjkU1mR0ZoLiLnRFjjZ8sZKtEf+rHLV+FLc X-Received: by 2002:a05:6214:4983:b0:6ab:98e9:71b3 with SMTP id 6a1803df08f44-6abbbc5bcbamr144058246d6.19.1716890501670; Tue, 28 May 2024 03:01:41 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716890501; cv=pass; d=google.com; s=arc-20160816; b=HuTU57sy8S9/rFWn3paPFfrMwCxffQpwc9AaLsUEbvhcxBrEJBqgateViDdDdvS4+R xqd0z7Lje+thc9Dy7GMptH0/0dbU0hfUj+NKrdxtby7l2+IQLSPNCxyOaQvYHKK6z+1u 98f9CqwM5vB2p+DyW5Q5I8a9f7vN1CNX4u6auyTVjzNr3gjZtWsBpLsqx/5HEBbVzAJN ug5hgFJns1ZM9duTS1diGxlEJsOA/9AAmSYA2TeKmgaBz5cbafTFNuVNkd+9Vzv+sXxt BQrPHfLFsYNTLi3bRqyrbSzX8U4ZR4/OxypPMJ7h5XNaL2eu01WzC3lOLzrcUtLcmBn1 cQLA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=GXy9MUCK5VGEeup/BbpcBg8zahF6no6zu65zqg0I4xs=; fh=6+yyjTs4Bc6ObDg8MMsFSiwJBIsinPZDKYvekws2kCU=; b=0MkDmVxWT+blHmAP6tN263Y97l5Fi9g/kjz7zXMuVe8/EpLEfEz0tdzv1zgQFEWX1a tOMyhsPAU2UaqhgM3I8nm44d9EnLWHv5Z+/BJ1W1ds/QMyooPEFgLv879J/9HIOoLKdn VRP2tuvnXZnGd4D9S8Osm3S8lacY3fSWcbyINGAjqLJzpXNzcRiWPB52oZyZfdqokMO+ 7B72z8VAYB8Ml1sLTNaKtlmpSfGRIfkTeS9n5+qGPz7Am2gwyPIhjgSz8FhwOJHvlmWM p15iGz5rTO0AFIBGYxSrY3umJNnJ8gNbvUADbU7+ZF+oZmsURYKsjBy+GC924TwsDURl u6KA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b=Cl7TD5xg; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-192061-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-192061-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id 6a1803df08f44-6ad782185e2si59121766d6.213.2024.05.28.03.01.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 03:01:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-192061-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b=Cl7TD5xg; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-192061-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-192061-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 0DC0E1C24846 for ; Tue, 28 May 2024 10:01:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 264A716C6B6; Tue, 28 May 2024 09:58:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Cl7TD5xg" Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 08EEF16B72D; Tue, 28 May 2024 09:58:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716890297; cv=none; b=Cg8E8JldQK/YcIVsJnUQlcjqjq5FKIBshjRiH2JvlQ/RCHX2aX21IvC7BSOii5kGRaKLY2kd3KJFYGVRiAOYwsRTGxN13bi5KPeGZ+hpqNuTedduNqPx2w1Ojm3oHl2DUxPeP3C5hrigi7UIj4FKPUi3SjzFCHRkPbUiA5G2zr8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716890297; c=relaxed/simple; bh=6hx438n+um4riiI14Idj1dDSnLV0yA9H9UwD+GzRrc0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=RRYheXCShhGEg87BVCxkai7hjtFvQcfHTctCIzypIuUpWPfOSz+bLoDCnlT2LE5Re0ALujfI2HelXYV1c2tHeZ4m+E+8D/bFI1zhaOxB9yWUqert+3lShOK0AY/DOQGeKwETpcnNn9zUTYJC4o/YOSTx+vJ2zA3PzXGsf+iJTao= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Cl7TD5xg; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1716890291; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=GXy9MUCK5VGEeup/BbpcBg8zahF6no6zu65zqg0I4xs=; b=Cl7TD5xgG7SNnKhwCtrYG9xMqUBYe8wDcdEjxE36Cqe15bMVQr2EAj+CWNQ1ZVUxrGU/EGBOE8vA7N4Iv85LWxWa9mrPv/u+w3EX/DzqtEgF8aUZHKHndoQylAIghY8loYz+nxxEBUv7yL4/wCgw/DLgXmM45xteQWLOdJa6rg8= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R731e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0W7PIbQB_1716890289; Received: from 30.97.48.200(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0W7PIbQB_1716890289) by smtp.aliyun-inc.com; Tue, 28 May 2024 17:58:11 +0800 Message-ID: <9666141f-ad0f-4224-ac48-eba63145c61e@linux.alibaba.com> Date: Tue, 28 May 2024 17:58:09 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC 0/2] fuse: introduce fuse server recovery mechanism To: Christian Brauner Cc: Jingbo Xu , Miklos Szeredi , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, winters.zc@antgroup.com References: <20240524064030.4944-1-jefflexu@linux.alibaba.com> <858d23ec-ea81-45cb-9629-ace5d6c2f6d9@linux.alibaba.com> <6a3c3035-b4c4-41d9-a7b0-65f72f479571@linux.alibaba.com> <20240528-pegel-karpfen-fd16814adc50@brauner> <36c14658-2c38-4515-92e1-839553971477@linux.alibaba.com> <20240528-umstritten-liedchen-30e6ca6632b2@brauner> From: Gao Xiang In-Reply-To: <20240528-umstritten-liedchen-30e6ca6632b2@brauner> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2024/5/28 17:32, Christian Brauner wrote: > On Tue, May 28, 2024 at 05:13:04PM +0800, Gao Xiang wrote: >> Hi Christian, >> >> On 2024/5/28 16:43, Christian Brauner wrote: >>> On Tue, May 28, 2024 at 12:02:46PM +0800, Gao Xiang wrote: >>>> >>>> >>>> On 2024/5/28 11:08, Jingbo Xu wrote: >>>>> >>>>> >>>>> On 5/28/24 10:45 AM, Jingbo Xu wrote: >>>>>> >>>>>> >>>>>> On 5/27/24 11:16 PM, Miklos Szeredi wrote: >>>>>>> On Fri, 24 May 2024 at 08:40, Jingbo Xu wrote: >>>>>>> >>>>>>>> 3. I don't know if a kernel based recovery mechanism is welcome on the >>>>>>>> community side. Any comment is welcome. Thanks! >>>>>>> >>>>>>> I'd prefer something external to fuse. >>>>>> >>>>>> Okay, understood. >>>>>> >>>>>>> >>>>>>> Maybe a kernel based fdstore (lifetime connected to that of the >>>>>>> container) would a useful service more generally? >>>>>> >>>>>> Yeah I indeed had considered this, but I'm afraid VFS guys would be >>>>>> concerned about why we do this on kernel side rather than in user space. >>>> >>>> Just from my own perspective, even if it's in FUSE, the concern is >>>> almost the same. >>>> >>>> I wonder if on-demand cachefiles can keep fds too in the future >>>> (thus e.g. daemonless feature could even be implemented entirely >>>> with kernel fdstore) but it still has the same concern or it's >>>> a source of duplication. >>>> >>>> Thanks, >>>> Gao Xiang >>>> >>>>>> >>>>>> I'm not sure what the VFS guys think about this and if the kernel side >>>>>> shall care about this. >>> >>> Fwiw, I'm not convinced and I think that's a big can of worms security >>> wise and semantics wise. I have discussed whether a kernel-side fdstore >>> would be something that systemd would use if available multiple times >>> and they wouldn't use it because it provides them with no benefits over >>> having it in userspace. >> >> As far as I know, currently there are approximately two ways to do >> failover mechanisms in kernel. >> >> The first model much like a fuse-like model: in this mode, we should >> keep and pass fd to maintain the active state. And currently, >> userspace should be responsible for the permission/security issues >> when doing something like passing fds. >> >> The second model is like one device-one instance model, for example >> ublk (If I understand correctly): each active instance (/dev/ublkbX) >> has their own unique control device (/dev/ublkcX). Users could >> assign/change DAC/MAC for each control device. And failover >> recovery just needs to reopen the control device with proper >> permission and do recovery. >> >> So just my own thought, kernel-side fdstore pseudo filesystem may >> provide a DAC/MAC mechanism for the first model. That is a much >> cleaner way than doing some similar thing independently in each >> subsystem which may need DAC/MAC-like mechanism. But that is >> just my own thought. > > The failover mechanism for /dev/ublkcX could easily be implemented using > the fdstore. The fact that they rolled their own thing is orthogonal to > this imho. Implementing retrieval policies like this in the kernel is > slowly advancing into /proc/$pid/fd/ levels of complexity. That's all > better handled with appropriate policies in userspace. And cachefilesd > can similarly just stash their fds in the fdstore. Ok, got it. I just would like to know what kernel fdstore currently sounds like (since Miklos mentioned it so I wonder if it's feasible since it can benefit to non-fuse cases). I think userspace fdstore works for me (unless some other interesting use cases for evaluation later). Jingbo has an internal requirement for fuse, that is a pure fuse stuff, and that is out of my scope though. Thanks, Gao Xiang