Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp895536pxk; Sat, 12 Sep 2020 04:07:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjCgu1lksCO8ypDt6HxahcIC7uRsz5u9Qis1E0xViYrV1tkUmZ1RUuUb2qt7zpf1RhAKbQ X-Received: by 2002:a17:906:ca4f:: with SMTP id jx15mr5577797ejb.454.1599908870352; Sat, 12 Sep 2020 04:07:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599908870; cv=none; d=google.com; s=arc-20160816; b=UTBV1zEoP4Eq+zCCVRleMZYsKk/Fo0CATK9LkV0j9H0Bz6j/IETY6Zk4N7HdnsYcJ4 NoEcXujCdpCz6iVpkI2uLiA9+IkVBc5lVlxa6JnrMPV9ljYfa5f+l4GpMKnfDEnNBGCP nl5tIivS6rm4ZaHuwqxd1GJ4Ogn9olHsCIQ/kCAhnTkxMYgbPJ4bhhJIx1Q6QhK+N/Z0 3NvzmhjmCS6DuHm2UC6b5PzABedPfUZUIJtzn1b2exY6hBMwO96bohPDuz+XLB6W4Pd+ QHy9m2JBQCKTsl0Ot2CjY+ZoomFAeMVrL5EIX6D7FaVHuE9amr1jX4VRUTGkyGpF9tOA D1Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Wt/BiPlhiIijnCfs5MY8bni1nAXL8ZGQRplqmawDiJw=; b=TQu2nnaMevVcb1V5vsiI0w1XLdULY/iPNgic+Y30ipJnMr2tOpTM1zmqLZMr7Ps3vX 2WWlqqBLuiJTvnZVMP9LjVlST40aEBv7fv6eVtmi6kP43TeJV/8zYxCs1zzM+ftVMzRs TzEZckuqvqjW8BbLEQkvumwllUKepXo65wc2uVLVSdh7j/JIvt2s6UtGHi7f+7kdPm88 wKuy3j/9jsDp8L1iLvgURgboPOAbi8sNWAv4W9n9Hl++54u2gWMhnFiDibZZa6Zs4WyJ nvmTRWJaklr3zjNH4ZqJdiVgUwETFsSWZmvVIhGTkF1ucivy6xsOUgNu6neoBDiaWnG+ +MCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Fns5pJ8d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t5si3531046edi.523.2020.09.12.04.07.15; Sat, 12 Sep 2020 04:07:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Fns5pJ8d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725854AbgILLGW (ORCPT + 99 others); Sat, 12 Sep 2020 07:06:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725825AbgILLGP (ORCPT ); Sat, 12 Sep 2020 07:06:15 -0400 Received: from mail-io1-xd44.google.com (mail-io1-xd44.google.com [IPv6:2607:f8b0:4864:20::d44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5EA9C061573; Sat, 12 Sep 2020 04:06:14 -0700 (PDT) Received: by mail-io1-xd44.google.com with SMTP id r25so13832301ioj.0; Sat, 12 Sep 2020 04:06:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Wt/BiPlhiIijnCfs5MY8bni1nAXL8ZGQRplqmawDiJw=; b=Fns5pJ8dHr9fbV01v6J6KIic9ztxehvvTUv2PP4jS9BV2Vn23ePY7bGsQaJE31zBqq obL6MK1BqJkbdTXnzld4/WYe11IMIQlZD+hQ+jL53jSQhWRqegm+3uXsQltWF6NVXaPG RMCegmlFyGmLJBolh222AzGmTCjd6U8sCroQcUmHSjyYgBwAdTvATszZPRAulmfIeijO HgNV5v2obfs6ausFSE7J6Ao/NUAtN7ao3AynpSOnEB8QfddahxEPdLainURp//WlAPP3 MAk/Dn+ArmblaNKts9urkWkJzGz1ZNOgm4a2Md56a4tRgkQn65yGV9GQQewJYDWPPdMP PQuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Wt/BiPlhiIijnCfs5MY8bni1nAXL8ZGQRplqmawDiJw=; b=bDSrlfkwal/g0DqJrdtu9BO5PPWg0Zn66bxW0BljBOmnxuhSomNmG7+8p7eikH4JMv g44lVK2Zp4HHXIYfii/FYYKSVNE58igEFoQV24PvJ+HRzaeeMDb5zxN0qUtax2y00JiG AnVWBdCkxTGWa6w92lXbAQ4fU+464N5yaEWnBvQrNb3q5jaEgPGWKZiCCCAslNdCrwDI iMKKj6npgEndRel5zFYyV+7FSbBJ1II6XRUytV9f6Gw7nry4MWgOGaGQddzvqlMC+jz+ bWBajUXC+4+y3xiRZP5RoGdmPJRHbzlkUG/K301QY7ju12yqEbhQk3rWYddpBUKyPxRn nbww== X-Gm-Message-State: AOAM532LXqk++vZsT1DJST2nHizyPbSYqKgz9HlRYYBchRhmao+0jgPe J0H1ycXB9QwABpvsYAyP59WzysJNPL0CRWOzlal41wpogKzQUg== X-Received: by 2002:a02:734f:: with SMTP id a15mr5944005jae.120.1599908773866; Sat, 12 Sep 2020 04:06:13 -0700 (PDT) MIME-Version: 1.0 References: <20200911163403.79505-1-balsini@android.com> <20200911163403.79505-2-balsini@android.com> In-Reply-To: <20200911163403.79505-2-balsini@android.com> From: Amir Goldstein Date: Sat, 12 Sep 2020 14:06:02 +0300 Message-ID: Subject: Re: [PATCH V8 1/3] fuse: Definitions and ioctl() for passthrough To: Alessio Balsini Cc: Miklos Szeredi , Akilesh Kailash , David Anderson , Eric Yan , Jann Horn , Jens Axboe , Martijn Coenen , Palmer Dabbelt , Paul Lawrence , Stefano Duo , Zimuzo Ezeozue , fuse-devel , kernel-team , linux-fsdevel , linux-kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 11, 2020 at 7:34 PM Alessio Balsini wrote: > > Introduce the new FUSE passthrough ioctl(), which allows userspace to > specify a direct connection between a FUSE file and a lower file system > file. > Such ioctl() requires userspace to specify: > - the file descriptor of one of its opened files, > - the unique identifier of the FUSE request associated with a pending > open/create operation, > both encapsulated into a fuse_passthrough_out data structure. > The ioctl() will search for the pending FUSE request matching the unique > identifier, and update the passthrough file pointer of the request with the > file pointer referenced by the passed file descriptor. > When that pending FUSE request is handled, the passthrough file pointer > is copied to the fuse_file data structure, so that the link between FUSE > and lower file system is consolidated. > > In order for the passthrough mode to be successfully activated, the lower > file system file must implement both read_ and write_iter file operations. > This extra check avoids special pseudofiles to be targets for this feature. > An additional enforced limitation is that when FUSE passthrough is enabled, > no further file system stacking is allowed. > > Signed-off-by: Alessio Balsini > --- [...] > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index bba747520e9b..eb223130a917 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -965,6 +965,12 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_args *args, > min_t(unsigned int, FUSE_MAX_MAX_PAGES, > max_t(unsigned int, arg->max_pages, 1)); > } > + if (arg->flags & FUSE_PASSTHROUGH) { > + fc->passthrough = 1; > + /* Prevent further stacking */ > + fc->sb->s_stack_depth = > + FILESYSTEM_MAX_STACK_DEPTH; > + } That seems a bit limiting. I suppose what you really want to avoid is loops into FUSE fd. There may be a way to do this with forbidding overlay over FUSE passthrough or the other way around. You can set fc->sb->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH - 1 here and in passthrough ioctl you can check for looping into a fuse fs with passthrough enabled on the passed fd (see below) ... > } else { > ra_pages = fc->max_read / PAGE_SIZE; > fc->no_lock = 1; > @@ -1002,7 +1008,8 @@ void fuse_send_init(struct fuse_conn *fc) > FUSE_WRITEBACK_CACHE | FUSE_NO_OPEN_SUPPORT | > FUSE_PARALLEL_DIROPS | FUSE_HANDLE_KILLPRIV | FUSE_POSIX_ACL | > FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS | > - FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA; > + FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA | > + FUSE_PASSTHROUGH; > ia->args.opcode = FUSE_INIT; > ia->args.in_numargs = 1; > ia->args.in_args[0].size = sizeof(ia->in); > diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c > new file mode 100644 > index 000000000000..86ab4eafa7bf > --- /dev/null > +++ b/fs/fuse/passthrough.c > @@ -0,0 +1,55 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +#include "fuse_i.h" > + > +int fuse_passthrough_setup(struct fuse_req *req, unsigned int fd) > +{ > + int ret; > + int fs_stack_depth; > + struct file *passthrough_filp; > + struct inode *passthrough_inode; > + struct super_block *passthrough_sb; > + > + /* Passthrough mode can only be enabled at file open/create time */ > + if (req->in.h.opcode != FUSE_OPEN && req->in.h.opcode != FUSE_CREATE) { > + pr_err("FUSE: invalid OPCODE for request.\n"); > + return -EINVAL; > + } > + > + passthrough_filp = fget(fd); > + if (!passthrough_filp) { > + pr_err("FUSE: invalid file descriptor for passthrough.\n"); > + return -EINVAL; > + } > + > + ret = -EINVAL; > + if (!passthrough_filp->f_op->read_iter || > + !passthrough_filp->f_op->write_iter) { > + pr_err("FUSE: passthrough file misses file operations.\n"); > + goto out; > + } > + > + passthrough_inode = file_inode(passthrough_filp); > + passthrough_sb = passthrough_inode->i_sb; > + fs_stack_depth = passthrough_sb->s_stack_depth + 1; ... for example: if (fs_stack_depth && passthrough_sb->s_type == fuse_fs_type) { pr_err("FUSE: stacked passthrough file\n"); goto out; } But maybe we want to ban passthrough to any lower FUSE at least for start. > + ret = -EEXIST; Why EEXIST? Why not EINVAL? > + if (fs_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) { > + pr_err("FUSE: maximum fs stacking depth exceeded for passthrough\n"); > + goto out; > + } > + > + req->args->passthrough_filp = passthrough_filp; > + return 0; > +out: > + fput(passthrough_filp); > + return ret; > +} > + And speaking of overlayfs, I believe you may be able to test your code with fuse-overlayfs (passthrough to upper files). This is a project with real users running real workloads who may be able to provide you with valuable feedback from testing. Thanks, Amir. [1] https://github.com/containers/fuse-overlayfs