Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2459060pxb; Tue, 23 Feb 2021 07:33:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJws1HthMPEg7Ogf2TM3dZCSs/4GHGKhHxxQ16m/Y2BzNX+R+HeZFUQFzA+Y3TvGPC+O+pXz X-Received: by 2002:a50:e14d:: with SMTP id i13mr28279544edl.106.1614094434545; Tue, 23 Feb 2021 07:33:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614094434; cv=none; d=google.com; s=arc-20160816; b=f3dbrz6gQO/vyYka1Db+ZTup5lQeW1sfEPgqzfTnOVzRi6bKa6Z1kWhlIyQrgtiu/g OwutVkAMpXzpZXIFbneSNhJwq+18gpcTocL3VSPV72wUOuIfj74AwhxfOu9UUGEoX95w ZwAgIAPicd6tjYHVi8IcLl/28UPVP31gtYbuERE04Ux/uT2tzv1bQhcc7UIoH7Z8YlD+ y737iRywWMK7SpIoEy9EkYy3oXVr89BCgWwwMKC+U04aFwpUceT0+ISt7Wr8tbCs7Ox0 cD7xHfEsxsYk5sZJrOKBxnaG0QUl9cOSTrFT69A90PrjHr0bbF9hPAxycf1IA3H8oTNK FtqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=l5ifbEjsbGSX5vHrh8zBq3RBKZpyn9Min7nzOJJGzBI=; b=JpmMvAuOfX6VbsK0HLGLe45jyr255ASrWd0GRJiY57z+HeqNvAfIveN6FOhnhTDlqO yE0E/TNyvIp7OMw1FdgFvSMb1dsPP+2EOVVrfisuKlQ5+YoyEb5pe6m+cK+kKFrCDIts T6PQQfFSDEvBiZfYnx3O2j5DNLpyhoq3Pbh81xrwswDEQ2OC8N1Zh0gimAnw5+fLWnwR K4kMyz2X0hvfdEVmXqGm6EUrkjE2ex2SP3a1NZZdqruWgO9LhZH55+AzzWMEZ5LifD8V RJuGQvl5PfcB7AOevdIzhodO7Q9/8xk1FPrbGdmdF2hsop+MbWWy9XabiGpCurx9aRaO msKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ksA16mo4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id de1si14301927edb.508.2021.02.23.07.33.19; Tue, 23 Feb 2021 07:33:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ksA16mo4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233220AbhBWPaa (ORCPT + 99 others); Tue, 23 Feb 2021 10:30:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233315AbhBWP3q (ORCPT ); Tue, 23 Feb 2021 10:29:46 -0500 Received: from mail-il1-x131.google.com (mail-il1-x131.google.com [IPv6:2607:f8b0:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA02DC06174A; Tue, 23 Feb 2021 07:29:02 -0800 (PST) Received: by mail-il1-x131.google.com with SMTP id d5so5539347iln.6; Tue, 23 Feb 2021 07:29:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=l5ifbEjsbGSX5vHrh8zBq3RBKZpyn9Min7nzOJJGzBI=; b=ksA16mo4RC1UhD5rSo8ROkbsX6VXxyJVYaWiPxwAugmQFn8P4I3OZOe24DYfpHf0rT 9CeFMj7jgxGPVH1BaP84MIH+MaIizqDemz2ynLVDOorkeTYPvZw6XKSOr2Xw+7emc/BY Xuigd25bTmpM/8RxdZk4xjOdDx8J23XzwESTa7LHSPo+QHJDl5/rHwASz1t2zlWxjwGR 9RFN6nvYdErKDzCKGibsPFaVwiRXs5KOEY5pLMRJnd8n2DREBCM/IoODEPxyJny/eweP b6XYDcUzIANcx+V+6kB3hQoAQC4RmWF4E2+Hios2TbciZ//LRoyCf5dAfvJ59k6N2tdU VV0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=l5ifbEjsbGSX5vHrh8zBq3RBKZpyn9Min7nzOJJGzBI=; b=TrfpvMdoaghFhb7Sy+3lRVNwMj89toNEvwSxbVNegGg3QMMxhEg+wPQgL/klw0c2aT fFuFW4fmMMuV2rD0Xrq5NJmI8BC6zsMq4ahrStzjRcbm61xpIu2h6N2XbumEoddEsKlS xymZbvOqj7tPeKeDtlp3dm70JSeeIehXfZXglprVQ1BgFDHQ5RnYa5s1FFHc3Hro5JDz FKrWdJtJz2o7GnxxDtxKx2yahxiBtQMEu7TgqWNllAk8hDOovUGrjWtGUC60mrJBaCXd k4qLD6aiqq6BiMVqkH1ltYjhTHaKzGs/+bK3cGjiIeZY1r/CHgnpgHw2SAad1Qq8q5gU nXuw== X-Gm-Message-State: AOAM5326V7ahKXreTTWXmewN9zYtxKTiB++HjMu3cyZHXnBOyl3ZVVWa CM8DSLJ0Q+GClUyJiK1FW9FFMHTqaq/rUP30dks= X-Received: by 2002:a92:8e42:: with SMTP id k2mr20780810ilh.250.1614094142335; Tue, 23 Feb 2021 07:29:02 -0800 (PST) MIME-Version: 1.0 References: <20210221195833.23828-1-lhenriques@suse.de> <20210222102456.6692-1-lhenriques@suse.de> <26a22719-427a-75cf-92eb-dda10d442ded@oracle.com> In-Reply-To: From: Amir Goldstein Date: Tue, 23 Feb 2021 17:28:51 +0200 Message-ID: Subject: Re: [PATCH v8] vfs: fix copy_file_range regression in cross-fs copies To: Luis Henriques Cc: dai.ngo@oracle.com, Jeff Layton , Steve French , Miklos Szeredi , Trond Myklebust , Anna Schumaker , Alexander Viro , "Darrick J. Wong" , Dave Chinner , Greg KH , Nicolas Boichat , Ian Lance Taylor , Luis Lozano , Andreas Dilger , Olga Kornievskaia , Christoph Hellwig , ceph-devel , linux-kernel , CIFS , samba-technical , linux-fsdevel , Linux NFS Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 23, 2021 at 12:31 PM Luis Henriques wrote: > > On Mon, Feb 22, 2021 at 08:25:27AM -0800, dai.ngo@oracle.com wrote: > > > > On 2/22/21 2:24 AM, Luis Henriques wrote: > > > A regression has been reported by Nicolas Boichat, found while using the > > > copy_file_range syscall to copy a tracefs file. Before commit > > > 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") the > > > kernel would return -EXDEV to userspace when trying to copy a file across > > > different filesystems. After this commit, the syscall doesn't fail anymore > > > and instead returns zero (zero bytes copied), as this file's content is > > > generated on-the-fly and thus reports a size of zero. > > > > > > This patch restores some cross-filesystem copy restrictions that existed > > > prior to commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across > > > devices"). Filesystems are still allowed to fall-back to the VFS > > > generic_copy_file_range() implementation, but that has now to be done > > > explicitly. > > > > > > nfsd is also modified to fall-back into generic_copy_file_range() in case > > > vfs_copy_file_range() fails with -EOPNOTSUPP or -EXDEV. > > > > > > Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices") > > > Link: https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/__;!!GqivPVa7Brio!P1UWThiSkxbjfjFQWNYJmCxGEkiLFyvHjH6cS-G1ZTt1z-TeqwGQgQmi49dC6w$ > > > Link: https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx*BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/__;Kw!!GqivPVa7Brio!P1UWThiSkxbjfjFQWNYJmCxGEkiLFyvHjH6cS-G1ZTt1z-TeqwGQgQmgCmMHzA$ > > > Link: https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/__;!!GqivPVa7Brio!P1UWThiSkxbjfjFQWNYJmCxGEkiLFyvHjH6cS-G1ZTt1z-TeqwGQgQmzqItkrQ$ > > > Reported-by: Nicolas Boichat > > > Signed-off-by: Luis Henriques > > > --- > > > Changes since v7 > > > - set 'ret' to '-EOPNOTSUPP' before the clone 'if' statement so that the > > > error returned is always related to the 'copy' operation > > > Changes since v6 > > > - restored i_sb checks for the clone operation > > > Changes since v5 > > > - check if ->copy_file_range is NULL before calling it > > > Changes since v4 > > > - nfsd falls-back to generic_copy_file_range() only *if* it gets -EOPNOTSUPP > > > or -EXDEV. > > > Changes since v3 > > > - dropped the COPY_FILE_SPLICE flag > > > - kept the f_op's checks early in generic_copy_file_checks, implementing > > > Amir's suggestions > > > - modified nfsd to use generic_copy_file_range() > > > Changes since v2 > > > - do all the required checks earlier, in generic_copy_file_checks(), > > > adding new checks for ->remap_file_range > > > - new COPY_FILE_SPLICE flag > > > - don't remove filesystem's fallback to generic_copy_file_range() > > > - updated commit changelog (and subject) > > > Changes since v1 (after Amir review) > > > - restored do_copy_file_range() helper > > > - return -EOPNOTSUPP if fs doesn't implement CFR > > > - updated commit description > > > > > > fs/nfsd/vfs.c | 8 +++++++- > > > fs/read_write.c | 49 ++++++++++++++++++++++++------------------------- > > > 2 files changed, 31 insertions(+), 26 deletions(-) > > > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > > > index 04937e51de56..23dab0fa9087 100644 > > > --- a/fs/nfsd/vfs.c > > > +++ b/fs/nfsd/vfs.c > > > @@ -568,6 +568,7 @@ __be32 nfsd4_clone_file_range(struct nfsd_file *nf_src, u64 src_pos, > > > ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst, > > > u64 dst_pos, u64 count) > > > { > > > + ssize_t ret; > > > /* > > > * Limit copy to 4MB to prevent indefinitely blocking an nfsd > > > @@ -578,7 +579,12 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst, > > > * limit like this and pipeline multiple COPY requests. > > > */ > > > count = min_t(u64, count, 1 << 22); > > > - return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0); > > > + ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0); > > > + > > > + if (ret == -EOPNOTSUPP || ret == -EXDEV) > > > + ret = generic_copy_file_range(src, src_pos, dst, dst_pos, > > > + count, 0); > > > + return ret; > > > } > > > __be32 nfsd4_vfs_fallocate(struct svc_rqst *rqstp, struct svc_fh *fhp, > > > diff --git a/fs/read_write.c b/fs/read_write.c > > > index 75f764b43418..5a26297fd410 100644 > > > --- a/fs/read_write.c > > > +++ b/fs/read_write.c > > > @@ -1388,28 +1388,6 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in, > > > } > > > EXPORT_SYMBOL(generic_copy_file_range); > > > -static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in, > > > - struct file *file_out, loff_t pos_out, > > > - size_t len, unsigned int flags) > > > -{ > > > - /* > > > - * Although we now allow filesystems to handle cross sb copy, passing > > > - * a file of the wrong filesystem type to filesystem driver can result > > > - * in an attempt to dereference the wrong type of ->private_data, so > > > - * avoid doing that until we really have a good reason. NFS defines > > > - * several different file_system_type structures, but they all end up > > > - * using the same ->copy_file_range() function pointer. > > > - */ > > > - if (file_out->f_op->copy_file_range && > > > - file_out->f_op->copy_file_range == file_in->f_op->copy_file_range) > > > - return file_out->f_op->copy_file_range(file_in, pos_in, > > > - file_out, pos_out, > > > - len, flags); > > > - > > > - return generic_copy_file_range(file_in, pos_in, file_out, pos_out, len, > > > - flags); > > > -} > > > - > > > /* > > > * Performs necessary checks before doing a file copy > > > * > > > @@ -1427,6 +1405,25 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, > > > loff_t size_in; > > > int ret; > > > + /* > > > + * Although we now allow filesystems to handle cross sb copy, passing > > > + * a file of the wrong filesystem type to filesystem driver can result > > > + * in an attempt to dereference the wrong type of ->private_data, so > > > + * avoid doing that until we really have a good reason. NFS defines > > > + * several different file_system_type structures, but they all end up > > > + * using the same ->copy_file_range() function pointer. > > > + */ > > > + if (file_out->f_op->copy_file_range) { > > > + if (file_in->f_op->copy_file_range != > > > + file_out->f_op->copy_file_range) > > > + return -EXDEV; > > > + } else if (file_in->f_op->remap_file_range) { > > > + if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb) > > > + return -EXDEV; > > > > I think this check is redundant, it's done in vfs_copy_file_range. > > If this check is removed then the else clause below should be removed > > also. Once this check and the else clause are removed then might as > > well move the the check of copy_file_range from here to vfs_copy_file_range. > > > > I don't think it's really redundant, although I agree is messy due to the > fact we try to clone first instead of copying them. > It was put here in early checks on purpose before the check for zero size file. I'm pretty sure this wasn't the case in earlier versions of the path and then it did not solve the reported problem. Thanks, Amir.