Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp5624317pxb; Tue, 16 Feb 2021 03:31:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJwnpNFCRIajlN7PGt08SIGvMbNXzHPj6u+/oE+kPgjhD+3sd2TqlX4KX/G5/Ki73ZxXvAza X-Received: by 2002:a17:906:36cc:: with SMTP id b12mr20389351ejc.323.1613475074476; Tue, 16 Feb 2021 03:31:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613475074; cv=none; d=google.com; s=arc-20160816; b=gZn9PAWg8cVeEvNPFWG8C8YW74uuzKvFJreEPp8F0QqzG2sPCaqy9JyBsUxh7TxmSU hnL8+OamDsCnQ6UeCL4Cjc1JYoDOQU6HKikms9FbZCFU5v90QknFYbqVIx0JceV9pSWk hfBLgRFL0fL9qyPoYyJNYWSQH5Mxf7jV/o5l7nXVMhZOVH1X4Rs6Qlef61GqNISe6e1b h5gqMbndQFGcyKvYMTpP5XOLZsSSwqsQTxshu4qCLJ+4sMXxwG5d2K6Q/M5HDhKFRSmQ +C1uA+6giQCLPGNVbugHKHvKbumDsdGS9K1yGeqVMzALJP9sPELpEccVsDAmQaHkJlHr IFkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=sEOJkscSuATRcfsVh3wS+QPv3K2p2RCOwfktsubLfCc=; b=QRhV2Qem7Y44NMNvuAYKRbD2Cm1VHPH2we+jZpxmRhlPmHgoPCgmYJTLANeLqo/BLZ vcteFL/FJ5wuZ9TMPZX81s/7PDgSQqnZwaJCwLpx2pggy1tI9C09FzPO+5K7OtTM5ZPN 4JbnWe31I4lHITp8klf3IcroJfZfBMdP936PthvckCqeCDkAF2rgq5jaOBBhVNxr36Ry Y6QSZg9FuYEve2Am3tLUvFMmO0+WBrhAHS0Df53/MRW4H1dRmI/PVMmUQm2kMs/1NxSr THqDFHkWNq4Xez6OOaGp7vid4ER6SvGXdX2vxHdDGnhH8J/8K8IBgEudwvq6j7+25rSZ HEoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=btdYIVxr; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dn20si15538119ejc.418.2021.02.16.03.30.47; Tue, 16 Feb 2021 03:31:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=btdYIVxr; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230049AbhBPL3t (ORCPT + 99 others); Tue, 16 Feb 2021 06:29:49 -0500 Received: from mail.kernel.org ([198.145.29.99]:39720 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbhBPL3m (ORCPT ); Tue, 16 Feb 2021 06:29:42 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id A94BF64DFF; Tue, 16 Feb 2021 11:28:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1613474940; bh=RPjPaTvayhktPzqcqsg1ndjbnlMY4ZXLaVp+CQgT4h4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=btdYIVxrhY203EYeuIVn+hf+O9MCnU4jc3+ifo20WyjTunezcWc/ss5NCNm9kD2yM NiRP82OU5FeL+iR1Mvd+m2h+4eXigLIZK63k3VD2u8bFvrnyL4+rNZRd10a4PFVrtT KAd3AWg1RQAU09Lp2Ite+6kqNCeEtH4cUeyD7KJc= Date: Tue, 16 Feb 2021 12:28:57 +0100 From: "gregkh@linuxfoundation.org" To: Luis Henriques Cc: Amir Goldstein , Trond Myklebust , "samba-technical@lists.samba.org" , "drinkcat@chromium.org" , "iant@google.com" , "linux-cifs@vger.kernel.org" , "darrick.wong@oracle.com" , "linux-kernel@vger.kernel.org" , "jlayton@kernel.org" , "anna.schumaker@netapp.com" , "llozano@chromium.org" , "linux-nfs@vger.kernel.org" , "miklos@szeredi.hu" , "viro@zeniv.linux.org.uk" , "dchinner@redhat.com" , "linux-fsdevel@vger.kernel.org" , "sfrench@samba.org" , "ceph-devel@vger.kernel.org" Subject: Re: [PATCH v2] vfs: prevent copy_file_range to copy across devices Message-ID: References: <20210215154317.8590-1-lhenriques@suse.de> <73ab4951f48d69f0183548c7a82f7ae37e286d1c.camel@hammerspace.com> <92d27397479984b95883197d90318ee76995b42e.camel@hammerspace.com> <87r1lgjm7l.fsf@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87r1lgjm7l.fsf@suse.de> Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Tue, Feb 16, 2021 at 11:17:34AM +0000, Luis Henriques wrote: > Amir Goldstein writes: > > > On Mon, Feb 15, 2021 at 8:57 PM Trond Myklebust wrote: > >> > >> On Mon, 2021-02-15 at 19:24 +0200, Amir Goldstein wrote: > >> > On Mon, Feb 15, 2021 at 6:53 PM Trond Myklebust < > >> > trondmy@hammerspace.com> wrote: > >> > > > >> > > On Mon, 2021-02-15 at 18:34 +0200, Amir Goldstein wrote: > >> > > > On Mon, Feb 15, 2021 at 5:42 PM Luis Henriques < > >> > > > lhenriques@suse.de> > >> > > > wrote: > >> > > > > > >> > > > > Nicolas Boichat reported an issue when trying to use the > >> > > > > copy_file_range > >> > > > > syscall on a tracefs file. It failed silently because the file > >> > > > > content is > >> > > > > generated on-the-fly (reporting a size of zero) and > >> > > > > copy_file_range > >> > > > > needs > >> > > > > to know in advance how much data is present. > >> > > > > > >> > > > > This commit restores the cross-fs restrictions that existed > >> > > > > prior > >> > > > > to > >> > > > > 5dae222a5ff0 ("vfs: allow copy_file_range to copy across > >> > > > > devices") > >> > > > > and > >> > > > > removes generic_copy_file_range() calls from ceph, cifs, fuse, > >> > > > > and > >> > > > > nfs. > >> > > > > > >> > > > > Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across > >> > > > > devices") > >> > > > > Link: > >> > > > > https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/ > >> > > > > Cc: Nicolas Boichat > >> > > > > Signed-off-by: Luis Henriques > >> > > > > >> > > > Code looks ok. > >> > > > You may add: > >> > > > > >> > > > Reviewed-by: Amir Goldstein > >> > > > > >> > > > I agree with Trond that the first paragraph of the commit message > >> > > > could > >> > > > be improved. > >> > > > The purpose of this change is to fix the change of behavior that > >> > > > caused the regression. > >> > > > > >> > > > Before v5.3, behavior was -EXDEV and userspace could fallback to > >> > > > read. > >> > > > After v5.3, behavior is zero size copy. > >> > > > > >> > > > It does not matter so much what makes sense for CFR to do in this > >> > > > case (generic cross-fs copy). What matters is that nobody asked > >> > > > for > >> > > > this change and that it caused problems. > >> > > > > >> > > > >> > > No. I'm saying that this patch should be NACKed unless there is a > >> > > real > >> > > explanation for why we give crap about this tracefs corner case and > >> > > why > >> > > it can't be fixed. > >> > > > >> > > There are plenty of reasons why copy offload across filesystems > >> > > makes > >> > > sense, and particularly when you're doing NAS. Clone just doesn't > >> > > cut > >> > > it when it comes to disaster recovery (whereas backup to a > >> > > different > >> > > storage unit does). If the client has to do the copy, then you're > >> > > effectively doubling the load on the server, and you're adding > >> > > potentially unnecessary network traffic (or at the very least you > >> > > are > >> > > doubling that traffic). > >> > > > >> > > >> > I don't understand the use case you are describing. > >> > > >> > Which filesystem types are you talking about for source and target > >> > of copy_file_range()? > >> > > >> > To be clear, the original change was done to support NFS/CIFS server- > >> > side > >> > copy and those should not be affected by this change. > >> > > >> > >> That is incorrect: > >> > >> ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file > >> *dst, > >> u64 dst_pos, u64 count) > >> { > >> > >> /* > >> * Limit copy to 4MB to prevent indefinitely blocking an nfsd > >> * thread and client rpc slot. The choice of 4MB is somewhat > >> * arbitrary. We might instead base this on r/wsize, or make it > >> * tunable, or use a time instead of a byte limit, or implement > >> * asynchronous copy. In theory a client could also recognize a > >> * limit like this and pipeline multiple COPY requests. > >> */ > >> count = min_t(u64, count, 1 << 22); > >> return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0); > >> } > >> > >> You are now explicitly changing the behaviour of knfsd when the source > >> and destination filesystem differ. > >> > >> For one thing, you are disallowing the NFSv4.2 copy offload use case of > >> copying from a local filesystem to a remote NFS server. However you are > >> also disallowing the copy from, say, an XFS formatted partition to an > >> ext4 partition. > >> > > > > Got it. > > Ugh. And I guess overlayfs may have a similar problem. > > > This is easy to solve with a flag COPY_FILE_SPLICE (or something) that > > is internal to kernel users. > > > > FWIW, you may want to look at the loop in ovl_copy_up_data() > > for improvements to nfsd_copy_file_range(). > > > > We can move the check out to copy_file_range syscall: > > > > if (flags != 0) > > return -EINVAL; > > > > Leave the fallback from all filesystems and check for the > > COPY_FILE_SPLICE flag inside generic_copy_file_range(). > > Ok, the diff bellow is just to make sure I understood your suggestion. > > The patch will also need to: > > - change nfs and overlayfs calls to vfs_copy_file_range() so that they > use the new flag. > > - check flags in generic_copy_file_checks() to make sure only valid flags > are used (COPY_FILE_SPLICE at the moment). > > Also, where should this flag be defined? include/uapi/linux/fs.h? Why would userspace want/need this flag?