Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755314AbbDKAYK (ORCPT ); Fri, 10 Apr 2015 20:24:10 -0400 Received: from mail-vn0-f42.google.com ([209.85.216.42]:33765 "EHLO mail-vn0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753510AbbDKAYH (ORCPT ); Fri, 10 Apr 2015 20:24:07 -0400 MIME-Version: 1.0 In-Reply-To: <20150411000208.GA20949@lenny.home.zabbo.net> References: <1428703236-24735-1-git-send-email-zab@redhat.com> <1428703236-24735-2-git-send-email-zab@redhat.com> <20150411000208.GA20949@lenny.home.zabbo.net> Date: Fri, 10 Apr 2015 20:24:06 -0400 Message-ID: Subject: Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper From: Trond Myklebust To: Zach Brown Cc: Linux Kernel Mailing List , Linux FS-devel Mailing List , linux-btrfs@vger.kernel.org, Linux NFS Mailing List , linux-scsi@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3434 Lines: 78 On Fri, Apr 10, 2015 at 8:02 PM, Zach Brown wrote: > On Fri, Apr 10, 2015 at 06:36:41PM -0400, Trond Myklebust wrote: >> On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown wrote: > >> > + >> > +/* >> > + * copy_file_range() differs from regular file read and write in that it >> > + * specifically allows return partial success. When it does so is up to >> > + * the copy_file_range method. >> > + */ >> > +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, >> > + struct file *file_out, loff_t pos_out, >> > + size_t len, int flags) >> >> I'm going to repeat a gripe with this interface. I really don't think >> we should treat copy_file_range() as taking a size_t length, since >> that is not sufficient to do a full file copy on 32-bit systems w/ LFS >> support. > > *nod*. The length type is limited by the syscall return type and the > arbitrary desire to mimic read/write. > > I sympathize with wanting to copy giant files with operations that don't > scale with file size because files can be enormous but sparse. The other argument against using a size_t is that there is no memory buffer involved here. size_t is, after all, a type describing in-memory objects, not files. >> Could we perhaps instead of a length, define a 'pos_in_start' and a >> 'pos_in_end' offset (with the latter being -1 for a full-file copy) >> and then return an 'loff_t' value stating where the copy ended? > > Well, the resulting offset will be set if the caller provided it. So > they could already be getting the copied length from that. But they > might not specify the offsets. Maybe they're just using the results to > total up a completion indicator. > > Maybe we could make the length a pointer like the offsets that's set to > the copied length on return. That works, but why do we care so much about the difference between a length and an offset as a return value? To be fair, the NFS copy offload also allows the copy to proceed out of order, in which case the range of copied data could be non-contiguous in the case of a failure. However neither the length nor the offset case will give you the full story in that case. Any return value can at best be considered to define an offset range whose contents need to be checked for success/failure. > This all seems pretty gross. Does anyone else have a vote? > > (And I'll argue strongly against creating magical offset values that > change behaviour. If we want to ignore arguments and get the length > from the source file we'd add a flag to do so.) The '-1' was not intended to be a special/magical value: as far as I'm concerned any end offset that covers the full range of supported file lengths would be OK. >> Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this >> interface would be closer to what is already in use anyway. > > Yeah, btrfs doesn't allow partial progress. It returns 0 on success. > We could also do that but people have expressed an interest in returning > partial progress. Returning an end offset would satisfy the partial progress requirement (with the caveat mentioned above). Cheers Trond -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/