Return-Path: linux-nfs-owner@vger.kernel.org Received: from mx1.redhat.com ([209.132.183.28]:33054 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754028Ab3IZSzy (ORCPT ); Thu, 26 Sep 2013 14:55:54 -0400 Date: Thu, 26 Sep 2013 11:55:08 -0700 From: Zach Brown To: Miklos Szeredi Cc: "J. Bruce Fields" , Anna Schumaker , Kernel Mailing List , Linux-Fsdevel , "linux-nfs@vger.kernel.org" , Trond Myklebust , Bryan Schumaker , "Martin K. Petersen" , Jens Axboe , Mark Fasheh , Joel Becker , Eric Wong Subject: Re: [RFC] extending splice for copy offloading Message-ID: <20130926185508.GO30372@lenny.home.zabbo.net> References: <1378919210-10372-1-git-send-email-zab@redhat.com> <20130925183828.GA30372@lenny.home.zabbo.net> <20130925190620.GB30372@lenny.home.zabbo.net> <20130925195526.GA18971@fieldses.org> <20130925210742.GG30372@lenny.home.zabbo.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Sender: linux-nfs-owner@vger.kernel.org List-ID: On Thu, Sep 26, 2013 at 10:58:05AM +0200, Miklos Szeredi wrote: > On Wed, Sep 25, 2013 at 11:07 PM, Zach Brown wrote: > >> A client-side copy will be slower, but I guess it does have the > >> advantage that the application can track progress to some degree, and > >> abort it fairly quickly without leaving the file in a totally undefined > >> state--and both might be useful if the copy's not a simple constant-time > >> operation. > > > > I suppose, but can't the app achieve a nice middle ground by copying the > > file in smaller syscalls? Avoid bulk data motion back to the client, > > but still get notification every, I dunno, few hundred meg? > > Yes. And if "cp" could just be switched from a read+write syscall > pair to a single splice syscall using the same buffer size. And then > the user would only notice that things got faster in case of server > side copy. No problems with long blocking times (at least not much > worse than it was). Hmm, yes, that would be a nice outcome. > However "cp" doesn't do reflinking by default, it has a switch for > that. If we just want "cp" and the like to use splice without fearing > side effects then by default we should try to be as close to > read+write behavior as possible. No? I guess? I don't find requiring --reflink hugely compelling. But there it is. > That's what I'm really > worrying about when you want to wire up splice to reflink by default. > I do think there should be a flag for that. And if on the block level > some magic happens, so be it. It's not the fs deverloper's worry any > more ;) Sure. So we'd have: - no flag default that forbids knowingly copying with shared references so that it will be used by default by people who feel strongly about their assumptions about independent write durability. - a flag that allows shared references for people who would otherwise use the file system shared reference ioctls (ocfs2 reflink, btrfs clone) but would like it to also do server-side read/write copies over nfs without additional intervention. - a flag that requires shared references for callers who don't want giant copies to take forever if they aren't instant. (The qemu guys asked for this at Plumbers.) I think I can live with that. - z