2017-03-02 16:08:35

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v1 0/3] VFS changes for NFSv4.2 "inter" server-to-server COPY op

One patch adds a check for validity of arguments.

In another patch is crucial to the existance of NFSv4.2 inter SSC COPY.
In order to implement the "inter" SSC COPY that involves two files from
different superblock, it requires removal of the cross-devices check in VFS.
Given that in the future the "cp" command will be written to use
copy_file_range(), it would be important for "inter" SSC COPY to use the
same interface.

This patch has first been met with an objection from Christoph saying that
"no other VFS operation is supported between mountpoints" and another objection
that since been addressed that "no xfstest coverage". We argue that no other
VFS operation needed such support and now there exists one.

An alternative approach might be to define a new file system function
(eg., bool allow_cross_device_op() that a filesystem can define and NFS will
define it and return true. And based on the existence of that function and
its result the check could be relaxed)?

If that check were to be removed, it would be beneficial to check the
the superblocks for the clone operation so that it's not called unnecessarily.

Anna Schumaker (1):
fs: Don't copy beyond the end of the file

Andy Adamson (1):
VFS permit cross device vfs_copy_file_range

Olga Kornievskaia (1):
VFS don't try clone if superblocks are different

fs/read_write.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

--
1.8.3.1



2017-03-02 16:08:49

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v1 3/3] VFS don't try clone if superblocks are different

Only try clone_file_range if superblocks are the same.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/read_write.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 75084cd..00161bd 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1543,7 +1543,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
* Try cloning first, this is supported by more file systems, and
* more efficient if both clone and copy are supported (e.g. NFS).
*/
- if (file_in->f_op->clone_file_range) {
+ if (inode_in->i_sb == inode_out->i_sb &&
+ file_in->f_op->clone_file_range) {
ret = file_in->f_op->clone_file_range(file_in, pos_in,
file_out, pos_out, len);
if (ret == 0) {
--
1.8.3.1


2017-03-02 16:08:36

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

From: Anna Schumaker <[email protected]>

Signed-off-by: Anna Schumaker <[email protected]>
---
fs/read_write.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index 5816d4c..1d9e305 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1526,6 +1526,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (unlikely(ret))
return ret;

+ if (pos_in >= i_size_read(inode_in))
+ return -EINVAL;
+
if (!(file_in->f_mode & FMODE_READ) ||
!(file_out->f_mode & FMODE_WRITE) ||
(file_out->f_flags & O_APPEND))
--
1.8.3.1


2017-03-02 16:08:37

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

Allow nfs_copy_file_range to copy across devices.
NFSv4.2 inter server to server copy always copies across devices, and
NFSv4.2 intra server to server copy can copy across devices on the same
server.

If a file system's fileoperations copy_file_range operation prohibits
cross-device copies, fall back to do_splice_direct. This is needed for
nfsd_copy_file_range() which is called by the inter server to server
destination server acting as an NFS client, and reading the file from
the source server.

Signed-off-by: Andy Adamson <[email protected]>
---
fs/read_write.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 1d9e305..75084cd 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1534,10 +1534,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
(file_out->f_flags & O_APPEND))
return -EBADF;

- /* this could be relaxed once a method supports cross-fs copies */
- if (inode_in->i_sb != inode_out->i_sb)
- return -EXDEV;
-
if (len == 0)
return 0;

@@ -1559,7 +1555,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (file_out->f_op->copy_file_range) {
ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
pos_out, len, flags);
- if (ret != -EOPNOTSUPP)
+ if (ret != -EOPNOTSUPP && ret != -EXDEV)
goto done;
}

--
1.8.3.1


2017-03-02 16:50:39

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range


> On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <[email protected]> wrote:
>=20
> On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote:
>> Allow nfs_copy_file_range to copy across devices.
>> NFSv4.2 inter server to server copy always copies across devices, and
>> NFSv4.2 intra server to server copy can copy across devices on the =
same
>> server.
>>=20
>> If a file system's fileoperations copy_file_range operation prohibits
>> cross-device copies, fall back to do_splice_direct. This is needed =
for
>> nfsd_copy_file_range() which is called by the inter server to server
>> destination server acting as an NFS client, and reading the file from
>> the source server.
>=20
> NAK, we really should not do operations between different superblocks.

Can you provide some reasoning as to why? What would it break? The =
reasoning for including one is to allow for a file system to achieve =
better performance which seems like a feature that would be of great =
benefit.=20=

2017-03-02 16:59:13

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

On Thu, Mar 02, 2017 at 11:02:09AM -0500, Olga Kornievskaia wrote:
> From: Anna Schumaker <[email protected]>
>
> Signed-off-by: Anna Schumaker <[email protected]>
> ---
> fs/read_write.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 5816d4c..1d9e305 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1526,6 +1526,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> if (unlikely(ret))
> return ret;
>
> + if (pos_in >= i_size_read(inode_in))
> + return -EINVAL;
> +

This ought to go in vfs_clone_file_prep_inodes.

(He says, noticing that btrfs never got updated to use that validator...)

--D

> if (!(file_in->f_mode & FMODE_READ) ||
> !(file_out->f_mode & FMODE_WRITE) ||
> (file_out->f_flags & O_APPEND))
> --
> 1.8.3.1
>

2017-03-02 18:23:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote:
> Allow nfs_copy_file_range to copy across devices.
> NFSv4.2 inter server to server copy always copies across devices, and
> NFSv4.2 intra server to server copy can copy across devices on the same
> server.
>
> If a file system's fileoperations copy_file_range operation prohibits
> cross-device copies, fall back to do_splice_direct. This is needed for
> nfsd_copy_file_range() which is called by the inter server to server
> destination server acting as an NFS client, and reading the file from
> the source server.

NAK, we really should not do operations between different superblocks.

2017-03-02 18:34:11

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file


> On Mar 2, 2017, at 11:58 AM, Darrick J. Wong <[email protected]> =
wrote:
>=20
> On Thu, Mar 02, 2017 at 11:02:09AM -0500, Olga Kornievskaia wrote:
>> From: Anna Schumaker <[email protected]>
>>=20
>> Signed-off-by: Anna Schumaker <[email protected]>
>> ---
>> fs/read_write.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>=20
>> diff --git a/fs/read_write.c b/fs/read_write.c
>> index 5816d4c..1d9e305 100644
>> --- a/fs/read_write.c
>> +++ b/fs/read_write.c
>> @@ -1526,6 +1526,9 @@ ssize_t vfs_copy_file_range(struct file =
*file_in, loff_t pos_in,
>> if (unlikely(ret))
>> return ret;
>>=20
>> + if (pos_in >=3D i_size_read(inode_in))
>> + return -EINVAL;
>> +
>=20
> This ought to go in vfs_clone_file_prep_inodes.
>=20
> (He says, noticing that btrfs never got updated to use that =
validator=E2=80=A6)

I apologize I=E2=80=99m not fully understanding the suggestion here. How =
btrfs is related to the check that I=E2=80=99m suggesting for the =
copy_file_range(). I don=E2=80=99t see how it would fix the problem for =
the copy_file_range().

Is the suggestion that NFS=E2=80=99s clone implementation is suppose to =
call vfs_clone_file_prep_inodes() where the check would be added and =
thus because vfs_copy_file_range() first decides to call clone() instead =
of copy() then that check would be met?

>=20
> --D
>=20
>> if (!(file_in->f_mode & FMODE_READ) ||
>> !(file_out->f_mode & FMODE_WRITE) ||
>> (file_out->f_flags & O_APPEND))
>> --=20
>> 1.8.3.1
>>=20


2017-03-02 18:41:05

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

On Thu, Mar 02, 2017 at 01:21:49PM -0500, Olga Kornievskaia wrote:
>
> > On Mar 2, 2017, at 11:58 AM, Darrick J. Wong <[email protected]> wrote:
> >
> > On Thu, Mar 02, 2017 at 11:02:09AM -0500, Olga Kornievskaia wrote:
> >> From: Anna Schumaker <[email protected]>
> >>
> >> Signed-off-by: Anna Schumaker <[email protected]>
> >> ---
> >> fs/read_write.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/fs/read_write.c b/fs/read_write.c
> >> index 5816d4c..1d9e305 100644
> >> --- a/fs/read_write.c
> >> +++ b/fs/read_write.c
> >> @@ -1526,6 +1526,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> >> if (unlikely(ret))
> >> return ret;
> >>
> >> + if (pos_in >= i_size_read(inode_in))
> >> + return -EINVAL;
> >> +
> >
> > This ought to go in vfs_clone_file_prep_inodes.
> >
> > (He says, noticing that btrfs never got updated to use that validator…)
>
> I apologize I’m not fully understanding the suggestion here. How btrfs is
> related to the check that I’m suggesting for the copy_file_range(). I don’t
> see how it would fix the problem for the copy_file_range().

Never mind, I misread vfs_copy as vfs_clone and thought you were talking
about something else. :(

Sorry about the noise.

--D

> Is the suggestion that NFS’s clone implementation is suppose to call
> vfs_clone_file_prep_inodes() where the check would be added and thus
> because vfs_copy_file_range() first decides to call clone() instead of
> copy() then that check would be met?
>
> >
> > --D
> >
> >> if (!(file_in->f_mode & FMODE_READ) ||
> >> !(file_out->f_mode & FMODE_WRITE) ||
> >> (file_out->f_flags & O_APPEND))
> >> --
> >> 1.8.3.1
> >>
>

2017-03-07 22:08:40

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range


> On Mar 2, 2017, at 11:38 AM, Olga Kornievskaia <[email protected]> =
wrote:
>=20
>=20
>> On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <[email protected]> wrote:
>>=20
>> On Thu, Mar 02, 2017 at 11:02:10AM -0500, Olga Kornievskaia wrote:
>>> Allow nfs_copy_file_range to copy across devices.
>>> NFSv4.2 inter server to server copy always copies across devices, =
and
>>> NFSv4.2 intra server to server copy can copy across devices on the =
same
>>> server.
>>>=20
>>> If a file system's fileoperations copy_file_range operation =
prohibits
>>> cross-device copies, fall back to do_splice_direct. This is needed =
for
>>> nfsd_copy_file_range() which is called by the inter server to server
>>> destination server acting as an NFS client, and reading the file =
from
>>> the source server.
>>=20
>> NAK, we really should not do operations between different =
superblocks.
>=20
> Can you provide some reasoning as to why? What would it break? The =
reasoning for including one is to allow for a file system to achieve =
better performance which seems like a feature that would be of great =
benefit.

Christoph, could you please elaborate on your objection.

Al, could you weigh in with regards to if and what would it take to =
allow for copy_file_range() to allow for copies between different =
superblocks.

We would appreciate the feedback of how can we enable this performance =
feature to be useful to users.

Thank you.




2017-03-07 23:50:12

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file


> On Mar 7, 2017, at 6:43 PM, Christoph Hellwig <[email protected]> =
wrote:
>=20
> On Thu, Mar 02, 2017 at 10:40:57AM -0800, Darrick J. Wong wrote:
>> Never mind, I misread vfs_copy as vfs_clone and thought you were =
talking
>> about something else. :(
>>=20
>> Sorry about the noise.
>=20
> And either way we'll need test coverage for copy_file_range before we
> do any more changes to it. I'm sick and tired of this doctoring =
around
> without having any test coverage. I've asked for it again and again
> and we've made very little progress despite having the code in the =
tree
> for almost a year and a half.


Anna has added test cases to the xfstest suite. Is there something =
that=E2=80=99s missing?=20=

2017-03-08 00:01:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

On Thu, Mar 02, 2017 at 10:40:57AM -0800, Darrick J. Wong wrote:
> Never mind, I misread vfs_copy as vfs_clone and thought you were talking
> about something else. :(
>
> Sorry about the noise.

And either way we'll need test coverage for copy_file_range before we
do any more changes to it. I'm sick and tired of this doctoring around
without having any test coverage. I've asked for it again and again
and we've made very little progress despite having the code in the tree
for almost a year and a half.

2017-03-08 00:35:55

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

On Tue, Mar 07, 2017 at 06:46:55PM -0500, Olga Kornievskaia wrote:
> Anna has added test cases to the xfstest suite. Is there something that’s missing?

No, we still don't have that. Anna did one attempt (or maybe two) but
didn't seem to have time to follow up the review comments. Maybe you
can give it a spin to make sure we at least have basic test coverage,
including the cases mentioned in the man page like this.

2017-03-08 15:40:50

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file


> On Mar 7, 2017, at 6:50 PM, Christoph Hellwig <[email protected]> =
wrote:
>=20
> On Tue, Mar 07, 2017 at 06:46:55PM -0500, Olga Kornievskaia wrote:
>> Anna has added test cases to the xfstest suite. Is there something =
that=E2=80=99s missing?=20
>=20
> No, we still don't have that. Anna did one attempt (or maybe two) but
> didn't seem to have time to follow up the review comments. Maybe you
> can give it a spin to make sure we at least have basic test coverage,
> including the cases mentioned in the man page like this.

I=E2=80=99m reading some mail archives and I see that Anna submitted =
patches for xfstests. The comment was to re-work it as a part of xfs_io =
command. She posted the patches for that. According to the announcement =
I see here: http://oss.sgi.com/archives/xfs/2016-08/msg00221.html it =
says it=E2=80=99s including the new copy_file_range command in xfs_io. I =
don=E2=80=99t run these tests so I don=E2=80=99t know but it look to me =
that the tests are in.

Can you please what I=E2=80=99m getting wrong?

Thank you.




2017-03-08 16:14:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] fs: Don't copy beyond the end of the file

On Wed, Mar 08, 2017 at 10:39:11AM -0500, Olga Kornievskaia wrote:
>
> I’m reading some mail archives and I see that Anna submitted patches for xfstests. The comment was to re-work it as a part of xfs_io command. She posted the patches for that. According to the announcement I see here: http://oss.sgi.com/archives/xfs/2016-08/msg00221.html it says it’s including the new copy_file_range command in xfs_io. I don’t run these tests so I don’t know but it look to me that the tests are in.

That's xfsprogs and the xfs_io command. Now that this step is done
someone (e.g. Anna or you) needs to send the patches to xfstests that
use this command.

2017-03-15 18:09:46

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Thu, Mar 02, 2017 at 11:38:03AM -0500, Olga Kornievskaia wrote:
> > On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <[email protected]> wrote:
> > NAK, we really should not do operations between different
> > superblocks.
>
> Can you provide some reasoning as to why? What would it break? The
> reasoning for including one is to allow for a file system to achieve
> better performance which seems like a feature that would be of great
> benefit. --

Yes, this has come up a few times. What's going on?:

- There was an explanation, and I missed it.
- The explanation is complicated and Christoph hasn't had time
to write it up.
- Christoph has a strong suspicion there are issues without
being sure exactly where they are.
- Something else?

I mean, the whole enterprise is dead in the water without this, as far
as I can tell, so I'd really like to understand the answer one way or
the other.

--b.

2017-03-21 15:50:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Wed, Mar 15, 2017 at 02:09:13PM -0400, J. Bruce Fields wrote:
> On Thu, Mar 02, 2017 at 11:38:03AM -0500, Olga Kornievskaia wrote:
> > > On Mar 2, 2017, at 11:07 AM, Christoph Hellwig <[email protected]> wrote:
> > > NAK, we really should not do operations between different
> > > superblocks.
> >
> > Can you provide some reasoning as to why? What would it break? The
> > reasoning for including one is to allow for a file system to achieve
> > better performance which seems like a feature that would be of great
> > benefit. --
>
> Yes, this has come up a few times. What's going on?:
>
> - There was an explanation, and I missed it.
> - The explanation is complicated and Christoph hasn't had time
> to write it up.
> - Christoph has a strong suspicion there are issues without
> being sure exactly where they are.
> - Something else?

As far as I can tell from talking at LSF:

- file system operations crossing superblocks is unusual. No
specific known issue. Also not sure I understand why splice
isn't precedent.
- implementation (with server acting as a client, long-running
process, etc.) will be complicated and ugly. Sure, I guess
we'll see how complicated and weigh that against any
advantages. (Though I'm curious about case e.g. of btrfs
clone--can't it easily clone across filesystem boundaries in
some cases when they share storage?)

Anyway. I'll read the patches. Probably not this week.

--b.

2017-03-21 19:03:41

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Tue, Mar 21, 2017 at 12:03:08PM -0400, Olga Kornievskaia wrote:
> Thank you for the update. I guess I don’t see how the proposed NFS
> implementation is complicated and ugly (but I’m biased). I’ll try to
> give you some performance number. My 1 data point (1gb) inter copy
> showed 30% improvement (how can that be ignored).

That would be useful, thanks--if it comes with some details about the
setup.

I'm not so curious about percent improvement, as how to predict the
performance on a given network.

If server-to-server copy looks like it's normally able to use close to
the available bandwidth between the two servers, and if a traditional
read-write-copy loop is similarly able to use the available bandwidth,
then I can figure out whether server-to-server copy will help on my
setup.

--b.

2017-03-22 17:16:21

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Tue, Mar 21, 2017 at 3:03 PM, J. Bruce Fields <[email protected]> wro=
te:
> On Tue, Mar 21, 2017 at 12:03:08PM -0400, Olga Kornievskaia wrote:
>> Thank you for the update. I guess I don=E2=80=99t see how the proposed N=
FS
>> implementation is complicated and ugly (but I=E2=80=99m biased). I=E2=80=
=99ll try to
>> give you some performance number. My 1 data point (1gb) inter copy
>> showed 30% improvement (how can that be ignored).
>
> That would be useful, thanks--if it comes with some details about the
> setup.

What I have available to me are two laptops that I run my VMs on. It
is not a setup that is representative of a real setup. I think this
setup can only provide percent improvement numbers.

Andy was suggesting that perhaps the performance lab at Redhat would
be able to do some testing of the patches for some real world
performance?

> I'm not so curious about percent improvement, as how to predict the
> performance on a given network.
>
> If server-to-server copy looks like it's normally able to use close to
> the available bandwidth between the two servers, and if a traditional
> read-write-copy loop is similarly able to use the available bandwidth,
> then I can figure out whether server-to-server copy will help on my
> setup.
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2018-08-31 20:21:20

by Florian Weimer

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On 03/02/2017 05:02 PM, Olga Kornievskaia wrote:
> Allow nfs_copy_file_range to copy across devices.
> NFSv4.2 inter server to server copy always copies across devices, and
> NFSv4.2 intra server to server copy can copy across devices on the same
> server.
>
> If a file system's fileoperations copy_file_range operation prohibits
> cross-device copies, fall back to do_splice_direct. This is needed for
> nfsd_copy_file_range() which is called by the inter server to server
> destination server acting as an NFS client, and reading the file from
> the source server.
>
> Signed-off-by: Andy Adamson<[email protected]>

What has happened to the patch? We unwittingly used copy_file_range in
the glibc test suite, without realizing that it does not support
cross-device copies.

Thanks,
Florian

2018-08-31 20:33:49

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Fri, Aug 31, 2018 at 12:14 PM Florian Weimer <[email protected]> wrote:
>
> On 03/02/2017 05:02 PM, Olga Kornievskaia wrote:
> > Allow nfs_copy_file_range to copy across devices.
> > NFSv4.2 inter server to server copy always copies across devices, and
> > NFSv4.2 intra server to server copy can copy across devices on the same
> > server.
> >
> > If a file system's fileoperations copy_file_range operation prohibits
> > cross-device copies, fall back to do_splice_direct. This is needed for
> > nfsd_copy_file_range() which is called by the inter server to server
> > destination server acting as an NFS client, and reading the file from
> > the source server.
> >
> > Signed-off-by: Andy Adamson<[email protected]>
>
> What has happened to the patch? We unwittingly used copy_file_range in
> the glibc test suite, without realizing that it does not support
> cross-device copies.

I'm still planning to fight for the patch to be included. As per
maintainers request, I separated out the async copy patches out and
hopefully that will be going into 4.20. I'm working on the "inter"
copy offload functionality which does require this patch. I will start
submitting and pushing this patch along with the rest of the patches.

Are you interested in the copy_file_range() functionality that does
support cross-device copies? If so can you tell me how are you using
it? It would be also helpful to watch for the submission of the patch
and argue in its favor.

Thank you.
>
> Thanks,
> Florian

2018-09-01 03:06:47

by Steve French

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] VFS permit cross device vfs_copy_file_range

On Fri, Aug 31, 2018 at 11:41 AM Olga Kornievskaia <[email protected]> wrote:
>
> On Fri, Aug 31, 2018 at 12:14 PM Florian Weimer <[email protected]> wrote:
> >
> > On 03/02/2017 05:02 PM, Olga Kornievskaia wrote:
> > > Allow nfs_copy_file_range to copy across devices.
> > > NFSv4.2 inter server to server copy always copies across devices, and
> > > NFSv4.2 intra server to server copy can copy across devices on the same
> > > server.
> > >
> > > If a file system's fileoperations copy_file_range operation prohibits
> > > cross-device copies, fall back to do_splice_direct. This is needed for
> > > nfsd_copy_file_range() which is called by the inter server to server
> > > destination server acting as an NFS client, and reading the file from
> > > the source server.
> > >
> > > Signed-off-by: Andy Adamson<[email protected]>
> >
> > What has happened to the patch? We unwittingly used copy_file_range in
> > the glibc test suite, without realizing that it does not support
> > cross-device copies.
>
> I'm still planning to fight for the patch to be included. As per
> maintainers request, I separated out the async copy patches out and
> hopefully that will be going into 4.20. I'm working on the "inter"
> copy offload functionality which does require this patch. I will start
> submitting and pushing this patch along with the rest of the patches.
>
> Are you interested in the copy_file_range() functionality that does
> support cross-device copies? If so can you tell me how are you using
> it? It would be also helpful to watch for the submission of the patch
> and argue in its favor.

SMB3 clients and server (Windows, Macs, Samba and probably most
every modern NAS out there) support the SMB3 "CopyChunk"
operation used by the Linux client.

I would expect that one of the most useful cases is cross device
For example you have two mounts to your server or server cluster
\\server\share1 is mounted to /mnt1
and
\\server\backup is mounted to /mnt2

and you want to do copy_file or copy_file_range of various
files from /mnt1 to /mnt2

As long as they are both the same file system type, why not
let the file system tell you whether it can do the copy.

Given that this is 10x to 100x faster than the alternative
and this is a common case (and there are many 100s of millions
of SMB3 server capable systems out there which already support
copychunk (and thus would support copy file) - why would we
want to restrict it.

It is much saner to let the file system tell the VFS if it can't
support the cross device copy.

---- and to make it even more obvious why this patch matters ---
Virtualization clients (and various Windows and NetApp server)
support another copy offload mechanism (not just CopyChunk)
ie via ODX which would support cross server not just cross share
copy. Enabling this would be a BIG incentive for finishing up
ODX copy support in Linux (cifs.ko using SMB3). This is
fairly widely supported by servers.




--
Thanks,

Steve