2018-10-30 20:56:27

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 00/11] client-side support for "inter" SSC copy

From: Olga Kornievskaia <[email protected]>

This patch series adds client-side support for doing NFSv4.2 "inter"
copy offload between different NFS servers.

In case of the "inter" SSC copy files reside on different servers and
thus under different superblocks and require that VFS removes the
restriction that src and dst files must be on the same superblock.

NFS's copy_file_range() determines if the copy is "intra" or "inter"
and for "inter" it sends the COPY_NOTIFY to the source server. Then,
it would send of an asynchronous COPY to the destination server. If
an application cancels an in-flight COPY, OFFLOAD_CANCEL is sent to
both of the servers.

This patch series also include necessary client-side additions that
are performed by the destination server. The server needs an NFS
open that represents a source file without doing an actual open.
Two function nfs42_ssc_open/nfs42_ssc_close() are introduced to
accomplish it that make use of the VFS's alloc_file_pseudo() to

v7:
--- in VFS patch: remove the support for generic copy_file_range
for all file systems, just allow for individual filesystems to
support cross device copy_file_range. the check for the superblocks
is moved just before the do_splice_direct()
--- modified the man page

Olga Kornievskaia (11):
VFS: move cross device copy_file_range() check into filesystems
NFS: validity check for source offset in copy_file_range
NFS NFSD: defining nl4_servers structure needed by both
NFS: add COPY_NOTIFY operation
NFS: add ca_source_server<> to COPY
NFS: also send OFFLOAD_CANCEL to source server
NFS: inter ssc open
NFS: skip recovery of copy open on dest server
NFS: for "inter" copy treat ESTALE as ENOTSUPP
NFS: COPY handle ERR_OFFLOAD_DENIED
NFS: replace cross device check in copy_file_range

Documentation/filesystems/porting | 7 ++
fs/cifs/cifsfs.c | 3 +
fs/nfs/nfs42.h | 15 ++-
fs/nfs/nfs42proc.c | 129 ++++++++++++++++++++++---
fs/nfs/nfs42xdr.c | 193 +++++++++++++++++++++++++++++++++++++-
fs/nfs/nfs4_fs.h | 10 ++
fs/nfs/nfs4client.c | 2 +-
fs/nfs/nfs4file.c | 125 +++++++++++++++++++++++-
fs/nfs/nfs4proc.c | 6 +-
fs/nfs/nfs4state.c | 14 +++
fs/nfs/nfs4xdr.c | 1 +
fs/overlayfs/file.c | 3 +
fs/read_write.c | 12 ++-
include/linux/nfs4.h | 25 +++++
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 17 ++++
16 files changed, 539 insertions(+), 24 deletions(-)

--
1.8.3.1



2018-10-30 20:56:29

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 01/11] VFS: move cross device copy_file_range() check into filesystems

From: Olga Kornievskaia <[email protected]>

This patch makes it the responsibility of individual filesystems to
allow or deny cross device copies. Both NFS and CIFS have operations
for cross-server copies, and later patches will implement this feature.

Note that as of this patch, the copy_file_range() function might be passed
superblocks from different filesystem types. -EXDEV should be returned
if cross device copies aren't supported.

Reviewed-by: Amir Goldstein <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Reviewed-by: Steve French <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Olga Kornievskaia <[email protected]>
---
Documentation/filesystems/porting | 7 +++++++
fs/cifs/cifsfs.c | 3 +++
fs/nfs/nfs4file.c | 3 +++
fs/overlayfs/file.c | 3 +++
fs/read_write.c | 12 +++++++-----
5 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index 7b7b845..897e1e7 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -622,3 +622,10 @@ in your dentry operations instead.
alloc_file_clone(file, flags, ops) does not affect any caller's references.
On success you get a new struct file sharing the mount/dentry with the
original, on failure - ERR_PTR().
+--
+[mandatory]
+ ->copy_file_range() may now be passed files which belong to two
+ different superblocks of the same file system type or which belong
+ to two different filesystems types all together. As before, the
+ destination's copy_file_range() is the function which is called.
+ If it cannot copy ranges from the source, it should return -EXDEV.
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 7065426..ca8fc87 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1114,6 +1114,9 @@ static ssize_t cifs_copy_file_range(struct file *src_file, loff_t off,
unsigned int xid = get_xid();
ssize_t rc;

+ if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb)
+ return -EXDEV;
+
rc = cifs_file_copychunk_range(xid, src_file, off, dst_file, destoff,
len, flags);
free_xid(xid);
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 4288a6e..5a73c90 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -135,6 +135,9 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
{
ssize_t ret;

+ if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
+ return -EXDEV;
+
if (file_inode(file_in) == file_inode(file_out))
return -EINVAL;
retry:
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index aeaefd2..0331e33 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -483,6 +483,9 @@ static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t len, unsigned int flags)
{
+ if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
+ return -EXDEV;
+
return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
OVL_COPY);
}
diff --git a/fs/read_write.c b/fs/read_write.c
index 39b4a21..c5bed2e 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1575,10 +1575,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
(file_out->f_flags & O_APPEND))
return -EBADF;

- /* this could be relaxed once a method supports cross-fs copies */
- if (inode_in->i_sb != inode_out->i_sb)
- return -EXDEV;
-
if (len == 0)
return 0;

@@ -1588,7 +1584,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
* Try cloning first, this is supported by more file systems, and
* more efficient if both clone and copy are supported (e.g. NFS).
*/
- if (file_in->f_op->clone_file_range) {
+ if (inode_in->i_sb == inode_out->i_sb &&
+ file_in->f_op->clone_file_range) {
ret = file_in->f_op->clone_file_range(file_in, pos_in,
file_out, pos_out, len);
if (ret == 0) {
@@ -1604,6 +1601,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
goto done;
}

+ if (inode_in->i_sb != inode_out->i_sb) {
+ ret = -EXDEV;
+ goto done;
+ }
+
ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);

--
1.8.3.1


2018-10-30 20:56:31

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 02/11] NFS: validity check for source offset in copy_file_range

From: Olga Kornievskaia <[email protected]>

Input source offset can not be beyond the end of the file.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs4file.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 5a73c90..7838bdf 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -135,6 +135,9 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
{
ssize_t ret;

+ if (pos_in >= i_size_read(file_inode(file_in)))
+ return -EINVAL;
+
if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
return -EXDEV;

--
1.8.3.1


2018-10-30 20:56:32

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 03/11] NFS NFSD: defining nl4_servers structure needed by both

From: Olga Kornievskaia <[email protected]>

These structures are needed by COPY_NOTIFY on the client and needed
by the nfsd as well

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Olga Kornievskaia <[email protected]>
---
include/linux/nfs4.h | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 1b06f0b..4803507 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -16,6 +16,7 @@
#include <linux/list.h>
#include <linux/uidgid.h>
#include <uapi/linux/nfs4.h>
+#include <linux/sunrpc/msg_prot.h>

enum nfs4_acl_whotype {
NFS4_ACL_WHO_NAMED = 0,
@@ -672,4 +673,27 @@ struct nfs4_op_map {
} u;
};

+struct nfs42_netaddr {
+ char netid[RPCBIND_MAXNETIDLEN];
+ char addr[RPCBIND_MAXUADDRLEN + 1];
+ u32 netid_len;
+ u32 addr_len;
+};
+
+enum netloc_type4 {
+ NL4_NAME = 1,
+ NL4_URL = 2,
+ NL4_NETADDR = 3,
+};
+
+struct nl4_server {
+ enum netloc_type4 nl4_type;
+ union {
+ struct { /* NL4_NAME, NL4_URL */
+ int nl4_str_sz;
+ char nl4_str[NFS4_OPAQUE_LIMIT + 1];
+ };
+ struct nfs42_netaddr nl4_addr; /* NL4_NETADDR */
+ } u;
+};
#endif
--
1.8.3.1


2018-10-30 20:56:34

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH 1/1] man-page: copy_file_range(2) allow for cross-device copies

From: Olga Kornievskaia <[email protected]>

A proposed VFS change removes the check for the files to reside
under the same file system prior to attempting a copy. Instead,
a file system driver implementation of the destination file
is allowed to perform a cross-device copy_file_range(). If
that filesystem does not support cross device copy, it returns
-EXDEV.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
man2/copy_file_range.2 | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/man2/copy_file_range.2 b/man2/copy_file_range.2
index 20374ab..03750c5 100644
--- a/man2/copy_file_range.2
+++ b/man2/copy_file_range.2
@@ -39,7 +39,8 @@ The
.BR copy_file_range ()
system call performs an in-kernel copy between two file descriptors
without the additional cost of transferring data from the kernel to user space
-and then back into the kernel.
+and then back into the kernel. Starting kernel version 4.21 passed in
+file descriptors are not required to be under the same mounted file system.
It copies up to
.I len
bytes of data from file descriptor
@@ -129,9 +130,9 @@ Out of memory.
There is not enough space on the target filesystem to complete the copy.
.TP
.B EXDEV
-The files referred to by
-.IR file_in " and " file_out
-are not on the same mounted filesystem.
+The file system of the
+.I file_out
+does not support cross device file copy.
.SH VERSIONS
The
.BR copy_file_range ()
--
1.8.3.1


2018-10-30 20:56:43

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 08/11] NFS: skip recovery of copy open on dest server

From: Olga Kornievskaia <[email protected]>

Mark the open created for the source file on the destination
server. Then if this open is going thru a recovery, then fail
the recovery as we don't need to be recoving a "fake" open.
We need to fail the ongoing READs and vfs_copy_file_range().

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs4_fs.h | 1 +
fs/nfs/nfs4file.c | 1 +
fs/nfs/nfs4state.c | 14 ++++++++++++++
3 files changed, 16 insertions(+)

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 9c566a4..482754d 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -165,6 +165,7 @@ enum {
NFS_STATE_CHANGE_WAIT, /* A state changing operation is outstanding */
#ifdef CONFIG_NFS_V4_2
NFS_CLNT_DST_SSC_COPY_STATE, /* dst server open state on client*/
+ NFS_SRV_SSC_COPY_STATE, /* ssc state on the dst server */
#endif /* CONFIG_NFS_V4_2 */
};

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 0b1dcf9..989f174 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -333,6 +333,7 @@ struct file *
if (ctx->state == NULL)
goto out_stateowner;

+ set_bit(NFS_SRV_SSC_COPY_STATE, &ctx->state->flags);
set_bit(NFS_OPEN_STATE, &ctx->state->flags);
memcpy(&ctx->state->open_stateid.other, &stateid->other,
NFS4_STATEID_OTHER_SIZE);
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 62ae0fd..b0b82c6 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1606,6 +1606,9 @@ static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs
{
struct nfs4_state *state;
int status = 0;
+#ifdef CONFIG_NFS_V4_2
+ bool found_ssc_copy_state = false;
+#endif /* CONFIG_NFS_V4_2 */

/* Note: we rely on the sp->so_states list being ordered
* so that we always reclaim open(O_RDWR) and/or open(O_WRITE)
@@ -1625,6 +1628,13 @@ static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs
continue;
if (state->state == 0)
continue;
+#ifdef CONFIG_NFS_V4_2
+ if (test_bit(NFS_SRV_SSC_COPY_STATE, &state->flags)) {
+ nfs4_state_mark_recovery_failed(state, -EIO);
+ found_ssc_copy_state = true;
+ continue;
+ }
+#endif /* CONFIG_NFS_V4_2 */
refcount_inc(&state->count);
spin_unlock(&sp->so_lock);
status = __nfs4_reclaim_open_state(sp, state, ops);
@@ -1671,6 +1681,10 @@ static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs
}
raw_write_seqcount_end(&sp->so_reclaim_seqcount);
spin_unlock(&sp->so_lock);
+#ifdef CONFIG_NFS_V4_2
+ if (found_ssc_copy_state)
+ return -EIO;
+#endif /* CONFIG_NFS_V4_2 */
return 0;
out_err:
nfs4_put_open_state(state);
--
1.8.3.1


2018-10-30 20:56:43

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 09/11] NFS: for "inter" copy treat ESTALE as ENOTSUPP

From: Olga Kornievskaia <[email protected]>

If the client sends an "inter" copy to the destination server but
it only supports "intra" copy, it can return ESTALE (since it
doesn't know anything about the file handle from the other server
and does not recognize the special case of "inter" copy). Translate
this error as ENOTSUPP and also send OFFLOAD_CANCEL to the source
server so that it can clean up state.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 98fe00b..00809b3 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -395,6 +395,11 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
args.sync = true;
dst_exception.retry = 1;
continue;
+ } else if (err == -ESTALE &&
+ !nfs42_files_from_same_server(src, dst)) {
+ nfs42_do_offload_cancel_async(src, &args.src_stateid);
+ err = -EOPNOTSUPP;
+ break;
}

err2 = nfs4_handle_exception(server, err, &src_exception);
--
1.8.3.1


2018-10-30 20:56:46

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 10/11] NFS: COPY handle ERR_OFFLOAD_DENIED

From: Olga Kornievskaia <[email protected]>

If server sends ERR_OFFLOAD_DENIED error, the client must fall
back on doing copy the normal way. Return ENOTSUPP to the vfs and
fallback to regular copy.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 00809b3..c7c2ffa 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -395,7 +395,8 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
args.sync = true;
dst_exception.retry = 1;
continue;
- } else if (err == -ESTALE &&
+ } else if ((err == -ESTALE ||
+ err == -NFS4ERR_OFFLOAD_DENIED) &&
!nfs42_files_from_same_server(src, dst)) {
nfs42_do_offload_cancel_async(src, &args.src_stateid);
err = -EOPNOTSUPP;
--
1.8.3.1


2018-10-30 20:56:46

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 07/11] NFS: inter ssc open

From: Olga Kornievskaia <[email protected]>

NFSv4.2 inter server to server copy requires the destination server to
READ the data from the source server using the provided stateid and
file handle.

Given an NFSv4 stateid and filehandle from the COPY operaion, provide the
destination server with an NFS client function to create a struct file
suitable for the destiniation server to READ the data to be copied.

Signed-off-by: Olga Kornievskaia <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4_fs.h | 7 +++++
fs/nfs/nfs4file.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 5 ++-
3 files changed, 103 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 7d17b31..9c566a4 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -307,6 +307,13 @@ extern int nfs4_set_rw_stateid(nfs4_stateid *stateid,
const struct nfs_open_context *ctx,
const struct nfs_lock_context *l_ctx,
fmode_t fmode);
+extern int nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
+ struct nfs_fattr *fattr, struct nfs4_label *label,
+ struct inode *inode);
+extern int update_open_stateid(struct nfs4_state *state,
+ const nfs4_stateid *open_stateid,
+ const nfs4_stateid *deleg_stateid,
+ fmode_t fmode);

#if defined(CONFIG_NFS_V4_1)
extern int nfs41_sequence_done(struct rpc_task *, struct nfs4_sequence_res *);
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index e5c1a68..0b1dcf9 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -8,6 +8,7 @@
#include <linux/file.h>
#include <linux/falloc.h>
#include <linux/nfs_fs.h>
+#include <linux/file.h>
#include "delegation.h"
#include "internal.h"
#include "iostat.h"
@@ -267,6 +268,99 @@ static int nfs42_clone_file_range(struct file *src_file, loff_t src_off,
out:
return ret;
}
+
+static int read_name_gen = 1;
+#define SSC_READ_NAME_BODY "ssc_read_%d"
+
+struct file *
+nfs42_ssc_open(struct vfsmount *ss_mnt, struct nfs_fh *src_fh,
+ nfs4_stateid *stateid)
+{
+ struct nfs_fattr fattr;
+ struct file *filep, *res;
+ struct nfs_server *server;
+ struct inode *r_ino = NULL;
+ struct nfs_open_context *ctx;
+ struct nfs4_state_owner *sp;
+ char *read_name;
+ int len, status = 0;
+
+ server = NFS_SERVER(ss_mnt->mnt_root->d_inode);
+
+ nfs_fattr_init(&fattr);
+
+ status = nfs4_proc_getattr(server, src_fh, &fattr, NULL, NULL);
+ if (status < 0) {
+ res = ERR_PTR(status);
+ goto out;
+ }
+
+ res = ERR_PTR(-ENOMEM);
+ len = strlen(SSC_READ_NAME_BODY) + 16;
+ read_name = kzalloc(len, GFP_NOFS);
+ if (read_name == NULL)
+ goto out;
+ snprintf(read_name, len, SSC_READ_NAME_BODY, read_name_gen++);
+
+ r_ino = nfs_fhget(ss_mnt->mnt_root->d_inode->i_sb, src_fh, &fattr,
+ NULL);
+ if (IS_ERR(r_ino)) {
+ res = ERR_CAST(r_ino);
+ goto out;
+ }
+
+ filep = alloc_file_pseudo(r_ino, ss_mnt, read_name, FMODE_READ,
+ r_ino->i_fop);
+ if (IS_ERR(filep)) {
+ res = ERR_CAST(filep);
+ goto out;
+ }
+ filep->f_mode |= FMODE_READ;
+
+ ctx = alloc_nfs_open_context(filep->f_path.dentry, filep->f_mode,
+ filep);
+ if (IS_ERR(ctx)) {
+ res = ERR_CAST(ctx);
+ goto out_filep;
+ }
+
+ res = ERR_PTR(-EINVAL);
+ sp = nfs4_get_state_owner(server, ctx->cred, GFP_KERNEL);
+ if (sp == NULL)
+ goto out_ctx;
+
+ ctx->state = nfs4_get_open_state(r_ino, sp);
+ if (ctx->state == NULL)
+ goto out_stateowner;
+
+ set_bit(NFS_OPEN_STATE, &ctx->state->flags);
+ memcpy(&ctx->state->open_stateid.other, &stateid->other,
+ NFS4_STATEID_OTHER_SIZE);
+ update_open_stateid(ctx->state, stateid, NULL, filep->f_mode);
+
+ nfs_file_set_open_context(filep, ctx);
+ put_nfs_open_context(ctx);
+
+ file_ra_state_init(&filep->f_ra, filep->f_mapping->host->i_mapping);
+ res = filep;
+out:
+ return res;
+out_stateowner:
+ nfs4_put_state_owner(sp);
+out_ctx:
+ put_nfs_open_context(ctx);
+out_filep:
+ fput(filep);
+ goto out;
+}
+EXPORT_SYMBOL_GPL(nfs42_ssc_open);
+void nfs42_ssc_close(struct file *filep)
+{
+ struct nfs_open_context *ctx = nfs_file_open_context(filep);
+
+ ctx->state->flags = 0;
+}
+EXPORT_SYMBOL_GPL(nfs42_ssc_close);
#endif /* CONFIG_NFS_V4_2 */

const struct file_operations nfs4_file_operations = {
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index fec6e6b..e5178b2f 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -91,7 +91,6 @@
static int _nfs4_recover_proc_open(struct nfs4_opendata *data);
static int nfs4_do_fsinfo(struct nfs_server *, struct nfs_fh *, struct nfs_fsinfo *);
static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr);
-static int nfs4_proc_getattr(struct nfs_server *, struct nfs_fh *, struct nfs_fattr *, struct nfs4_label *label, struct inode *inode);
static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, struct nfs_fattr *fattr, struct nfs4_label *label, struct inode *inode);
static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
struct nfs_fattr *fattr, struct iattr *sattr,
@@ -1653,7 +1652,7 @@ static void nfs_state_clear_delegation(struct nfs4_state *state)
write_sequnlock(&state->seqlock);
}

-static int update_open_stateid(struct nfs4_state *state,
+int update_open_stateid(struct nfs4_state *state,
const nfs4_stateid *open_stateid,
const nfs4_stateid *delegation,
fmode_t fmode)
@@ -3936,7 +3935,7 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
return nfs4_call_sync(server->client, server, &msg, &args.seq_args, &res.seq_res, 0);
}

-static int nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
+int nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
struct nfs_fattr *fattr, struct nfs4_label *label,
struct inode *inode)
{
--
1.8.3.1


2018-10-30 20:56:48

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 11/11] NFS: replace cross device check in copy_file_range

From: Olga Kornievskaia <[email protected]>

Add a check to disallow cross file systems copy offload, both
files are expected to be of NFS4.2+ type.

Reviewed-by: Jeff Layton <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs4file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 989f174..69e2705 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -142,7 +142,10 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
if (pos_in >= i_size_read(file_inode(file_in)))
return -EINVAL;

- if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
+ if (file_in->f_op != &nfs4_file_operations)
+ return -EXDEV;
+
+ if ((NFS_SERVER(file_inode(file_in)))->nfs_client->cl_minorversion < 2)
return -EXDEV;

if (file_inode(file_in) == file_inode(file_out))
--
1.8.3.1


2018-10-30 20:56:49

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 06/11] NFS: also send OFFLOAD_CANCEL to source server

From: Olga Kornievskaia <[email protected]>

In case of copy is cancelled, also send OFFLOAD_CANCEL to the source
server.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index bb9e799..98fe00b 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -205,12 +205,14 @@ static int handle_async_copy(struct nfs42_copy_res *res,
memcpy(&res->write_res.verifier, &copy->verf, sizeof(copy->verf));
status = -copy->error;

+out_free:
kfree(copy);
return status;
out_cancel:
nfs42_do_offload_cancel_async(dst, &copy->stateid);
- kfree(copy);
- return status;
+ if (!nfs42_files_from_same_server(src, dst))
+ nfs42_do_offload_cancel_async(src, src_stateid);
+ goto out_free;
}

static int process_copy_commit(struct file *dst, loff_t pos_dst,
--
1.8.3.1


2018-10-30 20:56:54

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 05/11] NFS: add ca_source_server<> to COPY

From: Olga Kornievskaia <[email protected]>

Support only one source server address: the same address that
the client and source server use.

Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42.h | 3 ++-
fs/nfs/nfs42proc.c | 26 +++++++++++++++++---------
fs/nfs/nfs42xdr.c | 12 ++++++++++--
fs/nfs/nfs4file.c | 7 ++++++-
include/linux/nfs_xdr.h | 1 +
5 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
index bbe49a3..28dcee5 100644
--- a/fs/nfs/nfs42.h
+++ b/fs/nfs/nfs42.h
@@ -15,7 +15,8 @@
/* nfs4.2proc.c */
#ifdef CONFIG_NFS_V4_2
int nfs42_proc_allocate(struct file *, loff_t, loff_t);
-ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t, size_t);
+ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t, size_t,
+ struct nl4_server *, nfs4_stateid *);
int nfs42_proc_deallocate(struct file *, loff_t, loff_t);
loff_t nfs42_proc_llseek(struct file *, loff_t, int);
int nfs42_proc_layoutstats_generic(struct nfs_server *,
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index b1c57a4..bb9e799 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -242,7 +242,9 @@ static ssize_t _nfs42_proc_copy(struct file *src,
struct file *dst,
struct nfs_lock_context *dst_lock,
struct nfs42_copy_args *args,
- struct nfs42_copy_res *res)
+ struct nfs42_copy_res *res,
+ struct nl4_server *nss,
+ nfs4_stateid *cnr_stateid)
{
struct rpc_message msg = {
.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COPY],
@@ -256,11 +258,15 @@ static ssize_t _nfs42_proc_copy(struct file *src,
size_t count = args->count;
ssize_t status;

- status = nfs4_set_rw_stateid(&args->src_stateid, src_lock->open_context,
- src_lock, FMODE_READ);
- if (status)
- return status;
-
+ if (nss) {
+ args->cp_src = nss;
+ nfs4_stateid_copy(&args->src_stateid, cnr_stateid);
+ } else {
+ status = nfs4_set_rw_stateid(&args->src_stateid,
+ src_lock->open_context, src_lock, FMODE_READ);
+ if (status)
+ return status;
+ }
status = nfs_filemap_write_and_wait_range(file_inode(src)->i_mapping,
pos_src, pos_src + (loff_t)count - 1);
if (status)
@@ -324,8 +330,9 @@ static ssize_t _nfs42_proc_copy(struct file *src,
}

ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
- struct file *dst, loff_t pos_dst,
- size_t count)
+ struct file *dst, loff_t pos_dst, size_t count,
+ struct nl4_server *nss,
+ nfs4_stateid *cnr_stateid)
{
struct nfs_server *server = NFS_SERVER(file_inode(dst));
struct nfs_lock_context *src_lock;
@@ -370,7 +377,8 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
inode_lock(file_inode(dst));
err = _nfs42_proc_copy(src, src_lock,
dst, dst_lock,
- &args, &res);
+ &args, &res,
+ nss, cnr_stateid);
inode_unlock(file_inode(dst));

if (err >= 0)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index e6e7cbf..c96c3f8 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -21,7 +21,10 @@
#define encode_copy_maxsz (op_encode_hdr_maxsz + \
XDR_QUADLEN(NFS4_STATEID_SIZE) + \
XDR_QUADLEN(NFS4_STATEID_SIZE) + \
- 2 + 2 + 2 + 1 + 1 + 1)
+ 2 + 2 + 2 + 1 + 1 + 1 +\
+ 1 + /* One cnr_source_server */\
+ 1 + /* nl4_type */ \
+ 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
#define decode_copy_maxsz (op_decode_hdr_maxsz + \
NFS42_WRITE_RES_SIZE + \
1 /* cr_consecutive */ + \
@@ -186,7 +189,12 @@ static void encode_copy(struct xdr_stream *xdr,

encode_uint32(xdr, 1); /* consecutive = true */
encode_uint32(xdr, args->sync);
- encode_uint32(xdr, 0); /* src server list */
+ if (args->cp_src == NULL) { /* intra-ssc */
+ encode_uint32(xdr, 0); /* no src server list */
+ return;
+ }
+ encode_uint32(xdr, 1); /* supporting 1 server */
+ encode_nl4_server(xdr, args->cp_src);
}

static void encode_offload_cancel(struct xdr_stream *xdr,
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index beda4b3..e5c1a68 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -134,6 +134,8 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
size_t count, unsigned int flags)
{
struct nfs42_copy_notify_res *cn_resp = NULL;
+ struct nl4_server *nss = NULL;
+ nfs4_stateid *cnrs = NULL;
ssize_t ret;

if (pos_in >= i_size_read(file_inode(file_in)))
@@ -154,9 +156,12 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
ret = nfs42_proc_copy_notify(file_in, file_out, cn_resp);
if (ret)
goto out;
+ nss = &cn_resp->cnr_src;
+ cnrs = &cn_resp->cnr_stateid;
}

- ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
+ ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count, nss,
+ cnrs);
out:
kfree(cn_resp);
if (ret == -EAGAIN)
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index dfc59bc..3a40b17 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1400,6 +1400,7 @@ struct nfs42_copy_args {

u64 count;
bool sync;
+ struct nl4_server *cp_src;
};

struct nfs42_write_res {
--
1.8.3.1


2018-10-30 20:56:56

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v7 04/11] NFS: add COPY_NOTIFY operation

From: Olga Kornievskaia <[email protected]>

Try using the delegation stateid, then the open stateid.

Only NL4_NETATTR, No support for NL4_NAME and NL4_URL.
Allow only one source server address to be returned for now.

To distinguish between same server copy offload ("intra") and
a copy between different server ("inter"), do a check of server
owner identity.

Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42.h | 12 +++
fs/nfs/nfs42proc.c | 91 +++++++++++++++++++++++
fs/nfs/nfs42xdr.c | 181 ++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4_fs.h | 2 +
fs/nfs/nfs4client.c | 2 +-
fs/nfs/nfs4file.c | 14 ++++
fs/nfs/nfs4proc.c | 1 +
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 1 +
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 16 ++++
11 files changed, 321 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
index 19ec38f8..bbe49a3 100644
--- a/fs/nfs/nfs42.h
+++ b/fs/nfs/nfs42.h
@@ -13,6 +13,7 @@
#define PNFS_LAYOUTSTATS_MAXDEV (4)

/* nfs4.2proc.c */
+#ifdef CONFIG_NFS_V4_2
int nfs42_proc_allocate(struct file *, loff_t, loff_t);
ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t, size_t);
int nfs42_proc_deallocate(struct file *, loff_t, loff_t);
@@ -20,5 +21,16 @@
int nfs42_proc_layoutstats_generic(struct nfs_server *,
struct nfs42_layoutstat_data *);
int nfs42_proc_clone(struct file *, struct file *, loff_t, loff_t, loff_t);
+int nfs42_proc_copy_notify(struct file *, struct file *,
+ struct nfs42_copy_notify_res *);
+static inline bool nfs42_files_from_same_server(struct file *in,
+ struct file *out)
+{
+ struct nfs_client *c_in = (NFS_SERVER(file_inode(in)))->nfs_client;
+ struct nfs_client *c_out = (NFS_SERVER(file_inode(out)))->nfs_client;

+ return nfs4_check_serverowner_major_id(c_in->cl_serverowner,
+ c_out->cl_serverowner);
+}
+#endif /* CONFIG_NFS_V4_2 */
#endif /* __LINUX_FS_NFS_NFS4_2_H */
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index ac5b784..b1c57a4 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -3,6 +3,7 @@
* Copyright (c) 2014 Anna Schumaker <[email protected]>
*/
#include <linux/fs.h>
+#include <linux/sunrpc/addr.h>
#include <linux/sunrpc/sched.h>
#include <linux/nfs.h>
#include <linux/nfs3.h>
@@ -15,10 +16,30 @@
#include "pnfs.h"
#include "nfs4session.h"
#include "internal.h"
+#include "delegation.h"

#define NFSDBG_FACILITY NFSDBG_PROC
static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);

+static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
+{
+ struct nfs_client *clp = (NFS_SERVER(file_inode(filep)))->nfs_client;
+ unsigned short port = 2049;
+
+ rcu_read_lock();
+ naddr->netid_len = scnprintf(naddr->netid,
+ sizeof(naddr->netid), "%s",
+ rpc_peeraddr2str(clp->cl_rpcclient,
+ RPC_DISPLAY_NETID));
+ naddr->addr_len = scnprintf(naddr->addr,
+ sizeof(naddr->addr),
+ "%s.%u.%u",
+ rpc_peeraddr2str(clp->cl_rpcclient,
+ RPC_DISPLAY_ADDR),
+ port >> 8, port & 255);
+ rcu_read_unlock();
+}
+
static int _nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep,
struct nfs_lock_context *lock, loff_t offset, loff_t len)
{
@@ -461,6 +482,76 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
return status;
}

+int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
+ struct nfs42_copy_notify_args *args,
+ struct nfs42_copy_notify_res *res)
+{
+ struct nfs_server *src_server = NFS_SERVER(file_inode(src));
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COPY_NOTIFY],
+ .rpc_argp = args,
+ .rpc_resp = res,
+ };
+ int status;
+ struct nfs_open_context *ctx;
+ struct nfs_lock_context *l_ctx;
+
+ ctx = get_nfs_open_context(nfs_file_open_context(src));
+ l_ctx = nfs_get_lock_context(ctx);
+ if (IS_ERR(l_ctx))
+ return PTR_ERR(l_ctx);
+
+ status = nfs4_set_rw_stateid(&args->cna_src_stateid, ctx, l_ctx,
+ FMODE_READ);
+ nfs_put_lock_context(l_ctx);
+ if (status)
+ return status;
+
+ status = nfs4_call_sync(src_server->client, src_server, &msg,
+ &args->cna_seq_args, &res->cnr_seq_res, 0);
+ if (status == -ENOTSUPP)
+ src_server->caps &= ~NFS_CAP_COPY_NOTIFY;
+
+ put_nfs_open_context(nfs_file_open_context(src));
+ return status;
+}
+
+int nfs42_proc_copy_notify(struct file *src, struct file *dst,
+ struct nfs42_copy_notify_res *res)
+{
+ struct nfs_server *src_server = NFS_SERVER(file_inode(src));
+ struct nfs42_copy_notify_args *args;
+ struct nfs4_exception exception = {
+ .inode = file_inode(src),
+ };
+ int status;
+
+ if (!(src_server->caps & NFS_CAP_COPY_NOTIFY))
+ return -EOPNOTSUPP;
+
+ args = kzalloc(sizeof(struct nfs42_copy_notify_args), GFP_NOFS);
+ if (args == NULL)
+ return -ENOMEM;
+
+ args->cna_src_fh = NFS_FH(file_inode(src)),
+ args->cna_dst.nl4_type = NL4_NETADDR;
+ nfs42_set_netaddr(dst, &args->cna_dst.u.nl4_addr);
+ exception.stateid = &args->cna_src_stateid;
+
+ do {
+ status = _nfs42_proc_copy_notify(src, dst, args, res);
+ if (status == -ENOTSUPP) {
+ status = -EOPNOTSUPP;
+ goto out;
+ }
+ status = nfs4_handle_exception(src_server, status, &exception);
+ } while (exception.retry);
+
+out:
+ kfree(args);
+ return status;
+}
+
static loff_t _nfs42_proc_llseek(struct file *filep,
struct nfs_lock_context *lock, loff_t offset, int whence)
{
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 69f72ed..e6e7cbf 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -29,6 +29,16 @@
#define encode_offload_cancel_maxsz (op_encode_hdr_maxsz + \
XDR_QUADLEN(NFS4_STATEID_SIZE))
#define decode_offload_cancel_maxsz (op_decode_hdr_maxsz)
+#define encode_copy_notify_maxsz (op_encode_hdr_maxsz + \
+ XDR_QUADLEN(NFS4_STATEID_SIZE) + \
+ 1 + /* nl4_type */ \
+ 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
+#define decode_copy_notify_maxsz (op_decode_hdr_maxsz + \
+ 3 + /* cnr_lease_time */\
+ XDR_QUADLEN(NFS4_STATEID_SIZE) + \
+ 1 + /* Support 1 cnr_source_server */\
+ 1 + /* nl4_type */ \
+ 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
#define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
encode_fallocate_maxsz)
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
@@ -84,6 +94,12 @@
#define NFS4_dec_offload_cancel_sz (compound_decode_hdr_maxsz + \
decode_putfh_maxsz + \
decode_offload_cancel_maxsz)
+#define NFS4_enc_copy_notify_sz (compound_encode_hdr_maxsz + \
+ encode_putfh_maxsz + \
+ encode_copy_notify_maxsz)
+#define NFS4_dec_copy_notify_sz (compound_decode_hdr_maxsz + \
+ decode_putfh_maxsz + \
+ decode_copy_notify_maxsz)
#define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
encode_putfh_maxsz + \
encode_deallocate_maxsz + \
@@ -137,6 +153,25 @@ static void encode_allocate(struct xdr_stream *xdr,
encode_fallocate(xdr, args);
}

+static void encode_nl4_server(struct xdr_stream *xdr, const struct nl4_server *ns)
+{
+ encode_uint32(xdr, ns->nl4_type);
+ switch (ns->nl4_type) {
+ case NL4_NAME:
+ case NL4_URL:
+ encode_string(xdr, ns->u.nl4_str_sz, ns->u.nl4_str);
+ break;
+ case NL4_NETADDR:
+ encode_string(xdr, ns->u.nl4_addr.netid_len,
+ ns->u.nl4_addr.netid);
+ encode_string(xdr, ns->u.nl4_addr.addr_len,
+ ns->u.nl4_addr.addr);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ }
+}
+
static void encode_copy(struct xdr_stream *xdr,
const struct nfs42_copy_args *args,
struct compound_hdr *hdr)
@@ -162,6 +197,15 @@ static void encode_offload_cancel(struct xdr_stream *xdr,
encode_nfs4_stateid(xdr, &args->osa_stateid);
}

+static void encode_copy_notify(struct xdr_stream *xdr,
+ const struct nfs42_copy_notify_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_COPY_NOTIFY, decode_copy_notify_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->cna_src_stateid);
+ encode_nl4_server(xdr, &args->cna_dst);
+}
+
static void encode_deallocate(struct xdr_stream *xdr,
const struct nfs42_falloc_args *args,
struct compound_hdr *hdr)
@@ -298,6 +342,25 @@ static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
}

/*
+ * Encode COPY_NOTIFY request
+ */
+static void nfs4_xdr_enc_copy_notify(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs42_copy_notify_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->cna_seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->cna_seq_args, &hdr);
+ encode_putfh(xdr, args->cna_src_fh, &hdr);
+ encode_copy_notify(xdr, args, &hdr);
+ encode_nops(&hdr);
+}
+
+/*
* Encode DEALLOCATE request
*/
static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
@@ -416,6 +479,58 @@ static int decode_write_response(struct xdr_stream *xdr,
return -EIO;
}

+static int decode_nl4_server(struct xdr_stream *xdr, struct nl4_server *ns)
+{
+ struct nfs42_netaddr *naddr;
+ uint32_t dummy;
+ char *dummy_str;
+ __be32 *p;
+ int status;
+
+ /* nl_type */
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ return -EIO;
+ ns->nl4_type = be32_to_cpup(p);
+ switch (ns->nl4_type) {
+ case NL4_NAME:
+ case NL4_URL:
+ status = decode_opaque_inline(xdr, &dummy, &dummy_str);
+ if (unlikely(status))
+ return status;
+ if (unlikely(dummy > NFS4_OPAQUE_LIMIT))
+ return -EIO;
+ memcpy(&ns->u.nl4_str, dummy_str, dummy);
+ ns->u.nl4_str_sz = dummy;
+ break;
+ case NL4_NETADDR:
+ naddr = &ns->u.nl4_addr;
+
+ /* netid string */
+ status = decode_opaque_inline(xdr, &dummy, &dummy_str);
+ if (unlikely(status))
+ return status;
+ if (unlikely(dummy > RPCBIND_MAXNETIDLEN))
+ return -EIO;
+ naddr->netid_len = dummy;
+ memcpy(naddr->netid, dummy_str, naddr->netid_len);
+
+ /* uaddr string */
+ status = decode_opaque_inline(xdr, &dummy, &dummy_str);
+ if (unlikely(status))
+ return status;
+ if (unlikely(dummy > RPCBIND_MAXUADDRLEN))
+ return -EIO;
+ naddr->addr_len = dummy;
+ memcpy(naddr->addr, dummy_str, naddr->addr_len);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return -EIO;
+ }
+ return 0;
+}
+
static int decode_copy_requirements(struct xdr_stream *xdr,
struct nfs42_copy_res *res) {
__be32 *p;
@@ -458,6 +573,46 @@ static int decode_offload_cancel(struct xdr_stream *xdr,
return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
}

+static int decode_copy_notify(struct xdr_stream *xdr,
+ struct nfs42_copy_notify_res *res)
+{
+ __be32 *p;
+ int status, count;
+
+ status = decode_op_hdr(xdr, OP_COPY_NOTIFY);
+ if (status)
+ return status;
+ /* cnr_lease_time */
+ p = xdr_inline_decode(xdr, 12);
+ if (unlikely(!p))
+ goto out_overflow;
+ p = xdr_decode_hyper(p, &res->cnr_lease_time.seconds);
+ res->cnr_lease_time.nseconds = be32_to_cpup(p);
+
+ status = decode_opaque_fixed(xdr, &res->cnr_stateid, NFS4_STATEID_SIZE);
+ if (unlikely(status))
+ goto out_overflow;
+
+ /* number of source addresses */
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ goto out_overflow;
+
+ count = be32_to_cpup(p);
+ if (count > 1)
+ pr_warn("NFS: %s: nsvr %d > Supported. Use first servers\n",
+ __func__, count);
+
+ status = decode_nl4_server(xdr, &res->cnr_src);
+ if (unlikely(status))
+ goto out_overflow;
+ return 0;
+
+out_overflow:
+ print_overflow_msg(__func__, xdr);
+ return -EIO;
+}
+
static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *res)
{
return decode_op_hdr(xdr, OP_DEALLOCATE);
@@ -585,6 +740,32 @@ static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst *rqstp,
}

/*
+ * Decode COPY_NOTIFY response
+ */
+static int nfs4_xdr_dec_copy_notify(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs42_copy_notify_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->cnr_seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_copy_notify(xdr, res);
+
+out:
+ return status;
+}
+
+/*
* Decode DEALLOCATE request
*/
static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 8d59c96..7d17b31 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -460,6 +460,8 @@ int nfs41_discover_server_trunking(struct nfs_client *clp,
struct nfs_client **, struct rpc_cred *);
extern void nfs4_schedule_session_recovery(struct nfs4_session *, int);
extern void nfs41_notify_server(struct nfs_client *);
+bool nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
+ struct nfs41_server_owner *o2);
#else
static inline void nfs4_schedule_session_recovery(struct nfs4_session *session, int err)
{
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 8f53455..ac00eb8 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -625,7 +625,7 @@ int nfs40_walk_client_list(struct nfs_client *new,
/*
* Returns true if the server major ids match
*/
-static bool
+bool
nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
struct nfs41_server_owner *o2)
{
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 7838bdf..beda4b3 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -133,6 +133,7 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t count, unsigned int flags)
{
+ struct nfs42_copy_notify_res *cn_resp = NULL;
ssize_t ret;

if (pos_in >= i_size_read(file_inode(file_in)))
@@ -144,7 +145,20 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
if (file_inode(file_in) == file_inode(file_out))
return -EINVAL;
retry:
+ if (!nfs42_files_from_same_server(file_in, file_out)) {
+ cn_resp = kzalloc(sizeof(struct nfs42_copy_notify_res),
+ GFP_NOFS);
+ if (unlikely(cn_resp == NULL))
+ return -ENOMEM;
+
+ ret = nfs42_proc_copy_notify(file_in, file_out, cn_resp);
+ if (ret)
+ goto out;
+ }
+
ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
+out:
+ kfree(cn_resp);
if (ret == -EAGAIN)
goto retry;
return ret;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index db84b4a..fec6e6b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -9692,6 +9692,7 @@ static bool nfs4_match_stateid(const nfs4_stateid *s1,
| NFS_CAP_ALLOCATE
| NFS_CAP_COPY
| NFS_CAP_OFFLOAD_CANCEL
+ | NFS_CAP_COPY_NOTIFY
| NFS_CAP_DEALLOCATE
| NFS_CAP_SEEK
| NFS_CAP_LAYOUTSTATS
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index b7bde12..2163900 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7790,6 +7790,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
PROC42(CLONE, enc_clone, dec_clone),
PROC42(COPY, enc_copy, dec_copy),
PROC42(OFFLOAD_CANCEL, enc_offload_cancel, dec_offload_cancel),
+ PROC42(COPY_NOTIFY, enc_copy_notify, dec_copy_notify),
PROC(LOOKUPP, enc_lookupp, dec_lookupp),
};

diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 4803507..9e49a6c 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -537,6 +537,7 @@ enum {
NFSPROC4_CLNT_CLONE,
NFSPROC4_CLNT_COPY,
NFSPROC4_CLNT_OFFLOAD_CANCEL,
+ NFSPROC4_CLNT_COPY_NOTIFY,

NFSPROC4_CLNT_LOOKUPP,
};
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 0fc0b91..e5d89ff 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -261,5 +261,6 @@ struct nfs_server {
#define NFS_CAP_CLONE (1U << 23)
#define NFS_CAP_COPY (1U << 24)
#define NFS_CAP_OFFLOAD_CANCEL (1U << 25)
+#define NFS_CAP_COPY_NOTIFY (1U << 26)

#endif
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 0e01625..dfc59bc 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1428,6 +1428,22 @@ struct nfs42_offload_status_res {
int osr_status;
};

+struct nfs42_copy_notify_args {
+ struct nfs4_sequence_args cna_seq_args;
+
+ struct nfs_fh *cna_src_fh;
+ nfs4_stateid cna_src_stateid;
+ struct nl4_server cna_dst;
+};
+
+struct nfs42_copy_notify_res {
+ struct nfs4_sequence_res cnr_seq_res;
+
+ struct nfstime4 cnr_lease_time;
+ nfs4_stateid cnr_stateid;
+ struct nl4_server cnr_src;
+};
+
struct nfs42_seek_args {
struct nfs4_sequence_args seq_args;

--
1.8.3.1


2018-10-30 21:26:58

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v7 01/11] VFS: move cross device copy_file_range() check into filesystems

On Tue, Oct 30, 2018 at 10:56 PM Olga Kornievskaia
<[email protected]> wrote:
>
> From: Olga Kornievskaia <[email protected]>
>
> This patch makes it the responsibility of individual filesystems to
> allow or deny cross device copies. Both NFS and CIFS have operations
> for cross-server copies, and later patches will implement this feature.
>
> Note that as of this patch, the copy_file_range() function might be passed
> superblocks from different filesystem types. -EXDEV should be returned
> if cross device copies aren't supported.
>
> Reviewed-by: Amir Goldstein <[email protected]>
> Reviewed-by: Matthew Wilcox <[email protected]>
> Reviewed-by: Steve French <[email protected]>
> Reviewed-by: Jeff Layton <[email protected]>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> Documentation/filesystems/porting | 7 +++++++
> fs/cifs/cifsfs.c | 3 +++
> fs/nfs/nfs4file.c | 3 +++
> fs/overlayfs/file.c | 3 +++
> fs/read_write.c | 12 +++++++-----
> 5 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
> index 7b7b845..897e1e7 100644
> --- a/Documentation/filesystems/porting
> +++ b/Documentation/filesystems/porting
> @@ -622,3 +622,10 @@ in your dentry operations instead.
> alloc_file_clone(file, flags, ops) does not affect any caller's references.
> On success you get a new struct file sharing the mount/dentry with the
> original, on failure - ERR_PTR().
> +--
> +[mandatory]
> + ->copy_file_range() may now be passed files which belong to two
> + different superblocks of the same file system type or which belong
> + to two different filesystems types all together. As before, the
> + destination's copy_file_range() is the function which is called.
> + If it cannot copy ranges from the source, it should return -EXDEV.
> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> index 7065426..ca8fc87 100644
> --- a/fs/cifs/cifsfs.c
> +++ b/fs/cifs/cifsfs.c
> @@ -1114,6 +1114,9 @@ static ssize_t cifs_copy_file_range(struct file *src_file, loff_t off,
> unsigned int xid = get_xid();
> ssize_t rc;
>
> + if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb)
> + return -EXDEV;
> +
> rc = cifs_file_copychunk_range(xid, src_file, off, dst_file, destoff,
> len, flags);
> free_xid(xid);
> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> index 4288a6e..5a73c90 100644
> --- a/fs/nfs/nfs4file.c
> +++ b/fs/nfs/nfs4file.c
> @@ -135,6 +135,9 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
> {
> ssize_t ret;
>
> + if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
> + return -EXDEV;
> +
> if (file_inode(file_in) == file_inode(file_out))
> return -EINVAL;
> retry:
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index aeaefd2..0331e33 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -483,6 +483,9 @@ static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> struct file *file_out, loff_t pos_out,
> size_t len, unsigned int flags)
> {
> + if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
> + return -EXDEV;
> +
> return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
> OVL_COPY);
> }
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 39b4a21..c5bed2e 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1575,10 +1575,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> (file_out->f_flags & O_APPEND))
> return -EBADF;
>
> - /* this could be relaxed once a method supports cross-fs copies */
> - if (inode_in->i_sb != inode_out->i_sb)
> - return -EXDEV;
> -
> if (len == 0)
> return 0;
>
> @@ -1588,7 +1584,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> * Try cloning first, this is supported by more file systems, and
> * more efficient if both clone and copy are supported (e.g. NFS).
> */
> - if (file_in->f_op->clone_file_range) {
> + if (inode_in->i_sb == inode_out->i_sb &&
> + file_in->f_op->clone_file_range) {
> ret = file_in->f_op->clone_file_range(file_in, pos_in,
> file_out, pos_out, len);
> if (ret == 0) {
> @@ -1604,6 +1601,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> goto done;
> }
>
> + if (inode_in->i_sb != inode_out->i_sb) {
> + ret = -EXDEV;
> + goto done;
> + }
> +
> ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
>

If this check would stay here for long I would say it needs a TODO comment
of some sort, similar to the one that was removed, but I expect
someone, if not Olga,
will pick this up shorty after merge. If nobody else, I probably will.

So I re-affirm my Reviewed-by.

Thanks,
Amir.

2018-10-31 15:07:51

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v7 04/11] NFS: add COPY_NOTIFY operation

Hi Olga,

On Tue, 2018-10-30 at 16:56 -0400, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <[email protected]>
>
> Try using the delegation stateid, then the open stateid.
>
> Only NL4_NETATTR, No support for NL4_NAME and NL4_URL.
> Allow only one source server address to be returned for now.
>
> To distinguish between same server copy offload ("intra") and
> a copy between different server ("inter"), do a check of server
> owner identity.
>
> Reviewed-by: Jeff Layton <[email protected]>
> Signed-off-by: Andy Adamson <[email protected]>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> fs/nfs/nfs42.h | 12 +++
> fs/nfs/nfs42proc.c | 91 +++++++++++++++++++++++
> fs/nfs/nfs42xdr.c | 181
> ++++++++++++++++++++++++++++++++++++++++++++++
> fs/nfs/nfs4_fs.h | 2 +
> fs/nfs/nfs4client.c | 2 +-
> fs/nfs/nfs4file.c | 14 ++++
> fs/nfs/nfs4proc.c | 1 +
> fs/nfs/nfs4xdr.c | 1 +
> include/linux/nfs4.h | 1 +
> include/linux/nfs_fs_sb.h | 1 +
> include/linux/nfs_xdr.h | 16 ++++
> 11 files changed, 321 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
> index 19ec38f8..bbe49a3 100644
> --- a/fs/nfs/nfs42.h
> +++ b/fs/nfs/nfs42.h
> @@ -13,6 +13,7 @@
> #define PNFS_LAYOUTSTATS_MAXDEV (4)
>
> /* nfs4.2proc.c */
> +#ifdef CONFIG_NFS_V4_2
> int nfs42_proc_allocate(struct file *, loff_t, loff_t);
> ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t,
> size_t);
> int nfs42_proc_deallocate(struct file *, loff_t, loff_t);
> @@ -20,5 +21,16 @@
> int nfs42_proc_layoutstats_generic(struct nfs_server *,
> struct nfs42_layoutstat_data *);
> int nfs42_proc_clone(struct file *, struct file *, loff_t, loff_t, loff_t);
> +int nfs42_proc_copy_notify(struct file *, struct file *,
> + struct nfs42_copy_notify_res *);
> +static inline bool nfs42_files_from_same_server(struct file *in,
> + struct file *out)
> +{
> + struct nfs_client *c_in = (NFS_SERVER(file_inode(in)))->nfs_client;
> + struct nfs_client *c_out = (NFS_SERVER(file_inode(out)))->nfs_client;
>
> + return nfs4_check_serverowner_major_id(c_in->cl_serverowner,
> + c_out->cl_serverowner);
> +}
> +#endif /* CONFIG_NFS_V4_2 */
> #endif /* __LINUX_FS_NFS_NFS4_2_H */
> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> index ac5b784..b1c57a4 100644
> --- a/fs/nfs/nfs42proc.c
> +++ b/fs/nfs/nfs42proc.c
> @@ -3,6 +3,7 @@
> * Copyright (c) 2014 Anna Schumaker <[email protected]>
> */
> #include <linux/fs.h>
> +#include <linux/sunrpc/addr.h>
> #include <linux/sunrpc/sched.h>
> #include <linux/nfs.h>
> #include <linux/nfs3.h>
> @@ -15,10 +16,30 @@
> #include "pnfs.h"
> #include "nfs4session.h"
> #include "internal.h"
> +#include "delegation.h"
>
> #define NFSDBG_FACILITY NFSDBG_PROC
> static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid
> *std);
>
> +static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr
> *naddr)
> +{
> + struct nfs_client *clp = (NFS_SERVER(file_inode(filep)))->nfs_client;
> + unsigned short port = 2049;
> +
> + rcu_read_lock();
> + naddr->netid_len = scnprintf(naddr->netid,
> + sizeof(naddr->netid), "%s",
> + rpc_peeraddr2str(clp->cl_rpcclient,
> + RPC_DISPLAY_NETID));
> + naddr->addr_len = scnprintf(naddr->addr,
> + sizeof(naddr->addr),
> + "%s.%u.%u",
> + rpc_peeraddr2str(clp->cl_rpcclient,
> + RPC_DISPLAY_ADDR),
> + port >> 8, port & 255);
> + rcu_read_unlock();
> +}
> +
> static int _nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep,
> struct nfs_lock_context *lock, loff_t offset, loff_t len)
> {
> @@ -461,6 +482,76 @@ static int nfs42_do_offload_cancel_async(struct file
> *dst,
> return status;
> }
>
> +int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
> + struct nfs42_copy_notify_args *args,
> + struct nfs42_copy_notify_res *res)
> +{
> + struct nfs_server *src_server = NFS_SERVER(file_inode(src));
> + struct rpc_message msg = {
> + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COPY_NOTIFY],
> + .rpc_argp = args,
> + .rpc_resp = res,
> + };
> + int status;
> + struct nfs_open_context *ctx;
> + struct nfs_lock_context *l_ctx;
> +
> + ctx = get_nfs_open_context(nfs_file_open_context(src));
> + l_ctx = nfs_get_lock_context(ctx);
> + if (IS_ERR(l_ctx))
> + return PTR_ERR(l_ctx);
> +
> + status = nfs4_set_rw_stateid(&args->cna_src_stateid, ctx, l_ctx,
> + FMODE_READ);
> + nfs_put_lock_context(l_ctx);
> + if (status)
> + return status;
> +
> + status = nfs4_call_sync(src_server->client, src_server, &msg,
> + &args->cna_seq_args, &res->cnr_seq_res, 0);
> + if (status == -ENOTSUPP)
> + src_server->caps &= ~NFS_CAP_COPY_NOTIFY;
> +
> + put_nfs_open_context(nfs_file_open_context(src));
> + return status;
> +}
> +
> +int nfs42_proc_copy_notify(struct file *src, struct file *dst,
> + struct nfs42_copy_notify_res *res)
> +{
> + struct nfs_server *src_server = NFS_SERVER(file_inode(src));
> + struct nfs42_copy_notify_args *args;
> + struct nfs4_exception exception = {
> + .inode = file_inode(src),
> + };
> + int status;
> +
> + if (!(src_server->caps & NFS_CAP_COPY_NOTIFY))
> + return -EOPNOTSUPP;
> +
> + args = kzalloc(sizeof(struct nfs42_copy_notify_args), GFP_NOFS);
> + if (args == NULL)
> + return -ENOMEM;
> +
> + args->cna_src_fh = NFS_FH(file_inode(src)),
> + args->cna_dst.nl4_type = NL4_NETADDR;
> + nfs42_set_netaddr(dst, &args->cna_dst.u.nl4_addr);
> + exception.stateid = &args->cna_src_stateid;
> +
> + do {
> + status = _nfs42_proc_copy_notify(src, dst, args, res);
> + if (status == -ENOTSUPP) {
> + status = -EOPNOTSUPP;
> + goto out;
> + }
> + status = nfs4_handle_exception(src_server, status, &exception);
> + } while (exception.retry);
> +
> +out:
> + kfree(args);
> + return status;
> +}
> +
> static loff_t _nfs42_proc_llseek(struct file *filep,
> struct nfs_lock_context *lock, loff_t offset, int whence)
> {
> diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
> index 69f72ed..e6e7cbf 100644
> --- a/fs/nfs/nfs42xdr.c
> +++ b/fs/nfs/nfs42xdr.c
> @@ -29,6 +29,16 @@
> #define encode_offload_cancel_maxsz (op_encode_hdr_maxsz + \
> XDR_QUADLEN(NFS4_STATEID_SIZE))
> #define decode_offload_cancel_maxsz (op_decode_hdr_maxsz)
> +#define encode_copy_notify_maxsz (op_encode_hdr_maxsz + \
> + XDR_QUADLEN(NFS4_STATEID_SIZE) + \
> + 1 + /* nl4_type */ \
> + 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
> +#define decode_copy_notify_maxsz (op_decode_hdr_maxsz + \
> + 3 + /* cnr_lease_time */\
> + XDR_QUADLEN(NFS4_STATEID_SIZE) + \
> + 1 + /* Support 1 cnr_source_server */\
> + 1 + /* nl4_type */ \
> + 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
> #define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
> encode_fallocate_maxsz)
> #define decode_deallocate_maxsz (op_decode_hdr_maxsz)
> @@ -84,6 +94,12 @@
> #define NFS4_dec_offload_cancel_sz (compound_decode_hdr_maxsz + \
> decode_putfh_maxsz + \
> decode_offload_cancel_maxsz)
> +#define NFS4_enc_copy_notify_sz (compound_encode_hdr_maxsz + \
> + encode_putfh_maxsz + \
> + encode_copy_notify_maxsz)
> +#define NFS4_dec_copy_notify_sz (compound_decode_hdr_maxsz + \
> + decode_putfh_maxsz + \
> + decode_copy_notify_maxsz)
> #define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
> encode_putfh_maxsz + \
> encode_deallocate_maxsz + \
> @@ -137,6 +153,25 @@ static void encode_allocate(struct xdr_stream *xdr,
> encode_fallocate(xdr, args);
> }
>
> +static void encode_nl4_server(struct xdr_stream *xdr, const struct nl4_server
> *ns)
> +{
> + encode_uint32(xdr, ns->nl4_type);
> + switch (ns->nl4_type) {
> + case NL4_NAME:
> + case NL4_URL:
> + encode_string(xdr, ns->u.nl4_str_sz, ns->u.nl4_str);
> + break;
> + case NL4_NETADDR:
> + encode_string(xdr, ns->u.nl4_addr.netid_len,
> + ns->u.nl4_addr.netid);
> + encode_string(xdr, ns->u.nl4_addr.addr_len,
> + ns->u.nl4_addr.addr);
> + break;
> + default:
> + WARN_ON_ONCE(1);
> + }
> +}
> +
> static void encode_copy(struct xdr_stream *xdr,
> const struct nfs42_copy_args *args,
> struct compound_hdr *hdr)
> @@ -162,6 +197,15 @@ static void encode_offload_cancel(struct xdr_stream *xdr,
> encode_nfs4_stateid(xdr, &args->osa_stateid);
> }
>
> +static void encode_copy_notify(struct xdr_stream *xdr,
> + const struct nfs42_copy_notify_args *args,
> + struct compound_hdr *hdr)
> +{
> + encode_op_hdr(xdr, OP_COPY_NOTIFY, decode_copy_notify_maxsz, hdr);
> + encode_nfs4_stateid(xdr, &args->cna_src_stateid);
> + encode_nl4_server(xdr, &args->cna_dst);
> +}
> +
> static void encode_deallocate(struct xdr_stream *xdr,
> const struct nfs42_falloc_args *args,
> struct compound_hdr *hdr)
> @@ -298,6 +342,25 @@ static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst
> *req,
> }
>
> /*
> + * Encode COPY_NOTIFY request
> + */
> +static void nfs4_xdr_enc_copy_notify(struct rpc_rqst *req,
> + struct xdr_stream *xdr,
> + const void *data)
> +{
> + const struct nfs42_copy_notify_args *args = data;
> + struct compound_hdr hdr = {
> + .minorversion = nfs4_xdr_minorversion(&args->cna_seq_args),
> + };
> +
> + encode_compound_hdr(xdr, req, &hdr);
> + encode_sequence(xdr, &args->cna_seq_args, &hdr);
> + encode_putfh(xdr, args->cna_src_fh, &hdr);
> + encode_copy_notify(xdr, args, &hdr);
> + encode_nops(&hdr);
> +}
> +
> +/*
> * Encode DEALLOCATE request
> */
> static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
> @@ -416,6 +479,58 @@ static int decode_write_response(struct xdr_stream *xdr,
> return -EIO;
> }
>
> +static int decode_nl4_server(struct xdr_stream *xdr, struct nl4_server *ns)
> +{
> + struct nfs42_netaddr *naddr;
> + uint32_t dummy;
> + char *dummy_str;
> + __be32 *p;
> + int status;
> +
> + /* nl_type */
> + p = xdr_inline_decode(xdr, 4);
> + if (unlikely(!p))
> + return -EIO;
> + ns->nl4_type = be32_to_cpup(p);
> + switch (ns->nl4_type) {
> + case NL4_NAME:
> + case NL4_URL:
> + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> + if (unlikely(status))
> + return status;
> + if (unlikely(dummy > NFS4_OPAQUE_LIMIT))
> + return -EIO;
> + memcpy(&ns->u.nl4_str, dummy_str, dummy);
> + ns->u.nl4_str_sz = dummy;
> + break;
> + case NL4_NETADDR:
> + naddr = &ns->u.nl4_addr;
> +
> + /* netid string */
> + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> + if (unlikely(status))
> + return status;
> + if (unlikely(dummy > RPCBIND_MAXNETIDLEN))
> + return -EIO;
> + naddr->netid_len = dummy;
> + memcpy(naddr->netid, dummy_str, naddr->netid_len);
> +
> + /* uaddr string */
> + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> + if (unlikely(status))
> + return status;
> + if (unlikely(dummy > RPCBIND_MAXUADDRLEN))
> + return -EIO;
> + naddr->addr_len = dummy;
> + memcpy(naddr->addr, dummy_str, naddr->addr_len);
> + break;
> + default:
> + WARN_ON_ONCE(1);
> + return -EIO;
> + }
> + return 0;
> +}
> +
> static int decode_copy_requirements(struct xdr_stream *xdr,
> struct nfs42_copy_res *res) {
> __be32 *p;
> @@ -458,6 +573,46 @@ static int decode_offload_cancel(struct xdr_stream *xdr,
> return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
> }
>
> +static int decode_copy_notify(struct xdr_stream *xdr,
> + struct nfs42_copy_notify_res *res)
> +{
> + __be32 *p;
> + int status, count;
> +
> + status = decode_op_hdr(xdr, OP_COPY_NOTIFY);
> + if (status)
> + return status;
> + /* cnr_lease_time */
> + p = xdr_inline_decode(xdr, 12);
> + if (unlikely(!p))
> + goto out_overflow;
> + p = xdr_decode_hyper(p, &res->cnr_lease_time.seconds);
> + res->cnr_lease_time.nseconds = be32_to_cpup(p);
> +
> + status = decode_opaque_fixed(xdr, &res->cnr_stateid, NFS4_STATEID_SIZE);
> + if (unlikely(status))
> + goto out_overflow;
> +
> + /* number of source addresses */
> + p = xdr_inline_decode(xdr, 4);
> + if (unlikely(!p))
> + goto out_overflow;
> +
> + count = be32_to_cpup(p);
> + if (count > 1)
> + pr_warn("NFS: %s: nsvr %d > Supported. Use first servers\n",
> + __func__, count);
> +
> + status = decode_nl4_server(xdr, &res->cnr_src);
> + if (unlikely(status))
> + goto out_overflow;
> + return 0;
> +
> +out_overflow:
> + print_overflow_msg(__func__, xdr);
> + return -EIO;
> +}
> +
> static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res
> *res)
> {
> return decode_op_hdr(xdr, OP_DEALLOCATE);
> @@ -585,6 +740,32 @@ static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst
> *rqstp,
> }
>
> /*
> + * Decode COPY_NOTIFY response
> + */
> +static int nfs4_xdr_dec_copy_notify(struct rpc_rqst *rqstp,
> + struct xdr_stream *xdr,
> + void *data)
> +{
> + struct nfs42_copy_notify_res *res = data;
> + struct compound_hdr hdr;
> + int status;
> +
> + status = decode_compound_hdr(xdr, &hdr);
> + if (status)
> + goto out;
> + status = decode_sequence(xdr, &res->cnr_seq_res, rqstp);
> + if (status)
> + goto out;
> + status = decode_putfh(xdr);
> + if (status)
> + goto out;
> + status = decode_copy_notify(xdr, res);
> +
> +out:
> + return status;
> +}
> +
> +/*
> * Decode DEALLOCATE request
> */
> static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
> diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> index 8d59c96..7d17b31 100644
> --- a/fs/nfs/nfs4_fs.h
> +++ b/fs/nfs/nfs4_fs.h
> @@ -460,6 +460,8 @@ int nfs41_discover_server_trunking(struct nfs_client *clp,
> struct nfs_client **, struct rpc_cred *);
> extern void nfs4_schedule_session_recovery(struct nfs4_session *, int);
> extern void nfs41_notify_server(struct nfs_client *);
> +bool nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
> + struct nfs41_server_owner *o2);
> #else
> static inline void nfs4_schedule_session_recovery(struct nfs4_session
> *session, int err)
> {
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 8f53455..ac00eb8 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -625,7 +625,7 @@ int nfs40_walk_client_list(struct nfs_client *new,
> /*
> * Returns true if the server major ids match
> */
> -static bool
> +bool
> nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
> struct nfs41_server_owner *o2)
> {
> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> index 7838bdf..beda4b3 100644
> --- a/fs/nfs/nfs4file.c
> +++ b/fs/nfs/nfs4file.c
> @@ -133,6 +133,7 @@ static ssize_t nfs4_copy_file_range(struct file *file_in,
> loff_t pos_in,
> struct file *file_out, loff_t pos_out,
> size_t count, unsigned int flags)
> {
> + struct nfs42_copy_notify_res *cn_resp = NULL;
> ssize_t ret;
>
> if (pos_in >= i_size_read(file_inode(file_in)))
> @@ -144,7 +145,20 @@ static ssize_t nfs4_copy_file_range(struct file *file_in,
> loff_t pos_in,
> if (file_inode(file_in) == file_inode(file_out))
> return -EINVAL;
> retry:
> + if (!nfs42_files_from_same_server(file_in, file_out)) {

I'm seeing this crash when I try to use vfs_copy_file_range() on NFS v4.0. I
think it's because clients don't have a cl_serverowner defined in this case:

[ +0.051545] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[ +0.002032] PGD 0 P4D 0
[ +0.002021] Oops: 0000 [#4] PREEMPT SMP PTI
[ +0.001980] CPU: 1 PID: 1194 Comm: nfscopy Tainted: G D 4.19.0-
ANNA+ #2124
[ +0.001386] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ +0.001266] RIP: 0010:nfs4_check_serverowner_major_id+0x5/0x30 [nfsv4]
[ +0.001254] Code: ff ff 48 8b 7c 24 10 eb 95 41 bf da d8 ff ff eb da e8 ef ec
eb d3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 <8b> 57 08 31
c0 3b 56 08 75 12 48 83 c6 0c 48 83 c7 0c e8 64 06 63
[ +0.002487] RSP: 0018:ffffac0b40b77e50 EFLAGS: 00010246
[ +0.001233] RAX: ffff8eb6f45a6000 RBX: ffff8eb77738e500 RCX: 0000000000000000
[ +0.001218] RDX: ffff8eb6f45a6000 RSI: 0000000000000000 RDI: 0000000000000000
[ +0.001271] RBP: ffff8eb6f533eb00 R08: 0000000080000000 R09: ffff8eb778c10800
[ +0.000956] R10: ffff8eb77aa36f98 R11: ffff8eb77ab66320 R12: 0000000000000000
[ +0.000848] R13: 0000000080000000 R14: 0000000000000000 R15: 0000000080000000
[ +0.000880] FS: 00007f7bf84d9500(0000) GS:ffff8eb77cb00000(0000)
knlGS:0000000000000000
[ +0.000876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000815] CR2: 0000000000000008 CR3: 00000000b3e80005 CR4: 0000000000160ee0
[ +0.000809] Call Trace:
[ +0.000803] nfs4_copy_file_range+0x8b/0x120 [nfsv4]
[ +0.000793] vfs_copy_file_range+0x135/0x360
[ +0.000768] __se_sys_copy_file_range+0xce/0x1f0
[ +0.000756] do_syscall_64+0x5b/0x170
[ +0.000765] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ +0.000728] RIP: 0033:0x7f7bf840a40d
[ +0.000718] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8
48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
ff ff 73 01 c3 48 8b 0d 23 7a 0c 00 f7 d8 64 89 01 48
[ +0.001446] RSP: 002b:00007ffd9ee8c158 EFLAGS: 00000202 ORIG_RAX:
0000000000000146
[ +0.000771] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7bf840a40d
[ +0.000727] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003
[ +0.000709] RBP: 00007ffd9ee8c1a0 R08: 0000000080000000 R09: 0000000000000000
[ +0.000692] R10: 0000000000000000 R11: 0000000000000202 R12: 0000561b5b15e720
[ +0.000680] R13: 00007ffd9ee8c360 R14: 0000000000000000 R15: 0000000000000000
[ +0.000667] Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache rpcrdma
ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm ib_core crct10dif_pclmul crc32_pclmul cfg80211
ghash_clmulni_intel nfsd joydev mousedev psmouse aesni_intel auth_rpcgss rfkill
aes_x86_64 8021q crypto_simd nfs_acl cryptd lockd mrp input_leds glue_helper
led_class grace evdev pcspkr intel_agp i2c_piix4 intel_gtt sunrpc mac_hid
ip_tables x_tables ata_generic pata_acpi ata_piix libata scsi_mod serio_raw
atkbd libps2 i8042 floppy serio xfs virtio_gpu drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm drm libcrc32c crc32c_generic crc32c_intel
virtio_balloon virtio_net net_failover failover agpgart virtio_pci virtio_blk
virtio_ring virtio
[ +0.004802] CR2: 0000000000000008
[ +0.000742] ---[ end trace 24756b969e170fa4 ]---


Thanks,
Anna

> + cn_resp = kzalloc(sizeof(struct nfs42_copy_notify_res),
> + GFP_NOFS);
> + if (unlikely(cn_resp == NULL))
> + return -ENOMEM;
> +
> + ret = nfs42_proc_copy_notify(file_in, file_out, cn_resp);
> + if (ret)
> + goto out;
> + }
> +
> ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
> +out:
> + kfree(cn_resp);
> if (ret == -EAGAIN)
> goto retry;
> return ret;
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index db84b4a..fec6e6b 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -9692,6 +9692,7 @@ static bool nfs4_match_stateid(const nfs4_stateid *s1,
> | NFS_CAP_ALLOCATE
> | NFS_CAP_COPY
> | NFS_CAP_OFFLOAD_CANCEL
> + | NFS_CAP_COPY_NOTIFY
> | NFS_CAP_DEALLOCATE
> | NFS_CAP_SEEK
> | NFS_CAP_LAYOUTSTATS
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index b7bde12..2163900 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -7790,6 +7790,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct
> nfs_entry *entry,
> PROC42(CLONE, enc_clone, dec_clone),
> PROC42(COPY, enc_copy, dec_copy),
> PROC42(OFFLOAD_CANCEL, enc_offload_cancel, dec_offload_cancel),
> + PROC42(COPY_NOTIFY, enc_copy_notify, dec_copy_notify),
> PROC(LOOKUPP, enc_lookupp, dec_lookupp),
> };
>
> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> index 4803507..9e49a6c 100644
> --- a/include/linux/nfs4.h
> +++ b/include/linux/nfs4.h
> @@ -537,6 +537,7 @@ enum {
> NFSPROC4_CLNT_CLONE,
> NFSPROC4_CLNT_COPY,
> NFSPROC4_CLNT_OFFLOAD_CANCEL,
> + NFSPROC4_CLNT_COPY_NOTIFY,
>
> NFSPROC4_CLNT_LOOKUPP,
> };
> diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> index 0fc0b91..e5d89ff 100644
> --- a/include/linux/nfs_fs_sb.h
> +++ b/include/linux/nfs_fs_sb.h
> @@ -261,5 +261,6 @@ struct nfs_server {
> #define NFS_CAP_CLONE (1U << 23)
> #define NFS_CAP_COPY (1U << 24)
> #define NFS_CAP_OFFLOAD_CANCEL (1U << 25)
> +#define NFS_CAP_COPY_NOTIFY (1U << 26)
>
> #endif
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index 0e01625..dfc59bc 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1428,6 +1428,22 @@ struct nfs42_offload_status_res {
> int osr_status;
> };
>
> +struct nfs42_copy_notify_args {
> + struct nfs4_sequence_args cna_seq_args;
> +
> + struct nfs_fh *cna_src_fh;
> + nfs4_stateid cna_src_stateid;
> + struct nl4_server cna_dst;
> +};
> +
> +struct nfs42_copy_notify_res {
> + struct nfs4_sequence_res cnr_seq_res;
> +
> + struct nfstime4 cnr_lease_time;
> + nfs4_stateid cnr_stateid;
> + struct nl4_server cnr_src;
> +};
> +
> struct nfs42_seek_args {
> struct nfs4_sequence_args seq_args;
>

2018-10-31 15:28:45

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v7 04/11] NFS: add COPY_NOTIFY operation

On Wed, Oct 31, 2018 at 11:07 AM Schumaker, Anna
<[email protected]> wrote:
>
> Hi Olga,
>
> On Tue, 2018-10-30 at 16:56 -0400, Olga Kornievskaia wrote:
> > From: Olga Kornievskaia <[email protected]>
> >
> > Try using the delegation stateid, then the open stateid.
> >
> > Only NL4_NETATTR, No support for NL4_NAME and NL4_URL.
> > Allow only one source server address to be returned for now.
> >
> > To distinguish between same server copy offload ("intra") and
> > a copy between different server ("inter"), do a check of server
> > owner identity.
> >
> > Reviewed-by: Jeff Layton <[email protected]>
> > Signed-off-by: Andy Adamson <[email protected]>
> > Signed-off-by: Olga Kornievskaia <[email protected]>
> > ---
> > fs/nfs/nfs42.h | 12 +++
> > fs/nfs/nfs42proc.c | 91 +++++++++++++++++++++++
> > fs/nfs/nfs42xdr.c | 181
> > ++++++++++++++++++++++++++++++++++++++++++++++
> > fs/nfs/nfs4_fs.h | 2 +
> > fs/nfs/nfs4client.c | 2 +-
> > fs/nfs/nfs4file.c | 14 ++++
> > fs/nfs/nfs4proc.c | 1 +
> > fs/nfs/nfs4xdr.c | 1 +
> > include/linux/nfs4.h | 1 +
> > include/linux/nfs_fs_sb.h | 1 +
> > include/linux/nfs_xdr.h | 16 ++++
> > 11 files changed, 321 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
> > index 19ec38f8..bbe49a3 100644
> > --- a/fs/nfs/nfs42.h
> > +++ b/fs/nfs/nfs42.h
> > @@ -13,6 +13,7 @@
> > #define PNFS_LAYOUTSTATS_MAXDEV (4)
> >
> > /* nfs4.2proc.c */
> > +#ifdef CONFIG_NFS_V4_2
> > int nfs42_proc_allocate(struct file *, loff_t, loff_t);
> > ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t,
> > size_t);
> > int nfs42_proc_deallocate(struct file *, loff_t, loff_t);
> > @@ -20,5 +21,16 @@
> > int nfs42_proc_layoutstats_generic(struct nfs_server *,
> > struct nfs42_layoutstat_data *);
> > int nfs42_proc_clone(struct file *, struct file *, loff_t, loff_t, loff_t);
> > +int nfs42_proc_copy_notify(struct file *, struct file *,
> > + struct nfs42_copy_notify_res *);
> > +static inline bool nfs42_files_from_same_server(struct file *in,
> > + struct file *out)
> > +{
> > + struct nfs_client *c_in = (NFS_SERVER(file_inode(in)))->nfs_client;
> > + struct nfs_client *c_out = (NFS_SERVER(file_inode(out)))->nfs_client;
> >
> > + return nfs4_check_serverowner_major_id(c_in->cl_serverowner,
> > + c_out->cl_serverowner);
> > +}
> > +#endif /* CONFIG_NFS_V4_2 */
> > #endif /* __LINUX_FS_NFS_NFS4_2_H */
> > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> > index ac5b784..b1c57a4 100644
> > --- a/fs/nfs/nfs42proc.c
> > +++ b/fs/nfs/nfs42proc.c
> > @@ -3,6 +3,7 @@
> > * Copyright (c) 2014 Anna Schumaker <[email protected]>
> > */
> > #include <linux/fs.h>
> > +#include <linux/sunrpc/addr.h>
> > #include <linux/sunrpc/sched.h>
> > #include <linux/nfs.h>
> > #include <linux/nfs3.h>
> > @@ -15,10 +16,30 @@
> > #include "pnfs.h"
> > #include "nfs4session.h"
> > #include "internal.h"
> > +#include "delegation.h"
> >
> > #define NFSDBG_FACILITY NFSDBG_PROC
> > static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid
> > *std);
> >
> > +static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr
> > *naddr)
> > +{
> > + struct nfs_client *clp = (NFS_SERVER(file_inode(filep)))->nfs_client;
> > + unsigned short port = 2049;
> > +
> > + rcu_read_lock();
> > + naddr->netid_len = scnprintf(naddr->netid,
> > + sizeof(naddr->netid), "%s",
> > + rpc_peeraddr2str(clp->cl_rpcclient,
> > + RPC_DISPLAY_NETID));
> > + naddr->addr_len = scnprintf(naddr->addr,
> > + sizeof(naddr->addr),
> > + "%s.%u.%u",
> > + rpc_peeraddr2str(clp->cl_rpcclient,
> > + RPC_DISPLAY_ADDR),
> > + port >> 8, port & 255);
> > + rcu_read_unlock();
> > +}
> > +
> > static int _nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep,
> > struct nfs_lock_context *lock, loff_t offset, loff_t len)
> > {
> > @@ -461,6 +482,76 @@ static int nfs42_do_offload_cancel_async(struct file
> > *dst,
> > return status;
> > }
> >
> > +int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
> > + struct nfs42_copy_notify_args *args,
> > + struct nfs42_copy_notify_res *res)
> > +{
> > + struct nfs_server *src_server = NFS_SERVER(file_inode(src));
> > + struct rpc_message msg = {
> > + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COPY_NOTIFY],
> > + .rpc_argp = args,
> > + .rpc_resp = res,
> > + };
> > + int status;
> > + struct nfs_open_context *ctx;
> > + struct nfs_lock_context *l_ctx;
> > +
> > + ctx = get_nfs_open_context(nfs_file_open_context(src));
> > + l_ctx = nfs_get_lock_context(ctx);
> > + if (IS_ERR(l_ctx))
> > + return PTR_ERR(l_ctx);
> > +
> > + status = nfs4_set_rw_stateid(&args->cna_src_stateid, ctx, l_ctx,
> > + FMODE_READ);
> > + nfs_put_lock_context(l_ctx);
> > + if (status)
> > + return status;
> > +
> > + status = nfs4_call_sync(src_server->client, src_server, &msg,
> > + &args->cna_seq_args, &res->cnr_seq_res, 0);
> > + if (status == -ENOTSUPP)
> > + src_server->caps &= ~NFS_CAP_COPY_NOTIFY;
> > +
> > + put_nfs_open_context(nfs_file_open_context(src));
> > + return status;
> > +}
> > +
> > +int nfs42_proc_copy_notify(struct file *src, struct file *dst,
> > + struct nfs42_copy_notify_res *res)
> > +{
> > + struct nfs_server *src_server = NFS_SERVER(file_inode(src));
> > + struct nfs42_copy_notify_args *args;
> > + struct nfs4_exception exception = {
> > + .inode = file_inode(src),
> > + };
> > + int status;
> > +
> > + if (!(src_server->caps & NFS_CAP_COPY_NOTIFY))
> > + return -EOPNOTSUPP;
> > +
> > + args = kzalloc(sizeof(struct nfs42_copy_notify_args), GFP_NOFS);
> > + if (args == NULL)
> > + return -ENOMEM;
> > +
> > + args->cna_src_fh = NFS_FH(file_inode(src)),
> > + args->cna_dst.nl4_type = NL4_NETADDR;
> > + nfs42_set_netaddr(dst, &args->cna_dst.u.nl4_addr);
> > + exception.stateid = &args->cna_src_stateid;
> > +
> > + do {
> > + status = _nfs42_proc_copy_notify(src, dst, args, res);
> > + if (status == -ENOTSUPP) {
> > + status = -EOPNOTSUPP;
> > + goto out;
> > + }
> > + status = nfs4_handle_exception(src_server, status, &exception);
> > + } while (exception.retry);
> > +
> > +out:
> > + kfree(args);
> > + return status;
> > +}
> > +
> > static loff_t _nfs42_proc_llseek(struct file *filep,
> > struct nfs_lock_context *lock, loff_t offset, int whence)
> > {
> > diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
> > index 69f72ed..e6e7cbf 100644
> > --- a/fs/nfs/nfs42xdr.c
> > +++ b/fs/nfs/nfs42xdr.c
> > @@ -29,6 +29,16 @@
> > #define encode_offload_cancel_maxsz (op_encode_hdr_maxsz + \
> > XDR_QUADLEN(NFS4_STATEID_SIZE))
> > #define decode_offload_cancel_maxsz (op_decode_hdr_maxsz)
> > +#define encode_copy_notify_maxsz (op_encode_hdr_maxsz + \
> > + XDR_QUADLEN(NFS4_STATEID_SIZE) + \
> > + 1 + /* nl4_type */ \
> > + 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
> > +#define decode_copy_notify_maxsz (op_decode_hdr_maxsz + \
> > + 3 + /* cnr_lease_time */\
> > + XDR_QUADLEN(NFS4_STATEID_SIZE) + \
> > + 1 + /* Support 1 cnr_source_server */\
> > + 1 + /* nl4_type */ \
> > + 1 + XDR_QUADLEN(NFS4_OPAQUE_LIMIT))
> > #define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
> > encode_fallocate_maxsz)
> > #define decode_deallocate_maxsz (op_decode_hdr_maxsz)
> > @@ -84,6 +94,12 @@
> > #define NFS4_dec_offload_cancel_sz (compound_decode_hdr_maxsz + \
> > decode_putfh_maxsz + \
> > decode_offload_cancel_maxsz)
> > +#define NFS4_enc_copy_notify_sz (compound_encode_hdr_maxsz + \
> > + encode_putfh_maxsz + \
> > + encode_copy_notify_maxsz)
> > +#define NFS4_dec_copy_notify_sz (compound_decode_hdr_maxsz + \
> > + decode_putfh_maxsz + \
> > + decode_copy_notify_maxsz)
> > #define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
> > encode_putfh_maxsz + \
> > encode_deallocate_maxsz + \
> > @@ -137,6 +153,25 @@ static void encode_allocate(struct xdr_stream *xdr,
> > encode_fallocate(xdr, args);
> > }
> >
> > +static void encode_nl4_server(struct xdr_stream *xdr, const struct nl4_server
> > *ns)
> > +{
> > + encode_uint32(xdr, ns->nl4_type);
> > + switch (ns->nl4_type) {
> > + case NL4_NAME:
> > + case NL4_URL:
> > + encode_string(xdr, ns->u.nl4_str_sz, ns->u.nl4_str);
> > + break;
> > + case NL4_NETADDR:
> > + encode_string(xdr, ns->u.nl4_addr.netid_len,
> > + ns->u.nl4_addr.netid);
> > + encode_string(xdr, ns->u.nl4_addr.addr_len,
> > + ns->u.nl4_addr.addr);
> > + break;
> > + default:
> > + WARN_ON_ONCE(1);
> > + }
> > +}
> > +
> > static void encode_copy(struct xdr_stream *xdr,
> > const struct nfs42_copy_args *args,
> > struct compound_hdr *hdr)
> > @@ -162,6 +197,15 @@ static void encode_offload_cancel(struct xdr_stream *xdr,
> > encode_nfs4_stateid(xdr, &args->osa_stateid);
> > }
> >
> > +static void encode_copy_notify(struct xdr_stream *xdr,
> > + const struct nfs42_copy_notify_args *args,
> > + struct compound_hdr *hdr)
> > +{
> > + encode_op_hdr(xdr, OP_COPY_NOTIFY, decode_copy_notify_maxsz, hdr);
> > + encode_nfs4_stateid(xdr, &args->cna_src_stateid);
> > + encode_nl4_server(xdr, &args->cna_dst);
> > +}
> > +
> > static void encode_deallocate(struct xdr_stream *xdr,
> > const struct nfs42_falloc_args *args,
> > struct compound_hdr *hdr)
> > @@ -298,6 +342,25 @@ static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst
> > *req,
> > }
> >
> > /*
> > + * Encode COPY_NOTIFY request
> > + */
> > +static void nfs4_xdr_enc_copy_notify(struct rpc_rqst *req,
> > + struct xdr_stream *xdr,
> > + const void *data)
> > +{
> > + const struct nfs42_copy_notify_args *args = data;
> > + struct compound_hdr hdr = {
> > + .minorversion = nfs4_xdr_minorversion(&args->cna_seq_args),
> > + };
> > +
> > + encode_compound_hdr(xdr, req, &hdr);
> > + encode_sequence(xdr, &args->cna_seq_args, &hdr);
> > + encode_putfh(xdr, args->cna_src_fh, &hdr);
> > + encode_copy_notify(xdr, args, &hdr);
> > + encode_nops(&hdr);
> > +}
> > +
> > +/*
> > * Encode DEALLOCATE request
> > */
> > static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
> > @@ -416,6 +479,58 @@ static int decode_write_response(struct xdr_stream *xdr,
> > return -EIO;
> > }
> >
> > +static int decode_nl4_server(struct xdr_stream *xdr, struct nl4_server *ns)
> > +{
> > + struct nfs42_netaddr *naddr;
> > + uint32_t dummy;
> > + char *dummy_str;
> > + __be32 *p;
> > + int status;
> > +
> > + /* nl_type */
> > + p = xdr_inline_decode(xdr, 4);
> > + if (unlikely(!p))
> > + return -EIO;
> > + ns->nl4_type = be32_to_cpup(p);
> > + switch (ns->nl4_type) {
> > + case NL4_NAME:
> > + case NL4_URL:
> > + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> > + if (unlikely(status))
> > + return status;
> > + if (unlikely(dummy > NFS4_OPAQUE_LIMIT))
> > + return -EIO;
> > + memcpy(&ns->u.nl4_str, dummy_str, dummy);
> > + ns->u.nl4_str_sz = dummy;
> > + break;
> > + case NL4_NETADDR:
> > + naddr = &ns->u.nl4_addr;
> > +
> > + /* netid string */
> > + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> > + if (unlikely(status))
> > + return status;
> > + if (unlikely(dummy > RPCBIND_MAXNETIDLEN))
> > + return -EIO;
> > + naddr->netid_len = dummy;
> > + memcpy(naddr->netid, dummy_str, naddr->netid_len);
> > +
> > + /* uaddr string */
> > + status = decode_opaque_inline(xdr, &dummy, &dummy_str);
> > + if (unlikely(status))
> > + return status;
> > + if (unlikely(dummy > RPCBIND_MAXUADDRLEN))
> > + return -EIO;
> > + naddr->addr_len = dummy;
> > + memcpy(naddr->addr, dummy_str, naddr->addr_len);
> > + break;
> > + default:
> > + WARN_ON_ONCE(1);
> > + return -EIO;
> > + }
> > + return 0;
> > +}
> > +
> > static int decode_copy_requirements(struct xdr_stream *xdr,
> > struct nfs42_copy_res *res) {
> > __be32 *p;
> > @@ -458,6 +573,46 @@ static int decode_offload_cancel(struct xdr_stream *xdr,
> > return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
> > }
> >
> > +static int decode_copy_notify(struct xdr_stream *xdr,
> > + struct nfs42_copy_notify_res *res)
> > +{
> > + __be32 *p;
> > + int status, count;
> > +
> > + status = decode_op_hdr(xdr, OP_COPY_NOTIFY);
> > + if (status)
> > + return status;
> > + /* cnr_lease_time */
> > + p = xdr_inline_decode(xdr, 12);
> > + if (unlikely(!p))
> > + goto out_overflow;
> > + p = xdr_decode_hyper(p, &res->cnr_lease_time.seconds);
> > + res->cnr_lease_time.nseconds = be32_to_cpup(p);
> > +
> > + status = decode_opaque_fixed(xdr, &res->cnr_stateid, NFS4_STATEID_SIZE);
> > + if (unlikely(status))
> > + goto out_overflow;
> > +
> > + /* number of source addresses */
> > + p = xdr_inline_decode(xdr, 4);
> > + if (unlikely(!p))
> > + goto out_overflow;
> > +
> > + count = be32_to_cpup(p);
> > + if (count > 1)
> > + pr_warn("NFS: %s: nsvr %d > Supported. Use first servers\n",
> > + __func__, count);
> > +
> > + status = decode_nl4_server(xdr, &res->cnr_src);
> > + if (unlikely(status))
> > + goto out_overflow;
> > + return 0;
> > +
> > +out_overflow:
> > + print_overflow_msg(__func__, xdr);
> > + return -EIO;
> > +}
> > +
> > static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res
> > *res)
> > {
> > return decode_op_hdr(xdr, OP_DEALLOCATE);
> > @@ -585,6 +740,32 @@ static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst
> > *rqstp,
> > }
> >
> > /*
> > + * Decode COPY_NOTIFY response
> > + */
> > +static int nfs4_xdr_dec_copy_notify(struct rpc_rqst *rqstp,
> > + struct xdr_stream *xdr,
> > + void *data)
> > +{
> > + struct nfs42_copy_notify_res *res = data;
> > + struct compound_hdr hdr;
> > + int status;
> > +
> > + status = decode_compound_hdr(xdr, &hdr);
> > + if (status)
> > + goto out;
> > + status = decode_sequence(xdr, &res->cnr_seq_res, rqstp);
> > + if (status)
> > + goto out;
> > + status = decode_putfh(xdr);
> > + if (status)
> > + goto out;
> > + status = decode_copy_notify(xdr, res);
> > +
> > +out:
> > + return status;
> > +}
> > +
> > +/*
> > * Decode DEALLOCATE request
> > */
> > static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
> > diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
> > index 8d59c96..7d17b31 100644
> > --- a/fs/nfs/nfs4_fs.h
> > +++ b/fs/nfs/nfs4_fs.h
> > @@ -460,6 +460,8 @@ int nfs41_discover_server_trunking(struct nfs_client *clp,
> > struct nfs_client **, struct rpc_cred *);
> > extern void nfs4_schedule_session_recovery(struct nfs4_session *, int);
> > extern void nfs41_notify_server(struct nfs_client *);
> > +bool nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
> > + struct nfs41_server_owner *o2);
> > #else
> > static inline void nfs4_schedule_session_recovery(struct nfs4_session
> > *session, int err)
> > {
> > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > index 8f53455..ac00eb8 100644
> > --- a/fs/nfs/nfs4client.c
> > +++ b/fs/nfs/nfs4client.c
> > @@ -625,7 +625,7 @@ int nfs40_walk_client_list(struct nfs_client *new,
> > /*
> > * Returns true if the server major ids match
> > */
> > -static bool
> > +bool
> > nfs4_check_serverowner_major_id(struct nfs41_server_owner *o1,
> > struct nfs41_server_owner *o2)
> > {
> > diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> > index 7838bdf..beda4b3 100644
> > --- a/fs/nfs/nfs4file.c
> > +++ b/fs/nfs/nfs4file.c
> > @@ -133,6 +133,7 @@ static ssize_t nfs4_copy_file_range(struct file *file_in,
> > loff_t pos_in,
> > struct file *file_out, loff_t pos_out,
> > size_t count, unsigned int flags)
> > {
> > + struct nfs42_copy_notify_res *cn_resp = NULL;
> > ssize_t ret;
> >
> > if (pos_in >= i_size_read(file_inode(file_in)))
> > @@ -144,7 +145,20 @@ static ssize_t nfs4_copy_file_range(struct file *file_in,
> > loff_t pos_in,
> > if (file_inode(file_in) == file_inode(file_out))
> > return -EINVAL;
> > retry:
> > + if (!nfs42_files_from_same_server(file_in, file_out)) {
>
> I'm seeing this crash when I try to use vfs_copy_file_range() on NFS v4.0. I
> think it's because clients don't have a cl_serverowner defined in this case:

Thanks for the catch. Will fix it.

> [ +0.051545] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000008
> [ +0.002032] PGD 0 P4D 0
> [ +0.002021] Oops: 0000 [#4] PREEMPT SMP PTI
> [ +0.001980] CPU: 1 PID: 1194 Comm: nfscopy Tainted: G D 4.19.0-
> ANNA+ #2124
> [ +0.001386] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ +0.001266] RIP: 0010:nfs4_check_serverowner_major_id+0x5/0x30 [nfsv4]
> [ +0.001254] Code: ff ff 48 8b 7c 24 10 eb 95 41 bf da d8 ff ff eb da e8 ef ec
> eb d3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 <8b> 57 08 31
> c0 3b 56 08 75 12 48 83 c6 0c 48 83 c7 0c e8 64 06 63
> [ +0.002487] RSP: 0018:ffffac0b40b77e50 EFLAGS: 00010246
> [ +0.001233] RAX: ffff8eb6f45a6000 RBX: ffff8eb77738e500 RCX: 0000000000000000
> [ +0.001218] RDX: ffff8eb6f45a6000 RSI: 0000000000000000 RDI: 0000000000000000
> [ +0.001271] RBP: ffff8eb6f533eb00 R08: 0000000080000000 R09: ffff8eb778c10800
> [ +0.000956] R10: ffff8eb77aa36f98 R11: ffff8eb77ab66320 R12: 0000000000000000
> [ +0.000848] R13: 0000000080000000 R14: 0000000000000000 R15: 0000000080000000
> [ +0.000880] FS: 00007f7bf84d9500(0000) GS:ffff8eb77cb00000(0000)
> knlGS:0000000000000000
> [ +0.000876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ +0.000815] CR2: 0000000000000008 CR3: 00000000b3e80005 CR4: 0000000000160ee0
> [ +0.000809] Call Trace:
> [ +0.000803] nfs4_copy_file_range+0x8b/0x120 [nfsv4]
> [ +0.000793] vfs_copy_file_range+0x135/0x360
> [ +0.000768] __se_sys_copy_file_range+0xce/0x1f0
> [ +0.000756] do_syscall_64+0x5b/0x170
> [ +0.000765] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ +0.000728] RIP: 0033:0x7f7bf840a40d
> [ +0.000718] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8
> 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0
> ff ff 73 01 c3 48 8b 0d 23 7a 0c 00 f7 d8 64 89 01 48
> [ +0.001446] RSP: 002b:00007ffd9ee8c158 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000146
> [ +0.000771] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7bf840a40d
> [ +0.000727] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003
> [ +0.000709] RBP: 00007ffd9ee8c1a0 R08: 0000000080000000 R09: 0000000000000000
> [ +0.000692] R10: 0000000000000000 R11: 0000000000000202 R12: 0000561b5b15e720
> [ +0.000680] R13: 00007ffd9ee8c360 R14: 0000000000000000 R15: 0000000000000000
> [ +0.000667] Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache rpcrdma
> ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
> target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_uverbs ib_umad
> rdma_cm ib_cm iw_cm ib_core crct10dif_pclmul crc32_pclmul cfg80211
> ghash_clmulni_intel nfsd joydev mousedev psmouse aesni_intel auth_rpcgss rfkill
> aes_x86_64 8021q crypto_simd nfs_acl cryptd lockd mrp input_leds glue_helper
> led_class grace evdev pcspkr intel_agp i2c_piix4 intel_gtt sunrpc mac_hid
> ip_tables x_tables ata_generic pata_acpi ata_piix libata scsi_mod serio_raw
> atkbd libps2 i8042 floppy serio xfs virtio_gpu drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm drm libcrc32c crc32c_generic crc32c_intel
> virtio_balloon virtio_net net_failover failover agpgart virtio_pci virtio_blk
> virtio_ring virtio
> [ +0.004802] CR2: 0000000000000008
> [ +0.000742] ---[ end trace 24756b969e170fa4 ]---
>
>
> Thanks,
> Anna
>
> > + cn_resp = kzalloc(sizeof(struct nfs42_copy_notify_res),
> > + GFP_NOFS);
> > + if (unlikely(cn_resp == NULL))
> > + return -ENOMEM;
> > +
> > + ret = nfs42_proc_copy_notify(file_in, file_out, cn_resp);
> > + if (ret)
> > + goto out;
> > + }
> > +
> > ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
> > +out:
> > + kfree(cn_resp);
> > if (ret == -EAGAIN)
> > goto retry;
> > return ret;
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index db84b4a..fec6e6b 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -9692,6 +9692,7 @@ static bool nfs4_match_stateid(const nfs4_stateid *s1,
> > | NFS_CAP_ALLOCATE
> > | NFS_CAP_COPY
> > | NFS_CAP_OFFLOAD_CANCEL
> > + | NFS_CAP_COPY_NOTIFY
> > | NFS_CAP_DEALLOCATE
> > | NFS_CAP_SEEK
> > | NFS_CAP_LAYOUTSTATS
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index b7bde12..2163900 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -7790,6 +7790,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct
> > nfs_entry *entry,
> > PROC42(CLONE, enc_clone, dec_clone),
> > PROC42(COPY, enc_copy, dec_copy),
> > PROC42(OFFLOAD_CANCEL, enc_offload_cancel, dec_offload_cancel),
> > + PROC42(COPY_NOTIFY, enc_copy_notify, dec_copy_notify),
> > PROC(LOOKUPP, enc_lookupp, dec_lookupp),
> > };
> >
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index 4803507..9e49a6c 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -537,6 +537,7 @@ enum {
> > NFSPROC4_CLNT_CLONE,
> > NFSPROC4_CLNT_COPY,
> > NFSPROC4_CLNT_OFFLOAD_CANCEL,
> > + NFSPROC4_CLNT_COPY_NOTIFY,
> >
> > NFSPROC4_CLNT_LOOKUPP,
> > };
> > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> > index 0fc0b91..e5d89ff 100644
> > --- a/include/linux/nfs_fs_sb.h
> > +++ b/include/linux/nfs_fs_sb.h
> > @@ -261,5 +261,6 @@ struct nfs_server {
> > #define NFS_CAP_CLONE (1U << 23)
> > #define NFS_CAP_COPY (1U << 24)
> > #define NFS_CAP_OFFLOAD_CANCEL (1U << 25)
> > +#define NFS_CAP_COPY_NOTIFY (1U << 26)
> >
> > #endif
> > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > index 0e01625..dfc59bc 100644
> > --- a/include/linux/nfs_xdr.h
> > +++ b/include/linux/nfs_xdr.h
> > @@ -1428,6 +1428,22 @@ struct nfs42_offload_status_res {
> > int osr_status;
> > };
> >
> > +struct nfs42_copy_notify_args {
> > + struct nfs4_sequence_args cna_seq_args;
> > +
> > + struct nfs_fh *cna_src_fh;
> > + nfs4_stateid cna_src_stateid;
> > + struct nl4_server cna_dst;
> > +};
> > +
> > +struct nfs42_copy_notify_res {
> > + struct nfs4_sequence_res cnr_seq_res;
> > +
> > + struct nfstime4 cnr_lease_time;
> > + nfs4_stateid cnr_stateid;
> > + struct nl4_server cnr_src;
> > +};
> > +
> > struct nfs42_seek_args {
> > struct nfs4_sequence_args seq_args;
> >

2018-10-31 18:54:33

by Goldwyn Rodrigues

[permalink] [raw]
Subject: Re: [PATCH v7 01/11] VFS: move cross device copy_file_range() check into filesystems

On 16:56 30/10, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <[email protected]>
>
> This patch makes it the responsibility of individual filesystems to
> allow or deny cross device copies. Both NFS and CIFS have operations
> for cross-server copies, and later patches will implement this feature.
>
> Note that as of this patch, the copy_file_range() function might be passed
> superblocks from different filesystem types. -EXDEV should be returned
> if cross device copies aren't supported.
>
> Reviewed-by: Amir Goldstein <[email protected]>
> Reviewed-by: Matthew Wilcox <[email protected]>
> Reviewed-by: Steve French <[email protected]>
> Reviewed-by: Jeff Layton <[email protected]>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> Documentation/filesystems/porting | 7 +++++++
> fs/cifs/cifsfs.c | 3 +++
> fs/nfs/nfs4file.c | 3 +++
> fs/overlayfs/file.c | 3 +++
> fs/read_write.c | 12 +++++++-----
> 5 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
> index 7b7b845..897e1e7 100644
> --- a/Documentation/filesystems/porting
> +++ b/Documentation/filesystems/porting
> @@ -622,3 +622,10 @@ in your dentry operations instead.
> alloc_file_clone(file, flags, ops) does not affect any caller's references.
> On success you get a new struct file sharing the mount/dentry with the
> original, on failure - ERR_PTR().
> +--
> +[mandatory]
> + ->copy_file_range() may now be passed files which belong to two
> + different superblocks of the same file system type or which belong
> + to two different filesystems types all together. As before, the
> + destination's copy_file_range() is the function which is called.
> + If it cannot copy ranges from the source, it should return -EXDEV.
> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> index 7065426..ca8fc87 100644
> --- a/fs/cifs/cifsfs.c
> +++ b/fs/cifs/cifsfs.c
> @@ -1114,6 +1114,9 @@ static ssize_t cifs_copy_file_range(struct file *src_file, loff_t off,
> unsigned int xid = get_xid();
> ssize_t rc;
>
> + if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb)
> + return -EXDEV;
> +
> rc = cifs_file_copychunk_range(xid, src_file, off, dst_file, destoff,
> len, flags);
> free_xid(xid);
> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> index 4288a6e..5a73c90 100644
> --- a/fs/nfs/nfs4file.c
> +++ b/fs/nfs/nfs4file.c
> @@ -135,6 +135,9 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
> {
> ssize_t ret;
>
> + if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
> + return -EXDEV;
> +
> if (file_inode(file_in) == file_inode(file_out))
> return -EINVAL;
> retry:
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index aeaefd2..0331e33 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -483,6 +483,9 @@ static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> struct file *file_out, loff_t pos_out,
> size_t len, unsigned int flags)
> {
> + if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
> + return -EXDEV;
> +
> return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
> OVL_COPY);
> }
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 39b4a21..c5bed2e 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1575,10 +1575,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> (file_out->f_flags & O_APPEND))
> return -EBADF;
>
> - /* this could be relaxed once a method supports cross-fs copies */
> - if (inode_in->i_sb != inode_out->i_sb)
> - return -EXDEV;
> -
> if (len == 0)
> return 0;
>
> @@ -1588,7 +1584,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> * Try cloning first, this is supported by more file systems, and
> * more efficient if both clone and copy are supported (e.g. NFS).
> */
> - if (file_in->f_op->clone_file_range) {
> + if (inode_in->i_sb == inode_out->i_sb &&
> + file_in->f_op->clone_file_range) {
> ret = file_in->f_op->clone_file_range(file_in, pos_in,
> file_out, pos_out, len);
> if (ret == 0) {
> @@ -1604,6 +1601,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> goto done;
> }
>
> + if (inode_in->i_sb != inode_out->i_sb) {
> + ret = -EXDEV;
> + goto done;
> + }
> +

Can we just do away with this check and let it perform splice()? The splice()
call does not need to have file_in and file_out as the same sb.

> ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
>
> --
> 1.8.3.1
>

--
Goldwyn