2018-04-13 17:03:17

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 00/13] NFS support for async intra COPY

Change the client side to always request an asynchronous copy.

If server replies to the COPY with the copy stateid, client will go
wait on the CB_OFFLOAD. To fend off the race between CB_OFFLOAD and
COPY reply, we check the list of pending callbacks before going to
wait. Client adds the copy to the global list of copy stateids for the
callback to look thru and signal the waiting copy.

When the client receives reply from the CB_OFFLOAD with some bytes and
committed how is UNSTABLE, then COMMIT is sent to the server. The results
are propagated to the VFS and application. Assuming that application
will deal with a partial result and continue from the new offset if needed.

If server errs with ERR_OFFLOAD_NOREQS the copy will be re-sent as a
synchronous COPY. If application cancels an in-flight COPY,
OFFLOAD_CANCEL is sent to the server (sending it as an async RPC in the
context of the nfsid_workqueue).

v8.
- added patch 13 to handle reboot recovery of the server or network
partition. this patch was a part of the initial "inter async" series
but never got added to the async series. without this patch if the
server goes away, thread that was doing the copy would be hanging
waiting for the callback that would never arrive.



Anna Schumaker (1):
fs: Don't copy beyond the end of the file

Olga Kornievskaia (12):
NFS CB_OFFLOAD xdr
NFS OFFLOAD_STATUS xdr
NFS OFFLOAD_STATUS op
NFS OFFLOAD_CANCEL xdr
NFS COPY xdr handle async reply
NFS add support for asynchronous COPY
NFS handle COPY reply CB_OFFLOAD call race
NFS export nfs4_async_handle_error
NFS send OFFLOAD_CANCEL when COPY killed
NFS handle COPY ERR_OFFLOAD_NO_REQS
NFS add a simple sync nfs4_proc_commit after async COPY
NFS recover from destination server reboot for copies

fs/nfs/callback.h | 12 +++
fs/nfs/callback_proc.c | 54 ++++++++++
fs/nfs/callback_xdr.c | 81 ++++++++++++++-
fs/nfs/client.c | 1 +
fs/nfs/nfs42.h | 5 +-
fs/nfs/nfs42proc.c | 251 ++++++++++++++++++++++++++++++++++++++++++++--
fs/nfs/nfs42xdr.c | 189 +++++++++++++++++++++++++++++++---
fs/nfs/nfs4_fs.h | 8 +-
fs/nfs/nfs4client.c | 15 +++
fs/nfs/nfs4file.c | 9 +-
fs/nfs/nfs4proc.c | 38 ++++++-
fs/nfs/nfs4state.c | 15 +++
fs/nfs/nfs4xdr.c | 2 +
fs/read_write.c | 3 +
include/linux/nfs4.h | 2 +
include/linux/nfs_fs.h | 11 ++
include/linux/nfs_fs_sb.h | 4 +
include/linux/nfs_xdr.h | 14 +++
18 files changed, 689 insertions(+), 25 deletions(-)

--
1.8.3.1



2018-04-13 17:03:18

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 01/13] fs: Don't copy beyond the end of the file

From: Anna Schumaker <[email protected]>

Signed-off-by: Anna Schumaker <[email protected]>
---
fs/read_write.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index f8547b8..316394f 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1543,6 +1543,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (unlikely(ret))
return ret;

+ if (pos_in >= i_size_read(inode_in))
+ return -EINVAL;
+
if (!(file_in->f_mode & FMODE_READ) ||
!(file_out->f_mode & FMODE_WRITE) ||
(file_out->f_flags & O_APPEND))
--
1.8.3.1


2018-04-13 17:03:22

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 05/13] NFS OFFLOAD_CANCEL xdr

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42xdr.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 1 +
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 1 +
include/linux/nfs_fs_sb.h | 1 +
5 files changed, 72 insertions(+)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 8789486..7fda85c 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -31,6 +31,9 @@
#define decode_offload_status_maxsz (op_decode_hdr_maxsz + \
2 /* osr_count */ + \
1 + 1 /* osr_complete<1> */)
+#define encode_offload_cancel_maxsz (op_encode_hdr_maxsz + \
+ XDR_QUADLEN(NFS4_STATEID_SIZE))
+#define decode_offload_cancel_maxsz (op_decode_hdr_maxsz)
#define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
encode_fallocate_maxsz)
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
@@ -86,6 +89,12 @@
#define NFS4_dec_offload_status_sz (compound_decode_hdr_maxsz + \
decode_putfh_maxsz + \
decode_offload_status_maxsz)
+#define NFS4_enc_offload_cancel_sz (compound_encode_hdr_maxsz + \
+ encode_putfh_maxsz + \
+ encode_offload_cancel_maxsz)
+#define NFS4_dec_offload_cancel_sz (compound_decode_hdr_maxsz + \
+ decode_putfh_maxsz + \
+ decode_offload_cancel_maxsz)
#define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
encode_putfh_maxsz + \
encode_deallocate_maxsz + \
@@ -164,6 +173,14 @@ static void encode_offload_status(struct xdr_stream *xdr,
encode_nfs4_stateid(xdr, &args->osa_stateid);
}

+static void encode_offload_cancel(struct xdr_stream *xdr,
+ const struct nfs42_offload_status_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_OFFLOAD_CANCEL, decode_offload_cancel_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->osa_stateid);
+}
+
static void encode_deallocate(struct xdr_stream *xdr,
const struct nfs42_falloc_args *args,
struct compound_hdr *hdr)
@@ -299,6 +316,25 @@ static void nfs4_xdr_enc_offload_status(struct rpc_rqst *req,
}

/*
+ * Encode OFFLOAD_CANEL request
+ */
+static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs42_offload_status_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->osa_seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->osa_seq_args, &hdr);
+ encode_putfh(xdr, args->osa_src_fh, &hdr);
+ encode_offload_cancel(xdr, args, &hdr);
+ encode_nops(&hdr);
+}
+
+/*
* Encode DEALLOCATE request
*/
static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
@@ -478,6 +514,12 @@ static int decode_offload_status(struct xdr_stream *xdr,
return -EIO;
}

+static int decode_offload_cancel(struct xdr_stream *xdr,
+ struct nfs42_offload_status_res *res)
+{
+ return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
+}
+
static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *res)
{
return decode_op_hdr(xdr, OP_DEALLOCATE);
@@ -604,6 +646,32 @@ static int nfs4_xdr_dec_offload_status(struct rpc_rqst *rqstp,
}

/*
+ * Decode OFFLOAD_CANCEL response
+ */
+static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs42_offload_status_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->osr_seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_offload_cancel(xdr, res);
+
+out:
+ return status;
+}
+
+/*
* Decode DEALLOCATE request
*/
static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 08f12b5..46571f7a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -9496,6 +9496,7 @@ static bool nfs4_match_stateid(const nfs4_stateid *s1,
| NFS_CAP_ALLOCATE
| NFS_CAP_COPY
| NFS_CAP_OFFLOAD_STATUS
+ | NFS_CAP_OFFLOAD_CANCEL
| NFS_CAP_DEALLOCATE
| NFS_CAP_SEEK
| NFS_CAP_LAYOUTSTATS
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index ca82e16..a7d4107 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7756,6 +7756,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
PROC42(CLONE, enc_clone, dec_clone),
PROC42(COPY, enc_copy, dec_copy),
PROC42(OFFLOAD_STATUS, enc_offload_status, dec_offload_status),
+ PROC42(OFFLOAD_CANCEL, enc_offload_cancel, dec_offload_cancel),
PROC(LOOKUPP, enc_lookupp, dec_lookupp),
};

diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 47ae050..e796f56 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -528,6 +528,7 @@ enum {
NFSPROC4_CLNT_CLONE,
NFSPROC4_CLNT_COPY,
NFSPROC4_CLNT_OFFLOAD_STATUS,
+ NFSPROC4_CLNT_OFFLOAD_CANCEL,

NFSPROC4_CLNT_LOOKUPP,
};
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 1f6e9d6..6d2a5c8 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -255,5 +255,6 @@ struct nfs_server {
#define NFS_CAP_CLONE (1U << 23)
#define NFS_CAP_COPY (1U << 24)
#define NFS_CAP_OFFLOAD_STATUS (1U << 25)
+#define NFS_CAP_OFFLOAD_CANCEL (1U << 26)

#endif
--
1.8.3.1


2018-04-13 17:03:27

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 09/13] NFS export nfs4_async_handle_error

Make this function available to nfs42proc.c

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs4_fs.h | 3 +++
fs/nfs/nfs4proc.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index b374f68..7303a6f 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -248,6 +248,9 @@ int nfs4_replace_transport(struct nfs_server *server,

/* nfs4proc.c */
extern int nfs4_handle_exception(struct nfs_server *, int, struct nfs4_exception *);
+extern int nfs4_async_handle_error(struct rpc_task *task,
+ struct nfs_server *server,
+ struct nfs4_state *state, long *timeout);
extern int nfs4_call_sync(struct rpc_clnt *, struct nfs_server *,
struct rpc_message *, struct nfs4_sequence_args *,
struct nfs4_sequence_res *, int);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 46571f7a..dcd08ba 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -553,7 +553,7 @@ int nfs4_handle_exception(struct nfs_server *server, int errorcode, struct nfs4_
return ret;
}

-static int
+int
nfs4_async_handle_error(struct rpc_task *task, struct nfs_server *server,
struct nfs4_state *state, long *timeout)
{
--
1.8.3.1


2018-04-13 17:03:28

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 10/13] NFS send OFFLOAD_CANCEL when COPY killed

When COPY is killed by the user send OFFLOAD_CANCEL to server
processing the copy.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 89 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index fa4631d..2971eda 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -17,6 +17,7 @@
#include "internal.h"

#define NFSDBG_FACILITY NFSDBG_PROC
+static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);

static int _nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep,
struct nfs_lock_context *lock, loff_t offset, loff_t len)
@@ -166,10 +167,15 @@ static int handle_async_copy(struct nfs42_copy_res *res,
list_add_tail(&copy->copies, &server->ss_copies);
spin_unlock(&server->nfs_client->cl_lock);

- wait_for_completion_interruptible(&copy->completion);
+ status = wait_for_completion_interruptible(&copy->completion);
spin_lock(&server->nfs_client->cl_lock);
list_del_init(&copy->copies);
spin_unlock(&server->nfs_client->cl_lock);
+ if (status == -ERESTARTSYS) {
+ nfs42_do_offload_cancel_async(dst, &copy->stateid);
+ kfree(copy);
+ return status;
+ }
out:
res->write_res.count = copy->count;
memcpy(&res->write_res.verifier, &copy->verf, sizeof(copy->verf));
@@ -327,6 +333,88 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
return err;
}

+struct nfs42_offloadcancel_data {
+ struct nfs_server *seq_server;
+ struct nfs42_offload_status_args args;
+ struct nfs42_offload_status_res res;
+};
+
+static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
+{
+ struct nfs42_offloadcancel_data *data = calldata;
+
+ nfs4_setup_sequence(data->seq_server->nfs_client,
+ &data->args.osa_seq_args,
+ &data->res.osr_seq_res, task);
+}
+
+static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
+{
+ struct nfs42_offloadcancel_data *data = calldata;
+
+ nfs41_sequence_done(task, &data->res.osr_seq_res);
+ if (task->tk_status &&
+ nfs4_async_handle_error(task, data->seq_server, NULL,
+ NULL) == -EAGAIN)
+ rpc_restart_call_prepare(task);
+}
+
+static void nfs42_free_offloadcancel_data(void *data)
+{
+ kfree(data);
+}
+
+static const struct rpc_call_ops nfs42_offload_cancel_ops = {
+ .rpc_call_prepare = nfs42_offload_cancel_prepare,
+ .rpc_call_done = nfs42_offload_cancel_done,
+ .rpc_release = nfs42_free_offloadcancel_data,
+};
+
+static int nfs42_do_offload_cancel_async(struct file *dst,
+ nfs4_stateid *stateid)
+{
+ struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
+ struct nfs42_offloadcancel_data *data = NULL;
+ struct nfs_open_context *ctx = nfs_file_open_context(dst);
+ struct rpc_task *task;
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_CANCEL],
+ .rpc_cred = ctx->cred,
+ };
+ struct rpc_task_setup task_setup_data = {
+ .rpc_client = dst_server->client,
+ .rpc_message = &msg,
+ .callback_ops = &nfs42_offload_cancel_ops,
+ .workqueue = nfsiod_workqueue,
+ .flags = RPC_TASK_ASYNC,
+ };
+ int status;
+
+ if (!(dst_server->caps & NFS_CAP_OFFLOAD_CANCEL))
+ return -EOPNOTSUPP;
+
+ data = kzalloc(sizeof(struct nfs42_offloadcancel_data), GFP_NOFS);
+ if (data == NULL)
+ return -ENOMEM;
+
+ data->seq_server = dst_server;
+ data->args.osa_src_fh = NFS_FH(file_inode(dst));
+ memcpy(&data->args.osa_stateid, stateid,
+ sizeof(data->args.osa_stateid));
+ msg.rpc_argp = &data->args;
+ msg.rpc_resp = &data->res;
+ task_setup_data.callback_data = data;
+ nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res, 1);
+ task = rpc_run_task(&task_setup_data);
+ if (IS_ERR(task))
+ return PTR_ERR(task);
+ status = rpc_wait_for_completion_task(task);
+ if (status == -ENOTSUPP)
+ dst_server->caps &= ~NFS_CAP_OFFLOAD_CANCEL;
+ rpc_put_task(task);
+ return status;
+}
+
int _nfs42_proc_offload_status(struct file *dst, nfs4_stateid *stateid,
struct nfs42_offload_status_res *res)
{
--
1.8.3.1


2018-04-13 17:03:20

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 03/13] NFS OFFLOAD_STATUS xdr

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42xdr.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 1 +
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 1 +
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 12 +++++++
6 files changed, 107 insertions(+)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 5966e1e..8789486 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -26,6 +26,11 @@
NFS42_WRITE_RES_SIZE + \
1 /* cr_consecutive */ + \
1 /* cr_synchronous */)
+#define encode_offload_status_maxsz (op_encode_hdr_maxsz + \
+ XDR_QUADLEN(NFS4_STATEID_SIZE))
+#define decode_offload_status_maxsz (op_decode_hdr_maxsz + \
+ 2 /* osr_count */ + \
+ 1 + 1 /* osr_complete<1> */)
#define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
encode_fallocate_maxsz)
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
@@ -75,6 +80,12 @@
decode_putfh_maxsz + \
decode_copy_maxsz + \
decode_commit_maxsz)
+#define NFS4_enc_offload_status_sz (compound_encode_hdr_maxsz + \
+ encode_putfh_maxsz + \
+ encode_offload_status_maxsz)
+#define NFS4_dec_offload_status_sz (compound_decode_hdr_maxsz + \
+ decode_putfh_maxsz + \
+ decode_offload_status_maxsz)
#define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
encode_putfh_maxsz + \
encode_deallocate_maxsz + \
@@ -145,6 +156,14 @@ static void encode_copy(struct xdr_stream *xdr,
encode_uint32(xdr, 0); /* src server list */
}

+static void encode_offload_status(struct xdr_stream *xdr,
+ const struct nfs42_offload_status_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_OFFLOAD_STATUS, decode_offload_status_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->osa_stateid);
+}
+
static void encode_deallocate(struct xdr_stream *xdr,
const struct nfs42_falloc_args *args,
struct compound_hdr *hdr)
@@ -261,6 +280,25 @@ static void nfs4_xdr_enc_copy(struct rpc_rqst *req,
}

/*
+ * Encode OFFLOAD_STATUS request
+ */
+static void nfs4_xdr_enc_offload_status(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs42_offload_status_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->osa_seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->osa_seq_args, &hdr);
+ encode_putfh(xdr, args->osa_src_fh, &hdr);
+ encode_offload_status(xdr, args, &hdr);
+ encode_nops(&hdr);
+}
+
+/*
* Encode DEALLOCATE request
*/
static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
@@ -413,6 +451,33 @@ static int decode_copy(struct xdr_stream *xdr, struct nfs42_copy_res *res)
return decode_copy_requirements(xdr, res);
}

+static int decode_offload_status(struct xdr_stream *xdr,
+ struct nfs42_offload_status_res *res)
+{
+ __be32 *p;
+ uint32_t count;
+ int status;
+
+ status = decode_op_hdr(xdr, OP_OFFLOAD_STATUS);
+ if (status)
+ return status;
+ p = xdr_inline_decode(xdr, 8 + 4);
+ if (unlikely(!p))
+ goto out_overflow;
+ p = xdr_decode_hyper(p, &res->osr_count);
+ count = be32_to_cpup(p++);
+ if (count) {
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ goto out_overflow;
+ res->osr_status = nfs4_stat_to_errno(be32_to_cpup(p));
+ }
+ return 0;
+out_overflow:
+ print_overflow_msg(__func__, xdr);
+ return -EIO;
+}
+
static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *res)
{
return decode_op_hdr(xdr, OP_DEALLOCATE);
@@ -513,6 +578,32 @@ static int nfs4_xdr_dec_copy(struct rpc_rqst *rqstp,
}

/*
+ * Decode OFFLOAD_STATUS response
+ */
+static int nfs4_xdr_dec_offload_status(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs42_offload_status_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->osr_seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_offload_status(xdr, res);
+
+out:
+ return status;
+}
+
+/*
* Decode DEALLOCATE request
*/
static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 47f3c27..08f12b5 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -9495,6 +9495,7 @@ static bool nfs4_match_stateid(const nfs4_stateid *s1,
| NFS_CAP_ATOMIC_OPEN_V1
| NFS_CAP_ALLOCATE
| NFS_CAP_COPY
+ | NFS_CAP_OFFLOAD_STATUS
| NFS_CAP_DEALLOCATE
| NFS_CAP_SEEK
| NFS_CAP_LAYOUTSTATS
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 65c9c41..ca82e16 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7755,6 +7755,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
PROC42(LAYOUTSTATS, enc_layoutstats, dec_layoutstats),
PROC42(CLONE, enc_clone, dec_clone),
PROC42(COPY, enc_copy, dec_copy),
+ PROC42(OFFLOAD_STATUS, enc_offload_status, dec_offload_status),
PROC(LOOKUPP, enc_lookupp, dec_lookupp),
};

diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 57ffaa2..47ae050 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -527,6 +527,7 @@ enum {
NFSPROC4_CLNT_LAYOUTSTATS,
NFSPROC4_CLNT_CLONE,
NFSPROC4_CLNT_COPY,
+ NFSPROC4_CLNT_OFFLOAD_STATUS,

NFSPROC4_CLNT_LOOKUPP,
};
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 4e735be..1f6e9d6 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -254,5 +254,6 @@ struct nfs_server {
#define NFS_CAP_LAYOUTSTATS (1U << 22)
#define NFS_CAP_CLONE (1U << 23)
#define NFS_CAP_COPY (1U << 24)
+#define NFS_CAP_OFFLOAD_STATUS (1U << 25)

#endif
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 6959968..b5ea289 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1400,6 +1400,18 @@ struct nfs42_copy_res {
struct nfs_commitres commit_res;
};

+struct nfs42_offload_status_args {
+ struct nfs4_sequence_args osa_seq_args;
+ struct nfs_fh *osa_src_fh;
+ nfs4_stateid osa_stateid;
+};
+
+struct nfs42_offload_status_res {
+ struct nfs4_sequence_res osr_seq_res;
+ uint64_t osr_count;
+ int osr_status;
+};
+
struct nfs42_seek_args {
struct nfs4_sequence_args seq_args;

--
1.8.3.1


2018-04-13 17:03:26

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 08/13] NFS handle COPY reply CB_OFFLOAD call race

It's possible that server replies back with CB_OFFLOAD call and
COPY reply at the same time such that client will process
CB_OFFLOAD before reply to COPY. For that keep a list of pending
callback stateids received and then before waiting on completion
check the pending list.

Cleanup any pending copies on the client shutdown.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/callback_proc.c | 17 ++++++++++++++---
fs/nfs/nfs42proc.c | 22 ++++++++++++++++++++--
fs/nfs/nfs4client.c | 15 +++++++++++++++
include/linux/nfs_fs_sb.h | 1 +
4 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index e5e78e5..5f2b546 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -675,11 +675,12 @@ __be32 nfs4_callback_offload(void *data, void *dummy,
struct cb_offloadargs *args = data;
struct nfs_server *server;
struct nfs4_copy_state *copy;
+ bool found = false;

+ spin_lock(&cps->clp->cl_lock);
rcu_read_lock();
list_for_each_entry_rcu(server, &cps->clp->cl_superblocks,
client_link) {
- spin_lock(&server->nfs_client->cl_lock);
list_for_each_entry(copy, &server->ss_copies, copies) {
if (memcmp(args->coa_stateid.other,
copy->stateid.other,
@@ -687,13 +688,23 @@ __be32 nfs4_callback_offload(void *data, void *dummy,
continue;
nfs4_copy_cb_args(copy, args);
complete(&copy->completion);
- spin_unlock(&server->nfs_client->cl_lock);
+ found = true;
goto out;
}
- spin_unlock(&server->nfs_client->cl_lock);
}
out:
rcu_read_unlock();
+ if (!found) {
+ copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
+ if (!copy) {
+ spin_unlock(&cps->clp->cl_lock);
+ return -ENOMEM;
+ }
+ memcpy(&copy->stateid, &args->coa_stateid, NFS4_STATEID_SIZE);
+ nfs4_copy_cb_args(copy, args);
+ list_add_tail(&copy->copies, &cps->clp->pending_cb_stateids);
+ }
+ spin_unlock(&cps->clp->cl_lock);

return 0;
}
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 06f779b..fa4631d 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -138,14 +138,31 @@ static int handle_async_copy(struct nfs42_copy_res *res,
{
struct nfs4_copy_state *copy;
int status = NFS4_OK;
+ bool found_pending = false;
+
+ spin_lock(&server->nfs_client->cl_lock);
+ list_for_each_entry(copy, &server->nfs_client->pending_cb_stateids,
+ copies) {
+ if (memcmp(&res->write_res.stateid, &copy->stateid,
+ NFS4_STATEID_SIZE))
+ continue;
+ found_pending = true;
+ list_del(&copy->copies);
+ break;
+ }
+ if (found_pending) {
+ spin_unlock(&server->nfs_client->cl_lock);
+ goto out;
+ }

copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
- if (!copy)
+ if (!copy) {
+ spin_unlock(&server->nfs_client->cl_lock);
return -ENOMEM;
+ }
memcpy(&copy->stateid, &res->write_res.stateid, NFS4_STATEID_SIZE);
init_completion(&copy->completion);

- spin_lock(&server->nfs_client->cl_lock);
list_add_tail(&copy->copies, &server->ss_copies);
spin_unlock(&server->nfs_client->cl_lock);

@@ -153,6 +170,7 @@ static int handle_async_copy(struct nfs42_copy_res *res,
spin_lock(&server->nfs_client->cl_lock);
list_del_init(&copy->copies);
spin_unlock(&server->nfs_client->cl_lock);
+out:
res->write_res.count = copy->count;
memcpy(&res->write_res.verifier, &copy->verf, sizeof(copy->verf));
status = -copy->error;
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 9796314..d68ba73 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -156,9 +156,23 @@ struct rpc_clnt *
}
}

+static void
+nfs4_cleanup_callback(struct nfs_client *clp)
+{
+ struct nfs4_copy_state *cp_state;
+
+ while (!list_empty(&clp->pending_cb_stateids)) {
+ cp_state = list_entry(clp->pending_cb_stateids.next,
+ struct nfs4_copy_state, copies);
+ list_del(&cp_state->copies);
+ kfree(cp_state);
+ }
+}
+
void nfs41_shutdown_client(struct nfs_client *clp)
{
if (nfs4_has_session(clp)) {
+ nfs4_cleanup_callback(clp);
nfs4_shutdown_ds_clients(clp);
nfs4_destroy_session(clp->cl_session);
nfs4_destroy_clientid(clp);
@@ -202,6 +216,7 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init)
#if IS_ENABLED(CONFIG_NFS_V4_1)
init_waitqueue_head(&clp->cl_lock_waitq);
#endif
+ INIT_LIST_HEAD(&clp->pending_cb_stateids);
return clp;

error:
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 4836a1c..538e2c4 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -121,6 +121,7 @@ struct nfs_client {
#endif

struct net *cl_net;
+ struct list_head pending_cb_stateids;
};

/*
--
1.8.3.1


2018-04-13 17:03:24

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 06/13] NFS COPY xdr handle async reply

If server returns async reply, it must include a callback stateid,
wr_callback_id in the write_response4.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42xdr.c | 22 ++++++++++++----------
include/linux/nfs_xdr.h | 1 +
2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 7fda85c..9d70892 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -427,21 +427,23 @@ static int decode_write_response(struct xdr_stream *xdr,
struct nfs42_write_res *res)
{
__be32 *p;
+ int status, count;

- p = xdr_inline_decode(xdr, 4 + 8 + 4);
+ p = xdr_inline_decode(xdr, 4);
if (unlikely(!p))
goto out_overflow;
-
- /*
- * We never use asynchronous mode, so warn if a server returns
- * a stateid.
- */
- if (unlikely(*p != 0)) {
- pr_err_once("%s: server has set unrequested "
- "asynchronous mode\n", __func__);
+ count = be32_to_cpup(p);
+ if (count > 1)
return -EREMOTEIO;
+ else if (count == 1) {
+ status = decode_opaque_fixed(xdr, &res->stateid,
+ NFS4_STATEID_SIZE);
+ if (unlikely(status))
+ goto out_overflow;
}
- p++;
+ p = xdr_inline_decode(xdr, 8 + 4);
+ if (unlikely(!p))
+ goto out_overflow;
p = xdr_decode_hyper(p, &res->count);
res->verifier.committed = be32_to_cpup(p);
return decode_verifier(xdr, &res->verifier.verifier);
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index b5ea289..4735f7d 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1388,6 +1388,7 @@ struct nfs42_copy_args {
};

struct nfs42_write_res {
+ nfs4_stateid stateid;
u64 count;
struct nfs_writeverf verifier;
};
--
1.8.3.1


2018-04-13 17:03:31

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 13/13] NFS recover from destination server reboot for copies

Mark the destination state to indicate a server-side copy is
happening. On detecting a reboot and recovering open state check
if any state is engaged in a server-side copy, if so, find the
copy and mark it and then signal the waiting thread. Upon wakeup,
if copy was marked then propage EAGAIN to the nfsd_copy_file_range
and restart the copy from scratch.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 16 +++++++++++++---
fs/nfs/nfs4_fs.h | 3 +++
fs/nfs/nfs4file.c | 9 +++++++--
fs/nfs/nfs4state.c | 15 +++++++++++++++
include/linux/nfs_fs.h | 2 ++
5 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 94b654a..305284d 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -140,6 +140,7 @@ static int handle_async_copy(struct nfs42_copy_res *res,
struct nfs4_copy_state *copy;
int status = NFS4_OK;
bool found_pending = false;
+ struct nfs_open_context *ctx = nfs_file_open_context(dst);

spin_lock(&server->nfs_client->cl_lock);
list_for_each_entry(copy, &server->nfs_client->pending_cb_stateids,
@@ -163,6 +164,7 @@ static int handle_async_copy(struct nfs42_copy_res *res,
}
memcpy(&copy->stateid, &res->write_res.stateid, NFS4_STATEID_SIZE);
init_completion(&copy->completion);
+ copy->parent_state = ctx->state;

list_add_tail(&copy->copies, &server->ss_copies);
spin_unlock(&server->nfs_client->cl_lock);
@@ -172,9 +174,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,
list_del_init(&copy->copies);
spin_unlock(&server->nfs_client->cl_lock);
if (status == -ERESTARTSYS) {
- nfs42_do_offload_cancel_async(dst, &copy->stateid);
- kfree(copy);
- return status;
+ goto out_cancel;
+ } else if (copy->flags) {
+ status = -EAGAIN;
+ goto out_cancel;
}
out:
res->write_res.count = copy->count;
@@ -183,6 +186,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,

kfree(copy);
return status;
+out_cancel:
+ nfs42_do_offload_cancel_async(dst, &copy->stateid);
+ kfree(copy);
+ return status;
}

static int process_copy_commit(struct file *dst, loff_t pos_dst,
@@ -254,6 +261,9 @@ static ssize_t _nfs42_proc_copy(struct file *src,
if (!res->commit_res.verf)
return -ENOMEM;
}
+ set_bit(NFS_CLNT_DST_SSC_COPY_STATE,
+ &dst_lock->open_context->state->flags);
+
status = nfs4_call_sync(server->client, server, &msg,
&args->seq_args, &res->seq_res, 0);
if (status == -ENOTSUPP)
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 6f113f0..83d252a 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -163,6 +163,9 @@ enum {
NFS_STATE_RECOVERY_FAILED, /* OPEN stateid state recovery failed */
NFS_STATE_MAY_NOTIFY_LOCK, /* server may CB_NOTIFY_LOCK */
NFS_STATE_CHANGE_WAIT, /* A state changing operation is outstanding */
+#ifdef CONFIG_NFS_V4_2
+ NFS_CLNT_DST_SSC_COPY_STATE, /* dst server open state on client*/
+#endif /* CONFIG_NFS_V4_2 */
};

struct nfs4_state {
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 6b3b372..eaef5fa 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -133,10 +133,15 @@ static ssize_t nfs4_copy_file_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t count, unsigned int flags)
{
+ ssize_t ret;
+
if (file_inode(file_in) == file_inode(file_out))
return -EINVAL;
-
- return nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
+retry:
+ ret = nfs42_proc_copy(file_in, pos_in, file_out, pos_out, count);
+ if (ret == -EAGAIN)
+ goto retry;
+ return ret;
}

static loff_t nfs4_file_llseek(struct file *filep, loff_t offset, int whence)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 91a4d4e..041bb50 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1584,6 +1584,21 @@ static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs
&state->flags);
nfs4_put_open_state(state);
spin_lock(&sp->so_lock);
+#ifdef CONFIG_NFS_V4_2
+ if (test_bit(NFS_CLNT_DST_SSC_COPY_STATE, &state->flags)) {
+ struct nfs4_copy_state *copy;
+
+ spin_lock(&sp->so_server->nfs_client->cl_lock);
+ list_for_each_entry(copy, &sp->so_server->ss_copies, copies) {
+ if (memcmp(&state->stateid.other, &copy->parent_state->stateid.other, NFS4_STATEID_SIZE))
+ continue;
+ copy->flags = 1;
+ complete(&copy->completion);
+ break;
+ }
+ spin_unlock(&sp->so_server->nfs_client->cl_lock);
+ }
+#endif /* CONFIG_NFS_V4_2 */
goto restart;
}
}
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index e1c1f82..93a5c9f 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -192,6 +192,8 @@ struct nfs4_copy_state {
uint64_t count;
struct nfs_writeverf verf;
int error;
+ int flags;
+ struct nfs4_state *parent_state;
};

/*
--
1.8.3.1


2018-04-13 17:03:24

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 07/13] NFS add support for asynchronous COPY

Change xdr to always send COPY asynchronously.

Keep the list copies send in a list under a server structure.
Once copy is sent, it waits on a completion structure that will
be signalled by the callback thread that receives CB_OFFLOAD.

If CB_OFFLOAD returned an error and even if it returned partial
bytes, ignore them (as we can't commit without a verifier to
match) and return an error.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/callback_proc.c | 38 +++++++++++++++++++++++++++++++-
fs/nfs/client.c | 1 +
fs/nfs/nfs42proc.c | 55 ++++++++++++++++++++++++++++++++++++++++++-----
fs/nfs/nfs42xdr.c | 8 ++++---
include/linux/nfs_fs.h | 9 ++++++++
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 1 +
7 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 27c7ba3..e5e78e5 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -656,9 +656,45 @@ __be32 nfs4_callback_notify_lock(void *argp, void *resp,
}
#endif /* CONFIG_NFS_V4_1 */
#ifdef CONFIG_NFS_V4_2
-__be32 nfs4_callback_offload(void *args, void *dummy,
+static void nfs4_copy_cb_args(struct nfs4_copy_state *cp_state,
+ struct cb_offloadargs *args)
+{
+ cp_state->count = args->wr_count;
+ cp_state->error = args->error;
+ if (!args->error) {
+ cp_state->verf.committed = args->wr_writeverf.committed;
+ memcpy(&cp_state->verf.verifier.data[0],
+ &args->wr_writeverf.verifier.data[0],
+ NFS4_VERIFIER_SIZE);
+ }
+}
+
+__be32 nfs4_callback_offload(void *data, void *dummy,
struct cb_process_state *cps)
{
+ struct cb_offloadargs *args = data;
+ struct nfs_server *server;
+ struct nfs4_copy_state *copy;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(server, &cps->clp->cl_superblocks,
+ client_link) {
+ spin_lock(&server->nfs_client->cl_lock);
+ list_for_each_entry(copy, &server->ss_copies, copies) {
+ if (memcmp(args->coa_stateid.other,
+ copy->stateid.other,
+ sizeof(args->coa_stateid.other)))
+ continue;
+ nfs4_copy_cb_args(copy, args);
+ complete(&copy->completion);
+ spin_unlock(&server->nfs_client->cl_lock);
+ goto out;
+ }
+ spin_unlock(&server->nfs_client->cl_lock);
+ }
+out:
+ rcu_read_unlock();
+
return 0;
}
#endif /* CONFIG_NFS_V4_2 */
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index b9129e2..09649f5 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -886,6 +886,7 @@ struct nfs_server *nfs_alloc_server(void)
INIT_LIST_HEAD(&server->delegations);
INIT_LIST_HEAD(&server->layouts);
INIT_LIST_HEAD(&server->state_owners_lru);
+ INIT_LIST_HEAD(&server->ss_copies);

atomic_set(&server->active, 0);

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index c78e235..06f779b 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -130,6 +130,37 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
return err;
}

+static int handle_async_copy(struct nfs42_copy_res *res,
+ struct nfs_server *server,
+ struct file *src,
+ struct file *dst,
+ nfs4_stateid *src_stateid)
+{
+ struct nfs4_copy_state *copy;
+ int status = NFS4_OK;
+
+ copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_NOFS);
+ if (!copy)
+ return -ENOMEM;
+ memcpy(&copy->stateid, &res->write_res.stateid, NFS4_STATEID_SIZE);
+ init_completion(&copy->completion);
+
+ spin_lock(&server->nfs_client->cl_lock);
+ list_add_tail(&copy->copies, &server->ss_copies);
+ spin_unlock(&server->nfs_client->cl_lock);
+
+ wait_for_completion_interruptible(&copy->completion);
+ spin_lock(&server->nfs_client->cl_lock);
+ list_del_init(&copy->copies);
+ spin_unlock(&server->nfs_client->cl_lock);
+ res->write_res.count = copy->count;
+ memcpy(&res->write_res.verifier, &copy->verf, sizeof(copy->verf));
+ status = -copy->error;
+
+ kfree(copy);
+ return status;
+}
+
static ssize_t _nfs42_proc_copy(struct file *src,
struct nfs_lock_context *src_lock,
struct file *dst,
@@ -168,9 +199,13 @@ static ssize_t _nfs42_proc_copy(struct file *src,
if (status)
return status;

- res->commit_res.verf = kzalloc(sizeof(struct nfs_writeverf), GFP_NOFS);
- if (!res->commit_res.verf)
- return -ENOMEM;
+ res->commit_res.verf = NULL;
+ if (args->sync) {
+ res->commit_res.verf =
+ kzalloc(sizeof(struct nfs_writeverf), GFP_NOFS);
+ if (!res->commit_res.verf)
+ return -ENOMEM;
+ }
status = nfs4_call_sync(server->client, server, &msg,
&args->seq_args, &res->seq_res, 0);
if (status == -ENOTSUPP)
@@ -178,18 +213,27 @@ static ssize_t _nfs42_proc_copy(struct file *src,
if (status)
goto out;

- if (nfs_write_verifier_cmp(&res->write_res.verifier.verifier,
+ if (args->sync &&
+ nfs_write_verifier_cmp(&res->write_res.verifier.verifier,
&res->commit_res.verf->verifier)) {
status = -EAGAIN;
goto out;
}

+ if (!res->synchronous) {
+ status = handle_async_copy(res, server, src, dst,
+ &args->src_stateid);
+ if (status)
+ return status;
+ }
+
truncate_pagecache_range(dst_inode, pos_dst,
pos_dst + res->write_res.count);

status = res->write_res.count;
out:
- kfree(res->commit_res.verf);
+ if (args->sync)
+ kfree(res->commit_res.verf);
return status;
}

@@ -206,6 +250,7 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
.dst_fh = NFS_FH(file_inode(dst)),
.dst_pos = pos_dst,
.count = count,
+ .sync = false,
};
struct nfs42_copy_res res;
struct nfs4_exception src_exception = {
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 9d70892..bc0bae6 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -161,7 +161,7 @@ static void encode_copy(struct xdr_stream *xdr,
encode_uint64(xdr, args->count);

encode_uint32(xdr, 1); /* consecutive = true */
- encode_uint32(xdr, 1); /* synchronous = true */
+ encode_uint32(xdr, args->sync);
encode_uint32(xdr, 0); /* src server list */
}

@@ -292,7 +292,8 @@ static void nfs4_xdr_enc_copy(struct rpc_rqst *req,
encode_savefh(xdr, &hdr);
encode_putfh(xdr, args->dst_fh, &hdr);
encode_copy(xdr, args, &hdr);
- encode_copy_commit(xdr, args, &hdr);
+ if (args->sync)
+ encode_copy_commit(xdr, args, &hdr);
encode_nops(&hdr);
}

@@ -616,7 +617,8 @@ static int nfs4_xdr_dec_copy(struct rpc_rqst *rqstp,
status = decode_copy(xdr, res);
if (status)
goto out;
- status = decode_commit(xdr, &res->commit_res);
+ if (res->commit_res.verf)
+ status = decode_commit(xdr, &res->commit_res);
out:
return status;
}
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 38187c6..e1c1f82 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -185,6 +185,15 @@ struct nfs_inode {
struct inode vfs_inode;
};

+struct nfs4_copy_state {
+ struct list_head copies;
+ nfs4_stateid stateid;
+ struct completion completion;
+ uint64_t count;
+ struct nfs_writeverf verf;
+ int error;
+};
+
/*
* Access bit flags
*/
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 6d2a5c8..4836a1c 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -208,6 +208,7 @@ struct nfs_server {
struct list_head state_owners_lru;
struct list_head layouts;
struct list_head delegations;
+ struct list_head ss_copies;

unsigned long mig_gen;
unsigned long mig_status;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 4735f7d..ef66fce 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1385,6 +1385,7 @@ struct nfs42_copy_args {
u64 dst_pos;

u64 count;
+ bool sync;
};

struct nfs42_write_res {
--
1.8.3.1


2018-04-13 17:03:21

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 04/13] NFS OFFLOAD_STATUS op

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42.h | 5 ++++-
fs/nfs/nfs42proc.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
index 19ec38f8..93e9d19 100644
--- a/fs/nfs/nfs42.h
+++ b/fs/nfs/nfs42.h
@@ -12,6 +12,7 @@
*/
#define PNFS_LAYOUTSTATS_MAXDEV (4)

+#if defined(CONFIG_NFS_V4_2)
/* nfs4.2proc.c */
int nfs42_proc_allocate(struct file *, loff_t, loff_t);
ssize_t nfs42_proc_copy(struct file *, loff_t, struct file *, loff_t, size_t);
@@ -20,5 +21,7 @@
int nfs42_proc_layoutstats_generic(struct nfs_server *,
struct nfs42_layoutstat_data *);
int nfs42_proc_clone(struct file *, struct file *, loff_t, loff_t, loff_t);
-
+int nfs42_proc_offload_status(struct file *, nfs4_stateid *,
+ struct nfs42_offload_status_res *);
+#endif /* CONFIG_NFS_V4_2) */
#endif /* __LINUX_FS_NFS_NFS4_2_H */
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 9c37444..c78e235 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -264,6 +264,49 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
return err;
}

+int _nfs42_proc_offload_status(struct file *dst, nfs4_stateid *stateid,
+ struct nfs42_offload_status_res *res)
+{
+ struct nfs42_offload_status_args args = {
+ .osa_src_fh = NFS_FH(file_inode(dst)),
+ };
+ struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
+ .rpc_resp = res,
+ .rpc_argp = &args,
+ };
+ int status;
+
+ memcpy(&args.osa_stateid, stateid, sizeof(args.osa_stateid));
+ status = nfs4_call_sync(dst_server->client, dst_server, &msg,
+ &args.osa_seq_args, &res->osr_seq_res, 0);
+ if (status == -ENOTSUPP)
+ dst_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
+
+ return status;
+}
+
+int nfs42_proc_offload_status(struct file *dst, nfs4_stateid *stateid,
+ struct nfs42_offload_status_res *res)
+{
+ struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
+ struct nfs4_exception exception = { };
+ int status;
+
+ if (!(dst_server->caps & NFS_CAP_OFFLOAD_STATUS))
+ return -EOPNOTSUPP;
+
+ do {
+ status = _nfs42_proc_offload_status(dst, stateid, res);
+ if (status == -ENOTSUPP)
+ return -EOPNOTSUPP;
+ status = nfs4_handle_exception(dst_server, status, &exception);
+ } while (exception.retry);
+
+ return status;
+}
+
static loff_t _nfs42_proc_llseek(struct file *filep,
struct nfs_lock_context *lock, loff_t offset, int whence)
{
--
1.8.3.1


2018-04-13 17:03:19

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 02/13] NFS CB_OFFLOAD xdr

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/callback.h | 12 ++++++++
fs/nfs/callback_proc.c | 7 +++++
fs/nfs/callback_xdr.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
index a20a0bc..8f34daf 100644
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -184,6 +184,18 @@ struct cb_notify_lock_args {
extern __be32 nfs4_callback_notify_lock(void *argp, void *resp,
struct cb_process_state *cps);
#endif /* CONFIG_NFS_V4_1 */
+#ifdef CONFIG_NFS_V4_2
+struct cb_offloadargs {
+ struct nfs_fh coa_fh;
+ nfs4_stateid coa_stateid;
+ uint32_t error;
+ uint64_t wr_count;
+ struct nfs_writeverf wr_writeverf;
+};
+
+extern __be32 nfs4_callback_offload(void *args, void *dummy,
+ struct cb_process_state *cps);
+#endif /* CONFIG_NFS_V4_2 */
extern int check_gss_callback_principal(struct nfs_client *, struct svc_rqst *);
extern __be32 nfs4_callback_getattr(void *argp, void *resp,
struct cb_process_state *cps);
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index a50d781..27c7ba3 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -655,3 +655,10 @@ __be32 nfs4_callback_notify_lock(void *argp, void *resp,
return htonl(NFS4_OK);
}
#endif /* CONFIG_NFS_V4_1 */
+#ifdef CONFIG_NFS_V4_2
+__be32 nfs4_callback_offload(void *args, void *dummy,
+ struct cb_process_state *cps)
+{
+ return 0;
+}
+#endif /* CONFIG_NFS_V4_2 */
diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
index 123c069..3a96a45 100644
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -38,6 +38,9 @@
#define CB_OP_RECALLSLOT_RES_MAXSZ (CB_OP_HDR_RES_MAXSZ)
#define CB_OP_NOTIFY_LOCK_RES_MAXSZ (CB_OP_HDR_RES_MAXSZ)
#endif /* CONFIG_NFS_V4_1 */
+#ifdef CONFIG_NFS_V4_2
+#define CB_OP_OFFLOAD_RES_MAXSZ (CB_OP_HDR_RES_MAXSZ)
+#endif /* CONFIG_NFS_V4_2 */

#define NFSDBG_FACILITY NFSDBG_CALLBACK

@@ -527,7 +530,73 @@ static __be32 decode_notify_lock_args(struct svc_rqst *rqstp,
}

#endif /* CONFIG_NFS_V4_1 */
+#ifdef CONFIG_NFS_V4_2
+static __be32 decode_write_response(struct xdr_stream *xdr,
+ struct cb_offloadargs *args)
+{
+ __be32 *p;
+ __be32 dummy;
+
+ /* skip the always zero field */
+ p = read_buf(xdr, 4);
+ if (unlikely(!p))
+ goto out;
+ dummy = ntohl(*p++);
+
+ /* decode count, stable_how, verifier */
+ p = xdr_inline_decode(xdr, 8 + 4);
+ if (unlikely(!p))
+ goto out;
+ p = xdr_decode_hyper(p, &args->wr_count);
+ args->wr_writeverf.committed = be32_to_cpup(p);
+ p = xdr_inline_decode(xdr, NFS4_VERIFIER_SIZE);
+ if (likely(p)) {
+ memcpy(&args->wr_writeverf.verifier.data[0], p,
+ NFS4_VERIFIER_SIZE);
+ return 0;
+ }
+out:
+ return htonl(NFS4ERR_RESOURCE);
+}
+
+static __be32 decode_offload_args(struct svc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct cb_offloadargs *args = data;
+ __be32 *p;
+ __be32 status;
+
+ /* decode fh */
+ status = decode_fh(xdr, &args->coa_fh);
+ if (unlikely(status != 0))
+ return status;
+
+ /* decode stateid */
+ status = decode_stateid(xdr, &args->coa_stateid);
+ if (unlikely(status != 0))
+ return status;

+ /* decode status */
+ p = read_buf(xdr, 4);
+ if (unlikely(!p))
+ goto out;
+ args->error = ntohl(*p++);
+ if (!args->error) {
+ status = decode_write_response(xdr, args);
+ if (unlikely(status != 0))
+ return status;
+ } else {
+ p = xdr_inline_decode(xdr, 8);
+ if (unlikely(!p))
+ goto out;
+ p = xdr_decode_hyper(p, &args->wr_count);
+ }
+ return 0;
+out:
+ return htonl(NFS4ERR_RESOURCE);
+}
+#endif /* CONFIG_NFS_V4_2 */
static __be32 encode_string(struct xdr_stream *xdr, unsigned int len, const char *str)
{
if (unlikely(xdr_stream_encode_opaque(xdr, str, len) < 0))
@@ -794,7 +863,10 @@ static void nfs4_cb_free_slot(struct cb_process_state *cps)
if (status != htonl(NFS4ERR_OP_ILLEGAL))
return status;

- if (op_nr == OP_CB_OFFLOAD)
+ if (op_nr == OP_CB_OFFLOAD) {
+ *op = &callback_ops[op_nr];
+ return htonl(NFS_OK);
+ } else
return htonl(NFS4ERR_NOTSUPP);
return htonl(NFS4ERR_OP_ILLEGAL);
}
@@ -990,6 +1062,13 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp)
.res_maxsize = CB_OP_NOTIFY_LOCK_RES_MAXSZ,
},
#endif /* CONFIG_NFS_V4_1 */
+#ifdef CONFIG_NFS_V4_2
+ [OP_CB_OFFLOAD] = {
+ .process_op = nfs4_callback_offload,
+ .decode_args = decode_offload_args,
+ .res_maxsize = CB_OP_OFFLOAD_RES_MAXSZ,
+ },
+#endif /* CONFIG_NFS_V4_2 */
};

/*
--
1.8.3.1


2018-04-13 17:03:29

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 11/13] NFS handle COPY ERR_OFFLOAD_NO_REQS

If client sent async COPY and server replied with
ERR_OFFLOAD_NO_REQS, client should retry with a synchronous copy.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 2971eda..fbd1f8a 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -316,7 +316,11 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
if (err == -ENOTSUPP) {
err = -EOPNOTSUPP;
break;
- } if (err == -EAGAIN) {
+ } else if (err == -EAGAIN) {
+ dst_exception.retry = 1;
+ continue;
+ } else if (err == -NFS4ERR_OFFLOAD_NO_REQS && !args.sync) {
+ args.sync = true;
dst_exception.retry = 1;
continue;
}
--
1.8.3.1


2018-04-13 17:03:30

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH v8 12/13] NFS add a simple sync nfs4_proc_commit after async COPY

A COPY with unstable write data needs a simple sync commit.
Filehandle value is gotten as a part of the inner loop so in
case of a reboot retry it should get the new value.

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs42proc.c | 31 +++++++++++++++++++++++++++++++
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4proc.c | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index fbd1f8a..94b654a 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -185,6 +185,30 @@ static int handle_async_copy(struct nfs42_copy_res *res,
return status;
}

+static int process_copy_commit(struct file *dst, loff_t pos_dst,
+ struct nfs42_copy_res *res)
+{
+ struct nfs_commitres cres;
+ int status = -ENOMEM;
+
+ cres.verf = kzalloc(sizeof(struct nfs_writeverf), GFP_NOFS);
+ if (!cres.verf)
+ goto out;
+
+ status = nfs4_proc_commit(dst, pos_dst, res->write_res.count, &cres);
+ if (status)
+ goto out_free;
+ if (nfs_write_verifier_cmp(&res->write_res.verifier.verifier,
+ &cres.verf->verifier)) {
+ dprintk("commit verf differs from copy verf\n");
+ status = -EAGAIN;
+ }
+out_free:
+ kfree(cres.verf);
+out:
+ return status;
+}
+
static ssize_t _nfs42_proc_copy(struct file *src,
struct nfs_lock_context *src_lock,
struct file *dst,
@@ -251,6 +275,13 @@ static ssize_t _nfs42_proc_copy(struct file *src,
return status;
}

+ if ((!res->synchronous || !args->sync) &&
+ res->write_res.verifier.committed != NFS_FILE_SYNC) {
+ status = process_copy_commit(dst, pos_dst, res);
+ if (status)
+ return status;
+ }
+
truncate_pagecache_range(dst_inode, pos_dst,
pos_dst + res->write_res.count);

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 7303a6f..6f113f0 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -483,7 +483,7 @@ extern int nfs4_sequence_done(struct rpc_task *task,
struct nfs4_sequence_res *res);

extern void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp);
-
+extern int nfs4_proc_commit(struct file *dst, __u64 offset, __u32 count, struct nfs_commitres *res);
extern const nfs4_stateid zero_stateid;
extern const nfs4_stateid invalid_stateid;

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dcd08ba..c8b554a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4972,6 +4972,40 @@ static void nfs4_proc_commit_setup(struct nfs_commit_data *data, struct rpc_mess
nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
}

+static int _nfs4_proc_commit(struct file *dst, struct nfs_commitargs *args,
+ struct nfs_commitres *res)
+{
+ struct inode *dst_inode = file_inode(dst);
+ struct nfs_server *server = NFS_SERVER(dst_inode);
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COMMIT],
+ .rpc_argp = args,
+ .rpc_resp = res,
+ };
+
+ args->fh = NFS_FH(dst_inode);
+ return nfs4_call_sync(server->client, server, &msg,
+ &args->seq_args, &res->seq_res, 1);
+}
+
+int nfs4_proc_commit(struct file *dst, __u64 offset, __u32 count, struct nfs_commitres *res)
+{
+ struct nfs_commitargs args = {
+ .offset = offset,
+ .count = count,
+ };
+ struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
+ struct nfs4_exception exception = { };
+ int status;
+
+ do {
+ status = _nfs4_proc_commit(dst, &args, res);
+ status = nfs4_handle_exception(dst_server, status, &exception);
+ } while (exception.retry);
+
+ return status;
+}
+
struct nfs4_renewdata {
struct nfs_client *client;
unsigned long timestamp;
--
1.8.3.1


2018-04-20 17:30:19

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH v8 00/13] NFS support for async intra COPY

Hi Olga,

On 04/13/2018 01:03 PM, Olga Kornievskaia wrote:
> Change the client side to always request an asynchronous copy.
>
> If server replies to the COPY with the copy stateid, client will go
> wait on the CB_OFFLOAD. To fend off the race between CB_OFFLOAD and
> COPY reply, we check the list of pending callbacks before going to
> wait. Client adds the copy to the global list of copy stateids for the
> callback to look thru and signal the waiting copy.
>
> When the client receives reply from the CB_OFFLOAD with some bytes and
> committed how is UNSTABLE, then COMMIT is sent to the server. The results
> are propagated to the VFS and application. Assuming that application
> will deal with a partial result and continue from the new offset if needed.
>
> If server errs with ERR_OFFLOAD_NOREQS the copy will be re-sent as a
> synchronous COPY. If application cancels an in-flight COPY,
> OFFLOAD_CANCEL is sent to the server (sending it as an async RPC in the
> context of the nfsid_workqueue).
>
> v8.
> - added patch 13 to handle reboot recovery of the server or network
> partition. this patch was a part of the initial "inter async" series
> but never got added to the async series. without this patch if the
> server goes away, thread that was doing the copy would be hanging
> waiting for the callback that would never arrive.
>
>
>
> Anna Schumaker (1):
> fs: Don't copy beyond the end of the file
>
> Olga Kornievskaia (12):
> NFS CB_OFFLOAD xdr
> NFS OFFLOAD_STATUS xdr
> NFS OFFLOAD_STATUS op

I'm still not sold on adding in offload status on the client without a way to use it. Can you rearrange this series so offload status is at the very end? That way we can add it in later, if there is ever an interface for copy statuses.

Anna

> NFS OFFLOAD_CANCEL xdr
> NFS COPY xdr handle async reply
> NFS add support for asynchronous COPY
> NFS handle COPY reply CB_OFFLOAD call race
> NFS export nfs4_async_handle_error
> NFS send OFFLOAD_CANCEL when COPY killed
> NFS handle COPY ERR_OFFLOAD_NO_REQS
> NFS add a simple sync nfs4_proc_commit after async COPY
> NFS recover from destination server reboot for copies
>
> fs/nfs/callback.h | 12 +++
> fs/nfs/callback_proc.c | 54 ++++++++++
> fs/nfs/callback_xdr.c | 81 ++++++++++++++-
> fs/nfs/client.c | 1 +
> fs/nfs/nfs42.h | 5 +-
> fs/nfs/nfs42proc.c | 251 ++++++++++++++++++++++++++++++++++++++++++++--
> fs/nfs/nfs42xdr.c | 189 +++++++++++++++++++++++++++++++---
> fs/nfs/nfs4_fs.h | 8 +-
> fs/nfs/nfs4client.c | 15 +++
> fs/nfs/nfs4file.c | 9 +-
> fs/nfs/nfs4proc.c | 38 ++++++-
> fs/nfs/nfs4state.c | 15 +++
> fs/nfs/nfs4xdr.c | 2 +
> fs/read_write.c | 3 +
> include/linux/nfs4.h | 2 +
> include/linux/nfs_fs.h | 11 ++
> include/linux/nfs_fs_sb.h | 4 +
> include/linux/nfs_xdr.h | 14 +++
> 18 files changed, 689 insertions(+), 25 deletions(-)
>