2022-05-16 21:45:16

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v1 0/5] NFS: Improvements for the NFSv4.2 READ_PLUS

From: Anna Schumaker <[email protected]>

Previously, decoding was a one step process that expanded holes as they
were seen in the buffer. This had a few undesireable side effects:

1) An extra READ_PLUS call was often needed to fetch any data shifted
off the end of the buffer when the last two segments are a HOLE
followed by DATA
2) We shifted the entire remaining buffer for each hole, meaning some
segments would get moved multiple times during one decode pass.

These patches attempt to address this by turning READ_PLUS decoding into
a two-step process. First, we build up a list of each segment returned
by the server, then we walk the list in reverse and move each segment
directly to their target offset. This cuts out all the extra copying,
and means we won't lose any data off of the end of the reply.

The results of my performance testing can be found here:
https://wiki.linux-nfs.org/wiki/index.php/Read_Plus_May_2022

Between these patches and the corresponding server patches, I'm seeing a
several second decrease in the amount of time that a READ_PLUS call
takes to complete even on the worst case test where pages alternate
between hole and data segments.

I also optimistically remove the CONFIG_NFS_V4_2_READ_PLUS Kconfig
option now that the known performance and correctness issues have been
resolved, but I would also be fine with changing it to default to 'y' if
there are objections to entirely dropping the option.

Please note that these patches rely on the xdr_stream_move_segment()
function added as the first patch of the corresponding server patches.

Thoughts?
Anna


Anna Schumaker (5):
SUNRPC: Add a function for directly setting the xdr page len
SUNRPC: Add a function for zeroing out a portion of an xdr_stream
NFS: Replace the READ_PLUS decoding code
SUNRPC: Remove xdr_align_data() and xdr_expand_hole()
NFS: Remove the CONFIG_NFS_V4_2_READ_PLUS Kconfig option

fs/nfs/Kconfig | 9 --
fs/nfs/nfs42xdr.c | 168 +++++++++++++++++++------------------
fs/nfs/nfs4proc.c | 2 +-
include/linux/sunrpc/xdr.h | 5 +-
net/sunrpc/xdr.c | 111 +++++++++++-------------
5 files changed, 140 insertions(+), 155 deletions(-)

--
2.36.1



2022-05-16 22:59:29

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v1 2/5] SUNRPC: Add a function for zeroing out a portion of an xdr_stream

From: Anna Schumaker <[email protected]>

This will be used during READ_PLUS decoding for handling HOLE segments.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 2 ++
net/sunrpc/xdr.c | 23 +++++++++++++++++++++++
2 files changed, 25 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 607340fa1fd4..d632fd170bf6 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -269,6 +269,8 @@ extern bool xdr_stream_subsegment(struct xdr_stream *xdr, struct xdr_buf *subbuf
unsigned int len);
extern unsigned int xdr_stream_move_segment(struct xdr_stream *xdr, unsigned int offset,
unsigned int target, unsigned int length);
+extern unsigned int xdr_stream_zero(struct xdr_stream *xdr, unsigned int offset,
+ unsigned int length);

/**
* xdr_set_scratch_buffer - Attach a scratch buffer for decoding data.
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index ff36a20aab89..7c7c4d360950 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -1795,6 +1795,29 @@ void xdr_buf_trim_head(struct xdr_buf *buf, unsigned int len)
}
EXPORT_SYMBOL_GPL(xdr_buf_trim_head);

+/**
+ * xdr_stream_zero - zero out a portion of an xdr_stream
+ * @xdr: an xdr_stream to zero out
+ * @offset: the starting point in the stream
+ * @length: the number of bytes to zero
+ */
+unsigned int xdr_stream_zero(struct xdr_stream *xdr, unsigned int offset,
+ unsigned int length)
+{
+ struct xdr_buf buf;
+
+ if (xdr_buf_subsegment(xdr->buf, &buf, offset, length) < 0)
+ return 0;
+ if (buf.head[0].iov_len)
+ xdr_buf_iov_zero(buf.head, 0, buf.head[0].iov_len);
+ if (buf.page_len > 0)
+ xdr_buf_pages_zero(&buf, 0, buf.page_len);
+ if (buf.tail[0].iov_len)
+ xdr_buf_iov_zero(buf.tail, 0, buf.tail[0].iov_len);
+ return length;
+}
+EXPORT_SYMBOL_GPL(xdr_stream_zero);
+
/**
* xdr_buf_trim - lop at most "len" bytes off the end of "buf"
* @buf: buf to be trimmed
--
2.36.1


2022-05-17 00:58:11

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v1 5/5] NFS: Remove the CONFIG_NFS_V4_2_READ_PLUS Kconfig option

From: Anna Schumaker <[email protected]>

This option was added due to a few failing xfstests and performance
issues that were found during testing. Both of these have now been
addressed, meaning it should be okay to remove the option.

Signed-off-by: Anna Schumaker <[email protected]>
---
fs/nfs/Kconfig | 9 ---------
fs/nfs/nfs4proc.c | 2 +-
2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index 14a72224b657..5dcd84ce1c0c 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -205,12 +205,3 @@ config NFS_DISABLE_UDP_SUPPORT
Choose Y here to disable the use of NFS over UDP. NFS over UDP
on modern networks (1Gb+) can lead to data corruption caused by
fragmentation during high loads.
-
-config NFS_V4_2_READ_PLUS
- bool "NFS: Enable support for the NFSv4.2 READ_PLUS operation"
- depends on NFS_V4_2
- default n
- help
- This is intended for developers only. The READ_PLUS operation has
- been shown to have issues under specific conditions and should not
- be used in production.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a79f66432bd3..114ae7673e9a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5436,7 +5436,7 @@ static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
nfs4_read_done_cb(task, hdr);
}

-#if defined CONFIG_NFS_V4_2 && defined CONFIG_NFS_V4_2_READ_PLUS
+#if defined CONFIG_NFS_V4_2
static void nfs42_read_plus_support(struct nfs_pgio_header *hdr,
struct rpc_message *msg)
{
--
2.36.1


2022-05-17 02:00:51

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v1 4/5] SUNRPC: Remove xdr_align_data() and xdr_expand_hole()

From: Anna Schumaker <[email protected]>

These functions are no longer needed now that the NFS client places data
and hole segments directly.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 2 --
net/sunrpc/xdr.c | 66 --------------------------------------
2 files changed, 68 deletions(-)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index d632fd170bf6..05a1a8b459b0 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -263,8 +263,6 @@ extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
extern int xdr_process_buf(const struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
extern void xdr_set_pagelen(struct xdr_stream *, unsigned int len);
-extern unsigned int xdr_align_data(struct xdr_stream *, unsigned int offset, unsigned int length);
-extern unsigned int xdr_expand_hole(struct xdr_stream *, unsigned int offset, unsigned int length);
extern bool xdr_stream_subsegment(struct xdr_stream *xdr, struct xdr_buf *subbuf,
unsigned int len);
extern unsigned int xdr_stream_move_segment(struct xdr_stream *xdr, unsigned int offset,
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 7c7c4d360950..49f98c95d1a7 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -1539,72 +1539,6 @@ void xdr_set_pagelen(struct xdr_stream *xdr, unsigned int len)
}
EXPORT_SYMBOL_GPL(xdr_set_pagelen);

-unsigned int xdr_align_data(struct xdr_stream *xdr, unsigned int offset,
- unsigned int length)
-{
- struct xdr_buf *buf = xdr->buf;
- unsigned int from, bytes, len;
- unsigned int shift;
-
- xdr_realign_pages(xdr);
- from = xdr_page_pos(xdr);
-
- if (from >= buf->page_len + buf->tail->iov_len)
- return 0;
- if (from + buf->head->iov_len >= buf->len)
- return 0;
-
- len = buf->len - buf->head->iov_len;
-
- /* We only shift data left! */
- if (WARN_ONCE(from < offset, "SUNRPC: misaligned data src=%u dst=%u\n",
- from, offset))
- return 0;
- if (WARN_ONCE(offset > buf->page_len,
- "SUNRPC: buffer overflow. offset=%u, page_len=%u\n",
- offset, buf->page_len))
- return 0;
-
- /* Move page data to the left */
- shift = from - offset;
- xdr_buf_pages_shift_left(buf, from, len, shift);
-
- bytes = xdr_stream_remaining(xdr);
- if (length > bytes)
- length = bytes;
- bytes -= length;
-
- xdr->buf->len -= shift;
- xdr_set_page(xdr, offset + length, bytes);
- return length;
-}
-EXPORT_SYMBOL_GPL(xdr_align_data);
-
-unsigned int xdr_expand_hole(struct xdr_stream *xdr, unsigned int offset,
- unsigned int length)
-{
- struct xdr_buf *buf = xdr->buf;
- unsigned int from, to, shift;
-
- xdr_realign_pages(xdr);
- from = xdr_page_pos(xdr);
- to = xdr_align_size(offset + length);
-
- /* Could the hole be behind us? */
- if (to > from) {
- unsigned int buflen = buf->len - buf->head->iov_len;
- shift = to - from;
- xdr_buf_try_expand(buf, shift);
- xdr_buf_pages_shift_right(buf, from, buflen, shift);
- xdr_set_page(xdr, to, xdr_stream_remaining(xdr));
- } else if (to != from)
- xdr_align_data(xdr, to, 0);
- xdr_buf_pages_zero(buf, offset, length);
-
- return length;
-}
-EXPORT_SYMBOL_GPL(xdr_expand_hole);
-
/**
* xdr_enter_page - decode data from the XDR page
* @xdr: pointer to xdr_stream struct
--
2.36.1


2022-05-17 03:39:48

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v1 1/5] SUNRPC: Add a function for directly setting the xdr page len

From: Anna Schumaker <[email protected]>

We need to do this step during READ_PLUS decoding so that we know pages
are the right length and any extra data has been preserved in the tail.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index b375b284afbe..607340fa1fd4 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -262,6 +262,7 @@ extern __be32 *xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes);
extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
extern int xdr_process_buf(const struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
+extern void xdr_set_pagelen(struct xdr_stream *, unsigned int len);
extern unsigned int xdr_align_data(struct xdr_stream *, unsigned int offset, unsigned int length);
extern unsigned int xdr_expand_hole(struct xdr_stream *, unsigned int offset, unsigned int length);
extern bool xdr_stream_subsegment(struct xdr_stream *xdr, struct xdr_buf *subbuf,
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index d71c90552fa2..ff36a20aab89 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -1509,6 +1509,36 @@ unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len)
}
EXPORT_SYMBOL_GPL(xdr_read_pages);

+/**
+ * xdr_set_pagelen - Sets the length of the XDR pages
+ * @xdr: pointer to xdr_stream struct
+ * @len: new length of the XDR page data
+ *
+ * Either grows or shrinks the length of the xdr pages by setting pagelen to
+ * @len bytes. When shrinking, any extra data is moved into buf->tail, whereas
+ * when growing any data beyond the current pointer is moved into the tail.
+ *
+ * Returns True if the operation was successful, and False otherwise.
+ */
+void xdr_set_pagelen(struct xdr_stream *xdr, unsigned int len)
+{
+ struct xdr_buf *buf = xdr->buf;
+ size_t remaining = xdr_stream_remaining(xdr);
+ size_t base = 0;
+
+ if (len < buf->page_len) {
+ base = buf->page_len - len;
+ xdr_shrink_pagelen(buf, len);
+ } else {
+ xdr_buf_head_shift_right(buf, xdr_stream_pos(xdr),
+ buf->page_len, remaining);
+ if (len > buf->page_len)
+ xdr_buf_try_expand(buf, len - buf->page_len);
+ }
+ xdr_set_tail_base(xdr, base, remaining);
+}
+EXPORT_SYMBOL_GPL(xdr_set_pagelen);
+
unsigned int xdr_align_data(struct xdr_stream *xdr, unsigned int offset,
unsigned int length)
{
--
2.36.1