2022-07-15 18:57:22

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 0/6] NFSD: Improvements for the NFSv4.2 READ_PLUS operation

From: Anna Schumaker <[email protected]>

The main motivation for this patchset is fixing generic/091 and
generic/263 with READ_PLUS. These tests appear to be failing due to
files getting modified in the middle of reply encoding. Attempts to lock
the file for the entire encode result in a deadlock, since llseek() and
read() both need the file lock.

The solution is to read everything from disk at once, and then check if
each buffer page is all zeroes or not. As a bonus, this lets us support
READ_PLUS hole segments on filesystems that don't track sparse files.
Additionally, this also solves the performance issues I hit when testing
using btrfs on a virtual machine.

I created a wiki page with the results of my performance testing here:
https://wiki.linux-nfs.org/wiki/index.php/Read_Plus_May_2022

These patches should probably go in before the related client changes
as the client will also be changed to make use of the
xdr_stream_move_subsegment() function.

Changed in v3:
- Respond to as many of Chuck's comments as possible

Changed in v2:
- Update to v5.19-rc6
- Rename xdr_stream_move_segment() -> xdr_stream_move_subsegment()

Thoughts?
Anna


Anna Schumaker (6):
SUNRPC: Introduce xdr_stream_move_subsegment()
SUNRPC: Introduce xdr_encode_double()
SUNRPC: Introduce xdr_buf_trim_head()
SUNRPC: Introduce xdr_buf_nth_page_address()
SUNRPC: Export xdr_buf_pagecount()
NFSD: Repeal and replace the READ_PLUS implementation

fs/nfsd/nfs4xdr.c | 219 ++++++++++++++++++++-----------------
include/linux/sunrpc/xdr.h | 6 +
net/sunrpc/xdr.c | 102 +++++++++++++++++
3 files changed, 227 insertions(+), 100 deletions(-)

--
2.37.1


2022-07-15 18:57:22

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v3 3/6] SUNRPC: Introduce xdr_buf_trim_head()

From: Anna Schumaker <[email protected]>

The READ_PLUS operation uses a 32-bit length field for encoding a DATA
segment, but 64-bit length field for encoding a HOLE segment. When
setting up our reply buffer, we need to reserve enough space to encode
a HOLE before reading the file data and use this function if the first
segment turns out to be DATA.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index e26047d474b2..bdaf048edde0 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -191,6 +191,7 @@ xdr_adjust_iovec(struct kvec *iov, __be32 *p)
extern void xdr_shift_buf(struct xdr_buf *, size_t);
extern void xdr_buf_from_iov(const struct kvec *, struct xdr_buf *);
extern int xdr_buf_subsegment(const struct xdr_buf *, struct xdr_buf *, unsigned int, unsigned int);
+extern void xdr_buf_trim_head(struct xdr_buf *, unsigned int);
extern void xdr_buf_trim(struct xdr_buf *, unsigned int);
extern int read_bytes_from_xdr_buf(const struct xdr_buf *, unsigned int, void *, unsigned int);
extern int write_bytes_to_xdr_buf(const struct xdr_buf *, unsigned int, void *, unsigned int);
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 63d9cdc989da..37956a274f81 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -1739,6 +1739,23 @@ unsigned int xdr_stream_move_subsegment(struct xdr_stream *xdr, unsigned int off
}
EXPORT_SYMBOL_GPL(xdr_stream_move_subsegment);

+/**
+ * xdr_buf_trim_head - lop at most "len" bytes off the end of "buf"->head
+ * @buf: buf to be trimmed
+ * @len: number of bytes to reduce "buf"->head by
+ *
+ * Trim an xdr_buf->head by the given number of bytes by fixing up the lengths.
+ * Note that it's possible that we'll trim less than that amount if the
+ * xdr_buf->head is too small.
+ */
+void xdr_buf_trim_head(struct xdr_buf *buf, unsigned int len)
+{
+ size_t trim = min_t(size_t, buf->head[0].iov_len, len);
+ buf->head[0].iov_len -= trim;
+ buf->len -= trim;
+}
+EXPORT_SYMBOL_GPL(xdr_buf_trim_head);
+
/**
* xdr_buf_trim - lop at most "len" bytes off the end of "buf"
* @buf: buf to be trimmed
--
2.37.1