2020-09-28 17:10:35

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 00/10] NFS: Add support for the v4.2 READ_PLUS operation

From: Anna Schumaker <[email protected]>

These patches add client support for the READ_PLUS operation, which
breaks read requests into several "data" and "hole" segments when
replying to the client.

- Changes since v5:
- Make sure we disable READ_PLUS over RDMA, by Chuck's request
- Update to v5.9-rc7

Here are the results of some performance tests I ran on some lab
machines. I tested by reading various 2G files from a few different underlying
filesystems and across several NFS versions. I used the `vmtouch` utility
to make sure files were only cached when we wanted them to be. In addition
to 100% data and 100% hole cases, I also tested with files that alternate
between data and hole segments. These files have either 4K, 8K, 16K, or 32K
segment sizes and start with either data or hole segments. So the file
mixed-4d has a 4K segment size beginning with a data segment, but mixed-32h
has 32K segments beginning with a hole. The units are in seconds, with the
first number for each NFS version being the uncached read time and the second
number is for when the file is cached on the server.

I added some extra data collection (client cpu percentage and sys time),
but the extra data means I couldn't figure out a way to break this down
into a concise table. I cut out v3 and v4.0 performance numbers to get
the size down, but I kept v4.1 for comparison because it uses the same
code that v4.2 without read plus uses.


Read Plus Results (ext4):
data
:... v4.1 ... Uncached ... 20.540 s, 105 MB/s, 0.65 s kern, 3% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 20.605 s, 104 MB/s, 0.65 s kern, 3% cpu
:....... Cached ..... 18.253 s, 118 MB/s, 0.67 s kern, 3% cpu
hole
:... v4.1 ... Uncached ... 18.255 s, 118 MB/s, 0.72 s kern, 3% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.72 s kern, 3% cpu
:... v4.2 ... Uncached ... 0.847 s, 2.5 GB/s, 0.73 s kern, 86% cpu
:....... Cached ..... 0.845 s, 2.5 GB/s, 0.72 s kern, 85% cpu
mixed-4d
:... v4.1 ... Uncached ... 54.691 s, 39 MB/s, 0.75 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 51.587 s, 42 MB/s, 0.75 s kern, 1% cpu
:....... Cached ..... 9.215 s, 233 MB/s, 0.67 s kern, 7% cpu
mixed-8d
:... v4.1 ... Uncached ... 37.072 s, 58 MB/s, 0.67 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 33.259 s, 65 MB/s, 0.68 s kern, 2% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.67 s kern, 7% cpu
mixed-16d
:... v4.1 ... Uncached ... 27.138 s, 79 MB/s, 0.73 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 23.042 s, 93 MB/s, 0.73 s kern, 3% cpu
:....... Cached ..... 9.150 s, 235 MB/s, 0.66 s kern, 7% cpu
mixed-32d
:... v4.1 ... Uncached ... 25.326 s, 85 MB/s, 0.68 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 21.125 s, 102 MB/s, 0.69 s kern, 3% cpu
:....... Cached ..... 9.140 s, 235 MB/s, 0.67 s kern, 7% cpu
mixed-4h
:... v4.1 ... Uncached ... 58.317 s, 37 MB/s, 0.75 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 51.878 s, 41 MB/s, 0.74 s kern, 1% cpu
:....... Cached ..... 9.215 s, 233 MB/s, 0.68 s kern, 7% cpu
mixed-8h
:... v4.1 ... Uncached ... 36.855 s, 58 MB/s, 0.68 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.72 s kern, 3% cpu
:... v4.2 ... Uncached ... 29.457 s, 73 MB/s, 0.68 s kern, 2% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.67 s kern, 7% cpu
mixed-16h
:... v4.1 ... Uncached ... 26.460 s, 81 MB/s, 0.74 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 19.587 s, 110 MB/s, 0.74 s kern, 3% cpu
:....... Cached ..... 9.150 s, 235 MB/s, 0.67 s kern, 7% cpu
mixed-32h
:... v4.1 ... Uncached ... 25.495 s, 84 MB/s, 0.69 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.65 s kern, 3% cpu
:... v4.2 ... Uncached ... 17.634 s, 122 MB/s, 0.69 s kern, 3% cpu
:....... Cached ..... 9.140 s, 235 MB/s, 0.68 s kern, 7% cpu



Read Plus Results (xfs):
data
:... v4.1 ... Uncached ... 20.230 s, 106 MB/s, 0.65 s kern, 3% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.68 s kern, 3% cpu
:... v4.2 ... Uncached ... 20.724 s, 104 MB/s, 0.65 s kern, 3% cpu
:....... Cached ..... 18.253 s, 118 MB/s, 0.67 s kern, 3% cpu
hole
:... v4.1 ... Uncached ... 18.255 s, 118 MB/s, 0.68 s kern, 3% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.69 s kern, 3% cpu
:... v4.2 ... Uncached ... 0.904 s, 2.4 GB/s, 0.72 s kern, 79% cpu
:....... Cached ..... 0.908 s, 2.4 GB/s, 0.73 s kern, 80% cpu
mixed-4d
:... v4.1 ... Uncached ... 57.553 s, 37 MB/s, 0.77 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 37.162 s, 58 MB/s, 0.73 s kern, 1% cpu
:....... Cached ..... 9.215 s, 233 MB/s, 0.67 s kern, 7% cpu
mixed-8d
:... v4.1 ... Uncached ... 36.754 s, 58 MB/s, 0.69 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.68 s kern, 3% cpu
:... v4.2 ... Uncached ... 24.454 s, 88 MB/s, 0.69 s kern, 2% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.66 s kern, 7% cpu
mixed-16d
:... v4.1 ... Uncached ... 27.156 s, 79 MB/s, 0.73 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 22.934 s, 94 MB/s, 0.72 s kern, 3% cpu
:....... Cached ..... 9.150 s, 235 MB/s, 0.68 s kern, 7% cpu
mixed-32d
:... v4.1 ... Uncached ... 27.849 s, 77 MB/s, 0.68 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.72 s kern, 3% cpu
:... v4.2 ... Uncached ... 23.670 s, 91 MB/s, 0.67 s kern, 2% cpu
:....... Cached ..... 9.139 s, 235 MB/s, 0.64 s kern, 7% cpu
mixed-4h
:... v4.1 ... Uncached ... 57.639 s, 37 MB/s, 0.72 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.69 s kern, 3% cpu
:... v4.2 ... Uncached ... 35.503 s, 61 MB/s, 0.72 s kern, 2% cpu
:....... Cached ..... 9.215 s, 233 MB/s, 0.66 s kern, 7% cpu
mixed-8h
:... v4.1 ... Uncached ... 37.044 s, 58 MB/s, 0.71 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.68 s kern, 3% cpu
:... v4.2 ... Uncached ... 23.779 s, 90 MB/s, 0.69 s kern, 2% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.65 s kern, 7% cpu
mixed-16h
:... v4.1 ... Uncached ... 27.167 s, 79 MB/s, 0.73 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.67 s kern, 3% cpu
:... v4.2 ... Uncached ... 19.088 s, 113 MB/s, 0.75 s kern, 3% cpu
:....... Cached ..... 9.159 s, 234 MB/s, 0.66 s kern, 7% cpu
mixed-32h
:... v4.1 ... Uncached ... 27.592 s, 78 MB/s, 0.71 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.68 s kern, 3% cpu
:... v4.2 ... Uncached ... 19.682 s, 109 MB/s, 0.67 s kern, 3% cpu
:....... Cached ..... 9.140 s, 235 MB/s, 0.67 s kern, 7% cpu



Read Plus Results (btrfs):
data
:... v4.1 ... Uncached ... 21.317 s, 101 MB/s, 0.63 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.67 s kern, 3% cpu
:... v4.2 ... Uncached ... 28.665 s, 75 MB/s, 0.65 s kern, 2% cpu
:....... Cached ..... 18.253 s, 118 MB/s, 0.66 s kern, 3% cpu
hole
:... v4.1 ... Uncached ... 18.256 s, 118 MB/s, 0.70 s kern, 3% cpu
: :....... Cached ..... 18.254 s, 118 MB/s, 0.73 s kern, 4% cpu
:... v4.2 ... Uncached ... 0.851 s, 2.5 GB/s, 0.72 s kern, 84% cpu
:....... Cached ..... 0.847 s, 2.5 GB/s, 0.73 s kern, 86% cpu
mixed-4d
:... v4.1 ... Uncached ... 56.857 s, 38 MB/s, 0.76 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.72 s kern, 3% cpu
:... v4.2 ... Uncached ... 54.455 s, 39 MB/s, 0.73 s kern, 1% cpu
:....... Cached ..... 9.215 s, 233 MB/s, 0.68 s kern, 7% cpu
mixed-8d
:... v4.1 ... Uncached ... 36.641 s, 59 MB/s, 0.68 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 33.205 s, 65 MB/s, 0.67 s kern, 2% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.65 s kern, 7% cpu
mixed-16d
:... v4.1 ... Uncached ... 28.653 s, 75 MB/s, 0.72 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 25.748 s, 83 MB/s, 0.71 s kern, 2% cpu
:....... Cached ..... 9.150 s, 235 MB/s, 0.64 s kern, 7% cpu
mixed-32d
:... v4.1 ... Uncached ... 28.886 s, 74 MB/s, 0.67 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.71 s kern, 3% cpu
:... v4.2 ... Uncached ... 24.724 s, 87 MB/s, 0.74 s kern, 2% cpu
:....... Cached ..... 9.140 s, 235 MB/s, 0.63 s kern, 6% cpu
mixed-4h
:... v4.1 ... Uncached ... 52.181 s, 41 MB/s, 0.73 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.66 s kern, 3% cpu
:... v4.2 ... Uncached ... 150.341 s, 14 MB/s, 0.72 s kern, 0% cpu
:....... Cached ..... 9.216 s, 233 MB/s, 0.63 s kern, 6% cpu
mixed-8h
:... v4.1 ... Uncached ... 36.945 s, 58 MB/s, 0.68 s kern, 1% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.65 s kern, 3% cpu
:... v4.2 ... Uncached ... 79.781 s, 27 MB/s, 0.68 s kern, 0% cpu
:....... Cached ..... 9.172 s, 234 MB/s, 0.66 s kern, 7% cpu
mixed-16h
:... v4.1 ... Uncached ... 28.651 s, 75 MB/s, 0.73 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.66 s kern, 3% cpu
:... v4.2 ... Uncached ... 47.428 s, 45 MB/s, 0.71 s kern, 1% cpu
:....... Cached ..... 9.150 s, 235 MB/s, 0.67 s kern, 7% cpu
mixed-32h
:... v4.1 ... Uncached ... 28.618 s, 75 MB/s, 0.69 s kern, 2% cpu
: :....... Cached ..... 18.252 s, 118 MB/s, 0.70 s kern, 3% cpu
:... v4.2 ... Uncached ... 38.813 s, 55 MB/s, 0.67 s kern, 1% cpu
:....... Cached ..... 9.140 s, 235 MB/s, 0.61 s kern, 6% cpu



Thoughts?
Anna


Anna Schumaker (10):
SUNRPC: Split out a function for setting current page
SUNRPC: Implement a xdr_page_pos() function
NFS: Use xdr_page_pos() in NFSv4 decode_getacl()
NFS: Add READ_PLUS data segment support
SUNRPC: Split out xdr_realign_pages() from xdr_align_pages()
SUNRPC: Split out _shift_data_right_tail()
SUNRPC: Add the ability to expand holes in data pages
NFS: Add READ_PLUS hole segment decoding
SUNRPC: Add an xdr_align_data() function
NFS: Decode a full READ_PLUS reply

fs/nfs/nfs42xdr.c | 167 ++++++++++++++++++++
fs/nfs/nfs4client.c | 2 +
fs/nfs/nfs4proc.c | 43 +++++-
fs/nfs/nfs4xdr.c | 7 +-
include/linux/nfs4.h | 2 +-
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 2 +-
include/linux/sunrpc/xdr.h | 3 +
net/sunrpc/xdr.c | 309 ++++++++++++++++++++++++++++++++-----
9 files changed, 488 insertions(+), 48 deletions(-)

--
2.28.0


2020-09-28 17:10:39

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 09/10] SUNRPC: Add an xdr_align_data() function

From: Anna Schumaker <[email protected]>

For now, this function simply aligns the data at the beginning of the
pages. This can eventually be expanded to shift data to the correct
offsets when we're ready.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 121 +++++++++++++++++++++++++++++++++++++
2 files changed, 122 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 36a81c29542e..9548d075e06d 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -252,6 +252,7 @@ extern __be32 *xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes);
extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
extern int xdr_process_buf(struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
+extern uint64_t xdr_align_data(struct xdr_stream *, uint64_t, uint32_t);
extern uint64_t xdr_expand_hole(struct xdr_stream *, uint64_t, uint64_t);

/**
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 24baf052e6e6..e799cbfe6b5a 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -19,6 +19,9 @@
#include <linux/bvec.h>
#include <trace/events/sunrpc.h>

+static void _copy_to_pages(struct page **, size_t, const char *, size_t);
+
+
/*
* XDR functions for basic NFS types
*/
@@ -201,6 +204,88 @@ EXPORT_SYMBOL_GPL(xdr_inline_pages);
* Helper routines for doing 'memmove' like operations on a struct xdr_buf
*/

+/**
+ * _shift_data_left_pages
+ * @pages: vector of pages containing both the source and dest memory area.
+ * @pgto_base: page vector address of destination
+ * @pgfrom_base: page vector address of source
+ * @len: number of bytes to copy
+ *
+ * Note: the addresses pgto_base and pgfrom_base are both calculated in
+ * the same way:
+ * if a memory area starts at byte 'base' in page 'pages[i]',
+ * then its address is given as (i << PAGE_CACHE_SHIFT) + base
+ * Alse note: pgto_base must be < pgfrom_base, but the memory areas
+ * they point to may overlap.
+ */
+static void
+_shift_data_left_pages(struct page **pages, size_t pgto_base,
+ size_t pgfrom_base, size_t len)
+{
+ struct page **pgfrom, **pgto;
+ char *vfrom, *vto;
+ size_t copy;
+
+ BUG_ON(pgfrom_base <= pgto_base);
+
+ pgto = pages + (pgto_base >> PAGE_SHIFT);
+ pgfrom = pages + (pgfrom_base >> PAGE_SHIFT);
+
+ pgto_base &= ~PAGE_MASK;
+ pgfrom_base &= ~PAGE_MASK;
+
+ do {
+ if (pgto_base >= PAGE_SIZE) {
+ pgto_base = 0;
+ pgto++;
+ }
+ if (pgfrom_base >= PAGE_SIZE){
+ pgfrom_base = 0;
+ pgfrom++;
+ }
+
+ copy = len;
+ if (copy > (PAGE_SIZE - pgto_base))
+ copy = PAGE_SIZE - pgto_base;
+ if (copy > (PAGE_SIZE - pgfrom_base))
+ copy = PAGE_SIZE - pgfrom_base;
+
+ vto = kmap_atomic(*pgto);
+ if (*pgto != *pgfrom) {
+ vfrom = kmap_atomic(*pgfrom);
+ memcpy(vto + pgto_base, vfrom + pgfrom_base, copy);
+ kunmap_atomic(vfrom);
+ } else
+ memmove(vto + pgto_base, vto + pgfrom_base, copy);
+ flush_dcache_page(*pgto);
+ kunmap_atomic(vto);
+
+ pgto_base += copy;
+ pgfrom_base += copy;
+
+ } while ((len -= copy) != 0);
+}
+
+static void
+_shift_data_left_tail(struct xdr_buf *buf, unsigned int pgto, size_t len)
+{
+ struct kvec *tail = buf->tail;
+
+ if (len > tail->iov_len)
+ len = tail->iov_len;
+
+ _copy_to_pages(buf->pages,
+ buf->page_base + pgto,
+ (char *)tail->iov_base,
+ len);
+ tail->iov_len -= len;
+
+ if (tail->iov_len > 0)
+ memmove((char *)tail->iov_base,
+ tail->iov_base + len,
+ tail->iov_len);
+}
+
/**
* _shift_data_right_pages
* @pages: vector of pages containing both the source and dest memory area.
@@ -1177,6 +1262,42 @@ unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len)
}
EXPORT_SYMBOL_GPL(xdr_read_pages);

+uint64_t xdr_align_data(struct xdr_stream *xdr, uint64_t offset, uint32_t length)
+{
+ struct xdr_buf *buf = xdr->buf;
+ unsigned int from, bytes;
+ unsigned int shift = 0;
+
+ if ((offset + length) < offset ||
+ (offset + length) > buf->page_len)
+ length = buf->page_len - offset;
+
+ xdr_realign_pages(xdr);
+ from = xdr_page_pos(xdr);
+ bytes = xdr->nwords << 2;
+ if (length < bytes)
+ bytes = length;
+
+ /* Move page data to the left */
+ if (from > offset) {
+ shift = min_t(unsigned int, bytes, buf->page_len - from);
+ _shift_data_left_pages(buf->pages,
+ buf->page_base + offset,
+ buf->page_base + from,
+ shift);
+ bytes -= shift;
+
+ /* Move tail data into the pages, if necessary */
+ if (bytes > 0)
+ _shift_data_left_tail(buf, offset + shift, bytes);
+ }
+
+ xdr->nwords -= XDR_QUADLEN(length);
+ xdr_set_page(xdr, from + length, PAGE_SIZE);
+ return length;
+}
+EXPORT_SYMBOL_GPL(xdr_align_data);
+
uint64_t xdr_expand_hole(struct xdr_stream *xdr, uint64_t offset, uint64_t length)
{
struct xdr_buf *buf = xdr->buf;
--
2.28.0

2020-09-28 17:11:16

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 01/10] SUNRPC: Split out a function for setting current page

From: Anna Schumaker <[email protected]>

I'm going to need this bit of code in a few places for READ_PLUS
decoding, so let's make it a helper function.

Signed-off-by: Anna Schumaker <[email protected]>
---
net/sunrpc/xdr.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 6dfe5dc8b35f..c62b0882c0d8 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -870,6 +870,13 @@ static int xdr_set_page_base(struct xdr_stream *xdr,
return 0;
}

+static void xdr_set_page(struct xdr_stream *xdr, unsigned int base,
+ unsigned int len)
+{
+ if (xdr_set_page_base(xdr, base, len) < 0)
+ xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2);
+}
+
static void xdr_set_next_page(struct xdr_stream *xdr)
{
unsigned int newbase;
@@ -877,8 +884,7 @@ static void xdr_set_next_page(struct xdr_stream *xdr)
newbase = (1 + xdr->page_ptr - xdr->buf->pages) << PAGE_SHIFT;
newbase -= xdr->buf->page_base;

- if (xdr_set_page_base(xdr, newbase, PAGE_SIZE) < 0)
- xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2);
+ xdr_set_page(xdr, newbase, PAGE_SIZE);
}

static bool xdr_set_next_buffer(struct xdr_stream *xdr)
@@ -886,8 +892,7 @@ static bool xdr_set_next_buffer(struct xdr_stream *xdr)
if (xdr->page_ptr != NULL)
xdr_set_next_page(xdr);
else if (xdr->iov == xdr->buf->head) {
- if (xdr_set_page_base(xdr, 0, PAGE_SIZE) < 0)
- xdr_set_iov(xdr, xdr->buf->tail, xdr->nwords << 2);
+ xdr_set_page(xdr, 0, PAGE_SIZE);
}
return xdr->p != xdr->end;
}
--
2.28.0

2020-09-28 17:11:16

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 06/10] SUNRPC: Split out _shift_data_right_tail()

From: Anna Schumaker <[email protected]>

xdr_shrink_pagelen() is very similar to what we need for hole expansion,
so split out the common code into its own function that can be used by
both functions.

Signed-off-by: Anna Schumaker <[email protected]>
---
net/sunrpc/xdr.c | 68 +++++++++++++++++++++++++++++-------------------
1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 70efb35c119e..d8c9555c6f2b 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -266,6 +266,46 @@ _shift_data_right_pages(struct page **pages, size_t pgto_base,
} while ((len -= copy) != 0);
}

+static unsigned int
+_shift_data_right_tail(struct xdr_buf *buf, unsigned int pgfrom, size_t len)
+{
+ struct kvec *tail = buf->tail;
+ unsigned int tailbuf_len;
+ unsigned int result = 0;
+ size_t copy;
+
+ tailbuf_len = buf->buflen - buf->head->iov_len - buf->page_len;
+
+ /* Shift the tail first */
+ if (tailbuf_len != 0) {
+ unsigned int free_space = tailbuf_len - tail->iov_len;
+
+ if (len < free_space)
+ free_space = len;
+ if (len > free_space)
+ len = free_space;
+
+ tail->iov_len += free_space;
+ copy = len;
+
+ if (tail->iov_len > len) {
+ char *p = (char *)tail->iov_base + len;
+ memmove(p, tail->iov_base, tail->iov_len - free_space);
+ result += tail->iov_len - free_space;
+ } else
+ copy = tail->iov_len;
+
+ /* Copy from the inlined pages into the tail */
+ _copy_from_pages((char *)tail->iov_base,
+ buf->pages,
+ buf->page_base + pgfrom,
+ copy);
+ result += copy;
+ }
+
+ return result;
+}
+
/**
* _copy_to_pages
* @pages: array of pages
@@ -446,39 +486,13 @@ xdr_shrink_bufhead(struct xdr_buf *buf, size_t len)
static unsigned int
xdr_shrink_pagelen(struct xdr_buf *buf, size_t len)
{
- struct kvec *tail;
- size_t copy;
unsigned int pglen = buf->page_len;
- unsigned int tailbuf_len;
unsigned int result;

- result = 0;
- tail = buf->tail;
if (len > buf->page_len)
len = buf-> page_len;
- tailbuf_len = buf->buflen - buf->head->iov_len - buf->page_len;

- /* Shift the tail first */
- if (tailbuf_len != 0) {
- unsigned int free_space = tailbuf_len - tail->iov_len;
-
- if (len < free_space)
- free_space = len;
- tail->iov_len += free_space;
-
- copy = len;
- if (tail->iov_len > len) {
- char *p = (char *)tail->iov_base + len;
- memmove(p, tail->iov_base, tail->iov_len - len);
- result += tail->iov_len - len;
- } else
- copy = tail->iov_len;
- /* Copy from the inlined pages into the tail */
- _copy_from_pages((char *)tail->iov_base,
- buf->pages, buf->page_base + pglen - len,
- copy);
- result += copy;
- }
+ result = _shift_data_right_tail(buf, pglen - len, len);
buf->page_len -= len;
buf->buflen -= len;
/* Have we truncated the message? */
--
2.28.0

2020-09-28 17:11:17

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 04/10] NFS: Add READ_PLUS data segment support

From: Anna Schumaker <[email protected]>

This patch adds client support for decoding a single NFS4_CONTENT_DATA
segment returned by the server. This is the simplest implementation
possible, since it does not account for any hole segments in the reply.

Signed-off-by: Anna Schumaker <[email protected]>

---
v6: Disable READ_PLUS on RDMA
v5: Fix up for the xattr patches
---
fs/nfs/nfs42xdr.c | 141 ++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4client.c | 2 +
fs/nfs/nfs4proc.c | 43 +++++++++++-
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 2 +-
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_xdr.h | 2 +-
7 files changed, 187 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index cc50085e151c..930b4ca212c1 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -45,6 +45,15 @@
#define encode_deallocate_maxsz (op_encode_hdr_maxsz + \
encode_fallocate_maxsz)
#define decode_deallocate_maxsz (op_decode_hdr_maxsz)
+#define encode_read_plus_maxsz (op_encode_hdr_maxsz + \
+ encode_stateid_maxsz + 3)
+#define NFS42_READ_PLUS_SEGMENT_SIZE (1 /* data_content4 */ + \
+ 2 /* data_info4.di_offset */ + \
+ 2 /* data_info4.di_length */)
+#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
+ 1 /* rpr_eof */ + \
+ 1 /* rpr_contents count */ + \
+ NFS42_READ_PLUS_SEGMENT_SIZE)
#define encode_seek_maxsz (op_encode_hdr_maxsz + \
encode_stateid_maxsz + \
2 /* offset */ + \
@@ -128,6 +137,14 @@
decode_putfh_maxsz + \
decode_deallocate_maxsz + \
decode_getattr_maxsz)
+#define NFS4_enc_read_plus_sz (compound_encode_hdr_maxsz + \
+ encode_sequence_maxsz + \
+ encode_putfh_maxsz + \
+ encode_read_plus_maxsz)
+#define NFS4_dec_read_plus_sz (compound_decode_hdr_maxsz + \
+ decode_sequence_maxsz + \
+ decode_putfh_maxsz + \
+ decode_read_plus_maxsz)
#define NFS4_enc_seek_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
@@ -324,6 +341,16 @@ static void encode_deallocate(struct xdr_stream *xdr,
encode_fallocate(xdr, args);
}

+static void encode_read_plus(struct xdr_stream *xdr,
+ const struct nfs_pgio_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_READ_PLUS, decode_read_plus_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->stateid);
+ encode_uint64(xdr, args->offset);
+ encode_uint32(xdr, args->count);
+}
+
static void encode_seek(struct xdr_stream *xdr,
const struct nfs42_seek_args *args,
struct compound_hdr *hdr)
@@ -722,6 +749,28 @@ static void nfs4_xdr_enc_deallocate(struct rpc_rqst *req,
encode_nops(&hdr);
}

+/*
+ * Encode READ_PLUS request
+ */
+static void nfs4_xdr_enc_read_plus(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs_pgio_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->seq_args, &hdr);
+ encode_putfh(xdr, args->fh, &hdr);
+ encode_read_plus(xdr, args, &hdr);
+
+ rpc_prepare_reply_pages(req, args->pages, args->pgbase,
+ args->count, hdr.replen);
+ encode_nops(&hdr);
+}
+
/*
* Encode SEEK request
*/
@@ -970,6 +1019,71 @@ static int decode_deallocate(struct xdr_stream *xdr, struct nfs42_falloc_res *re
return decode_op_hdr(xdr, OP_DEALLOCATE);
}

+static int decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_res *res,
+ uint32_t *eof)
+{
+ uint32_t count, recvd;
+ uint64_t offset;
+ __be32 *p;
+
+ p = xdr_inline_decode(xdr, 8 + 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ p = xdr_decode_hyper(p, &offset);
+ count = be32_to_cpup(p);
+ recvd = xdr_read_pages(xdr, count);
+ res->count += recvd;
+
+ if (count > recvd) {
+ dprintk("NFS: server cheating in read reply: "
+ "count %u > recvd %u\n", count, recvd);
+ *eof = 0;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
+{
+ uint32_t eof, segments, type;
+ int status;
+ __be32 *p;
+
+ status = decode_op_hdr(xdr, OP_READ_PLUS);
+ if (status)
+ return status;
+
+ p = xdr_inline_decode(xdr, 4 + 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ eof = be32_to_cpup(p++);
+ segments = be32_to_cpup(p++);
+ if (segments == 0)
+ goto out;
+
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ return -EIO;
+
+ type = be32_to_cpup(p++);
+ if (type == NFS4_CONTENT_DATA)
+ status = decode_read_plus_data(xdr, res, &eof);
+ else
+ return -EINVAL;
+
+ if (status)
+ return status;
+ if (segments > 1)
+ eof = 0;
+
+out:
+ res->eof = eof;
+ return 0;
+}
+
static int decode_seek(struct xdr_stream *xdr, struct nfs42_seek_res *res)
{
int status;
@@ -1146,6 +1260,33 @@ static int nfs4_xdr_dec_deallocate(struct rpc_rqst *rqstp,
return status;
}

+/*
+ * Decode READ_PLUS request
+ */
+static int nfs4_xdr_dec_read_plus(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs_pgio_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_read_plus(xdr, res);
+ if (!status)
+ status = res->count;
+out:
+ return status;
+}
+
/*
* Decode SEEK request
*/
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index daacc78a3d48..be7915c861ce 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -1045,6 +1045,8 @@ static int nfs4_server_common_setup(struct nfs_server *server,
server->caps |= server->nfs_client->cl_mvops->init_caps;
if (server->flags & NFS_MOUNT_NORDIRPLUS)
server->caps &= ~NFS_CAP_READDIRPLUS;
+ if (server->nfs_client->cl_proto == XPRT_TRANSPORT_RDMA)
+ server->caps &= ~NFS_CAP_READ_PLUS;
/*
* Don't use NFS uid/gid mapping if we're using AUTH_SYS or lower
* authentication.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 542961ffa529..1077e913aab6 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -70,6 +70,10 @@

#include "nfs4trace.h"

+#ifdef CONFIG_NFS_V4_2
+#include "nfs42.h"
+#endif /* CONFIG_NFS_V4_2 */
+
#define NFSDBG_FACILITY NFSDBG_PROC

#define NFS4_BITMASK_SZ 3
@@ -5259,28 +5263,60 @@ static bool nfs4_read_stateid_changed(struct rpc_task *task,
return true;
}

+static bool nfs4_read_plus_not_supported(struct rpc_task *task,
+ struct nfs_pgio_header *hdr)
+{
+ struct nfs_server *server = NFS_SERVER(hdr->inode);
+ struct rpc_message *msg = &task->tk_msg;
+
+ if (msg->rpc_proc == &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS] &&
+ server->caps & NFS_CAP_READ_PLUS && task->tk_status == -ENOTSUPP) {
+ server->caps &= ~NFS_CAP_READ_PLUS;
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+ rpc_restart_call_prepare(task);
+ return true;
+ }
+ return false;
+}
+
static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
{
-
dprintk("--> %s\n", __func__);

if (!nfs4_sequence_done(task, &hdr->res.seq_res))
return -EAGAIN;
if (nfs4_read_stateid_changed(task, &hdr->args))
return -EAGAIN;
+ if (nfs4_read_plus_not_supported(task, hdr))
+ return -EAGAIN;
if (task->tk_status > 0)
nfs_invalidate_atime(hdr->inode);
return hdr->pgio_done_cb ? hdr->pgio_done_cb(task, hdr) :
nfs4_read_done_cb(task, hdr);
}

+#ifdef CONFIG_NFS_V4_2
+static void nfs42_read_plus_support(struct nfs_server *server, struct rpc_message *msg)
+{
+ if (server->caps & NFS_CAP_READ_PLUS)
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS];
+ else
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+}
+#else
+static void nfs42_read_plus_support(struct nfs_server *server, struct rpc_message *msg)
+{
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+}
+#endif /* CONFIG_NFS_V4_2 */
+
static void nfs4_proc_read_setup(struct nfs_pgio_header *hdr,
struct rpc_message *msg)
{
hdr->timestamp = jiffies;
if (!hdr->pgio_done_cb)
hdr->pgio_done_cb = nfs4_read_done_cb;
- msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+ nfs42_read_plus_support(NFS_SERVER(hdr->inode), msg);
nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0);
}

@@ -10202,7 +10238,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
| NFS_CAP_SEEK
| NFS_CAP_LAYOUTSTATS
| NFS_CAP_CLONE
- | NFS_CAP_LAYOUTERROR,
+ | NFS_CAP_LAYOUTERROR
+ | NFS_CAP_READ_PLUS,
.init_client = nfs41_init_client,
.shutdown_client = nfs41_shutdown_client,
.match_stateid = nfs41_match_stateid,
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 3336ea3407a0..c6dbfcae7517 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7615,6 +7615,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
PROC42(SETXATTR, enc_setxattr, dec_setxattr),
PROC42(LISTXATTRS, enc_listxattrs, dec_listxattrs),
PROC42(REMOVEXATTR, enc_removexattr, dec_removexattr),
+ PROC42(READ_PLUS, enc_read_plus, dec_read_plus),
};

static unsigned int nfs_version4_counts[ARRAY_SIZE(nfs4_procedures)];
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index b8360be141da..9dc7eeac924f 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -551,13 +551,13 @@ enum {

NFSPROC4_CLNT_LOOKUPP,
NFSPROC4_CLNT_LAYOUTERROR,
-
NFSPROC4_CLNT_COPY_NOTIFY,

NFSPROC4_CLNT_GETXATTR,
NFSPROC4_CLNT_SETXATTR,
NFSPROC4_CLNT_LISTXATTRS,
NFSPROC4_CLNT_REMOVEXATTR,
+ NFSPROC4_CLNT_READ_PLUS,
};

/* nfs41 types */
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 7eae72a8762e..38e60ec742df 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -287,5 +287,6 @@ struct nfs_server {
#define NFS_CAP_LAYOUTERROR (1U << 26)
#define NFS_CAP_COPY_NOTIFY (1U << 27)
#define NFS_CAP_XATTR (1U << 28)
+#define NFS_CAP_READ_PLUS (1U << 29)

#endif
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 0599efd57eb9..d63cb862d58e 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -657,7 +657,7 @@ struct nfs_pgio_args {
struct nfs_pgio_res {
struct nfs4_sequence_res seq_res;
struct nfs_fattr * fattr;
- __u32 count;
+ __u64 count;
__u32 op_status;
union {
struct {
--
2.28.0

2020-09-28 17:11:18

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 08/10] NFS: Add READ_PLUS hole segment decoding

From: Anna Schumaker <[email protected]>

We keep things simple for now by only decoding a single hole or data
segment returned by the server, even if they returned more to us.

Signed-off-by: Anna Schumaker <[email protected]>
---
fs/nfs/nfs42xdr.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 930b4ca212c1..9720fedd2e57 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -53,7 +53,7 @@
#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
1 /* rpr_eof */ + \
1 /* rpr_contents count */ + \
- NFS42_READ_PLUS_SEGMENT_SIZE)
+ 2 * NFS42_READ_PLUS_SEGMENT_SIZE)
#define encode_seek_maxsz (op_encode_hdr_maxsz + \
encode_stateid_maxsz + \
2 /* offset */ + \
@@ -1045,6 +1045,28 @@ static int decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_res *re
return 0;
}

+static int decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_res *res,
+ uint32_t *eof)
+{
+ uint64_t offset, length, recvd;
+ __be32 *p;
+
+ p = xdr_inline_decode(xdr, 8 + 8);
+ if (unlikely(!p))
+ return -EIO;
+
+ p = xdr_decode_hyper(p, &offset);
+ p = xdr_decode_hyper(p, &length);
+ recvd = xdr_expand_hole(xdr, 0, length);
+ res->count += recvd;
+
+ if (recvd < length) {
+ *eof = 0;
+ return 1;
+ }
+ return 0;
+}
+
static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
{
uint32_t eof, segments, type;
@@ -1071,6 +1093,8 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
type = be32_to_cpup(p++);
if (type == NFS4_CONTENT_DATA)
status = decode_read_plus_data(xdr, res, &eof);
+ else if (type == NFS4_CONTENT_HOLE)
+ status = decode_read_plus_hole(xdr, res, &eof);
else
return -EINVAL;

--
2.28.0

2020-09-28 17:11:21

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 10/10] NFS: Decode a full READ_PLUS reply

From: Anna Schumaker <[email protected]>

Decode multiple hole and data segments sent by the server, placing
everything directly where they need to go in the xdr pages.

Signed-off-by: Anna Schumaker <[email protected]>
---
fs/nfs/nfs42xdr.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 9720fedd2e57..0dc31ad2362e 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -1032,7 +1032,7 @@ static int decode_read_plus_data(struct xdr_stream *xdr, struct nfs_pgio_res *re

p = xdr_decode_hyper(p, &offset);
count = be32_to_cpup(p);
- recvd = xdr_read_pages(xdr, count);
+ recvd = xdr_align_data(xdr, res->count, count);
res->count += recvd;

if (count > recvd) {
@@ -1057,7 +1057,7 @@ static int decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_res *re

p = xdr_decode_hyper(p, &offset);
p = xdr_decode_hyper(p, &length);
- recvd = xdr_expand_hole(xdr, 0, length);
+ recvd = xdr_expand_hole(xdr, res->count, length);
res->count += recvd;

if (recvd < length) {
@@ -1070,7 +1070,7 @@ static int decode_read_plus_hole(struct xdr_stream *xdr, struct nfs_pgio_res *re
static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
{
uint32_t eof, segments, type;
- int status;
+ int status, i;
__be32 *p;

status = decode_op_hdr(xdr, OP_READ_PLUS);
@@ -1086,22 +1086,24 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
if (segments == 0)
goto out;

- p = xdr_inline_decode(xdr, 4);
- if (unlikely(!p))
- return -EIO;
+ for (i = 0; i < segments; i++) {
+ p = xdr_inline_decode(xdr, 4);
+ if (unlikely(!p))
+ return -EIO;

- type = be32_to_cpup(p++);
- if (type == NFS4_CONTENT_DATA)
- status = decode_read_plus_data(xdr, res, &eof);
- else if (type == NFS4_CONTENT_HOLE)
- status = decode_read_plus_hole(xdr, res, &eof);
- else
- return -EINVAL;
+ type = be32_to_cpup(p++);
+ if (type == NFS4_CONTENT_DATA)
+ status = decode_read_plus_data(xdr, res, &eof);
+ else if (type == NFS4_CONTENT_HOLE)
+ status = decode_read_plus_hole(xdr, res, &eof);
+ else
+ return -EINVAL;

- if (status)
- return status;
- if (segments > 1)
- eof = 0;
+ if (status < 0)
+ return status;
+ if (status > 0)
+ break;
+ }

out:
res->eof = eof;
--
2.28.0

2020-09-28 17:11:37

by Anna Schumaker

[permalink] [raw]
Subject: [PATCH v6 02/10] SUNRPC: Implement a xdr_page_pos() function

From: Anna Schumaker <[email protected]>

I'll need this for READ_PLUS to help figure out the offset where page
data is stored at, but it might also be useful for other things.

Signed-off-by: Anna Schumaker <[email protected]>
---
include/linux/sunrpc/xdr.h | 1 +
net/sunrpc/xdr.c | 13 +++++++++++++
2 files changed, 14 insertions(+)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 6613d96a3029..026edbd041d5 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -242,6 +242,7 @@ extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
unsigned int base, unsigned int len);
extern unsigned int xdr_stream_pos(const struct xdr_stream *xdr);
+extern unsigned int xdr_page_pos(const struct xdr_stream *xdr);
extern void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf,
__be32 *p, struct rpc_rqst *rqst);
extern void xdr_init_decode_pages(struct xdr_stream *xdr, struct xdr_buf *buf,
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index c62b0882c0d8..8d29450fdce5 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -505,6 +505,19 @@ unsigned int xdr_stream_pos(const struct xdr_stream *xdr)
}
EXPORT_SYMBOL_GPL(xdr_stream_pos);

+/**
+ * xdr_page_pos - Return the current offset from the start of the xdr pages
+ * @xdr: pointer to struct xdr_stream
+ */
+unsigned int xdr_page_pos(const struct xdr_stream *xdr)
+{
+ unsigned int pos = xdr_stream_pos(xdr);
+
+ WARN_ON(pos < xdr->buf->head[0].iov_len);
+ return pos - xdr->buf->head[0].iov_len;
+}
+EXPORT_SYMBOL_GPL(xdr_page_pos);
+
/**
* xdr_init_encode - Initialize a struct xdr_stream for sending data.
* @xdr: pointer to xdr_stream struct
--
2.28.0