2021-07-12 18:58:18

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v4 0/3] Bulk-release pages during NFSD read splice

As with the page allocation side, I'm trying to reduce the average
number of times NFSD invokes the page allocation and release APIs
because they can be expensive, and because it is a resource that is
shared amongst all nfsd threads and thus access to it is at least
partially serialized. This small series tackles a code path that is
frequently invoked when NFSD handles READ operations on local
filesystems that support splice (i.e., most of the popular ones).

Changes since v3:
- Mark patches 1 and 3 as Reviewed-by: Neil Brown
- Convert bare release_pages() calls to pagevec_release()
- Release accrued free pages after every RPC retires

---

Chuck Lever (3):
NFSD: Clean up splice actor
SUNRPC: Add svc_rqst_replace_page() API
NFSD: Batch release pages during splice read


fs/nfsd/vfs.c | 20 +++++---------------
include/linux/sunrpc/svc.h | 4 ++++
net/sunrpc/svc.c | 21 +++++++++++++++++++++
net/sunrpc/svc_xprt.c | 3 +++
4 files changed, 33 insertions(+), 15 deletions(-)

--
Chuck Lever


2021-07-12 18:58:21

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v4 1/3] NFSD: Clean up splice actor

A few useful observations:

- The value in @size is never modified.

- splice_desc.len is an unsigned int, and so is xdr_buf.page_len.
An implicit cast to size_t is unnecessary.

- The computation of .page_len is the same in all three arms
of the "if" statement, so hoist it out to make it clear that
the operation is an unconditional invariant.

The resulting function is 18 bytes shorter on my system (-Os).

Signed-off-by: Chuck Lever <[email protected]>
Reviewed-by: NeilBrown <[email protected]>
---
fs/nfsd/vfs.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index a224a5e23cc1..46a6d9fce3d2 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -847,26 +847,21 @@ nfsd_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
struct svc_rqst *rqstp = sd->u.data;
struct page **pp = rqstp->rq_next_page;
struct page *page = buf->page;
- size_t size;
-
- size = sd->len;

if (rqstp->rq_res.page_len == 0) {
get_page(page);
put_page(*rqstp->rq_next_page);
*(rqstp->rq_next_page++) = page;
rqstp->rq_res.page_base = buf->offset;
- rqstp->rq_res.page_len = size;
} else if (page != pp[-1]) {
get_page(page);
if (*rqstp->rq_next_page)
put_page(*rqstp->rq_next_page);
*(rqstp->rq_next_page++) = page;
- rqstp->rq_res.page_len += size;
- } else
- rqstp->rq_res.page_len += size;
+ }
+ rqstp->rq_res.page_len += sd->len;

- return size;
+ return sd->len;
}

static int nfsd_direct_splice_actor(struct pipe_inode_info *pipe,


2021-07-12 18:59:11

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v4 3/3] NFSD: Batch release pages during splice read

Large splice reads call put_page() repeatedly. put_page() is
relatively expensive to call, so replace it with the new
svc_rqst_replace_page() helper to help amortize that cost.

Signed-off-by: Chuck Lever <[email protected]>
Reviewed-by: NeilBrown <[email protected]>
---
fs/nfsd/vfs.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 46a6d9fce3d2..7732a384f949 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -849,15 +849,10 @@ nfsd_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
struct page *page = buf->page;

if (rqstp->rq_res.page_len == 0) {
- get_page(page);
- put_page(*rqstp->rq_next_page);
- *(rqstp->rq_next_page++) = page;
+ svc_rqst_replace_page(rqstp, page);
rqstp->rq_res.page_base = buf->offset;
} else if (page != pp[-1]) {
- get_page(page);
- if (*rqstp->rq_next_page)
- put_page(*rqstp->rq_next_page);
- *(rqstp->rq_next_page++) = page;
+ svc_rqst_replace_page(rqstp, page);
}
rqstp->rq_res.page_len += sd->len;



2021-07-12 19:00:28

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v4 2/3] SUNRPC: Add svc_rqst_replace_page() API

Replacing a page in rq_pages[] requires a get_page(), which is a
bus-locked operation, and a put_page(), which can be even more
costly.

To reduce the cost of replacing a page in rq_pages[], batch the
put_page() operations by collecting "freed" pages in a pagevec,
and then release those pages when the pagevec is full. This
pagevec is also emptied when each RPC completes.

Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/sunrpc/svc.h | 4 ++++
net/sunrpc/svc.c | 21 +++++++++++++++++++++
net/sunrpc/svc_xprt.c | 3 +++
3 files changed, 28 insertions(+)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index e91d51ea028b..ab9afbf0a0d8 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -19,6 +19,7 @@
#include <linux/sunrpc/svcauth.h>
#include <linux/wait.h>
#include <linux/mm.h>
+#include <linux/pagevec.h>

/* statistics for svc_pool structures */
struct svc_pool_stats {
@@ -256,6 +257,7 @@ struct svc_rqst {
struct page * *rq_next_page; /* next reply page to use */
struct page * *rq_page_end; /* one past the last page */

+ struct pagevec rq_pvec;
struct kvec rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
struct bio_vec rq_bvec[RPCSVC_MAXPAGES];

@@ -502,6 +504,8 @@ struct svc_rqst *svc_rqst_alloc(struct svc_serv *serv,
struct svc_pool *pool, int node);
struct svc_rqst *svc_prepare_thread(struct svc_serv *serv,
struct svc_pool *pool, int node);
+void svc_rqst_replace_page(struct svc_rqst *rqstp,
+ struct page *page);
void svc_rqst_free(struct svc_rqst *);
void svc_exit_thread(struct svc_rqst *);
unsigned int svc_pool_map_get(void);
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 0de918cb3d90..d2d412d43827 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -838,6 +838,27 @@ svc_set_num_threads_sync(struct svc_serv *serv, struct svc_pool *pool, int nrser
}
EXPORT_SYMBOL_GPL(svc_set_num_threads_sync);

+/**
+ * svc_rqst_replace_page - Replace one page in rq_pages[]
+ * @rqstp: svc_rqst with pages to replace
+ * @page: replacement page
+ *
+ * When replacing a page in rq_pages, batch the release of the
+ * replaced pages to avoid hammering the page allocator.
+ */
+void svc_rqst_replace_page(struct svc_rqst *rqstp, struct page *page)
+{
+ if (*rqstp->rq_next_page) {
+ if (!pagevec_space(&rqstp->rq_pvec))
+ __pagevec_release(&rqstp->rq_pvec);
+ pagevec_add(&rqstp->rq_pvec, *rqstp->rq_next_page);
+ }
+
+ get_page(page);
+ *(rqstp->rq_next_page++) = page;
+}
+EXPORT_SYMBOL_GPL(svc_rqst_replace_page);
+
/*
* Called from a server thread as it's exiting. Caller must hold the "service
* mutex" for the service.
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index d66a8e44a1ae..682058a5ec13 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -539,6 +539,7 @@ static void svc_xprt_release(struct svc_rqst *rqstp)
kfree(rqstp->rq_deferred);
rqstp->rq_deferred = NULL;

+ pagevec_release(&rqstp->rq_pvec);
svc_free_res_pages(rqstp);
rqstp->rq_res.page_len = 0;
rqstp->rq_res.page_base = 0;
@@ -664,6 +665,8 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
struct xdr_buf *arg = &rqstp->rq_arg;
unsigned long pages, filled;

+ pagevec_init(&rqstp->rq_pvec);
+
pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT;
if (pages > RPCSVC_MAXPAGES) {
pr_warn_once("svc: warning: pages=%lu > RPCSVC_MAXPAGES=%lu\n",