2023-11-17 22:14:27

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 0/4] Eliminate the RQ_SPLICE_OK flag

The server's splice read path is used by the majority of exported
filesystems. At last month's bake-a-thon, we tossed around some
ideas about how to improve benchmarking and test coverage of the NFS
server's vectored-read path, which is a fallback.

One way to do this would be to expose a switch that can be set by
test harnesses to disable splice reads.

As an initial step, hoist RQ_SPLICE_OK out of the RPC layer. Later,
I'll add a netlink command to as a switch between "use splice if
possible" and "always use vectored reads". (I don't want to collide
with the work Lorenzo is doing).

Changes since v1:
- Address "undefined reference to `svcauth_gss_flavor'"

---

Chuck Lever (4):
SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavor
NFSD: Replace RQ_SPLICE_OK in nfsd_read()
NFSD: Modify NFSv4 to use nfsd_read_splice_ok()
SUNRPC: Remove RQ_SPLICE_OK


fs/nfsd/nfs4proc.c | 7 +++++--
fs/nfsd/nfs4xdr.c | 13 ++++++++-----
fs/nfsd/vfs.c | 26 +++++++++++++++++++++++++-
fs/nfsd/vfs.h | 1 +
fs/nfsd/xdr4.h | 1 +
include/linux/sunrpc/svc.h | 2 --
include/linux/sunrpc/svcauth.h | 7 ++++++-
include/trace/events/sunrpc.h | 1 -
net/sunrpc/auth_gss/svcauth_gss.c | 16 ++++++----------
net/sunrpc/svc.c | 2 --
net/sunrpc/svcauth.c | 16 ++++++++++++++++
11 files changed, 68 insertions(+), 24 deletions(-)

--
Chuck Lever


2023-11-17 22:14:37

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 1/4] SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavor

From: Chuck Lever <[email protected]>

NFSD will use this new API to determine whether nfsd_splice_read is
safe to use. This avoids the need to add a dependency to NFSD for
CONFIG_SUNRPC_GSS.

Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/sunrpc/svcauth.h | 7 ++++++-
net/sunrpc/auth_gss/svcauth_gss.c | 6 ++++++
net/sunrpc/svcauth.c | 16 ++++++++++++++++
3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index 6f90203edbf8..61c455f1e1f5 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -131,8 +131,11 @@ enum svc_auth_status {
* This call releases a domain.
*
* set_client()
- * Givens a pending request (struct svc_rqst), finds and assigns
+ * Given a pending request (struct svc_rqst), finds and assigns
* an appropriate 'auth_domain' as the client.
+ *
+ * pseudoflavor()
+ * Returns RPC_AUTH pseudoflavor in use by @rqstp.
*/
struct auth_ops {
char * name;
@@ -143,11 +146,13 @@ struct auth_ops {
int (*release)(struct svc_rqst *rqstp);
void (*domain_release)(struct auth_domain *dom);
enum svc_auth_status (*set_client)(struct svc_rqst *rqstp);
+ rpc_authflavor_t (*pseudoflavor)(struct svc_rqst *rqstp);
};

struct svc_xprt;

extern enum svc_auth_status svc_authenticate(struct svc_rqst *rqstp);
+extern rpc_authflavor_t svc_auth_flavor(struct svc_rqst *rqstp);
extern int svc_authorise(struct svc_rqst *rqstp);
extern enum svc_auth_status svc_set_client(struct svc_rqst *rqstp);
extern int svc_auth_register(rpc_authflavor_t flavor, struct auth_ops *aops);
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 18734e70c5dd..104d9a320142 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -2014,6 +2014,11 @@ svcauth_gss_domain_release(struct auth_domain *dom)
call_rcu(&dom->rcu_head, svcauth_gss_domain_release_rcu);
}

+static rpc_authflavor_t svcauth_gss_pseudoflavor(struct svc_rqst *rqstp)
+{
+ return svcauth_gss_flavor(rqstp->rq_gssclient);
+}
+
static struct auth_ops svcauthops_gss = {
.name = "rpcsec_gss",
.owner = THIS_MODULE,
@@ -2022,6 +2027,7 @@ static struct auth_ops svcauthops_gss = {
.release = svcauth_gss_release,
.domain_release = svcauth_gss_domain_release,
.set_client = svcauth_gss_set_client,
+ .pseudoflavor = svcauth_gss_pseudoflavor,
};

static int rsi_cache_create_net(struct net *net)
diff --git a/net/sunrpc/svcauth.c b/net/sunrpc/svcauth.c
index aa4429d0b810..1619211f0960 100644
--- a/net/sunrpc/svcauth.c
+++ b/net/sunrpc/svcauth.c
@@ -160,6 +160,22 @@ svc_auth_unregister(rpc_authflavor_t flavor)
}
EXPORT_SYMBOL_GPL(svc_auth_unregister);

+/**
+ * svc_auth_flavor - return RPC transaction's RPC_AUTH flavor
+ * @rqstp: RPC transaction context
+ *
+ * Returns an RPC flavor or GSS pseudoflavor.
+ */
+rpc_authflavor_t svc_auth_flavor(struct svc_rqst *rqstp)
+{
+ struct auth_ops *aops = rqstp->rq_authop;
+
+ if (!aops->pseudoflavor)
+ return aops->flavour;
+ return aops->pseudoflavor(rqstp);
+}
+EXPORT_SYMBOL_GPL(svc_auth_flavor);
+
/**************************************************
* 'auth_domains' are stored in a hash table indexed by name.
* When the last reference to an 'auth_domain' is dropped,


2023-11-17 22:14:42

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 2/4] NFSD: Replace RQ_SPLICE_OK in nfsd_read()

From: Chuck Lever <[email protected]>

RQ_SPLICE_OK is a bit of a layering violation. Also, a subsequent
patch is going to provide a mechanism for always disabling splice
reads.

Splicing is an issue only for NFS READs, so refactor nfsd_read() to
check the auth type directly instead of relying on an rq_flag
setting.

The new helper will be added into the NFSv4 read path in a
subsequent patch.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/vfs.c | 26 +++++++++++++++++++++++++-
fs/nfsd/vfs.h | 1 +
2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index fbbea7498f02..0ea7f533f816 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1216,6 +1216,30 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
return nfserr;
}

+/**
+ * nfsd_read_splice_ok - check if spliced reading is supported
+ * @rqstp: RPC transaction context
+ *
+ * Return values:
+ * %true: nfsd_splice_read() may be used
+ * %false: nfsd_splice_read() must not be used
+ *
+ * NFS READ normally uses splice to send data in-place. However the
+ * data in cache can change after the reply's MIC is computed but
+ * before the RPC reply is sent. To prevent the client from
+ * rejecting the server-computed MIC in this somewhat rare case, do
+ * not use splice with the GSS integrity and privacy services.
+ */
+bool nfsd_read_splice_ok(struct svc_rqst *rqstp)
+{
+ switch (svc_auth_flavor(rqstp)) {
+ case RPC_AUTH_GSS_KRB5I:
+ case RPC_AUTH_GSS_KRB5P:
+ return false;
+ }
+ return true;
+}
+
/**
* nfsd_read - Read data from a file
* @rqstp: RPC transaction context
@@ -1245,7 +1269,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
return err;

file = nf->nf_file;
- if (file->f_op->splice_read && test_bit(RQ_SPLICE_OK, &rqstp->rq_flags))
+ if (file->f_op->splice_read && nfsd_read_splice_ok(rqstp))
err = nfsd_splice_read(rqstp, fhp, file, offset, count, eof);
else
err = nfsd_iter_read(rqstp, fhp, file, offset, count, 0, eof);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index e3c29596f4df..702fbc4483bf 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -114,6 +114,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct file *file, loff_t offset,
unsigned long *count, unsigned int base,
u32 *eof);
+bool nfsd_read_splice_ok(struct svc_rqst *rqstp);
__be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
loff_t offset, unsigned long *count,
u32 *eof);


2023-11-17 22:14:48

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 3/4] NFSD: Modify NFSv4 to use nfsd_read_splice_ok()

From: Chuck Lever <[email protected]>

Avoid the use of an atomic bitop, and prepare for adding a run-time
switch for using splice reads.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4proc.c | 7 +++++--
fs/nfsd/nfs4xdr.c | 13 ++++++++-----
fs/nfsd/xdr4.h | 1 +
3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 6f2d4aa4970d..14712fa08f76 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -970,8 +970,11 @@ nfsd4_read(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* To ensure proper ordering, we therefore turn off zero copy if
* the client wants us to do more in this compound:
*/
- if (!nfsd4_last_compound_op(rqstp))
- clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
+ if (!nfsd4_last_compound_op(rqstp)) {
+ struct nfsd4_compoundargs *argp = rqstp->rq_argp;
+
+ argp->splice_ok = false;
+ }

/* check stateid */
status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index ec4ed6206df1..ea7c8e32d3ba 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2524,8 +2524,9 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
svc_reserve(argp->rqstp, max_reply + readbytes);
argp->rqstp->rq_cachetype = cachethis ? RC_REPLBUFF : RC_NOCACHE;

+ argp->splice_ok = nfsd_read_splice_ok(argp->rqstp);
if (readcount > 1 || max_reply > PAGE_SIZE - auth_slack)
- clear_bit(RQ_SPLICE_OK, &argp->rqstp->rq_flags);
+ argp->splice_ok = false;

return true;
}
@@ -4378,12 +4379,13 @@ static __be32
nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
union nfsd4_op_u *u)
{
+ struct nfsd4_compoundargs *argp = resp->rqstp->rq_argp;
struct nfsd4_read *read = &u->read;
- bool splice_ok = test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags);
- unsigned long maxcount;
struct xdr_stream *xdr = resp->xdr;
- struct file *file;
int starting_len = xdr->buf->len;
+ bool splice_ok = argp->splice_ok;
+ unsigned long maxcount;
+ struct file *file;
__be32 *p;

if (nfserr)
@@ -5204,9 +5206,10 @@ static __be32
nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
struct nfsd4_read *read)
{
- bool splice_ok = test_bit(RQ_SPLICE_OK, &resp->rqstp->rq_flags);
+ struct nfsd4_compoundargs *argp = resp->rqstp->rq_argp;
struct file *file = read->rd_nf->nf_file;
struct xdr_stream *xdr = resp->xdr;
+ bool splice_ok = argp->splice_ok;
unsigned long maxcount;
__be32 nfserr, *p;

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 80e859dc84d8..415516c1b27e 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -840,6 +840,7 @@ struct nfsd4_compoundargs {
u32 minorversion;
u32 client_opcnt;
u32 opcnt;
+ bool splice_ok;
struct nfsd4_op *ops;
struct nfsd4_op iops[8];
};


2023-11-17 22:14:54

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 4/4] SUNRPC: Remove RQ_SPLICE_OK

From: Chuck Lever <[email protected]>

This flag is no longer used.

Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/sunrpc/svc.h | 2 --
include/trace/events/sunrpc.h | 1 -
net/sunrpc/auth_gss/svcauth_gss.c | 10 ----------
net/sunrpc/svc.c | 2 --
4 files changed, 15 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index b10f987509cc..544fcfe07479 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -260,8 +260,6 @@ enum {
RQ_LOCAL, /* local request */
RQ_USEDEFERRAL, /* use deferral */
RQ_DROPME, /* drop current reply */
- RQ_SPLICE_OK, /* turned off in gss privacy to prevent
- * encrypting page cache pages */
RQ_VICTIM, /* Have agreed to shut down */
RQ_DATA, /* request has data */
};
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 337c90787fb1..cdd3a45e6003 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -1675,7 +1675,6 @@ DEFINE_SVCXDRBUF_EVENT(sendto);
svc_rqst_flag(LOCAL) \
svc_rqst_flag(USEDEFERRAL) \
svc_rqst_flag(DROPME) \
- svc_rqst_flag(SPLICE_OK) \
svc_rqst_flag(VICTIM) \
svc_rqst_flag_end(DATA)

diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 104d9a320142..24de94184700 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -866,14 +866,6 @@ svcauth_gss_unwrap_integ(struct svc_rqst *rqstp, u32 seq, struct gss_ctx *ctx)
struct xdr_buf databody_integ;
struct xdr_netobj checksum;

- /* NFS READ normally uses splice to send data in-place. However
- * the data in cache can change after the reply's MIC is computed
- * but before the RPC reply is sent. To prevent the client from
- * rejecting the server-computed MIC in this somewhat rare case,
- * do not use splice with the GSS integrity service.
- */
- clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
-
/* Did we already verify the signature on the original pass through? */
if (rqstp->rq_deferred)
return 0;
@@ -948,8 +940,6 @@ svcauth_gss_unwrap_priv(struct svc_rqst *rqstp, u32 seq, struct gss_ctx *ctx)
struct xdr_buf *buf = xdr->buf;
unsigned int saved_len;

- clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
-
if (xdr_stream_decode_u32(xdr, &len) < 0)
goto unwrap_failed;
if (rqstp->rq_deferred) {
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 3f2ea7a0496f..fa4e23fa0e09 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1305,8 +1305,6 @@ svc_process_common(struct svc_rqst *rqstp)
int rc;
__be32 *p;

- /* Will be turned off by GSS integrity and privacy services */
- set_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
/* Will be turned off only when NFSv4 Sessions are used */
set_bit(RQ_USEDEFERRAL, &rqstp->rq_flags);
clear_bit(RQ_DROPME, &rqstp->rq_flags);


2023-11-27 20:22:11

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Eliminate the RQ_SPLICE_OK flag

On Fri, 2023-11-17 at 17:14 -0500, Chuck Lever wrote:
> The server's splice read path is used by the majority of exported
> filesystems. At last month's bake-a-thon, we tossed around some
> ideas about how to improve benchmarking and test coverage of the NFS
> server's vectored-read path, which is a fallback.
>
> One way to do this would be to expose a switch that can be set by
> test harnesses to disable splice reads.
>
> As an initial step, hoist RQ_SPLICE_OK out of the RPC layer. Later,
> I'll add a netlink command to as a switch between "use splice if
> possible" and "always use vectored reads". (I don't want to collide
> with the work Lorenzo is doing).
>
> Changes since v1:
> - Address "undefined reference to `svcauth_gss_flavor'"
>
> ---
>
> Chuck Lever (4):
> SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavor
> NFSD: Replace RQ_SPLICE_OK in nfsd_read()
> NFSD: Modify NFSv4 to use nfsd_read_splice_ok()
> SUNRPC: Remove RQ_SPLICE_OK
>
>
> fs/nfsd/nfs4proc.c | 7 +++++--
> fs/nfsd/nfs4xdr.c | 13 ++++++++-----
> fs/nfsd/vfs.c | 26 +++++++++++++++++++++++++-
> fs/nfsd/vfs.h | 1 +
> fs/nfsd/xdr4.h | 1 +
> include/linux/sunrpc/svc.h | 2 --
> include/linux/sunrpc/svcauth.h | 7 ++++++-
> include/trace/events/sunrpc.h | 1 -
> net/sunrpc/auth_gss/svcauth_gss.c | 16 ++++++----------
> net/sunrpc/svc.c | 2 --
> net/sunrpc/svcauth.c | 16 ++++++++++++++++
> 11 files changed, 68 insertions(+), 24 deletions(-)
>
> --
> Chuck Lever
>

Reviewed-by: Jeff Layton <[email protected]>