2019-04-02 23:16:59

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 0/6] Allow containerised knfsd to set supported NFS versions

The current knfsd implementation is unable to support containers that
want to run different combinations of NFS versions. This is because of
the way we define which versions are supported: by directly editing the
global version table in 'nfsd_program'.
The following patch set modifies the method used to define version
information by moving some of the server RPC request initialisation
down into knfsd itself, allowing it to manage the version-specific
initialisation.

Note that we might want to consider a few follow ups to this patchset
to get rid of some of the version-specific flags (e.g. vs_hidden) that
currently need to be managed in the generic SUNRPC server code on behalf
of just one or two RPC programs. These flags can easily by replaced by
custom RPC request initialisers.

Trond Myklebust (6):
SUNRPC/nfs: Fix return value for nfs4_callback_compound()
SUNRPC: Add a callback to initialise server requests
SUNRPC: Clean up generic dispatcher code
SUNRPC: Allow further customisation of RPC program registration
nfsd: Add custom rpcbind callbacks for knfsd
nfsd: Allow containers to set supported nfs versions

fs/lockd/svc.c | 4 +-
fs/nfs/callback.c | 2 +
fs/nfs/callback_xdr.c | 2 +-
fs/nfsd/netns.h | 8 +
fs/nfsd/nfs4proc.c | 3 +-
fs/nfsd/nfsctl.c | 25 +--
fs/nfsd/nfsd.h | 8 +-
fs/nfsd/nfssvc.c | 251 +++++++++++++++++++++++++-----
include/linux/sunrpc/svc.h | 33 ++++
net/sunrpc/svc.c | 310 ++++++++++++++++++++++++-------------
10 files changed, 480 insertions(+), 166 deletions(-)

--
2.20.1



2019-04-02 23:17:00

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 1/6] SUNRPC/nfs: Fix return value for nfs4_callback_compound()

RPC server procedures are normally expected to return a __be32 encoded
status value of type 'enum rpc_accept_stat', however at least one function
wants to return an authentication status of type 'enum rpc_auth_stat'
in the case where authentication fails.
This patch adds functionality to allow this.

Fixes: a4e187d83d88 ("NFS: Don't drop CB requests with invalid principals")
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/callback_xdr.c | 2 +-
include/linux/sunrpc/svc.h | 2 ++
net/sunrpc/svc.c | 27 ++++++++++++++++++++++-----
3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
index 06233bfa6d73..73a5a5ea2976 100644
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -983,7 +983,7 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp)

out_invalidcred:
pr_warn_ratelimited("NFS: NFSv4 callback contains invalid cred\n");
- return rpc_autherr_badcred;
+ return svc_return_autherr(rqstp, rpc_autherr_badcred);
}

/*
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index e52385340b3b..7ff12c9dbeaf 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -271,6 +271,7 @@ struct svc_rqst {
#define RQ_VICTIM (5) /* about to be shut down */
#define RQ_BUSY (6) /* request is busy */
#define RQ_DATA (7) /* request has data */
+#define RQ_AUTHERR (8) /* Request status is auth error */
unsigned long rq_flags; /* flags field */
ktime_t rq_qtime; /* enqueue time */

@@ -504,6 +505,7 @@ unsigned int svc_fill_write_vector(struct svc_rqst *rqstp,
char *svc_fill_symlink_pathname(struct svc_rqst *rqstp,
struct kvec *first, void *p,
size_t total);
+__be32 svc_return_autherr(struct svc_rqst *rqstp, __be32 auth_err);

#define RPC_MAX_ADDRBUFLEN (63U)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index dbd19697ee38..3d5dd6b86652 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1144,6 +1144,22 @@ void svc_printk(struct svc_rqst *rqstp, const char *fmt, ...)
static __printf(2,3) void svc_printk(struct svc_rqst *rqstp, const char *fmt, ...) {}
#endif

+__be32
+svc_return_autherr(struct svc_rqst *rqstp, __be32 auth_err)
+{
+ set_bit(RQ_AUTHERR, &rqstp->rq_flags);
+ return auth_err;
+}
+EXPORT_SYMBOL_GPL(svc_return_autherr);
+
+static __be32
+svc_get_autherr(struct svc_rqst *rqstp, __be32 *statp)
+{
+ if (test_and_clear_bit(RQ_AUTHERR, &rqstp->rq_flags))
+ return *statp;
+ return rpc_auth_ok;
+}
+
/*
* Common routine for processing the RPC request.
*/
@@ -1290,11 +1306,9 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
procp->pc_release(rqstp);
goto dropit;
}
- if (*statp == rpc_autherr_badcred) {
- if (procp->pc_release)
- procp->pc_release(rqstp);
- goto err_bad_auth;
- }
+ auth_stat = svc_get_autherr(rqstp, statp);
+ if (auth_stat != rpc_auth_ok)
+ goto err_release_bad_auth;
if (*statp == rpc_success && procp->pc_encode &&
!procp->pc_encode(rqstp, resv->iov_base + resv->iov_len)) {
dprintk("svc: failed to encode reply\n");
@@ -1351,6 +1365,9 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
svc_putnl(resv, 2);
goto sendit;

+err_release_bad_auth:
+ if (procp->pc_release)
+ procp->pc_release(rqstp);
err_bad_auth:
dprintk("svc: authentication failed (%d)\n", ntohl(auth_stat));
serv->sv_stats->rpcbadauth++;
--
2.20.1


2019-04-02 23:17:01

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/6] SUNRPC: Add a callback to initialise server requests

Add a callback to help initialise server requests before they are
processed. This will allow us to clean up the NFS server version
support, and to make it container safe.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/lockd/svc.c | 3 +-
fs/nfs/callback.c | 1 +
fs/nfsd/nfssvc.c | 2 +
include/linux/sunrpc/svc.h | 16 +++++
net/sunrpc/svc.c | 125 ++++++++++++++++++++++++-------------
5 files changed, 103 insertions(+), 44 deletions(-)

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 346ed161756d..75415b21efda 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -807,5 +807,6 @@ static struct svc_program nlmsvc_program = {
.pg_name = "lockd", /* service name */
.pg_class = "nfsd", /* share authentication with nfsd */
.pg_stats = &nlmsvc_stats, /* stats table */
- .pg_authenticate = &lockd_authenticate /* export authentication */
+ .pg_authenticate = &lockd_authenticate, /* export authentication */
+ .pg_init_request = svc_generic_init_request,
};
diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 0b602a39dd71..a9510374bad7 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -457,4 +457,5 @@ static struct svc_program nfs4_callback_program = {
.pg_class = "nfs", /* authentication class */
.pg_stats = &nfs4_callback_stats,
.pg_authenticate = nfs_callback_authenticate,
+ .pg_init_request = svc_generic_init_request,
};
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 89cb484f1cfb..e26762e84798 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -86,6 +86,7 @@ static struct svc_program nfsd_acl_program = {
.pg_class = "nfsd",
.pg_stats = &nfsd_acl_svcstats,
.pg_authenticate = &svc_set_client,
+ .pg_init_request = svc_generic_init_request,
};

static struct svc_stat nfsd_acl_svcstats = {
@@ -118,6 +119,7 @@ struct svc_program nfsd_program = {
.pg_class = "nfsd", /* authentication class */
.pg_stats = &nfsd_svcstats, /* version table */
.pg_authenticate = &svc_set_client, /* export authentication */
+ .pg_init_request = svc_generic_init_request,

};

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 7ff12c9dbeaf..f43d5765acff 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -383,6 +383,16 @@ struct svc_deferred_req {
__be32 args[0];
};

+struct svc_process_info {
+ union {
+ int (*dispatch)(struct svc_rqst *, __be32 *);
+ struct {
+ unsigned int lovers;
+ unsigned int hivers;
+ } mismatch;
+ };
+};
+
/*
* List of RPC programs on the same transport endpoint
*/
@@ -397,6 +407,9 @@ struct svc_program {
char * pg_class; /* class name: services sharing authentication */
struct svc_stat * pg_stats; /* rpc statistics */
int (*pg_authenticate)(struct svc_rqst *);
+ __be32 (*pg_init_request)(struct svc_rqst *,
+ const struct svc_program *,
+ struct svc_process_info *);
};

/*
@@ -506,6 +519,9 @@ char *svc_fill_symlink_pathname(struct svc_rqst *rqstp,
struct kvec *first, void *p,
size_t total);
__be32 svc_return_autherr(struct svc_rqst *rqstp, __be32 auth_err);
+__be32 svc_generic_init_request(struct svc_rqst *rqstp,
+ const struct svc_program *progp,
+ struct svc_process_info *procinfo);

#define RPC_MAX_ADDRBUFLEN (63U)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 3d5dd6b86652..5f87c0f1e4e0 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1160,6 +1160,65 @@ svc_get_autherr(struct svc_rqst *rqstp, __be32 *statp)
return rpc_auth_ok;
}

+__be32
+svc_generic_init_request(struct svc_rqst *rqstp,
+ const struct svc_program *progp,
+ struct svc_process_info *ret)
+{
+ const struct svc_version *versp = NULL; /* compiler food */
+ const struct svc_procedure *procp = NULL;
+
+ if (rqstp->rq_vers >= progp->pg_nvers )
+ goto err_bad_vers;
+ versp = progp->pg_vers[rqstp->rq_vers];
+ if (!versp)
+ goto err_bad_vers;
+
+ /*
+ * Some protocol versions (namely NFSv4) require some form of
+ * congestion control. (See RFC 7530 section 3.1 paragraph 2)
+ * In other words, UDP is not allowed. We mark those when setting
+ * up the svc_xprt, and verify that here.
+ *
+ * The spec is not very clear about what error should be returned
+ * when someone tries to access a server that is listening on UDP
+ * for lower versions. RPC_PROG_MISMATCH seems to be the closest
+ * fit.
+ */
+ if (versp->vs_need_cong_ctrl && rqstp->rq_xprt &&
+ !test_bit(XPT_CONG_CTRL, &rqstp->rq_xprt->xpt_flags))
+ goto err_bad_vers;
+
+ if (rqstp->rq_proc >= versp->vs_nproc)
+ goto err_bad_proc;
+ rqstp->rq_procinfo = procp = &versp->vs_proc[rqstp->rq_proc];
+ if (!procp)
+ goto err_bad_proc;
+
+ /* Initialize storage for argp and resp */
+ memset(rqstp->rq_argp, 0, procp->pc_argsize);
+ memset(rqstp->rq_resp, 0, procp->pc_ressize);
+
+ /* un-reserve some of the out-queue now that we have a
+ * better idea of reply size
+ */
+ if (procp->pc_xdrressize)
+ svc_reserve_auth(rqstp, procp->pc_xdrressize<<2);
+
+ /* Bump per-procedure stats counter */
+ versp->vs_count[rqstp->rq_proc]++;
+
+ ret->dispatch = versp->vs_dispatch;
+ return rpc_success;
+err_bad_vers:
+ ret->mismatch.lovers = progp->pg_lovers;
+ ret->mismatch.hivers = progp->pg_hivers;
+ return rpc_prog_mismatch;
+err_bad_proc:
+ return rpc_proc_unavail;
+}
+EXPORT_SYMBOL_GPL(svc_generic_init_request);
+
/*
* Common routine for processing the RPC request.
*/
@@ -1167,11 +1226,11 @@ static int
svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
{
struct svc_program *progp;
- const struct svc_version *versp = NULL; /* compiler food */
const struct svc_procedure *procp = NULL;
struct svc_serv *serv = rqstp->rq_server;
+ struct svc_process_info process;
__be32 *statp;
- u32 prog, vers, proc;
+ u32 prog, vers;
__be32 auth_stat, rpc_stat;
int auth_res;
__be32 *reply_statp;
@@ -1203,8 +1262,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
svc_putnl(resv, 0); /* ACCEPT */

rqstp->rq_prog = prog = svc_getnl(argv); /* program number */
- rqstp->rq_vers = vers = svc_getnl(argv); /* version number */
- rqstp->rq_proc = proc = svc_getnl(argv); /* procedure number */
+ rqstp->rq_vers = svc_getnl(argv); /* version number */
+ rqstp->rq_proc = svc_getnl(argv); /* procedure number */

for (progp = serv->sv_program; progp; progp = progp->pg_next)
if (prog == progp->pg_prog)
@@ -1242,29 +1301,22 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
if (progp == NULL)
goto err_bad_prog;

- if (vers >= progp->pg_nvers ||
- !(versp = progp->pg_vers[vers]))
- goto err_bad_vers;
-
- /*
- * Some protocol versions (namely NFSv4) require some form of
- * congestion control. (See RFC 7530 section 3.1 paragraph 2)
- * In other words, UDP is not allowed. We mark those when setting
- * up the svc_xprt, and verify that here.
- *
- * The spec is not very clear about what error should be returned
- * when someone tries to access a server that is listening on UDP
- * for lower versions. RPC_PROG_MISMATCH seems to be the closest
- * fit.
- */
- if (versp->vs_need_cong_ctrl && rqstp->rq_xprt &&
- !test_bit(XPT_CONG_CTRL, &rqstp->rq_xprt->xpt_flags))
+ rpc_stat = progp->pg_init_request(rqstp, progp, &process);
+ switch (rpc_stat) {
+ case rpc_success:
+ break;
+ case rpc_prog_unavail:
+ goto err_bad_prog;
+ case rpc_prog_mismatch:
goto err_bad_vers;
+ case rpc_proc_unavail:
+ goto err_bad_proc;
+ }

- procp = versp->vs_proc + proc;
- if (proc >= versp->vs_nproc || !procp->pc_func)
+ procp = rqstp->rq_procinfo;
+ /* Should this check go into the dispatcher? */
+ if (!procp || !procp->pc_func)
goto err_bad_proc;
- rqstp->rq_procinfo = procp;

/* Syntactic check complete */
serv->sv_stats->rpccnt++;
@@ -1274,21 +1326,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
statp = resv->iov_base +resv->iov_len;
svc_putnl(resv, RPC_SUCCESS);

- /* Bump per-procedure stats counter */
- versp->vs_count[proc]++;
-
- /* Initialize storage for argp and resp */
- memset(rqstp->rq_argp, 0, procp->pc_argsize);
- memset(rqstp->rq_resp, 0, procp->pc_ressize);
-
- /* un-reserve some of the out-queue now that we have a
- * better idea of reply size
- */
- if (procp->pc_xdrressize)
- svc_reserve_auth(rqstp, procp->pc_xdrressize<<2);
-
/* Call the function that processes the request. */
- if (!versp->vs_dispatch) {
+ if (!process.dispatch) {
/*
* Decode arguments
* XXX: why do we ignore the return value?
@@ -1317,7 +1356,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
}
} else {
dprintk("svc: calling dispatcher\n");
- if (!versp->vs_dispatch(rqstp, statp)) {
+ if (!process.dispatch(rqstp, statp)) {
/* Release reply info */
if (procp->pc_release)
procp->pc_release(rqstp);
@@ -1386,16 +1425,16 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)

err_bad_vers:
svc_printk(rqstp, "unknown version (%d for prog %d, %s)\n",
- vers, prog, progp->pg_name);
+ rqstp->rq_vers, rqstp->rq_prog, progp->pg_name);

serv->sv_stats->rpcbadfmt++;
svc_putnl(resv, RPC_PROG_MISMATCH);
- svc_putnl(resv, progp->pg_lovers);
- svc_putnl(resv, progp->pg_hivers);
+ svc_putnl(resv, process.mismatch.lovers);
+ svc_putnl(resv, process.mismatch.hivers);
goto sendit;

err_bad_proc:
- svc_printk(rqstp, "unknown procedure (%d)\n", proc);
+ svc_printk(rqstp, "unknown procedure (%d)\n", rqstp->rq_proc);

serv->sv_stats->rpcbadfmt++;
svc_putnl(resv, RPC_PROC_UNAVAIL);
--
2.20.1


2019-04-02 23:17:02

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/6] SUNRPC: Clean up generic dispatcher code

Simplify the generic server dispatcher.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/svc.c | 75 ++++++++++++++++++++++++++++++------------------
1 file changed, 47 insertions(+), 28 deletions(-)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 5f87c0f1e4e0..eb6c7cef40de 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1160,6 +1160,45 @@ svc_get_autherr(struct svc_rqst *rqstp, __be32 *statp)
return rpc_auth_ok;
}

+static int
+svc_generic_dispatch(struct svc_rqst *rqstp, __be32 *statp)
+{
+ struct kvec *argv = &rqstp->rq_arg.head[0];
+ struct kvec *resv = &rqstp->rq_res.head[0];
+ const struct svc_procedure *procp = rqstp->rq_procinfo;
+
+ /*
+ * Decode arguments
+ * XXX: why do we ignore the return value?
+ */
+ if (procp->pc_decode &&
+ !procp->pc_decode(rqstp, argv->iov_base)) {
+ *statp = rpc_garbage_args;
+ return 1;
+ }
+
+ *statp = procp->pc_func(rqstp);
+
+ if (*statp == rpc_drop_reply ||
+ test_bit(RQ_DROPME, &rqstp->rq_flags))
+ return 0;
+
+ if (test_bit(RQ_AUTHERR, &rqstp->rq_flags))
+ return 1;
+
+ if (*statp != rpc_success)
+ return 1;
+
+ /* Encode reply */
+ if (procp->pc_encode &&
+ !procp->pc_encode(rqstp, resv->iov_base + resv->iov_len)) {
+ dprintk("svc: failed to encode reply\n");
+ /* serv->sv_stats->rpcsystemerr++; */
+ *statp = rpc_system_err;
+ }
+ return 1;
+}
+
__be32
svc_generic_init_request(struct svc_rqst *rqstp,
const struct svc_program *progp,
@@ -1328,40 +1367,17 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)

/* Call the function that processes the request. */
if (!process.dispatch) {
- /*
- * Decode arguments
- * XXX: why do we ignore the return value?
- */
- if (procp->pc_decode &&
- !procp->pc_decode(rqstp, argv->iov_base))
+ if (!svc_generic_dispatch(rqstp, statp))
+ goto release_dropit;
+ if (*statp == rpc_garbage_args)
goto err_garbage;
-
- *statp = procp->pc_func(rqstp);
-
- /* Encode reply */
- if (*statp == rpc_drop_reply ||
- test_bit(RQ_DROPME, &rqstp->rq_flags)) {
- if (procp->pc_release)
- procp->pc_release(rqstp);
- goto dropit;
- }
auth_stat = svc_get_autherr(rqstp, statp);
if (auth_stat != rpc_auth_ok)
goto err_release_bad_auth;
- if (*statp == rpc_success && procp->pc_encode &&
- !procp->pc_encode(rqstp, resv->iov_base + resv->iov_len)) {
- dprintk("svc: failed to encode reply\n");
- /* serv->sv_stats->rpcsystemerr++; */
- *statp = rpc_system_err;
- }
} else {
dprintk("svc: calling dispatcher\n");
- if (!process.dispatch(rqstp, statp)) {
- /* Release reply info */
- if (procp->pc_release)
- procp->pc_release(rqstp);
- goto dropit;
- }
+ if (!process.dispatch(rqstp, statp))
+ goto release_dropit; /* Release reply info */
}

/* Check RPC status result */
@@ -1380,6 +1396,9 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
goto close;
return 1; /* Caller can now send it */

+release_dropit:
+ if (procp->pc_release)
+ procp->pc_release(rqstp);
dropit:
svc_authorise(rqstp); /* doesn't hurt to call this twice */
dprintk("svc: svc_process dropit\n");
--
2.20.1


2019-04-02 23:17:04

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 4/6] SUNRPC: Allow further customisation of RPC program registration

Add a callback to allow customisation of the rpcbind registration.
When clients have the ability to turn on and off version support,
we want to allow them to also prevent registration of those
versions with the rpc portmapper.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/lockd/svc.c | 1 +
fs/nfs/callback.c | 1 +
fs/nfsd/nfssvc.c | 3 +-
include/linux/sunrpc/svc.h | 15 +++++++
net/sunrpc/svc.c | 85 ++++++++++++++++++++++++--------------
5 files changed, 73 insertions(+), 32 deletions(-)

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 75415b21efda..96bb74c919f9 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -809,4 +809,5 @@ static struct svc_program nlmsvc_program = {
.pg_stats = &nlmsvc_stats, /* stats table */
.pg_authenticate = &lockd_authenticate, /* export authentication */
.pg_init_request = svc_generic_init_request,
+ .pg_rpcbind_set = svc_generic_rpcbind_set,
};
diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index a9510374bad7..15c9575e0e7a 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -458,4 +458,5 @@ static struct svc_program nfs4_callback_program = {
.pg_stats = &nfs4_callback_stats,
.pg_authenticate = nfs_callback_authenticate,
.pg_init_request = svc_generic_init_request,
+ .pg_rpcbind_set = svc_generic_rpcbind_set,
};
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index e26762e84798..6a52400c85e0 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -87,6 +87,7 @@ static struct svc_program nfsd_acl_program = {
.pg_stats = &nfsd_acl_svcstats,
.pg_authenticate = &svc_set_client,
.pg_init_request = svc_generic_init_request,
+ .pg_rpcbind_set = svc_generic_rpcbind_set,
};

static struct svc_stat nfsd_acl_svcstats = {
@@ -120,7 +121,7 @@ struct svc_program nfsd_program = {
.pg_stats = &nfsd_svcstats, /* version table */
.pg_authenticate = &svc_set_client, /* export authentication */
.pg_init_request = svc_generic_init_request,
-
+ .pg_rpcbind_set = svc_generic_rpcbind_set,
};

static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index f43d5765acff..1afe38eb33f7 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -410,6 +410,11 @@ struct svc_program {
__be32 (*pg_init_request)(struct svc_rqst *,
const struct svc_program *,
struct svc_process_info *);
+ int (*pg_rpcbind_set)(struct net *net,
+ const struct svc_program *,
+ u32 version, int family,
+ unsigned short proto,
+ unsigned short port);
};

/*
@@ -522,6 +527,16 @@ __be32 svc_return_autherr(struct svc_rqst *rqstp, __be32 auth_err);
__be32 svc_generic_init_request(struct svc_rqst *rqstp,
const struct svc_program *progp,
struct svc_process_info *procinfo);
+int svc_generic_rpcbind_set(struct net *net,
+ const struct svc_program *progp,
+ u32 version, int family,
+ unsigned short proto,
+ unsigned short port);
+int svc_rpcbind_set_version(struct net *net,
+ const struct svc_program *progp,
+ u32 version, int family,
+ unsigned short proto,
+ unsigned short port);

#define RPC_MAX_ADDRBUFLEN (63U)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index eb6c7cef40de..d74604d3e1cf 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -993,6 +993,58 @@ static int __svc_register(struct net *net, const char *progname,
return error;
}

+int svc_rpcbind_set_version(struct net *net,
+ const struct svc_program *progp,
+ u32 version, int family,
+ unsigned short proto,
+ unsigned short port)
+{
+ dprintk("svc: svc_register(%sv%d, %s, %u, %u)\n",
+ progp->pg_name, version,
+ proto == IPPROTO_UDP? "udp" : "tcp",
+ port, family);
+
+ return __svc_register(net, progp->pg_name, progp->pg_prog,
+ version, family, proto, port);
+
+}
+EXPORT_SYMBOL_GPL(svc_rpcbind_set_version);
+
+int svc_generic_rpcbind_set(struct net *net,
+ const struct svc_program *progp,
+ u32 version, int family,
+ unsigned short proto,
+ unsigned short port)
+{
+ const struct svc_version *vers = progp->pg_vers[version];
+ int error;
+
+ if (vers == NULL)
+ return 0;
+
+ if (vers->vs_hidden) {
+ dprintk("svc: svc_register(%sv%d, %s, %u, %u)"
+ " (but not telling portmap)\n",
+ progp->pg_name, version,
+ proto == IPPROTO_UDP? "udp" : "tcp",
+ port, family);
+ return 0;
+ }
+
+ /*
+ * Don't register a UDP port if we need congestion
+ * control.
+ */
+ if (vers->vs_need_cong_ctrl && proto == IPPROTO_UDP)
+ return 0;
+
+ error = svc_rpcbind_set_version(net, progp, version,
+ family, proto, port);
+
+ return (vers->vs_rpcb_optnl) ? 0 : error;
+}
+EXPORT_SYMBOL_GPL(svc_generic_rpcbind_set);
+
/**
* svc_register - register an RPC service with the local portmapper
* @serv: svc_serv struct for the service to register
@@ -1008,7 +1060,6 @@ int svc_register(const struct svc_serv *serv, struct net *net,
const unsigned short port)
{
struct svc_program *progp;
- const struct svc_version *vers;
unsigned int i;
int error = 0;

@@ -1018,37 +1069,9 @@ int svc_register(const struct svc_serv *serv, struct net *net,

for (progp = serv->sv_program; progp; progp = progp->pg_next) {
for (i = 0; i < progp->pg_nvers; i++) {
- vers = progp->pg_vers[i];
- if (vers == NULL)
- continue;
-
- dprintk("svc: svc_register(%sv%d, %s, %u, %u)%s\n",
- progp->pg_name,
- i,
- proto == IPPROTO_UDP? "udp" : "tcp",
- port,
- family,
- vers->vs_hidden ?
- " (but not telling portmap)" : "");
-
- if (vers->vs_hidden)
- continue;
-
- /*
- * Don't register a UDP port if we need congestion
- * control.
- */
- if (vers->vs_need_cong_ctrl && proto == IPPROTO_UDP)
- continue;
-
- error = __svc_register(net, progp->pg_name, progp->pg_prog,
- i, family, proto, port);
-
- if (vers->vs_rpcb_optnl) {
- error = 0;
- continue;
- }

+ error = progp->pg_rpcbind_set(net, progp, i,
+ family, proto, port);
if (error < 0) {
printk(KERN_WARNING "svc: failed to register "
"%sv%u RPC service (errno %d).\n",
--
2.20.1


2019-04-02 23:17:04

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 5/6] nfsd: Add custom rpcbind callbacks for knfsd

Add custom rpcbind callbacks in preparation for the knfsd
per-container version feature.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/nfssvc.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 6a52400c85e0..c6649464bc4b 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -32,6 +32,16 @@

extern struct svc_program nfsd_program;
static int nfsd(void *vrqstp);
+static int nfsd_acl_rpcbind_set(struct net *,
+ const struct svc_program *,
+ u32, int,
+ unsigned short,
+ unsigned short);
+static int nfsd_rpcbind_set(struct net *,
+ const struct svc_program *,
+ u32, int,
+ unsigned short,
+ unsigned short);

/*
* nfsd_mutex protects nn->nfsd_serv -- both the pointer itself and the members
@@ -87,7 +97,7 @@ static struct svc_program nfsd_acl_program = {
.pg_stats = &nfsd_acl_svcstats,
.pg_authenticate = &svc_set_client,
.pg_init_request = svc_generic_init_request,
- .pg_rpcbind_set = svc_generic_rpcbind_set,
+ .pg_rpcbind_set = nfsd_acl_rpcbind_set,
};

static struct svc_stat nfsd_acl_svcstats = {
@@ -121,7 +131,7 @@ struct svc_program nfsd_program = {
.pg_stats = &nfsd_svcstats, /* version table */
.pg_authenticate = &svc_set_client, /* export authentication */
.pg_init_request = svc_generic_init_request,
- .pg_rpcbind_set = svc_generic_rpcbind_set,
+ .pg_rpcbind_set = nfsd_rpcbind_set,
};

static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
@@ -130,6 +140,14 @@ static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
[2] = 1,
};

+static bool
+nfsd_support_acl_version(int vers)
+{
+ if (vers >= NFSD_ACL_MINVERS && vers < NFSD_ACL_NRVERS)
+ return nfsd_acl_version[vers] != NULL;
+ return false;
+}
+
int nfsd_vers(int vers, enum vers_op change)
{
if (vers < NFSD_MINVERS || vers >= NFSD_NRVERS)
@@ -670,6 +688,28 @@ nfsd_svc(int nrservs, struct net *net)
return error;
}

+static int
+nfsd_acl_rpcbind_set(struct net *net, const struct svc_program *progp,
+ u32 version, int family, unsigned short proto,
+ unsigned short port)
+{
+ if (!nfsd_support_acl_version(version) ||
+ !nfsd_vers(version, NFSD_TEST))
+ return 0;
+ return svc_generic_rpcbind_set(net, progp, version, family,
+ proto, port);
+}
+
+static int
+nfsd_rpcbind_set(struct net *net, const struct svc_program *progp,
+ u32 version, int family, unsigned short proto,
+ unsigned short port)
+{
+ if (!nfsd_vers(version, NFSD_TEST))
+ return 0;
+ return svc_generic_rpcbind_set(net, progp, version, family,
+ proto, port);
+}

/*
* This is the NFS server kernel thread
--
2.20.1


2019-04-02 23:17:06

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 6/6] nfsd: Allow containers to set supported nfs versions

Support use of the --nfs-version/--no-nfs-version arguments to rpc.nfsd
in containers.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/netns.h | 8 ++
fs/nfsd/nfs4proc.c | 3 +-
fs/nfsd/nfsctl.c | 25 +++---
fs/nfsd/nfsd.h | 8 +-
fs/nfsd/nfssvc.c | 216 +++++++++++++++++++++++++++++++++++----------
5 files changed, 198 insertions(+), 62 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 32cb8c027483..68c0162751eb 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -131,10 +131,18 @@ struct nfsd_net {
u32 s2s_cp_cl_id;
struct idr s2s_cp_stateids;
spinlock_t s2s_cp_lock;
+
+ /*
+ * Version information
+ */
+ bool *nfsd_versions;
+ bool *nfsd4_minorversions;
};

/* Simple check to find out if a given net was properly initialized */
#define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)

+extern void nfsd_netns_free_versions(struct nfsd_net *nn);
+
extern unsigned int nfsd_net_id;
#endif /* __NFSD_NETNS_H__ */
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 0cfd257ffdaf..b0ad72d701b1 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1926,6 +1926,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
struct nfsd4_compound_state *cstate = &resp->cstate;
struct svc_fh *current_fh = &cstate->current_fh;
struct svc_fh *save_fh = &cstate->save_fh;
+ struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
__be32 status;

svcxdr_init_encode(rqstp, resp);
@@ -1948,7 +1949,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
* According to RFC3010, this takes precedence over all other errors.
*/
status = nfserr_minor_vers_mismatch;
- if (nfsd_minorversion(args->minorversion, NFSD_TEST) <= 0)
+ if (nfsd_minorversion(nn, args->minorversion, NFSD_TEST) <= 0)
goto out;
status = nfserr_resource;
if (args->opcnt > NFSD_MAX_OPS_PER_COMPOUND)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index f2feb2d11bae..2dc5a73cc464 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -537,14 +537,14 @@ static ssize_t write_pool_threads(struct file *file, char *buf, size_t size)
}

static ssize_t
-nfsd_print_version_support(char *buf, int remaining, const char *sep,
- unsigned vers, int minor)
+nfsd_print_version_support(struct nfsd_net *nn, char *buf, int remaining,
+ const char *sep, unsigned vers, int minor)
{
const char *format = minor < 0 ? "%s%c%u" : "%s%c%u.%u";
- bool supported = !!nfsd_vers(vers, NFSD_TEST);
+ bool supported = !!nfsd_vers(nn, vers, NFSD_TEST);

if (vers == 4 && minor >= 0 &&
- !nfsd_minorversion(minor, NFSD_TEST))
+ !nfsd_minorversion(nn, minor, NFSD_TEST))
supported = false;
if (minor == 0 && supported)
/*
@@ -599,20 +599,20 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
switch(num) {
case 2:
case 3:
- nfsd_vers(num, cmd);
+ nfsd_vers(nn, num, cmd);
break;
case 4:
if (*minorp == '.') {
- if (nfsd_minorversion(minor, cmd) < 0)
+ if (nfsd_minorversion(nn, minor, cmd) < 0)
return -EINVAL;
- } else if ((cmd == NFSD_SET) != nfsd_vers(num, NFSD_TEST)) {
+ } else if ((cmd == NFSD_SET) != nfsd_vers(nn, num, NFSD_TEST)) {
/*
* Either we have +4 and no minors are enabled,
* or we have -4 and at least one minor is enabled.
* In either case, propagate 'cmd' to all minors.
*/
minor = 0;
- while (nfsd_minorversion(minor, cmd) >= 0)
+ while (nfsd_minorversion(nn, minor, cmd) >= 0)
minor++;
}
break;
@@ -624,7 +624,7 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
/* If all get turned off, turn them back on, as
* having no versions is BAD
*/
- nfsd_reset_versions();
+ nfsd_reset_versions(nn);
}

/* Now write current state into reply buffer */
@@ -633,12 +633,12 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
remaining = SIMPLE_TRANSACTION_LIMIT;
for (num=2 ; num <= 4 ; num++) {
int minor;
- if (!nfsd_vers(num, NFSD_AVAIL))
+ if (!nfsd_vers(nn, num, NFSD_AVAIL))
continue;

minor = -1;
do {
- len = nfsd_print_version_support(buf, remaining,
+ len = nfsd_print_version_support(nn, buf, remaining,
sep, num, minor);
if (len >= remaining)
goto out;
@@ -1239,6 +1239,8 @@ static __net_init int nfsd_init_net(struct net *net)
retval = nfsd_idmap_init(net);
if (retval)
goto out_idmap_error;
+ nn->nfsd_versions = NULL;
+ nn->nfsd4_minorversions = NULL;
nn->nfsd4_lease = 90; /* default lease time */
nn->nfsd4_grace = 90;
nn->somebody_reclaimed = false;
@@ -1260,6 +1262,7 @@ static __net_exit void nfsd_exit_net(struct net *net)
{
nfsd_idmap_shutdown(net);
nfsd_export_shutdown(net);
+ nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
}

static struct pernet_operations nfsd_net_ops = {
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 066899929863..6bae2554b2b2 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -98,10 +98,12 @@ extern const struct svc_version nfsd_acl_version3;
#endif
#endif

+struct nfsd_net;
+
enum vers_op {NFSD_SET, NFSD_CLEAR, NFSD_TEST, NFSD_AVAIL };
-int nfsd_vers(int vers, enum vers_op change);
-int nfsd_minorversion(u32 minorversion, enum vers_op change);
-void nfsd_reset_versions(void);
+int nfsd_vers(struct nfsd_net *nn, int vers, enum vers_op change);
+int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change);
+void nfsd_reset_versions(struct nfsd_net *nn);
int nfsd_create_serv(struct net *net);

extern int nfsd_max_blksize;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index c6649464bc4b..16fd157e7651 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -42,6 +42,12 @@ static int nfsd_rpcbind_set(struct net *,
u32, int,
unsigned short,
unsigned short);
+static __be32 nfsd_acl_init_request(struct svc_rqst *,
+ const struct svc_program *,
+ struct svc_process_info *);
+static __be32 nfsd_init_request(struct svc_rqst *,
+ const struct svc_program *,
+ struct svc_process_info *);

/*
* nfsd_mutex protects nn->nfsd_serv -- both the pointer itself and the members
@@ -96,7 +102,7 @@ static struct svc_program nfsd_acl_program = {
.pg_class = "nfsd",
.pg_stats = &nfsd_acl_svcstats,
.pg_authenticate = &svc_set_client,
- .pg_init_request = svc_generic_init_request,
+ .pg_init_request = nfsd_acl_init_request,
.pg_rpcbind_set = nfsd_acl_rpcbind_set,
};

@@ -117,7 +123,6 @@ static const struct svc_version *nfsd_version[] = {

#define NFSD_MINVERS 2
#define NFSD_NRVERS ARRAY_SIZE(nfsd_version)
-static const struct svc_version *nfsd_versions[NFSD_NRVERS];

struct svc_program nfsd_program = {
#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
@@ -125,21 +130,15 @@ struct svc_program nfsd_program = {
#endif
.pg_prog = NFS_PROGRAM, /* program number */
.pg_nvers = NFSD_NRVERS, /* nr of entries in nfsd_version */
- .pg_vers = nfsd_versions, /* version table */
+ .pg_vers = nfsd_version, /* version table */
.pg_name = "nfsd", /* program name */
.pg_class = "nfsd", /* authentication class */
.pg_stats = &nfsd_svcstats, /* version table */
.pg_authenticate = &svc_set_client, /* export authentication */
- .pg_init_request = svc_generic_init_request,
+ .pg_init_request = nfsd_init_request,
.pg_rpcbind_set = nfsd_rpcbind_set,
};

-static bool nfsd_supported_minorversions[NFSD_SUPPORTED_MINOR_VERSION + 1] = {
- [0] = 1,
- [1] = 1,
- [2] = 1,
-};
-
static bool
nfsd_support_acl_version(int vers)
{
@@ -148,63 +147,127 @@ nfsd_support_acl_version(int vers)
return false;
}

-int nfsd_vers(int vers, enum vers_op change)
+static bool
+nfsd_support_version(int vers)
+{
+ if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
+ return nfsd_version[vers] != NULL;
+ return false;
+}
+
+static bool *
+nfsd_alloc_versions(void)
+{
+ bool *vers = kmalloc_array(NFSD_NRVERS, sizeof(bool), GFP_KERNEL);
+ unsigned i;
+
+ if (vers) {
+ /* All compiled versions are enabled by default */
+ for (i = 0; i < NFSD_NRVERS; i++)
+ vers[i] = nfsd_support_version(i);
+ }
+ return vers;
+}
+
+static bool *
+nfsd_alloc_minorversions(void)
+{
+ bool *vers = kmalloc_array(NFSD_SUPPORTED_MINOR_VERSION + 1,
+ sizeof(bool), GFP_KERNEL);
+ unsigned i;
+
+ if (vers) {
+ /* All minor versions are enabled by default */
+ for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
+ vers[i] = nfsd_support_version(4);
+ }
+ return vers;
+}
+
+void
+nfsd_netns_free_versions(struct nfsd_net *nn)
+{
+ kfree(nn->nfsd_versions);
+ kfree(nn->nfsd4_minorversions);
+ nn->nfsd_versions = NULL;
+ nn->nfsd4_minorversions = NULL;
+}
+
+static void
+nfsd_netns_init_versions(struct nfsd_net *nn)
+{
+ if (!nn->nfsd_versions) {
+ nn->nfsd_versions = nfsd_alloc_versions();
+ nn->nfsd4_minorversions = nfsd_alloc_minorversions();
+ if (!nn->nfsd_versions || !nn->nfsd4_minorversions)
+ nfsd_netns_free_versions(nn);
+ }
+}
+
+int nfsd_vers(struct nfsd_net *nn, int vers, enum vers_op change)
{
if (vers < NFSD_MINVERS || vers >= NFSD_NRVERS)
return 0;
switch(change) {
case NFSD_SET:
- nfsd_versions[vers] = nfsd_version[vers];
-#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
- if (vers < NFSD_ACL_NRVERS)
- nfsd_acl_versions[vers] = nfsd_acl_version[vers];
-#endif
+ if (nn->nfsd_versions)
+ nn->nfsd_versions[vers] = nfsd_support_version(vers);
break;
case NFSD_CLEAR:
- nfsd_versions[vers] = NULL;
-#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
- if (vers < NFSD_ACL_NRVERS)
- nfsd_acl_versions[vers] = NULL;
-#endif
+ nfsd_netns_init_versions(nn);
+ if (nn->nfsd_versions)
+ nn->nfsd_versions[vers] = false;
break;
case NFSD_TEST:
- return nfsd_versions[vers] != NULL;
+ if (nn->nfsd_versions)
+ return nn->nfsd_versions[vers];
+ /* Fallthrough */
case NFSD_AVAIL:
- return nfsd_version[vers] != NULL;
+ return nfsd_support_version(vers);
}
return 0;
}

static void
-nfsd_adjust_nfsd_versions4(void)
+nfsd_adjust_nfsd_versions4(struct nfsd_net *nn)
{
unsigned i;

for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++) {
- if (nfsd_supported_minorversions[i])
+ if (nn->nfsd4_minorversions[i])
return;
}
- nfsd_vers(4, NFSD_CLEAR);
+ nfsd_vers(nn, 4, NFSD_CLEAR);
}

-int nfsd_minorversion(u32 minorversion, enum vers_op change)
+int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change)
{
if (minorversion > NFSD_SUPPORTED_MINOR_VERSION &&
change != NFSD_AVAIL)
return -1;
+
switch(change) {
case NFSD_SET:
- nfsd_supported_minorversions[minorversion] = true;
- nfsd_vers(4, NFSD_SET);
+ if (nn->nfsd4_minorversions) {
+ nfsd_vers(nn, 4, NFSD_SET);
+ nn->nfsd4_minorversions[minorversion] =
+ nfsd_vers(nn, 4, NFSD_TEST);
+ }
break;
case NFSD_CLEAR:
- nfsd_supported_minorversions[minorversion] = false;
- nfsd_adjust_nfsd_versions4();
+ nfsd_netns_init_versions(nn);
+ if (nn->nfsd4_minorversions) {
+ nn->nfsd4_minorversions[minorversion] = false;
+ nfsd_adjust_nfsd_versions4(nn);
+ }
break;
case NFSD_TEST:
- return nfsd_supported_minorversions[minorversion];
+ if (nn->nfsd4_minorversions)
+ return nn->nfsd4_minorversions[minorversion];
+ return nfsd_vers(nn, 4, NFSD_TEST);
case NFSD_AVAIL:
- return minorversion <= NFSD_SUPPORTED_MINOR_VERSION;
+ return minorversion <= NFSD_SUPPORTED_MINOR_VERSION &&
+ nfsd_vers(nn, 4, NFSD_AVAIL);
}
return 0;
}
@@ -286,13 +349,9 @@ static void nfsd_shutdown_generic(void)
nfsd_racache_shutdown();
}

-static bool nfsd_needs_lockd(void)
+static bool nfsd_needs_lockd(struct nfsd_net *nn)
{
-#if defined(CONFIG_NFSD_V3)
- return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
-#else
- return (nfsd_versions[2] != NULL);
-#endif
+ return nfsd_vers(nn, 2, NFSD_TEST) || nfsd_vers(nn, 3, NFSD_TEST);
}

static int nfsd_startup_net(int nrservs, struct net *net)
@@ -310,7 +369,7 @@ static int nfsd_startup_net(int nrservs, struct net *net)
if (ret)
goto out_socks;

- if (nfsd_needs_lockd() && !nn->lockd_up) {
+ if (nfsd_needs_lockd(nn) && !nn->lockd_up) {
ret = lockd_up(net);
if (ret)
goto out_socks;
@@ -443,20 +502,20 @@ static void nfsd_last_thread(struct svc_serv *serv, struct net *net)
nfsd_export_flush(net);
}

-void nfsd_reset_versions(void)
+void nfsd_reset_versions(struct nfsd_net *nn)
{
int i;

for (i = 0; i < NFSD_NRVERS; i++)
- if (nfsd_vers(i, NFSD_TEST))
+ if (nfsd_vers(nn, i, NFSD_TEST))
return;

for (i = 0; i < NFSD_NRVERS; i++)
if (i != 4)
- nfsd_vers(i, NFSD_SET);
+ nfsd_vers(nn, i, NFSD_SET);
else {
int minor = 0;
- while (nfsd_minorversion(minor, NFSD_SET) >= 0)
+ while (nfsd_minorversion(nn, minor, NFSD_SET) >= 0)
minor++;
}
}
@@ -524,7 +583,7 @@ int nfsd_create_serv(struct net *net)
}
if (nfsd_max_blksize == 0)
nfsd_max_blksize = nfsd_get_default_max_blksize();
- nfsd_reset_versions();
+ nfsd_reset_versions(nn);
nn->nfsd_serv = svc_create_pooled(&nfsd_program, nfsd_max_blksize,
&nfsd_thread_sv_ops);
if (nn->nfsd_serv == NULL)
@@ -694,7 +753,7 @@ nfsd_acl_rpcbind_set(struct net *net, const struct svc_program *progp,
unsigned short port)
{
if (!nfsd_support_acl_version(version) ||
- !nfsd_vers(version, NFSD_TEST))
+ !nfsd_vers(net_generic(net, nfsd_net_id), version, NFSD_TEST))
return 0;
return svc_generic_rpcbind_set(net, progp, version, family,
proto, port);
@@ -705,12 +764,75 @@ nfsd_rpcbind_set(struct net *net, const struct svc_program *progp,
u32 version, int family, unsigned short proto,
unsigned short port)
{
- if (!nfsd_vers(version, NFSD_TEST))
+ if (!nfsd_vers(net_generic(net, nfsd_net_id), version, NFSD_TEST))
return 0;
return svc_generic_rpcbind_set(net, progp, version, family,
proto, port);
}

+static __be32
+nfsd_acl_init_request(struct svc_rqst *rqstp,
+ const struct svc_program *progp,
+ struct svc_process_info *ret)
+{
+ struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+ int i;
+
+ if (likely(nfsd_support_acl_version(rqstp->rq_vers) &&
+ nfsd_vers(nn, rqstp->rq_vers, NFSD_TEST)))
+ return svc_generic_init_request(rqstp, progp, ret);
+
+ ret->mismatch.lovers = NFSD_ACL_NRVERS;
+ for (i = NFSD_ACL_MINVERS; i < NFSD_ACL_NRVERS; i++) {
+ if (nfsd_support_acl_version(rqstp->rq_vers) &&
+ nfsd_vers(nn, i, NFSD_TEST)) {
+ ret->mismatch.lovers = i;
+ break;
+ }
+ }
+ if (ret->mismatch.lovers == NFSD_ACL_NRVERS)
+ return rpc_prog_unavail;
+ ret->mismatch.hivers = NFSD_ACL_MINVERS;
+ for (i = NFSD_ACL_NRVERS - 1; i >= NFSD_ACL_MINVERS; i--) {
+ if (nfsd_support_acl_version(rqstp->rq_vers) &&
+ nfsd_vers(nn, i, NFSD_TEST)) {
+ ret->mismatch.hivers = i;
+ break;
+ }
+ }
+ return rpc_prog_mismatch;
+}
+
+static __be32
+nfsd_init_request(struct svc_rqst *rqstp,
+ const struct svc_program *progp,
+ struct svc_process_info *ret)
+{
+ struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+ int i;
+
+ if (likely(nfsd_vers(nn, rqstp->rq_vers, NFSD_TEST)))
+ return svc_generic_init_request(rqstp, progp, ret);
+
+ ret->mismatch.lovers = NFSD_NRVERS;
+ for (i = NFSD_MINVERS; i < NFSD_NRVERS; i++) {
+ if (nfsd_vers(nn, i, NFSD_TEST)) {
+ ret->mismatch.lovers = i;
+ break;
+ }
+ }
+ if (ret->mismatch.lovers == NFSD_NRVERS)
+ return rpc_prog_unavail;
+ ret->mismatch.hivers = NFSD_MINVERS;
+ for (i = NFSD_NRVERS - 1; i >= NFSD_MINVERS; i--) {
+ if (nfsd_vers(nn, i, NFSD_TEST)) {
+ ret->mismatch.hivers = i;
+ break;
+ }
+ }
+ return rpc_prog_mismatch;
+}
+
/*
* This is the NFS server kernel thread
*/
--
2.20.1


2019-04-03 00:36:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/6] Allow containerised knfsd to set supported NFS versions

On Tue, Apr 02, 2019 at 04:14:42PM -0700, Trond Myklebust wrote:
> The current knfsd implementation is unable to support containers that
> want to run different combinations of NFS versions. This is because of
> the way we define which versions are supported: by directly editing the
> global version table in 'nfsd_program'.
> The following patch set modifies the method used to define version
> information by moving some of the server RPC request initialisation
> down into knfsd itself, allowing it to manage the version-specific
> initialisation.

Sounds fine, though I'm curious what the motivation is.

I assume the current behavior is that setting versions from any
container sets them globally (rather than erroring out or something)--so
is there any way some can detect whether they've got the new behavior
other than trying it and seeing what the result is in another container?

--b.

>
> Note that we might want to consider a few follow ups to this patchset
> to get rid of some of the version-specific flags (e.g. vs_hidden) that
> currently need to be managed in the generic SUNRPC server code on behalf
> of just one or two RPC programs. These flags can easily by replaced by
> custom RPC request initialisers.
>
> Trond Myklebust (6):
> SUNRPC/nfs: Fix return value for nfs4_callback_compound()
> SUNRPC: Add a callback to initialise server requests
> SUNRPC: Clean up generic dispatcher code
> SUNRPC: Allow further customisation of RPC program registration
> nfsd: Add custom rpcbind callbacks for knfsd
> nfsd: Allow containers to set supported nfs versions
>
> fs/lockd/svc.c | 4 +-
> fs/nfs/callback.c | 2 +
> fs/nfs/callback_xdr.c | 2 +-
> fs/nfsd/netns.h | 8 +
> fs/nfsd/nfs4proc.c | 3 +-
> fs/nfsd/nfsctl.c | 25 +--
> fs/nfsd/nfsd.h | 8 +-
> fs/nfsd/nfssvc.c | 251 +++++++++++++++++++++++++-----
> include/linux/sunrpc/svc.h | 33 ++++
> net/sunrpc/svc.c | 310 ++++++++++++++++++++++++-------------
> 10 files changed, 480 insertions(+), 166 deletions(-)
>
> --
> 2.20.1
>

2019-04-03 02:07:51

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 0/6] Allow containerised knfsd to set supported NFS versions

On Tue, 2019-04-02 at 20:36 -0400, J. Bruce Fields wrote:
> On Tue, Apr 02, 2019 at 04:14:42PM -0700, Trond Myklebust wrote:
> > The current knfsd implementation is unable to support containers
> > that
> > want to run different combinations of NFS versions. This is because
> > of
> > the way we define which versions are supported: by directly editing
> > the
> > global version table in 'nfsd_program'.
> > The following patch set modifies the method used to define version
> > information by moving some of the server RPC request initialisation
> > down into knfsd itself, allowing it to manage the version-specific
> > initialisation.
>
> Sounds fine, though I'm curious what the motivation is.
>
> I assume the current behavior is that setting versions from any
> container sets them globally (rather than erroring out or something)-
> -so
> is there any way some can detect whether they've got the new behavior
> other than trying it and seeing what the result is in another
> container?
>

The main motivation is to allow knfsd to work in a container, just like
it would in a VM.

We have an application where we want to export two different sets of
filesystems on the same host, but we want to export the first set of
filesystems as NFSv3 only, and the second as NFSv3+NFSv4.x. We could
potentially do this by extending the export file syntax, but why do
that when you can achieve the same result just by fixing containers to
work as expected?

> --b.
>
> > Note that we might want to consider a few follow ups to this
> > patchset
> > to get rid of some of the version-specific flags (e.g. vs_hidden)
> > that
> > currently need to be managed in the generic SUNRPC server code on
> > behalf
> > of just one or two RPC programs. These flags can easily by replaced
> > by
> > custom RPC request initialisers.
> >
> > Trond Myklebust (6):
> > SUNRPC/nfs: Fix return value for nfs4_callback_compound()
> > SUNRPC: Add a callback to initialise server requests
> > SUNRPC: Clean up generic dispatcher code
> > SUNRPC: Allow further customisation of RPC program registration
> > nfsd: Add custom rpcbind callbacks for knfsd
> > nfsd: Allow containers to set supported nfs versions
> >
> > fs/lockd/svc.c | 4 +-
> > fs/nfs/callback.c | 2 +
> > fs/nfs/callback_xdr.c | 2 +-
> > fs/nfsd/netns.h | 8 +
> > fs/nfsd/nfs4proc.c | 3 +-
> > fs/nfsd/nfsctl.c | 25 +--
> > fs/nfsd/nfsd.h | 8 +-
> > fs/nfsd/nfssvc.c | 251 +++++++++++++++++++++++++-----
> > include/linux/sunrpc/svc.h | 33 ++++
> > net/sunrpc/svc.c | 310 ++++++++++++++++++++++++---------
> > ----
> > 10 files changed, 480 insertions(+), 166 deletions(-)
> >
> > --
> > 2.20.1
> >
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]