2024-04-15 19:51:02

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 0/6] convert write_threads, write_version and write_ports to netlink commands

Introduce write_threads, write_version and write_ports netlink
commands similar to the ones available through the procfs.

Changes since v7:
- add gracetime and leasetime to threads-{set,get} command
- rely on nla_nest_start instead of nla_nest_start_noflag
Changes since v6:
- add the capability to pass sockaddr from userspace through listener-set
command
- rebase on top of nfsd-next
Changes since v5:
- for write_ports and write_version commands, userspace is expected to provide
a NFS listeners/supported versions list it want to enable (all the other
ports/versions will be disabled).
- fix comments
- rebase on top of nfsd-next
Changes since v4:
- rebase on top of nfsd-next tree
Changes since v3:
- drop write_maxconn and write_maxblksize for the moment
- add write_version and write_ports commands
Changes since v2:
- use u32 to store nthreads in nfsd_nl_threads_set_doit
- rename server-attr in control-plane in nfsd.yaml specs
Changes since v1:
- remove write_v4_end_grace command
- add write_maxblksize and write_maxconn netlink commands

This patch can be tested with user-space tool reported below:
https://patchwork.kernel.org/project/linux-nfs/cover/[email protected]/

Jeff Layton (2):
nfsd: move nfsd_mutex handling into nfsd_svc callers
SUNRPC: add a new svc_find_listener helper

Lorenzo Bianconi (4):
NFSD: convert write_threads to netlink command
NFSD: add write_version to netlink command
SUNRPC: introduce svc_xprt_create_from_sa utility routine
NFSD: add listener-{set,get} netlink command

Documentation/netlink/specs/nfsd.yaml | 104 ++++++
fs/nfsd/netlink.c | 65 ++++
fs/nfsd/netlink.h | 10 +
fs/nfsd/netns.h | 1 +
fs/nfsd/nfsctl.c | 476 ++++++++++++++++++++++++++
fs/nfsd/nfssvc.c | 7 +-
include/linux/sunrpc/svc_xprt.h | 5 +
include/uapi/linux/nfsd_netlink.h | 46 +++
net/sunrpc/svc_xprt.c | 167 +++++----
9 files changed, 819 insertions(+), 62 deletions(-)

--
2.44.0



2024-04-15 19:51:20

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 1/6] nfsd: move nfsd_mutex handling into nfsd_svc callers

From: Jeff Layton <[email protected]>

Currently nfsd_svc holds the nfsd_mutex over the whole function. For
some of the later netlink patches though, we want to do some other
things to the server before starting it. Move the mutex handling into
the callers.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
fs/nfsd/nfsctl.c | 2 ++
fs/nfsd/nfssvc.c | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 93c87587e646..f2e442d7fe16 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -406,7 +406,9 @@ static ssize_t write_threads(struct file *file, char *buf, size_t size)
if (newthreads < 0)
return -EINVAL;
trace_nfsd_ctl_threads(net, newthreads);
+ mutex_lock(&nfsd_mutex);
rv = nfsd_svc(newthreads, net, file->f_cred);
+ mutex_unlock(&nfsd_mutex);
if (rv < 0)
return rv;
} else
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index c0d17b92b249..ca193f7ff0e1 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -775,7 +775,8 @@ nfsd_svc(int nrservs, struct net *net, const struct cred *cred)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct svc_serv *serv;

- mutex_lock(&nfsd_mutex);
+ lockdep_assert_held(&nfsd_mutex);
+
dprintk("nfsd: creating service\n");

nrservs = max(nrservs, 0);
@@ -804,7 +805,6 @@ nfsd_svc(int nrservs, struct net *net, const struct cred *cred)
if (serv->sv_nrthreads == 0)
nfsd_destroy_serv(net);
out:
- mutex_unlock(&nfsd_mutex);
return error;
}

--
2.44.0


2024-04-15 19:51:40

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 3/6] NFSD: add write_version to netlink command

Introduce write_version netlink command through a "declarative" interface.
This patch introduces a change in behavior since for version-set userspace
is expected to provide a NFS major/minor version list it wants to enable
while all the other ones will be disabled. (procfs write_version
command implements imperative interface where the admin writes +3/-3 to
enable/disable a single version.

Reviewed-by: Jeff Layton <[email protected]>
Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
Documentation/netlink/specs/nfsd.yaml | 37 +++++++
fs/nfsd/netlink.c | 24 +++++
fs/nfsd/netlink.h | 5 +
fs/nfsd/netns.h | 1 +
fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
fs/nfsd/nfssvc.c | 3 +-
include/uapi/linux/nfsd_netlink.h | 18 ++++
7 files changed, 236 insertions(+), 2 deletions(-)

diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
index cbe6c5fd6c4d..0396e8b3ea1f 100644
--- a/Documentation/netlink/specs/nfsd.yaml
+++ b/Documentation/netlink/specs/nfsd.yaml
@@ -74,6 +74,26 @@ attribute-sets:
-
name: leasetime
type: u32
+ -
+ name: version
+ attributes:
+ -
+ name: major
+ type: u32
+ -
+ name: minor
+ type: u32
+ -
+ name: enabled
+ type: flag
+ -
+ name: server-proto
+ attributes:
+ -
+ name: version
+ type: nest
+ nested-attributes: version
+ multi-attr: true

operations:
list:
@@ -120,3 +140,20 @@ operations:
- threads
- gracetime
- leasetime
+ -
+ name: version-set
+ doc: set nfs enabled versions
+ attribute-set: server-proto
+ flags: [ admin-perm ]
+ do:
+ request:
+ attributes:
+ - version
+ -
+ name: version-get
+ doc: get nfs enabled versions
+ attribute-set: server-proto
+ do:
+ reply:
+ attributes:
+ - version
diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
index 20a646af0324..bf5df9597288 100644
--- a/fs/nfsd/netlink.c
+++ b/fs/nfsd/netlink.c
@@ -10,6 +10,13 @@

#include <uapi/linux/nfsd_netlink.h>

+/* Common nested types */
+const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
+ [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
+ [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
+ [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
+};
+
/* NFSD_CMD_THREADS_SET - do */
static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
[NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
@@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
[NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
};

+/* NFSD_CMD_VERSION_SET - do */
+static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
+ [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
+};
+
/* Ops table for nfsd */
static const struct genl_split_ops nfsd_nl_ops[] = {
{
@@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
.doit = nfsd_nl_threads_get_doit,
.flags = GENL_CMD_CAP_DO,
},
+ {
+ .cmd = NFSD_CMD_VERSION_SET,
+ .doit = nfsd_nl_version_set_doit,
+ .policy = nfsd_version_set_nl_policy,
+ .maxattr = NFSD_A_SERVER_PROTO_VERSION,
+ .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+ },
+ {
+ .cmd = NFSD_CMD_VERSION_GET,
+ .doit = nfsd_nl_version_get_doit,
+ .flags = GENL_CMD_CAP_DO,
+ },
};

struct genl_family nfsd_nl_family __ro_after_init = {
diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
index 4137fac477e4..c7c0da275481 100644
--- a/fs/nfsd/netlink.h
+++ b/fs/nfsd/netlink.h
@@ -11,6 +11,9 @@

#include <uapi/linux/nfsd_netlink.h>

+/* Common nested types */
+extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
+
int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);

@@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb);
int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
+int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
+int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);

extern struct genl_family nfsd_nl_family;

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index d4be519b5734..14ec15656320 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -218,6 +218,7 @@ struct nfsd_net {
/* Simple check to find out if a given net was properly initialized */
#define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)

+extern bool nfsd_support_version(int vers);
extern void nfsd_netns_free_versions(struct nfsd_net *nn);

extern unsigned int nfsd_net_id;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 38a5df03981b..2c8929ef79e9 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
return err;
}

+/**
+ * nfsd_nl_version_set_doit - set the nfs enabled versions
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ const struct nlattr *attr;
+ struct nfsd_net *nn;
+ int i, rem;
+
+ if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
+ return -EINVAL;
+
+ mutex_lock(&nfsd_mutex);
+
+ nn = net_generic(genl_info_net(info), nfsd_net_id);
+ if (nn->nfsd_serv) {
+ mutex_unlock(&nfsd_mutex);
+ return -EBUSY;
+ }
+
+ /* clear current supported versions. */
+ nfsd_vers(nn, 2, NFSD_CLEAR);
+ nfsd_vers(nn, 3, NFSD_CLEAR);
+ for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
+ nfsd_minorversion(nn, i, NFSD_CLEAR);
+
+ nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
+ struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
+ u32 major, minor = 0;
+ bool enabled;
+
+ if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
+ continue;
+
+ if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
+ nfsd_version_nl_policy, info->extack) < 0)
+ continue;
+
+ if (!tb[NFSD_A_VERSION_MAJOR])
+ continue;
+
+ major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
+ if (tb[NFSD_A_VERSION_MINOR])
+ minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
+
+ enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
+
+ switch (major) {
+ case 4:
+ nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
+ break;
+ case 3:
+ case 2:
+ if (!minor)
+ nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
+ break;
+ default:
+ break;
+ }
+ }
+
+ mutex_unlock(&nfsd_mutex);
+
+ return 0;
+}
+
+/**
+ * nfsd_nl_version_get_doit - get the nfs enabled versions
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct nfsd_net *nn;
+ int i, err;
+ void *hdr;
+
+ skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ hdr = genlmsg_iput(skb, info);
+ if (!hdr) {
+ err = -EMSGSIZE;
+ goto err_free_msg;
+ }
+
+ mutex_lock(&nfsd_mutex);
+ nn = net_generic(genl_info_net(info), nfsd_net_id);
+
+ for (i = 2; i <= 4; i++) {
+ int j;
+
+ for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
+ struct nlattr *attr;
+
+ /* Don't record any versions the kernel doesn't have
+ * compiled in
+ */
+ if (!nfsd_support_version(i))
+ continue;
+
+ /* NFSv{2,3} does not support minor numbers */
+ if (i < 4 && j)
+ continue;
+
+ attr = nla_nest_start(skb,
+ NFSD_A_SERVER_PROTO_VERSION);
+ if (!attr) {
+ err = -EINVAL;
+ goto err_nfsd_unlock;
+ }
+
+ if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
+ nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
+ err = -EINVAL;
+ goto err_nfsd_unlock;
+ }
+
+ /* Set the enabled flag if the version is enabled */
+ if (nfsd_vers(nn, i, NFSD_TEST) &&
+ (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
+ nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
+ err = -EINVAL;
+ goto err_nfsd_unlock;
+ }
+
+ nla_nest_end(skb, attr);
+ }
+ }
+
+ mutex_unlock(&nfsd_mutex);
+ genlmsg_end(skb, hdr);
+
+ return genlmsg_reply(skb, info);
+
+err_nfsd_unlock:
+ mutex_unlock(&nfsd_mutex);
+err_free_msg:
+ nlmsg_free(skb);
+
+ return err;
+}
+
/**
* nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
* @net: a freshly-created network namespace
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index ca193f7ff0e1..4fc91f50138a 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
.pg_rpcbind_set = nfsd_rpcbind_set,
};

-static bool
-nfsd_support_version(int vers)
+bool nfsd_support_version(int vers)
{
if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
return nfsd_version[vers] != NULL;
diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
index ccc78a5ee650..8a0a2b344923 100644
--- a/include/uapi/linux/nfsd_netlink.h
+++ b/include/uapi/linux/nfsd_netlink.h
@@ -38,10 +38,28 @@ enum {
NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
};

+enum {
+ NFSD_A_VERSION_MAJOR = 1,
+ NFSD_A_VERSION_MINOR,
+ NFSD_A_VERSION_ENABLED,
+
+ __NFSD_A_VERSION_MAX,
+ NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
+};
+
+enum {
+ NFSD_A_SERVER_PROTO_VERSION = 1,
+
+ __NFSD_A_SERVER_PROTO_MAX,
+ NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
+};
+
enum {
NFSD_CMD_RPC_STATUS_GET = 1,
NFSD_CMD_THREADS_SET,
NFSD_CMD_THREADS_GET,
+ NFSD_CMD_VERSION_SET,
+ NFSD_CMD_VERSION_GET,

__NFSD_CMD_MAX,
NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
--
2.44.0


2024-04-15 19:52:46

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 5/6] SUNRPC: add a new svc_find_listener helper

From: Jeff Layton <[email protected]>

svc_find_listener will return the transport instance pointer for the
endpoint accepting connections/peer traffic from the specified transport
class, and matching sockaddr.

Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
include/linux/sunrpc/svc_xprt.h | 2 ++
net/sunrpc/svc_xprt.c | 34 +++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 0d9b10dbe07d..0981e35a9fed 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -150,6 +150,8 @@ void svc_xprt_copy_addrs(struct svc_rqst *rqstp, struct svc_xprt *xprt);
void svc_xprt_close(struct svc_xprt *xprt);
int svc_port_is_privileged(struct sockaddr *sin);
int svc_print_xprts(char *buf, int maxlen);
+struct svc_xprt *svc_find_listener(struct svc_serv *serv, const char *xcl_name,
+ struct net *net, const struct sockaddr *sa);
struct svc_xprt *svc_find_xprt(struct svc_serv *serv, const char *xcl_name,
struct net *net, const sa_family_t af,
const unsigned short port);
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 463fe544ae28..34a3626c56b1 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -1276,6 +1276,40 @@ static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
return dr;
}

+/**
+ * svc_find_listener - find an RPC transport instance
+ * @serv: pointer to svc_serv to search
+ * @xcl_name: C string containing transport's class name
+ * @net: owner net pointer
+ * @sa: sockaddr containing address
+ *
+ * Return the transport instance pointer for the endpoint accepting
+ * connections/peer traffic from the specified transport class,
+ * and matching sockaddr.
+ */
+struct svc_xprt *svc_find_listener(struct svc_serv *serv, const char *xcl_name,
+ struct net *net, const struct sockaddr *sa)
+{
+ struct svc_xprt *xprt;
+ struct svc_xprt *found = NULL;
+
+ spin_lock_bh(&serv->sv_lock);
+ list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list) {
+ if (xprt->xpt_net != net)
+ continue;
+ if (strcmp(xprt->xpt_class->xcl_name, xcl_name))
+ continue;
+ if (!rpc_cmp_addr_port(sa, (struct sockaddr *)&xprt->xpt_local))
+ continue;
+ found = xprt;
+ svc_xprt_get(xprt);
+ break;
+ }
+ spin_unlock_bh(&serv->sv_lock);
+ return found;
+}
+EXPORT_SYMBOL_GPL(svc_find_listener);
+
/**
* svc_find_xprt - find an RPC transport instance
* @serv: pointer to svc_serv to search
--
2.44.0


2024-04-15 19:54:38

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 6/6] NFSD: add listener-{set,get} netlink command

Introduce write_ports netlink command. For listener-set, userspace is
expected to provide a NFS listeners list it wants enabled. All other
sockets will be closed.

Reviewed-by: Jeff Layton <[email protected]>
Co-developed-by: Jeff Layton <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
Documentation/netlink/specs/nfsd.yaml | 34 ++++
fs/nfsd/netlink.c | 22 +++
fs/nfsd/netlink.h | 3 +
fs/nfsd/nfsctl.c | 220 ++++++++++++++++++++++++++
include/uapi/linux/nfsd_netlink.h | 17 ++
5 files changed, 296 insertions(+)

diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
index 0396e8b3ea1f..e5c1f9186fe8 100644
--- a/Documentation/netlink/specs/nfsd.yaml
+++ b/Documentation/netlink/specs/nfsd.yaml
@@ -94,6 +94,23 @@ attribute-sets:
type: nest
nested-attributes: version
multi-attr: true
+ -
+ name: sock
+ attributes:
+ -
+ name: addr
+ type: binary
+ -
+ name: transport-name
+ type: string
+ -
+ name: server-sock
+ attributes:
+ -
+ name: addr
+ type: nest
+ nested-attributes: sock
+ multi-attr: true

operations:
list:
@@ -157,3 +174,20 @@ operations:
reply:
attributes:
- version
+ -
+ name: listener-set
+ doc: set nfs running sockets
+ attribute-set: server-sock
+ flags: [ admin-perm ]
+ do:
+ request:
+ attributes:
+ - addr
+ -
+ name: listener-get
+ doc: get nfs running listeners
+ attribute-set: server-sock
+ do:
+ reply:
+ attributes:
+ - addr
diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
index bf5df9597288..9450d691dae8 100644
--- a/fs/nfsd/netlink.c
+++ b/fs/nfsd/netlink.c
@@ -11,6 +11,11 @@
#include <uapi/linux/nfsd_netlink.h>

/* Common nested types */
+const struct nla_policy nfsd_sock_nl_policy[NFSD_A_SOCK_TRANSPORT_NAME + 1] = {
+ [NFSD_A_SOCK_ADDR] = { .type = NLA_BINARY, },
+ [NFSD_A_SOCK_TRANSPORT_NAME] = { .type = NLA_NUL_STRING, },
+};
+
const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
[NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
[NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
@@ -29,6 +34,11 @@ static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VE
[NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
};

+/* NFSD_CMD_LISTENER_SET - do */
+static const struct nla_policy nfsd_listener_set_nl_policy[NFSD_A_SERVER_SOCK_ADDR + 1] = {
+ [NFSD_A_SERVER_SOCK_ADDR] = NLA_POLICY_NESTED(nfsd_sock_nl_policy),
+};
+
/* Ops table for nfsd */
static const struct genl_split_ops nfsd_nl_ops[] = {
{
@@ -62,6 +72,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
.doit = nfsd_nl_version_get_doit,
.flags = GENL_CMD_CAP_DO,
},
+ {
+ .cmd = NFSD_CMD_LISTENER_SET,
+ .doit = nfsd_nl_listener_set_doit,
+ .policy = nfsd_listener_set_nl_policy,
+ .maxattr = NFSD_A_SERVER_SOCK_ADDR,
+ .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+ },
+ {
+ .cmd = NFSD_CMD_LISTENER_GET,
+ .doit = nfsd_nl_listener_get_doit,
+ .flags = GENL_CMD_CAP_DO,
+ },
};

struct genl_family nfsd_nl_family __ro_after_init = {
diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
index c7c0da275481..e3724637d64d 100644
--- a/fs/nfsd/netlink.h
+++ b/fs/nfsd/netlink.h
@@ -12,6 +12,7 @@
#include <uapi/linux/nfsd_netlink.h>

/* Common nested types */
+extern const struct nla_policy nfsd_sock_nl_policy[NFSD_A_SOCK_TRANSPORT_NAME + 1];
extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];

int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
@@ -23,6 +24,8 @@ int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
+int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info);
+int nfsd_nl_listener_get_doit(struct sk_buff *skb, struct genl_info *info);

extern struct genl_family nfsd_nl_family;

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 2c8929ef79e9..2e8534113ce4 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1907,6 +1907,226 @@ int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
return err;
}

+/**
+ * nfsd_nl_listener_set_doit - set the nfs running sockets
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct net *net = genl_info_net(info);
+ struct svc_xprt *xprt, *tmp;
+ const struct nlattr *attr;
+ struct svc_serv *serv;
+ LIST_HEAD(permsocks);
+ struct nfsd_net *nn;
+ int err, rem;
+
+ mutex_lock(&nfsd_mutex);
+
+ err = nfsd_create_serv(net);
+ if (err) {
+ mutex_unlock(&nfsd_mutex);
+ return err;
+ }
+
+ nn = net_generic(net, nfsd_net_id);
+ serv = nn->nfsd_serv;
+
+ spin_lock_bh(&serv->sv_lock);
+
+ /* Move all of the old listener sockets to a temp list */
+ list_splice_init(&serv->sv_permsocks, &permsocks);
+
+ /*
+ * Walk the list of server_socks from userland and move any that match
+ * back to sv_permsocks
+ */
+ nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
+ struct nlattr *tb[NFSD_A_SOCK_MAX + 1];
+ const char *xcl_name;
+ struct sockaddr *sa;
+
+ if (nla_type(attr) != NFSD_A_SERVER_SOCK_ADDR)
+ continue;
+
+ if (nla_parse_nested(tb, NFSD_A_SOCK_MAX, attr,
+ nfsd_sock_nl_policy, info->extack) < 0)
+ continue;
+
+ if (!tb[NFSD_A_SOCK_ADDR] || !tb[NFSD_A_SOCK_TRANSPORT_NAME])
+ continue;
+
+ if (nla_len(tb[NFSD_A_SOCK_ADDR]) < sizeof(*sa))
+ continue;
+
+ xcl_name = nla_data(tb[NFSD_A_SOCK_TRANSPORT_NAME]);
+ sa = nla_data(tb[NFSD_A_SOCK_ADDR]);
+
+ /* Put back any matching sockets */
+ list_for_each_entry_safe(xprt, tmp, &permsocks, xpt_list) {
+ /* This shouldn't be possible */
+ if (WARN_ON_ONCE(xprt->xpt_net != net)) {
+ list_move(&xprt->xpt_list, &serv->sv_permsocks);
+ continue;
+ }
+
+ /* If everything matches, put it back */
+ if (!strcmp(xprt->xpt_class->xcl_name, xcl_name) &&
+ rpc_cmp_addr_port(sa, (struct sockaddr *)&xprt->xpt_local)) {
+ list_move(&xprt->xpt_list, &serv->sv_permsocks);
+ break;
+ }
+ }
+ }
+
+ /* For now, no removing old sockets while server is running */
+ if (serv->sv_nrthreads && !list_empty(&permsocks)) {
+ list_splice_init(&permsocks, &serv->sv_permsocks);
+ spin_unlock_bh(&serv->sv_lock);
+ err = -EBUSY;
+ goto out_unlock_mtx;
+ }
+
+ /* Close the remaining sockets on the permsocks list */
+ while (!list_empty(&permsocks)) {
+ xprt = list_first_entry(&permsocks, struct svc_xprt, xpt_list);
+ list_move(&xprt->xpt_list, &serv->sv_permsocks);
+
+ /*
+ * Newly-created sockets are born with the BUSY bit set. Clear
+ * it if there are no threads, since nothing can pick it up
+ * in that case.
+ */
+ if (!serv->sv_nrthreads)
+ clear_bit(XPT_BUSY, &xprt->xpt_flags);
+
+ set_bit(XPT_CLOSE, &xprt->xpt_flags);
+ spin_unlock_bh(&serv->sv_lock);
+ svc_xprt_close(xprt);
+ spin_lock_bh(&serv->sv_lock);
+ }
+
+ spin_unlock_bh(&serv->sv_lock);
+
+ /* walk list of addrs again, open any that still don't exist */
+ nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
+ struct nlattr *tb[NFSD_A_SOCK_MAX + 1];
+ const char *xcl_name;
+ struct sockaddr *sa;
+ int ret;
+
+ if (nla_type(attr) != NFSD_A_SERVER_SOCK_ADDR)
+ continue;
+
+ if (nla_parse_nested(tb, NFSD_A_SOCK_MAX, attr,
+ nfsd_sock_nl_policy, info->extack) < 0)
+ continue;
+
+ if (!tb[NFSD_A_SOCK_ADDR] || !tb[NFSD_A_SOCK_TRANSPORT_NAME])
+ continue;
+
+ if (nla_len(tb[NFSD_A_SOCK_ADDR]) < sizeof(*sa))
+ continue;
+
+ xcl_name = nla_data(tb[NFSD_A_SOCK_TRANSPORT_NAME]);
+ sa = nla_data(tb[NFSD_A_SOCK_ADDR]);
+
+ xprt = svc_find_listener(serv, xcl_name, net, sa);
+ if (xprt) {
+ svc_xprt_put(xprt);
+ continue;
+ }
+
+ ret = svc_xprt_create_from_sa(serv, xcl_name, net, sa,
+ SVC_SOCK_ANONYMOUS,
+ get_current_cred());
+ /* always save the latest error */
+ if (ret < 0)
+ err = ret;
+ }
+
+ if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))
+ nfsd_destroy_serv(net);
+
+out_unlock_mtx:
+ mutex_unlock(&nfsd_mutex);
+
+ return err;
+}
+
+/**
+ * nfsd_nl_listener_get_doit - get the nfs running listeners
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_listener_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct svc_xprt *xprt;
+ struct svc_serv *serv;
+ struct nfsd_net *nn;
+ void *hdr;
+ int err;
+
+ skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ hdr = genlmsg_iput(skb, info);
+ if (!hdr) {
+ err = -EMSGSIZE;
+ goto err_free_msg;
+ }
+
+ mutex_lock(&nfsd_mutex);
+ nn = net_generic(genl_info_net(info), nfsd_net_id);
+
+ /* no nfs server? Just send empty socket list */
+ if (!nn->nfsd_serv)
+ goto out_unlock_mtx;
+
+ serv = nn->nfsd_serv;
+ spin_lock_bh(&serv->sv_lock);
+ list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list) {
+ struct nlattr *attr;
+
+ attr = nla_nest_start(skb, NFSD_A_SERVER_SOCK_ADDR);
+ if (!attr) {
+ err = -EINVAL;
+ goto err_serv_unlock;
+ }
+
+ if (nla_put_string(skb, NFSD_A_SOCK_TRANSPORT_NAME,
+ xprt->xpt_class->xcl_name) ||
+ nla_put(skb, NFSD_A_SOCK_ADDR,
+ sizeof(struct sockaddr_storage),
+ &xprt->xpt_local)) {
+ err = -EINVAL;
+ goto err_serv_unlock;
+ }
+
+ nla_nest_end(skb, attr);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+out_unlock_mtx:
+ mutex_unlock(&nfsd_mutex);
+ genlmsg_end(skb, hdr);
+
+ return genlmsg_reply(skb, info);
+
+err_serv_unlock:
+ spin_unlock_bh(&serv->sv_lock);
+ mutex_unlock(&nfsd_mutex);
+err_free_msg:
+ nlmsg_free(skb);
+
+ return err;
+}
+
/**
* nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
* @net: a freshly-created network namespace
diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
index 8a0a2b344923..ca9900f9d86f 100644
--- a/include/uapi/linux/nfsd_netlink.h
+++ b/include/uapi/linux/nfsd_netlink.h
@@ -54,12 +54,29 @@ enum {
NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
};

+enum {
+ NFSD_A_SOCK_ADDR = 1,
+ NFSD_A_SOCK_TRANSPORT_NAME,
+
+ __NFSD_A_SOCK_MAX,
+ NFSD_A_SOCK_MAX = (__NFSD_A_SOCK_MAX - 1)
+};
+
+enum {
+ NFSD_A_SERVER_SOCK_ADDR = 1,
+
+ __NFSD_A_SERVER_SOCK_MAX,
+ NFSD_A_SERVER_SOCK_MAX = (__NFSD_A_SERVER_SOCK_MAX - 1)
+};
+
enum {
NFSD_CMD_RPC_STATUS_GET = 1,
NFSD_CMD_THREADS_SET,
NFSD_CMD_THREADS_GET,
NFSD_CMD_VERSION_SET,
NFSD_CMD_VERSION_GET,
+ NFSD_CMD_LISTENER_SET,
+ NFSD_CMD_LISTENER_GET,

__NFSD_CMD_MAX,
NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
--
2.44.0


2024-04-15 20:01:43

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

Introduce write_threads netlink command similar to the one available
through the procfs.

Tested-by: Jeff Layton <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Co-developed-by: Jeff Layton <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
fs/nfsd/netlink.c | 19 +++++
fs/nfsd/netlink.h | 2 +
fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
include/uapi/linux/nfsd_netlink.h | 11 +++
5 files changed, 169 insertions(+)

diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
index 05acc73e2e33..cbe6c5fd6c4d 100644
--- a/Documentation/netlink/specs/nfsd.yaml
+++ b/Documentation/netlink/specs/nfsd.yaml
@@ -62,6 +62,18 @@ attribute-sets:
name: compound-ops
type: u32
multi-attr: true
+ -
+ name: server-worker
+ attributes:
+ -
+ name: threads
+ type: u32
+ -
+ name: gracetime
+ type: u32
+ -
+ name: leasetime
+ type: u32

operations:
list:
@@ -87,3 +99,24 @@ operations:
- sport
- dport
- compound-ops
+ -
+ name: threads-set
+ doc: set the number of running threads
+ attribute-set: server-worker
+ flags: [ admin-perm ]
+ do:
+ request:
+ attributes:
+ - threads
+ - gracetime
+ - leasetime
+ -
+ name: threads-get
+ doc: get the number of running threads
+ attribute-set: server-worker
+ do:
+ reply:
+ attributes:
+ - threads
+ - gracetime
+ - leasetime
diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
index 0e1d635ec5f9..20a646af0324 100644
--- a/fs/nfsd/netlink.c
+++ b/fs/nfsd/netlink.c
@@ -10,6 +10,13 @@

#include <uapi/linux/nfsd_netlink.h>

+/* NFSD_CMD_THREADS_SET - do */
+static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
+ [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
+ [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
+ [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
+};
+
/* Ops table for nfsd */
static const struct genl_split_ops nfsd_nl_ops[] = {
{
@@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
.done = nfsd_nl_rpc_status_get_done,
.flags = GENL_CMD_CAP_DUMP,
},
+ {
+ .cmd = NFSD_CMD_THREADS_SET,
+ .doit = nfsd_nl_threads_set_doit,
+ .policy = nfsd_threads_set_nl_policy,
+ .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
+ .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+ },
+ {
+ .cmd = NFSD_CMD_THREADS_GET,
+ .doit = nfsd_nl_threads_get_doit,
+ .flags = GENL_CMD_CAP_DO,
+ },
};

struct genl_family nfsd_nl_family __ro_after_init = {
diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
index d83dd6bdee92..4137fac477e4 100644
--- a/fs/nfsd/netlink.h
+++ b/fs/nfsd/netlink.h
@@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);

int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb);
+int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
+int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);

extern struct genl_family nfsd_nl_family;

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index f2e442d7fe16..38a5df03981b 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
return 0;
}

+/**
+ * nfsd_nl_threads_set_doit - set the number of running threads
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct net *net = genl_info_net(info);
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ int ret = -EBUSY;
+ u32 nthreads;
+
+ if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
+ return -EINVAL;
+
+ nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
+
+ mutex_lock(&nfsd_mutex);
+ if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
+ info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
+ const struct nlattr *attr;
+
+ if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
+ goto out_unlock;
+
+ ret = -EINVAL;
+ attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
+ if (attr) {
+ u32 gracetime = nla_get_u32(attr);
+
+ if (gracetime < 10 || gracetime > 3600)
+ goto out_unlock;
+
+ nn->nfsd4_grace = gracetime;
+ }
+
+ attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
+ if (attr) {
+ u32 leasetime = nla_get_u32(attr);
+
+ if (leasetime < 10 || leasetime > 3600)
+ goto out_unlock;
+
+ nn->nfsd4_lease = leasetime;
+ }
+ }
+
+ ret = nfsd_svc(nthreads, net, get_current_cred());
+out_unlock:
+ mutex_unlock(&nfsd_mutex);
+
+ return ret == nthreads ? 0 : ret;
+}
+
+/**
+ * nfsd_nl_threads_get_doit - get the number of running threads
+ * @skb: reply buffer
+ * @info: netlink metadata and command arguments
+ *
+ * Return 0 on success or a negative errno.
+ */
+int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
+{
+ struct net *net = genl_info_net(info);
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ void *hdr;
+ int err;
+
+ skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ hdr = genlmsg_iput(skb, info);
+ if (!hdr) {
+ err = -EMSGSIZE;
+ goto err_free_msg;
+ }
+
+ mutex_lock(&nfsd_mutex);
+ err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
+ nn->nfsd4_grace) ||
+ nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
+ nn->nfsd4_lease) ||
+ nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
+ nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
+ mutex_unlock(&nfsd_mutex);
+
+ if (err) {
+ err = -EINVAL;
+ goto err_free_msg;
+ }
+
+ genlmsg_end(skb, hdr);
+
+ return genlmsg_reply(skb, info);
+
+err_free_msg:
+ nlmsg_free(skb);
+
+ return err;
+}
+
/**
* nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
* @net: a freshly-created network namespace
diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
index 3cd044edee5d..ccc78a5ee650 100644
--- a/include/uapi/linux/nfsd_netlink.h
+++ b/include/uapi/linux/nfsd_netlink.h
@@ -29,8 +29,19 @@ enum {
NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
};

+enum {
+ NFSD_A_SERVER_WORKER_THREADS = 1,
+ NFSD_A_SERVER_WORKER_GRACETIME,
+ NFSD_A_SERVER_WORKER_LEASETIME,
+
+ __NFSD_A_SERVER_WORKER_MAX,
+ NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
+};
+
enum {
NFSD_CMD_RPC_STATUS_GET = 1,
+ NFSD_CMD_THREADS_SET,
+ NFSD_CMD_THREADS_GET,

__NFSD_CMD_MAX,
NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
--
2.44.0


2024-04-15 20:05:46

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 4/6] SUNRPC: introduce svc_xprt_create_from_sa utility routine

Add svc_xprt_create_from_sa utility routine and refactor
svc_xprt_create() codebase in order to introduce the capability to
create a svc port from socket address.

Reviewed-by: Jeff Layton <[email protected]>
Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
include/linux/sunrpc/svc_xprt.h | 3 +
net/sunrpc/svc_xprt.c | 133 ++++++++++++++++++--------------
2 files changed, 78 insertions(+), 58 deletions(-)

diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
index 8e20cd60e2e7..0d9b10dbe07d 100644
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -135,6 +135,9 @@ int svc_reg_xprt_class(struct svc_xprt_class *);
void svc_unreg_xprt_class(struct svc_xprt_class *);
void svc_xprt_init(struct net *, struct svc_xprt_class *, struct svc_xprt *,
struct svc_serv *);
+int svc_xprt_create_from_sa(struct svc_serv *serv, const char *xprt_name,
+ struct net *net, struct sockaddr *sap,
+ int flags, const struct cred *cred);
int svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
struct net *net, const int family,
const unsigned short port, int flags,
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index b4a85a227bd7..463fe544ae28 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -211,51 +211,6 @@ void svc_xprt_init(struct net *net, struct svc_xprt_class *xcl,
}
EXPORT_SYMBOL_GPL(svc_xprt_init);

-static struct svc_xprt *__svc_xpo_create(struct svc_xprt_class *xcl,
- struct svc_serv *serv,
- struct net *net,
- const int family,
- const unsigned short port,
- int flags)
-{
- struct sockaddr_in sin = {
- .sin_family = AF_INET,
- .sin_addr.s_addr = htonl(INADDR_ANY),
- .sin_port = htons(port),
- };
-#if IS_ENABLED(CONFIG_IPV6)
- struct sockaddr_in6 sin6 = {
- .sin6_family = AF_INET6,
- .sin6_addr = IN6ADDR_ANY_INIT,
- .sin6_port = htons(port),
- };
-#endif
- struct svc_xprt *xprt;
- struct sockaddr *sap;
- size_t len;
-
- switch (family) {
- case PF_INET:
- sap = (struct sockaddr *)&sin;
- len = sizeof(sin);
- break;
-#if IS_ENABLED(CONFIG_IPV6)
- case PF_INET6:
- sap = (struct sockaddr *)&sin6;
- len = sizeof(sin6);
- break;
-#endif
- default:
- return ERR_PTR(-EAFNOSUPPORT);
- }
-
- xprt = xcl->xcl_ops->xpo_create(serv, net, sap, len, flags);
- if (IS_ERR(xprt))
- trace_svc_xprt_create_err(serv->sv_program->pg_name,
- xcl->xcl_name, sap, len, xprt);
- return xprt;
-}
-
/**
* svc_xprt_received - start next receiver thread
* @xprt: controlling transport
@@ -294,9 +249,8 @@ void svc_add_new_perm_xprt(struct svc_serv *serv, struct svc_xprt *new)
}

static int _svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
- struct net *net, const int family,
- const unsigned short port, int flags,
- const struct cred *cred)
+ struct net *net, struct sockaddr *sap,
+ size_t len, int flags, const struct cred *cred)
{
struct svc_xprt_class *xcl;

@@ -312,8 +266,11 @@ static int _svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
goto err;

spin_unlock(&svc_xprt_class_lock);
- newxprt = __svc_xpo_create(xcl, serv, net, family, port, flags);
+ newxprt = xcl->xcl_ops->xpo_create(serv, net, sap, len, flags);
if (IS_ERR(newxprt)) {
+ trace_svc_xprt_create_err(serv->sv_program->pg_name,
+ xcl->xcl_name, sap, len,
+ newxprt);
module_put(xcl->xcl_owner);
return PTR_ERR(newxprt);
}
@@ -329,6 +286,48 @@ static int _svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
return -EPROTONOSUPPORT;
}

+/**
+ * svc_xprt_create_from_sa - Add a new listener to @serv from socket address
+ * @serv: target RPC service
+ * @xprt_name: transport class name
+ * @net: network namespace
+ * @sap: socket address pointer
+ * @flags: SVC_SOCK flags
+ * @cred: credential to bind to this transport
+ *
+ * Return local xprt port on success or %-EPROTONOSUPPORT on failure
+ */
+int svc_xprt_create_from_sa(struct svc_serv *serv, const char *xprt_name,
+ struct net *net, struct sockaddr *sap,
+ int flags, const struct cred *cred)
+{
+ size_t len;
+ int err;
+
+ switch (sap->sa_family) {
+ case AF_INET:
+ len = sizeof(struct sockaddr_in);
+ break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case AF_INET6:
+ len = sizeof(struct sockaddr_in6);
+ break;
+#endif
+ default:
+ return -EAFNOSUPPORT;
+ }
+
+ err = _svc_xprt_create(serv, xprt_name, net, sap, len, flags, cred);
+ if (err == -EPROTONOSUPPORT) {
+ request_module("svc%s", xprt_name);
+ err = _svc_xprt_create(serv, xprt_name, net, sap, len, flags,
+ cred);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(svc_xprt_create_from_sa);
+
/**
* svc_xprt_create - Add a new listener to @serv
* @serv: target RPC service
@@ -339,23 +338,41 @@ static int _svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
* @flags: SVC_SOCK flags
* @cred: credential to bind to this transport
*
- * Return values:
- * %0: New listener added successfully
- * %-EPROTONOSUPPORT: Requested transport type not supported
+ * Return local xprt port on success or %-EPROTONOSUPPORT on failure
*/
int svc_xprt_create(struct svc_serv *serv, const char *xprt_name,
struct net *net, const int family,
const unsigned short port, int flags,
const struct cred *cred)
{
- int err;
+ struct sockaddr_in sin = {
+ .sin_family = AF_INET,
+ .sin_addr.s_addr = htonl(INADDR_ANY),
+ .sin_port = htons(port),
+ };
+#if IS_ENABLED(CONFIG_IPV6)
+ struct sockaddr_in6 sin6 = {
+ .sin6_family = AF_INET6,
+ .sin6_addr = IN6ADDR_ANY_INIT,
+ .sin6_port = htons(port),
+ };
+#endif
+ struct sockaddr *sap;

- err = _svc_xprt_create(serv, xprt_name, net, family, port, flags, cred);
- if (err == -EPROTONOSUPPORT) {
- request_module("svc%s", xprt_name);
- err = _svc_xprt_create(serv, xprt_name, net, family, port, flags, cred);
+ switch (family) {
+ case PF_INET:
+ sap = (struct sockaddr *)&sin;
+ break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case PF_INET6:
+ sap = (struct sockaddr *)&sin6;
+ break;
+#endif
+ default:
+ return -EAFNOSUPPORT;
}
- return err;
+
+ return svc_xprt_create_from_sa(serv, xprt_name, net, sap, flags, cred);
}
EXPORT_SYMBOL_GPL(svc_xprt_create);

--
2.44.0


2024-04-16 03:16:52

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> Introduce write_version netlink command through a "declarative" interface.
> This patch introduces a change in behavior since for version-set userspace
> is expected to provide a NFS major/minor version list it wants to enable
> while all the other ones will be disabled. (procfs write_version
> command implements imperative interface where the admin writes +3/-3 to
> enable/disable a single version.

It seems a little weird to me that the interface always disables all
version, but then also allows individual versions to be disabled.

Would it be reasonable to simply ignore the "enabled" flag when setting
version, and just enable all versions listed??

Or maybe only enable those with the flag, and don't disable those
without the flag?

Those don't necessarily seem much better - but the current behaviour
still seems odd.

Also for getting the version, the doc says:

doc: get nfs enabled versions

which I don't think it quite right. The code reports all supported
versions, and indicates which of those are enabled. So maybe:

doc: get enabled status for all supported versions

I think that fact that it actually lists all supported versions is
useful and worth making explicit.

Thanks,
NeilBrown


>
> Reviewed-by: Jeff Layton <[email protected]>
> Tested-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> fs/nfsd/netlink.c | 24 +++++
> fs/nfsd/netlink.h | 5 +
> fs/nfsd/netns.h | 1 +
> fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> fs/nfsd/nfssvc.c | 3 +-
> include/uapi/linux/nfsd_netlink.h | 18 ++++
> 7 files changed, 236 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> index cbe6c5fd6c4d..0396e8b3ea1f 100644
> --- a/Documentation/netlink/specs/nfsd.yaml
> +++ b/Documentation/netlink/specs/nfsd.yaml
> @@ -74,6 +74,26 @@ attribute-sets:
> -
> name: leasetime
> type: u32
> + -
> + name: version
> + attributes:
> + -
> + name: major
> + type: u32
> + -
> + name: minor
> + type: u32
> + -
> + name: enabled
> + type: flag
> + -
> + name: server-proto
> + attributes:
> + -
> + name: version
> + type: nest
> + nested-attributes: version
> + multi-attr: true
>
> operations:
> list:
> @@ -120,3 +140,20 @@ operations:
> - threads
> - gracetime
> - leasetime
> + -
> + name: version-set
> + doc: set nfs enabled versions
> + attribute-set: server-proto
> + flags: [ admin-perm ]
> + do:
> + request:
> + attributes:
> + - version
> + -
> + name: version-get
> + doc: get nfs enabled versions
> + attribute-set: server-proto
> + do:
> + reply:
> + attributes:
> + - version
> diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> index 20a646af0324..bf5df9597288 100644
> --- a/fs/nfsd/netlink.c
> +++ b/fs/nfsd/netlink.c
> @@ -10,6 +10,13 @@
>
> #include <uapi/linux/nfsd_netlink.h>
>
> +/* Common nested types */
> +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> +};
> +
> /* NFSD_CMD_THREADS_SET - do */
> static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> };
>
> +/* NFSD_CMD_VERSION_SET - do */
> +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> +};
> +
> /* Ops table for nfsd */
> static const struct genl_split_ops nfsd_nl_ops[] = {
> {
> @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> .doit = nfsd_nl_threads_get_doit,
> .flags = GENL_CMD_CAP_DO,
> },
> + {
> + .cmd = NFSD_CMD_VERSION_SET,
> + .doit = nfsd_nl_version_set_doit,
> + .policy = nfsd_version_set_nl_policy,
> + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> + },
> + {
> + .cmd = NFSD_CMD_VERSION_GET,
> + .doit = nfsd_nl_version_get_doit,
> + .flags = GENL_CMD_CAP_DO,
> + },
> };
>
> struct genl_family nfsd_nl_family __ro_after_init = {
> diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> index 4137fac477e4..c7c0da275481 100644
> --- a/fs/nfsd/netlink.h
> +++ b/fs/nfsd/netlink.h
> @@ -11,6 +11,9 @@
>
> #include <uapi/linux/nfsd_netlink.h>
>
> +/* Common nested types */
> +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> +
> int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
>
> @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> struct netlink_callback *cb);
> int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
>
> extern struct genl_family nfsd_nl_family;
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index d4be519b5734..14ec15656320 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -218,6 +218,7 @@ struct nfsd_net {
> /* Simple check to find out if a given net was properly initialized */
> #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
>
> +extern bool nfsd_support_version(int vers);
> extern void nfsd_netns_free_versions(struct nfsd_net *nn);
>
> extern unsigned int nfsd_net_id;
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 38a5df03981b..2c8929ef79e9 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> return err;
> }
>
> +/**
> + * nfsd_nl_version_set_doit - set the nfs enabled versions
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + const struct nlattr *attr;
> + struct nfsd_net *nn;
> + int i, rem;
> +
> + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> + return -EINVAL;
> +
> + mutex_lock(&nfsd_mutex);
> +
> + nn = net_generic(genl_info_net(info), nfsd_net_id);
> + if (nn->nfsd_serv) {
> + mutex_unlock(&nfsd_mutex);
> + return -EBUSY;
> + }
> +
> + /* clear current supported versions. */
> + nfsd_vers(nn, 2, NFSD_CLEAR);
> + nfsd_vers(nn, 3, NFSD_CLEAR);
> + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> + nfsd_minorversion(nn, i, NFSD_CLEAR);
> +
> + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> + u32 major, minor = 0;
> + bool enabled;
> +
> + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> + continue;
> +
> + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> + nfsd_version_nl_policy, info->extack) < 0)
> + continue;
> +
> + if (!tb[NFSD_A_VERSION_MAJOR])
> + continue;
> +
> + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> + if (tb[NFSD_A_VERSION_MINOR])
> + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> +
> + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> +
> + switch (major) {
> + case 4:
> + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> + break;
> + case 3:
> + case 2:
> + if (!minor)
> + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> + break;
> + default:
> + break;
> + }
> + }
> +
> + mutex_unlock(&nfsd_mutex);
> +
> + return 0;
> +}
> +
> +/**
> + * nfsd_nl_version_get_doit - get the nfs enabled versions
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + struct nfsd_net *nn;
> + int i, err;
> + void *hdr;
> +
> + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!skb)
> + return -ENOMEM;
> +
> + hdr = genlmsg_iput(skb, info);
> + if (!hdr) {
> + err = -EMSGSIZE;
> + goto err_free_msg;
> + }
> +
> + mutex_lock(&nfsd_mutex);
> + nn = net_generic(genl_info_net(info), nfsd_net_id);
> +
> + for (i = 2; i <= 4; i++) {
> + int j;
> +
> + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> + struct nlattr *attr;
> +
> + /* Don't record any versions the kernel doesn't have
> + * compiled in
> + */
> + if (!nfsd_support_version(i))
> + continue;
> +
> + /* NFSv{2,3} does not support minor numbers */
> + if (i < 4 && j)
> + continue;
> +
> + attr = nla_nest_start(skb,
> + NFSD_A_SERVER_PROTO_VERSION);
> + if (!attr) {
> + err = -EINVAL;
> + goto err_nfsd_unlock;
> + }
> +
> + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> + err = -EINVAL;
> + goto err_nfsd_unlock;
> + }
> +
> + /* Set the enabled flag if the version is enabled */
> + if (nfsd_vers(nn, i, NFSD_TEST) &&
> + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> + err = -EINVAL;
> + goto err_nfsd_unlock;
> + }
> +
> + nla_nest_end(skb, attr);
> + }
> + }
> +
> + mutex_unlock(&nfsd_mutex);
> + genlmsg_end(skb, hdr);
> +
> + return genlmsg_reply(skb, info);
> +
> +err_nfsd_unlock:
> + mutex_unlock(&nfsd_mutex);
> +err_free_msg:
> + nlmsg_free(skb);
> +
> + return err;
> +}
> +
> /**
> * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> * @net: a freshly-created network namespace
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index ca193f7ff0e1..4fc91f50138a 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> .pg_rpcbind_set = nfsd_rpcbind_set,
> };
>
> -static bool
> -nfsd_support_version(int vers)
> +bool nfsd_support_version(int vers)
> {
> if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> return nfsd_version[vers] != NULL;
> diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> index ccc78a5ee650..8a0a2b344923 100644
> --- a/include/uapi/linux/nfsd_netlink.h
> +++ b/include/uapi/linux/nfsd_netlink.h
> @@ -38,10 +38,28 @@ enum {
> NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> };
>
> +enum {
> + NFSD_A_VERSION_MAJOR = 1,
> + NFSD_A_VERSION_MINOR,
> + NFSD_A_VERSION_ENABLED,
> +
> + __NFSD_A_VERSION_MAX,
> + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> +};
> +
> +enum {
> + NFSD_A_SERVER_PROTO_VERSION = 1,
> +
> + __NFSD_A_SERVER_PROTO_MAX,
> + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> +};
> +
> enum {
> NFSD_CMD_RPC_STATUS_GET = 1,
> NFSD_CMD_THREADS_SET,
> NFSD_CMD_THREADS_GET,
> + NFSD_CMD_VERSION_SET,
> + NFSD_CMD_VERSION_GET,
>
> __NFSD_CMD_MAX,
> NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> --
> 2.44.0
>
>


2024-04-16 10:26:59

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > Introduce write_version netlink command through a "declarative" interface.
> > This patch introduces a change in behavior since for version-set userspace
> > is expected to provide a NFS major/minor version list it wants to enable
> > while all the other ones will be disabled. (procfs write_version
> > command implements imperative interface where the admin writes +3/-3 to
> > enable/disable a single version.
>
> It seems a little weird to me that the interface always disables all
> version, but then also allows individual versions to be disabled.
>
> Would it be reasonable to simply ignore the "enabled" flag when setting
> version, and just enable all versions listed??
>
> Or maybe only enable those with the flag, and don't disable those
> without the flag?
>
> Those don't necessarily seem much better - but the current behaviour
> still seems odd.
>

I think it makes sense.

We disable all versions, and enable any that have the "enabled" flag set
in the call from userland. Userland technically needn't send down the
versions that are disabled in the call, but the current userland program
does.

I worry about imperative interfaces that might only say -- "enable v4.1,
but disable v3" and leave the others in their current state. That
requires that both the kernel and userland keep state about what
versions are currently enabled and disabled, and it's possible to get
that wrong.

My thinking was that by using a declarative interface like this, we
eliminate ambiguity in how these interfaces are supposed to work. The
client sends down an entire version map in one fell swoop, and we know
exactly what the result should look like.

Note that you can enable or disable just a single version with the
userland tool, but this way all of that complexity stays in userland.

> Also for getting the version, the doc says:
>
> doc: get nfs enabled versions
>
> which I don't think it quite right. The code reports all supported
> versions, and indicates which of those are enabled. So maybe:
>
> doc: get enabled status for all supported versions
>
>
> I think that fact that it actually lists all supported versions is
> useful and worth making explicit.
>
>

Agreed. We should make that change before merging anything.

> >
> > Reviewed-by: Jeff Layton <[email protected]>
> > Tested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > fs/nfsd/netlink.c | 24 +++++
> > fs/nfsd/netlink.h | 5 +
> > fs/nfsd/netns.h | 1 +
> > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > fs/nfsd/nfssvc.c | 3 +-
> > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > 7 files changed, 236 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > --- a/Documentation/netlink/specs/nfsd.yaml
> > +++ b/Documentation/netlink/specs/nfsd.yaml
> > @@ -74,6 +74,26 @@ attribute-sets:
> > -
> > name: leasetime
> > type: u32
> > + -
> > + name: version
> > + attributes:
> > + -
> > + name: major
> > + type: u32
> > + -
> > + name: minor
> > + type: u32
> > + -
> > + name: enabled
> > + type: flag
> > + -
> > + name: server-proto
> > + attributes:
> > + -
> > + name: version
> > + type: nest
> > + nested-attributes: version
> > + multi-attr: true
> >
> > operations:
> > list:
> > @@ -120,3 +140,20 @@ operations:
> > - threads
> > - gracetime
> > - leasetime
> > + -
> > + name: version-set
> > + doc: set nfs enabled versions
> > + attribute-set: server-proto
> > + flags: [ admin-perm ]
> > + do:
> > + request:
> > + attributes:
> > + - version
> > + -
> > + name: version-get
> > + doc: get nfs enabled versions
> > + attribute-set: server-proto
> > + do:
> > + reply:
> > + attributes:
> > + - version
> > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > index 20a646af0324..bf5df9597288 100644
> > --- a/fs/nfsd/netlink.c
> > +++ b/fs/nfsd/netlink.c
> > @@ -10,6 +10,13 @@
> >
> > #include <uapi/linux/nfsd_netlink.h>
> >
> > +/* Common nested types */
> > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > +};
> > +
> > /* NFSD_CMD_THREADS_SET - do */
> > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > };
> >
> > +/* NFSD_CMD_VERSION_SET - do */
> > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > +};
> > +
> > /* Ops table for nfsd */
> > static const struct genl_split_ops nfsd_nl_ops[] = {
> > {
> > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > .doit = nfsd_nl_threads_get_doit,
> > .flags = GENL_CMD_CAP_DO,
> > },
> > + {
> > + .cmd = NFSD_CMD_VERSION_SET,
> > + .doit = nfsd_nl_version_set_doit,
> > + .policy = nfsd_version_set_nl_policy,
> > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > + },
> > + {
> > + .cmd = NFSD_CMD_VERSION_GET,
> > + .doit = nfsd_nl_version_get_doit,
> > + .flags = GENL_CMD_CAP_DO,
> > + },
> > };
> >
> > struct genl_family nfsd_nl_family __ro_after_init = {
> > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > index 4137fac477e4..c7c0da275481 100644
> > --- a/fs/nfsd/netlink.h
> > +++ b/fs/nfsd/netlink.h
> > @@ -11,6 +11,9 @@
> >
> > #include <uapi/linux/nfsd_netlink.h>
> >
> > +/* Common nested types */
> > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > +
> > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> >
> > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > struct netlink_callback *cb);
> > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> >
> > extern struct genl_family nfsd_nl_family;
> >
> > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > index d4be519b5734..14ec15656320 100644
> > --- a/fs/nfsd/netns.h
> > +++ b/fs/nfsd/netns.h
> > @@ -218,6 +218,7 @@ struct nfsd_net {
> > /* Simple check to find out if a given net was properly initialized */
> > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> >
> > +extern bool nfsd_support_version(int vers);
> > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> >
> > extern unsigned int nfsd_net_id;
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 38a5df03981b..2c8929ef79e9 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > return err;
> > }
> >
> > +/**
> > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + const struct nlattr *attr;
> > + struct nfsd_net *nn;
> > + int i, rem;
> > +
> > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > + return -EINVAL;
> > +
> > + mutex_lock(&nfsd_mutex);
> > +
> > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > + if (nn->nfsd_serv) {
> > + mutex_unlock(&nfsd_mutex);
> > + return -EBUSY;
> > + }
> > +
> > + /* clear current supported versions. */
> > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > +
> > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > + u32 major, minor = 0;
> > + bool enabled;
> > +
> > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > + continue;
> > +
> > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > + nfsd_version_nl_policy, info->extack) < 0)
> > + continue;
> > +
> > + if (!tb[NFSD_A_VERSION_MAJOR])
> > + continue;
> > +
> > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > + if (tb[NFSD_A_VERSION_MINOR])
> > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > +
> > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > +
> > + switch (major) {
> > + case 4:
> > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > + break;
> > + case 3:
> > + case 2:
> > + if (!minor)
> > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > + break;
> > + default:
> > + break;
> > + }
> > + }
> > +
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + struct nfsd_net *nn;
> > + int i, err;
> > + void *hdr;
> > +
> > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > + if (!skb)
> > + return -ENOMEM;
> > +
> > + hdr = genlmsg_iput(skb, info);
> > + if (!hdr) {
> > + err = -EMSGSIZE;
> > + goto err_free_msg;
> > + }
> > +
> > + mutex_lock(&nfsd_mutex);
> > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > +
> > + for (i = 2; i <= 4; i++) {
> > + int j;
> > +
> > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > + struct nlattr *attr;
> > +
> > + /* Don't record any versions the kernel doesn't have
> > + * compiled in
> > + */
> > + if (!nfsd_support_version(i))
> > + continue;
> > +
> > + /* NFSv{2,3} does not support minor numbers */
> > + if (i < 4 && j)
> > + continue;
> > +
> > + attr = nla_nest_start(skb,
> > + NFSD_A_SERVER_PROTO_VERSION);
> > + if (!attr) {
> > + err = -EINVAL;
> > + goto err_nfsd_unlock;
> > + }
> > +
> > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > + err = -EINVAL;
> > + goto err_nfsd_unlock;
> > + }
> > +
> > + /* Set the enabled flag if the version is enabled */
> > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > + err = -EINVAL;
> > + goto err_nfsd_unlock;
> > + }
> > +
> > + nla_nest_end(skb, attr);
> > + }
> > + }
> > +
> > + mutex_unlock(&nfsd_mutex);
> > + genlmsg_end(skb, hdr);
> > +
> > + return genlmsg_reply(skb, info);
> > +
> > +err_nfsd_unlock:
> > + mutex_unlock(&nfsd_mutex);
> > +err_free_msg:
> > + nlmsg_free(skb);
> > +
> > + return err;
> > +}
> > +
> > /**
> > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > * @net: a freshly-created network namespace
> > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > index ca193f7ff0e1..4fc91f50138a 100644
> > --- a/fs/nfsd/nfssvc.c
> > +++ b/fs/nfsd/nfssvc.c
> > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > .pg_rpcbind_set = nfsd_rpcbind_set,
> > };
> >
> > -static bool
> > -nfsd_support_version(int vers)
> > +bool nfsd_support_version(int vers)
> > {
> > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > return nfsd_version[vers] != NULL;
> > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > index ccc78a5ee650..8a0a2b344923 100644
> > --- a/include/uapi/linux/nfsd_netlink.h
> > +++ b/include/uapi/linux/nfsd_netlink.h
> > @@ -38,10 +38,28 @@ enum {
> > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > };
> >
> > +enum {
> > + NFSD_A_VERSION_MAJOR = 1,
> > + NFSD_A_VERSION_MINOR,
> > + NFSD_A_VERSION_ENABLED,
> > +
> > + __NFSD_A_VERSION_MAX,
> > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > +};
> > +
> > +enum {
> > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > +
> > + __NFSD_A_SERVER_PROTO_MAX,
> > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > +};
> > +
> > enum {
> > NFSD_CMD_RPC_STATUS_GET = 1,
> > NFSD_CMD_THREADS_SET,
> > NFSD_CMD_THREADS_GET,
> > + NFSD_CMD_VERSION_SET,
> > + NFSD_CMD_VERSION_GET,
> >
> > __NFSD_CMD_MAX,
> > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > --
> > 2.44.0
> >
> >
>

--
Jeff Layton <[email protected]>

2024-04-16 19:06:09

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v8 0/6] convert write_threads, write_version and write_ports to netlink commands

On Mon, Apr 15, 2024 at 09:44:33PM +0200, Lorenzo Bianconi wrote:
> Introduce write_threads, write_version and write_ports netlink
> commands similar to the ones available through the procfs.
>
> Changes since v7:
> - add gracetime and leasetime to threads-{set,get} command
> - rely on nla_nest_start instead of nla_nest_start_noflag
> Changes since v6:
> - add the capability to pass sockaddr from userspace through listener-set
> command
> - rebase on top of nfsd-next
> Changes since v5:
> - for write_ports and write_version commands, userspace is expected to provide
> a NFS listeners/supported versions list it want to enable (all the other
> ports/versions will be disabled).
> - fix comments
> - rebase on top of nfsd-next
> Changes since v4:
> - rebase on top of nfsd-next tree
> Changes since v3:
> - drop write_maxconn and write_maxblksize for the moment
> - add write_version and write_ports commands
> Changes since v2:
> - use u32 to store nthreads in nfsd_nl_threads_set_doit
> - rename server-attr in control-plane in nfsd.yaml specs
> Changes since v1:
> - remove write_v4_end_grace command
> - add write_maxblksize and write_maxconn netlink commands
>
> This patch can be tested with user-space tool reported below:
> https://patchwork.kernel.org/project/linux-nfs/cover/[email protected]/
>
> Jeff Layton (2):
> nfsd: move nfsd_mutex handling into nfsd_svc callers
> SUNRPC: add a new svc_find_listener helper
>
> Lorenzo Bianconi (4):
> NFSD: convert write_threads to netlink command
> NFSD: add write_version to netlink command
> SUNRPC: introduce svc_xprt_create_from_sa utility routine
> NFSD: add listener-{set,get} netlink command
>
> Documentation/netlink/specs/nfsd.yaml | 104 ++++++
> fs/nfsd/netlink.c | 65 ++++
> fs/nfsd/netlink.h | 10 +
> fs/nfsd/netns.h | 1 +
> fs/nfsd/nfsctl.c | 476 ++++++++++++++++++++++++++
> fs/nfsd/nfssvc.c | 7 +-
> include/linux/sunrpc/svc_xprt.h | 5 +
> include/uapi/linux/nfsd_netlink.h | 46 +++
> net/sunrpc/svc_xprt.c | 167 +++++----
> 9 files changed, 819 insertions(+), 62 deletions(-)
>
> --
> 2.44.0
>

I'd like to take these for v6.10, just to get something we can start
working on together in the same body of source code. I've been
waiting for a degree of consensus between you three... Hopefully
this week or early next?


--
Chuck Lever

2024-04-16 21:48:37

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Tue, 16 Apr 2024, Jeff Layton wrote:
> On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > Introduce write_version netlink command through a "declarative" interface.
> > > This patch introduces a change in behavior since for version-set userspace
> > > is expected to provide a NFS major/minor version list it wants to enable
> > > while all the other ones will be disabled. (procfs write_version
> > > command implements imperative interface where the admin writes +3/-3 to
> > > enable/disable a single version.
> >
> > It seems a little weird to me that the interface always disables all
> > version, but then also allows individual versions to be disabled.
> >
> > Would it be reasonable to simply ignore the "enabled" flag when setting
> > version, and just enable all versions listed??
> >
> > Or maybe only enable those with the flag, and don't disable those
> > without the flag?
> >
> > Those don't necessarily seem much better - but the current behaviour
> > still seems odd.
> >
>
> I think it makes sense.
>
> We disable all versions, and enable any that have the "enabled" flag set
> in the call from userland. Userland technically needn't send down the
> versions that are disabled in the call, but the current userland program
> does.
>
> I worry about imperative interfaces that might only say -- "enable v4.1,
> but disable v3" and leave the others in their current state. That
> requires that both the kernel and userland keep state about what
> versions are currently enabled and disabled, and it's possible to get
> that wrong.

I understand and support your aversion for imperative interfaces.
But this interface, as currently implemented, looks somewhat imperative.
The message sent to the kernel could enable some interfaces and then
disable them. I know that isn't the intent, but it is what the code
supports. Hence "weird".

We could add code to make that sort of thing impossible, but there isn't
much point. Better to make it syntactically impossible.

Realistically there will never be NFSv4.3 as there is no need - new
features can be added incrementally. So we could just pass an array of
5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
I'm not sure that I do either. A "read" would return the same array
with 3 possible states: unavailable, disabled, enabled. (Maybe the
array could be variable length so 5.0 could be added one day...).

I haven't managed to come up with anything *better* than the current
proposal and I don't want to stand in its way, but I wanted to highlight
that it looks weird - as much imperative as declarative - in case
someone else might be able to come up with a better alternative.

Thanks,
NeilBrown


>
> My thinking was that by using a declarative interface like this, we
> eliminate ambiguity in how these interfaces are supposed to work. The
> client sends down an entire version map in one fell swoop, and we know
> exactly what the result should look like.
>
> Note that you can enable or disable just a single version with the
> userland tool, but this way all of that complexity stays in userland.
>
> > Also for getting the version, the doc says:
> >
> > doc: get nfs enabled versions
> >
> > which I don't think it quite right. The code reports all supported
> > versions, and indicates which of those are enabled. So maybe:
> >
> > doc: get enabled status for all supported versions
> >
> >
> > I think that fact that it actually lists all supported versions is
> > useful and worth making explicit.
> >
> >
>
> Agreed. We should make that change before merging anything.
>
> > >
> > > Reviewed-by: Jeff Layton <[email protected]>
> > > Tested-by: Jeff Layton <[email protected]>
> > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > ---
> > > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > > fs/nfsd/netlink.c | 24 +++++
> > > fs/nfsd/netlink.h | 5 +
> > > fs/nfsd/netns.h | 1 +
> > > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > > fs/nfsd/nfssvc.c | 3 +-
> > > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > > 7 files changed, 236 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > @@ -74,6 +74,26 @@ attribute-sets:
> > > -
> > > name: leasetime
> > > type: u32
> > > + -
> > > + name: version
> > > + attributes:
> > > + -
> > > + name: major
> > > + type: u32
> > > + -
> > > + name: minor
> > > + type: u32
> > > + -
> > > + name: enabled
> > > + type: flag
> > > + -
> > > + name: server-proto
> > > + attributes:
> > > + -
> > > + name: version
> > > + type: nest
> > > + nested-attributes: version
> > > + multi-attr: true
> > >
> > > operations:
> > > list:
> > > @@ -120,3 +140,20 @@ operations:
> > > - threads
> > > - gracetime
> > > - leasetime
> > > + -
> > > + name: version-set
> > > + doc: set nfs enabled versions
> > > + attribute-set: server-proto
> > > + flags: [ admin-perm ]
> > > + do:
> > > + request:
> > > + attributes:
> > > + - version
> > > + -
> > > + name: version-get
> > > + doc: get nfs enabled versions
> > > + attribute-set: server-proto
> > > + do:
> > > + reply:
> > > + attributes:
> > > + - version
> > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > index 20a646af0324..bf5df9597288 100644
> > > --- a/fs/nfsd/netlink.c
> > > +++ b/fs/nfsd/netlink.c
> > > @@ -10,6 +10,13 @@
> > >
> > > #include <uapi/linux/nfsd_netlink.h>
> > >
> > > +/* Common nested types */
> > > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > > +};
> > > +
> > > /* NFSD_CMD_THREADS_SET - do */
> > > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > };
> > >
> > > +/* NFSD_CMD_VERSION_SET - do */
> > > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > > +};
> > > +
> > > /* Ops table for nfsd */
> > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > {
> > > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > .doit = nfsd_nl_threads_get_doit,
> > > .flags = GENL_CMD_CAP_DO,
> > > },
> > > + {
> > > + .cmd = NFSD_CMD_VERSION_SET,
> > > + .doit = nfsd_nl_version_set_doit,
> > > + .policy = nfsd_version_set_nl_policy,
> > > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > + },
> > > + {
> > > + .cmd = NFSD_CMD_VERSION_GET,
> > > + .doit = nfsd_nl_version_get_doit,
> > > + .flags = GENL_CMD_CAP_DO,
> > > + },
> > > };
> > >
> > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > index 4137fac477e4..c7c0da275481 100644
> > > --- a/fs/nfsd/netlink.h
> > > +++ b/fs/nfsd/netlink.h
> > > @@ -11,6 +11,9 @@
> > >
> > > #include <uapi/linux/nfsd_netlink.h>
> > >
> > > +/* Common nested types */
> > > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > > +
> > > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > >
> > > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > struct netlink_callback *cb);
> > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> > >
> > > extern struct genl_family nfsd_nl_family;
> > >
> > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > index d4be519b5734..14ec15656320 100644
> > > --- a/fs/nfsd/netns.h
> > > +++ b/fs/nfsd/netns.h
> > > @@ -218,6 +218,7 @@ struct nfsd_net {
> > > /* Simple check to find out if a given net was properly initialized */
> > > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> > >
> > > +extern bool nfsd_support_version(int vers);
> > > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> > >
> > > extern unsigned int nfsd_net_id;
> > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > index 38a5df03981b..2c8929ef79e9 100644
> > > --- a/fs/nfsd/nfsctl.c
> > > +++ b/fs/nfsd/nfsctl.c
> > > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > return err;
> > > }
> > >
> > > +/**
> > > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > > + * @skb: reply buffer
> > > + * @info: netlink metadata and command arguments
> > > + *
> > > + * Return 0 on success or a negative errno.
> > > + */
> > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > +{
> > > + const struct nlattr *attr;
> > > + struct nfsd_net *nn;
> > > + int i, rem;
> > > +
> > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > > + return -EINVAL;
> > > +
> > > + mutex_lock(&nfsd_mutex);
> > > +
> > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > + if (nn->nfsd_serv) {
> > > + mutex_unlock(&nfsd_mutex);
> > > + return -EBUSY;
> > > + }
> > > +
> > > + /* clear current supported versions. */
> > > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > > +
> > > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > > + u32 major, minor = 0;
> > > + bool enabled;
> > > +
> > > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > > + continue;
> > > +
> > > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > > + nfsd_version_nl_policy, info->extack) < 0)
> > > + continue;
> > > +
> > > + if (!tb[NFSD_A_VERSION_MAJOR])
> > > + continue;
> > > +
> > > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > > + if (tb[NFSD_A_VERSION_MINOR])
> > > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > > +
> > > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > > +
> > > + switch (major) {
> > > + case 4:
> > > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > > + break;
> > > + case 3:
> > > + case 2:
> > > + if (!minor)
> > > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > > + break;
> > > + default:
> > > + break;
> > > + }
> > > + }
> > > +
> > > + mutex_unlock(&nfsd_mutex);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +/**
> > > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > > + * @skb: reply buffer
> > > + * @info: netlink metadata and command arguments
> > > + *
> > > + * Return 0 on success or a negative errno.
> > > + */
> > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > +{
> > > + struct nfsd_net *nn;
> > > + int i, err;
> > > + void *hdr;
> > > +
> > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > + if (!skb)
> > > + return -ENOMEM;
> > > +
> > > + hdr = genlmsg_iput(skb, info);
> > > + if (!hdr) {
> > > + err = -EMSGSIZE;
> > > + goto err_free_msg;
> > > + }
> > > +
> > > + mutex_lock(&nfsd_mutex);
> > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > +
> > > + for (i = 2; i <= 4; i++) {
> > > + int j;
> > > +
> > > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > > + struct nlattr *attr;
> > > +
> > > + /* Don't record any versions the kernel doesn't have
> > > + * compiled in
> > > + */
> > > + if (!nfsd_support_version(i))
> > > + continue;
> > > +
> > > + /* NFSv{2,3} does not support minor numbers */
> > > + if (i < 4 && j)
> > > + continue;
> > > +
> > > + attr = nla_nest_start(skb,
> > > + NFSD_A_SERVER_PROTO_VERSION);
> > > + if (!attr) {
> > > + err = -EINVAL;
> > > + goto err_nfsd_unlock;
> > > + }
> > > +
> > > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > > + err = -EINVAL;
> > > + goto err_nfsd_unlock;
> > > + }
> > > +
> > > + /* Set the enabled flag if the version is enabled */
> > > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > > + err = -EINVAL;
> > > + goto err_nfsd_unlock;
> > > + }
> > > +
> > > + nla_nest_end(skb, attr);
> > > + }
> > > + }
> > > +
> > > + mutex_unlock(&nfsd_mutex);
> > > + genlmsg_end(skb, hdr);
> > > +
> > > + return genlmsg_reply(skb, info);
> > > +
> > > +err_nfsd_unlock:
> > > + mutex_unlock(&nfsd_mutex);
> > > +err_free_msg:
> > > + nlmsg_free(skb);
> > > +
> > > + return err;
> > > +}
> > > +
> > > /**
> > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > * @net: a freshly-created network namespace
> > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > > index ca193f7ff0e1..4fc91f50138a 100644
> > > --- a/fs/nfsd/nfssvc.c
> > > +++ b/fs/nfsd/nfssvc.c
> > > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > > .pg_rpcbind_set = nfsd_rpcbind_set,
> > > };
> > >
> > > -static bool
> > > -nfsd_support_version(int vers)
> > > +bool nfsd_support_version(int vers)
> > > {
> > > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > > return nfsd_version[vers] != NULL;
> > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > index ccc78a5ee650..8a0a2b344923 100644
> > > --- a/include/uapi/linux/nfsd_netlink.h
> > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > @@ -38,10 +38,28 @@ enum {
> > > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > };
> > >
> > > +enum {
> > > + NFSD_A_VERSION_MAJOR = 1,
> > > + NFSD_A_VERSION_MINOR,
> > > + NFSD_A_VERSION_ENABLED,
> > > +
> > > + __NFSD_A_VERSION_MAX,
> > > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > > +};
> > > +
> > > +enum {
> > > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > > +
> > > + __NFSD_A_SERVER_PROTO_MAX,
> > > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > > +};
> > > +
> > > enum {
> > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > NFSD_CMD_THREADS_SET,
> > > NFSD_CMD_THREADS_GET,
> > > + NFSD_CMD_VERSION_SET,
> > > + NFSD_CMD_VERSION_GET,
> > >
> > > __NFSD_CMD_MAX,
> > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > --
> > > 2.44.0
> > >
> > >
> >
>
> --
> Jeff Layton <[email protected]>
>


2024-04-16 21:55:41

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> Introduce write_threads netlink command similar to the one available
> through the procfs.

I think this should support write_pool_threads too.
i.e. the number of threads should be an array. If it is a singleton,
the it does write_threads. If larger it does write_pool_threads.
I don't think we want to add a separate command later for pool_threads.

NeilBrown


>
> Tested-by: Jeff Layton <[email protected]>
> Reviewed-by: Jeff Layton <[email protected]>
> Co-developed-by: Jeff Layton <[email protected]>
> Signed-off-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
> fs/nfsd/netlink.c | 19 +++++
> fs/nfsd/netlink.h | 2 +
> fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
> include/uapi/linux/nfsd_netlink.h | 11 +++
> 5 files changed, 169 insertions(+)
>
> diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> index 05acc73e2e33..cbe6c5fd6c4d 100644
> --- a/Documentation/netlink/specs/nfsd.yaml
> +++ b/Documentation/netlink/specs/nfsd.yaml
> @@ -62,6 +62,18 @@ attribute-sets:
> name: compound-ops
> type: u32
> multi-attr: true
> + -
> + name: server-worker
> + attributes:
> + -
> + name: threads
> + type: u32
> + -
> + name: gracetime
> + type: u32
> + -
> + name: leasetime
> + type: u32
>
> operations:
> list:
> @@ -87,3 +99,24 @@ operations:
> - sport
> - dport
> - compound-ops
> + -
> + name: threads-set
> + doc: set the number of running threads
> + attribute-set: server-worker
> + flags: [ admin-perm ]
> + do:
> + request:
> + attributes:
> + - threads
> + - gracetime
> + - leasetime
> + -
> + name: threads-get
> + doc: get the number of running threads
> + attribute-set: server-worker
> + do:
> + reply:
> + attributes:
> + - threads
> + - gracetime
> + - leasetime
> diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> index 0e1d635ec5f9..20a646af0324 100644
> --- a/fs/nfsd/netlink.c
> +++ b/fs/nfsd/netlink.c
> @@ -10,6 +10,13 @@
>
> #include <uapi/linux/nfsd_netlink.h>
>
> +/* NFSD_CMD_THREADS_SET - do */
> +static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> + [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> + [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
> + [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> +};
> +
> /* Ops table for nfsd */
> static const struct genl_split_ops nfsd_nl_ops[] = {
> {
> @@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> .done = nfsd_nl_rpc_status_get_done,
> .flags = GENL_CMD_CAP_DUMP,
> },
> + {
> + .cmd = NFSD_CMD_THREADS_SET,
> + .doit = nfsd_nl_threads_set_doit,
> + .policy = nfsd_threads_set_nl_policy,
> + .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
> + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> + },
> + {
> + .cmd = NFSD_CMD_THREADS_GET,
> + .doit = nfsd_nl_threads_get_doit,
> + .flags = GENL_CMD_CAP_DO,
> + },
> };
>
> struct genl_family nfsd_nl_family __ro_after_init = {
> diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> index d83dd6bdee92..4137fac477e4 100644
> --- a/fs/nfsd/netlink.h
> +++ b/fs/nfsd/netlink.h
> @@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
>
> int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> struct netlink_callback *cb);
> +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
>
> extern struct genl_family nfsd_nl_family;
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index f2e442d7fe16..38a5df03981b 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
> return 0;
> }
>
> +/**
> + * nfsd_nl_threads_set_doit - set the number of running threads
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + struct net *net = genl_info_net(info);
> + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> + int ret = -EBUSY;
> + u32 nthreads;
> +
> + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
> + return -EINVAL;
> +
> + nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
> +
> + mutex_lock(&nfsd_mutex);
> + if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
> + info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
> + const struct nlattr *attr;
> +
> + if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
> + goto out_unlock;
> +
> + ret = -EINVAL;
> + attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
> + if (attr) {
> + u32 gracetime = nla_get_u32(attr);
> +
> + if (gracetime < 10 || gracetime > 3600)
> + goto out_unlock;
> +
> + nn->nfsd4_grace = gracetime;
> + }
> +
> + attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
> + if (attr) {
> + u32 leasetime = nla_get_u32(attr);
> +
> + if (leasetime < 10 || leasetime > 3600)
> + goto out_unlock;
> +
> + nn->nfsd4_lease = leasetime;
> + }
> + }
> +
> + ret = nfsd_svc(nthreads, net, get_current_cred());
> +out_unlock:
> + mutex_unlock(&nfsd_mutex);
> +
> + return ret == nthreads ? 0 : ret;
> +}
> +
> +/**
> + * nfsd_nl_threads_get_doit - get the number of running threads
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + struct net *net = genl_info_net(info);
> + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> + void *hdr;
> + int err;
> +
> + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!skb)
> + return -ENOMEM;
> +
> + hdr = genlmsg_iput(skb, info);
> + if (!hdr) {
> + err = -EMSGSIZE;
> + goto err_free_msg;
> + }
> +
> + mutex_lock(&nfsd_mutex);
> + err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
> + nn->nfsd4_grace) ||
> + nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
> + nn->nfsd4_lease) ||
> + nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
> + nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
> + mutex_unlock(&nfsd_mutex);
> +
> + if (err) {
> + err = -EINVAL;
> + goto err_free_msg;
> + }
> +
> + genlmsg_end(skb, hdr);
> +
> + return genlmsg_reply(skb, info);
> +
> +err_free_msg:
> + nlmsg_free(skb);
> +
> + return err;
> +}
> +
> /**
> * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> * @net: a freshly-created network namespace
> diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> index 3cd044edee5d..ccc78a5ee650 100644
> --- a/include/uapi/linux/nfsd_netlink.h
> +++ b/include/uapi/linux/nfsd_netlink.h
> @@ -29,8 +29,19 @@ enum {
> NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
> };
>
> +enum {
> + NFSD_A_SERVER_WORKER_THREADS = 1,
> + NFSD_A_SERVER_WORKER_GRACETIME,
> + NFSD_A_SERVER_WORKER_LEASETIME,
> +
> + __NFSD_A_SERVER_WORKER_MAX,
> + NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> +};
> +
> enum {
> NFSD_CMD_RPC_STATUS_GET = 1,
> + NFSD_CMD_THREADS_SET,
> + NFSD_CMD_THREADS_GET,
>
> __NFSD_CMD_MAX,
> NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> --
> 2.44.0
>
>


2024-04-16 22:28:50

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> Introduce write_threads netlink command similar to the one available
> through the procfs.
>
> Tested-by: Jeff Layton <[email protected]>
> Reviewed-by: Jeff Layton <[email protected]>
> Co-developed-by: Jeff Layton <[email protected]>
> Signed-off-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
> fs/nfsd/netlink.c | 19 +++++
> fs/nfsd/netlink.h | 2 +
> fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
> include/uapi/linux/nfsd_netlink.h | 11 +++
> 5 files changed, 169 insertions(+)
>
> diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> index 05acc73e2e33..cbe6c5fd6c4d 100644
> --- a/Documentation/netlink/specs/nfsd.yaml
> +++ b/Documentation/netlink/specs/nfsd.yaml
> @@ -62,6 +62,18 @@ attribute-sets:
> name: compound-ops
> type: u32
> multi-attr: true
> + -
> + name: server-worker
> + attributes:
> + -
> + name: threads
> + type: u32
> + -
> + name: gracetime
> + type: u32
> + -
> + name: leasetime
> + type: u32

Another thought: I would be really happy if the "scope" were another
optional argument here. The mechanism of setting the scope by user the
hostname works but is ugly. I'm inclined to ignore the hostname
completely when netlink is used, but I'm not completely sure about that.
(aside - I think using the hostname for the default scope was a really
bad idea. It should have been a fixed string like "Linux").

NeilBrown



>
> operations:
> list:
> @@ -87,3 +99,24 @@ operations:
> - sport
> - dport
> - compound-ops
> + -
> + name: threads-set
> + doc: set the number of running threads
> + attribute-set: server-worker
> + flags: [ admin-perm ]
> + do:
> + request:
> + attributes:
> + - threads
> + - gracetime
> + - leasetime
> + -
> + name: threads-get
> + doc: get the number of running threads
> + attribute-set: server-worker
> + do:
> + reply:
> + attributes:
> + - threads
> + - gracetime
> + - leasetime
> diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> index 0e1d635ec5f9..20a646af0324 100644
> --- a/fs/nfsd/netlink.c
> +++ b/fs/nfsd/netlink.c
> @@ -10,6 +10,13 @@
>
> #include <uapi/linux/nfsd_netlink.h>
>
> +/* NFSD_CMD_THREADS_SET - do */
> +static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> + [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> + [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
> + [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> +};
> +
> /* Ops table for nfsd */
> static const struct genl_split_ops nfsd_nl_ops[] = {
> {
> @@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> .done = nfsd_nl_rpc_status_get_done,
> .flags = GENL_CMD_CAP_DUMP,
> },
> + {
> + .cmd = NFSD_CMD_THREADS_SET,
> + .doit = nfsd_nl_threads_set_doit,
> + .policy = nfsd_threads_set_nl_policy,
> + .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
> + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> + },
> + {
> + .cmd = NFSD_CMD_THREADS_GET,
> + .doit = nfsd_nl_threads_get_doit,
> + .flags = GENL_CMD_CAP_DO,
> + },
> };
>
> struct genl_family nfsd_nl_family __ro_after_init = {
> diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> index d83dd6bdee92..4137fac477e4 100644
> --- a/fs/nfsd/netlink.h
> +++ b/fs/nfsd/netlink.h
> @@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
>
> int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> struct netlink_callback *cb);
> +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
>
> extern struct genl_family nfsd_nl_family;
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index f2e442d7fe16..38a5df03981b 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
> return 0;
> }
>
> +/**
> + * nfsd_nl_threads_set_doit - set the number of running threads
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + struct net *net = genl_info_net(info);
> + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> + int ret = -EBUSY;
> + u32 nthreads;
> +
> + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
> + return -EINVAL;
> +
> + nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
> +
> + mutex_lock(&nfsd_mutex);
> + if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
> + info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
> + const struct nlattr *attr;
> +
> + if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
> + goto out_unlock;
> +
> + ret = -EINVAL;
> + attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
> + if (attr) {
> + u32 gracetime = nla_get_u32(attr);
> +
> + if (gracetime < 10 || gracetime > 3600)
> + goto out_unlock;
> +
> + nn->nfsd4_grace = gracetime;
> + }
> +
> + attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
> + if (attr) {
> + u32 leasetime = nla_get_u32(attr);
> +
> + if (leasetime < 10 || leasetime > 3600)
> + goto out_unlock;
> +
> + nn->nfsd4_lease = leasetime;
> + }
> + }
> +
> + ret = nfsd_svc(nthreads, net, get_current_cred());
> +out_unlock:
> + mutex_unlock(&nfsd_mutex);
> +
> + return ret == nthreads ? 0 : ret;
> +}
> +
> +/**
> + * nfsd_nl_threads_get_doit - get the number of running threads
> + * @skb: reply buffer
> + * @info: netlink metadata and command arguments
> + *
> + * Return 0 on success or a negative errno.
> + */
> +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> +{
> + struct net *net = genl_info_net(info);
> + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> + void *hdr;
> + int err;
> +
> + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!skb)
> + return -ENOMEM;
> +
> + hdr = genlmsg_iput(skb, info);
> + if (!hdr) {
> + err = -EMSGSIZE;
> + goto err_free_msg;
> + }
> +
> + mutex_lock(&nfsd_mutex);
> + err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
> + nn->nfsd4_grace) ||
> + nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
> + nn->nfsd4_lease) ||
> + nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
> + nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
> + mutex_unlock(&nfsd_mutex);
> +
> + if (err) {
> + err = -EINVAL;
> + goto err_free_msg;
> + }
> +
> + genlmsg_end(skb, hdr);
> +
> + return genlmsg_reply(skb, info);
> +
> +err_free_msg:
> + nlmsg_free(skb);
> +
> + return err;
> +}
> +
> /**
> * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> * @net: a freshly-created network namespace
> diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> index 3cd044edee5d..ccc78a5ee650 100644
> --- a/include/uapi/linux/nfsd_netlink.h
> +++ b/include/uapi/linux/nfsd_netlink.h
> @@ -29,8 +29,19 @@ enum {
> NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
> };
>
> +enum {
> + NFSD_A_SERVER_WORKER_THREADS = 1,
> + NFSD_A_SERVER_WORKER_GRACETIME,
> + NFSD_A_SERVER_WORKER_LEASETIME,
> +
> + __NFSD_A_SERVER_WORKER_MAX,
> + NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> +};
> +
> enum {
> NFSD_CMD_RPC_STATUS_GET = 1,
> + NFSD_CMD_THREADS_SET,
> + NFSD_CMD_THREADS_GET,
>
> __NFSD_CMD_MAX,
> NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> --
> 2.44.0
>
>


2024-04-16 22:33:29

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> On Tue, 16 Apr 2024, Jeff Layton wrote:
> > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > Introduce write_version netlink command through a "declarative" interface.
> > > > This patch introduces a change in behavior since for version-set userspace
> > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > while all the other ones will be disabled. (procfs write_version
> > > > command implements imperative interface where the admin writes +3/-3 to
> > > > enable/disable a single version.
> > >
> > > It seems a little weird to me that the interface always disables all
> > > version, but then also allows individual versions to be disabled.
> > >
> > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > version, and just enable all versions listed??
> > >
> > > Or maybe only enable those with the flag, and don't disable those
> > > without the flag?
> > >
> > > Those don't necessarily seem much better - but the current behaviour
> > > still seems odd.
> > >
> >
> > I think it makes sense.
> >
> > We disable all versions, and enable any that have the "enabled" flag set
> > in the call from userland. Userland technically needn't send down the
> > versions that are disabled in the call, but the current userland program
> > does.
> >
> > I worry about imperative interfaces that might only say -- "enable v4.1,
> > but disable v3" and leave the others in their current state. That
> > requires that both the kernel and userland keep state about what
> > versions are currently enabled and disabled, and it's possible to get
> > that wrong.
>
> I understand and support your aversion for imperative interfaces.
> But this interface, as currently implemented, looks somewhat imperative.
> The message sent to the kernel could enable some interfaces and then
> disable them. I know that isn't the intent, but it is what the code
> supports. Hence "weird".
>
> We could add code to make that sort of thing impossible, but there isn't
> much point. Better to make it syntactically impossible.
>
> Realistically there will never be NFSv4.3 as there is no need - new
> features can be added incrementally.?
>

There is no need _so_far_. Who knows what the future will bring? Maybe
we'll need v4.3 in order to add some needed feature?

> So we could just pass an array of
> 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> I'm not sure that I do either. A "read" would return the same array
> with 3 possible states: unavailable, disabled, enabled. (Maybe the
> array could be variable length so 5.0 could be added one day...).
>

A set of flags is basically what this interface is. They just happen to
also be labeled with the major and minorversion. I think that's a good
thing.

>
> I haven't managed to come up with anything *better* than the current
> proposal and I don't want to stand in its way, but I wanted to highlight
> that it looks weird - as much imperative as declarative - in case
> someone else might be able to come up with a better alternative.
>

The intention was to create a symmetrical interface. We have 2
operations: a "get" that asks "what versions are currently supported and
are they enabled?", and a "set" that says "here is the new state of
version enablement".

The userland tool always sends down a complete set of versions. The
kernel is (currently) more forgiving, and treats omitted versions as
being disabled. The kernel could require that every supported version be
represented in the "set" operation, if that's more desirable.

> >
> > My thinking was that by using a declarative interface like this, we
> > eliminate ambiguity in how these interfaces are supposed to work. The
> > client sends down an entire version map in one fell swoop, and we know
> > exactly what the result should look like.
> >
> > Note that you can enable or disable just a single version with the
> > userland tool, but this way all of that complexity stays in userland.
> >
> > > Also for getting the version, the doc says:
> > >
> > > doc: get nfs enabled versions
> > >
> > > which I don't think it quite right. The code reports all supported
> > > versions, and indicates which of those are enabled. So maybe:
> > >
> > > doc: get enabled status for all supported versions
> > >
> > >
> > > I think that fact that it actually lists all supported versions is
> > > useful and worth making explicit.
> > >
> > >
> >
> > Agreed. We should make that change before merging anything.
> >
> > > >
> > > > Reviewed-by: Jeff Layton <[email protected]>
> > > > Tested-by: Jeff Layton <[email protected]>
> > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > ---
> > > > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > > > fs/nfsd/netlink.c | 24 +++++
> > > > fs/nfsd/netlink.h | 5 +
> > > > fs/nfsd/netns.h | 1 +
> > > > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > > > fs/nfsd/nfssvc.c | 3 +-
> > > > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > > > 7 files changed, 236 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > > @@ -74,6 +74,26 @@ attribute-sets:
> > > > -
> > > > name: leasetime
> > > > type: u32
> > > > + -
> > > > + name: version
> > > > + attributes:
> > > > + -
> > > > + name: major
> > > > + type: u32
> > > > + -
> > > > + name: minor
> > > > + type: u32
> > > > + -
> > > > + name: enabled
> > > > + type: flag
> > > > + -
> > > > + name: server-proto
> > > > + attributes:
> > > > + -
> > > > + name: version
> > > > + type: nest
> > > > + nested-attributes: version
> > > > + multi-attr: true
> > > >
> > > > operations:
> > > > list:
> > > > @@ -120,3 +140,20 @@ operations:
> > > > - threads
> > > > - gracetime
> > > > - leasetime
> > > > + -
> > > > + name: version-set
> > > > + doc: set nfs enabled versions
> > > > + attribute-set: server-proto
> > > > + flags: [ admin-perm ]
> > > > + do:
> > > > + request:
> > > > + attributes:
> > > > + - version
> > > > + -
> > > > + name: version-get
> > > > + doc: get nfs enabled versions
> > > > + attribute-set: server-proto
> > > > + do:
> > > > + reply:
> > > > + attributes:
> > > > + - version
> > > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > > index 20a646af0324..bf5df9597288 100644
> > > > --- a/fs/nfsd/netlink.c
> > > > +++ b/fs/nfsd/netlink.c
> > > > @@ -10,6 +10,13 @@
> > > >
> > > > #include <uapi/linux/nfsd_netlink.h>
> > > >
> > > > +/* Common nested types */
> > > > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > > > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > > > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > > > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > > > +};
> > > > +
> > > > /* NFSD_CMD_THREADS_SET - do */
> > > > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > > > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > > };
> > > >
> > > > +/* NFSD_CMD_VERSION_SET - do */
> > > > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > > > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > > > +};
> > > > +
> > > > /* Ops table for nfsd */
> > > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > {
> > > > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > .doit = nfsd_nl_threads_get_doit,
> > > > .flags = GENL_CMD_CAP_DO,
> > > > },
> > > > + {
> > > > + .cmd = NFSD_CMD_VERSION_SET,
> > > > + .doit = nfsd_nl_version_set_doit,
> > > > + .policy = nfsd_version_set_nl_policy,
> > > > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > > + },
> > > > + {
> > > > + .cmd = NFSD_CMD_VERSION_GET,
> > > > + .doit = nfsd_nl_version_get_doit,
> > > > + .flags = GENL_CMD_CAP_DO,
> > > > + },
> > > > };
> > > >
> > > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > > index 4137fac477e4..c7c0da275481 100644
> > > > --- a/fs/nfsd/netlink.h
> > > > +++ b/fs/nfsd/netlink.h
> > > > @@ -11,6 +11,9 @@
> > > >
> > > > #include <uapi/linux/nfsd_netlink.h>
> > > >
> > > > +/* Common nested types */
> > > > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > > > +
> > > > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > > > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > > >
> > > > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > > struct netlink_callback *cb);
> > > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > >
> > > > extern struct genl_family nfsd_nl_family;
> > > >
> > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > > index d4be519b5734..14ec15656320 100644
> > > > --- a/fs/nfsd/netns.h
> > > > +++ b/fs/nfsd/netns.h
> > > > @@ -218,6 +218,7 @@ struct nfsd_net {
> > > > /* Simple check to find out if a given net was properly initialized */
> > > > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> > > >
> > > > +extern bool nfsd_support_version(int vers);
> > > > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> > > >
> > > > extern unsigned int nfsd_net_id;
> > > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > > index 38a5df03981b..2c8929ef79e9 100644
> > > > --- a/fs/nfsd/nfsctl.c
> > > > +++ b/fs/nfsd/nfsctl.c
> > > > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > return err;
> > > > }
> > > >
> > > > +/**
> > > > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > > > + * @skb: reply buffer
> > > > + * @info: netlink metadata and command arguments
> > > > + *
> > > > + * Return 0 on success or a negative errno.
> > > > + */
> > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > > +{
> > > > + const struct nlattr *attr;
> > > > + struct nfsd_net *nn;
> > > > + int i, rem;
> > > > +
> > > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > > > + return -EINVAL;
> > > > +
> > > > + mutex_lock(&nfsd_mutex);
> > > > +
> > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > + if (nn->nfsd_serv) {
> > > > + mutex_unlock(&nfsd_mutex);
> > > > + return -EBUSY;
> > > > + }
> > > > +
> > > > + /* clear current supported versions. */
> > > > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > > > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > > > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > > > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > > > +
> > > > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > > > + u32 major, minor = 0;
> > > > + bool enabled;
> > > > +
> > > > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > > > + continue;
> > > > +
> > > > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > > > + nfsd_version_nl_policy, info->extack) < 0)
> > > > + continue;
> > > > +
> > > > + if (!tb[NFSD_A_VERSION_MAJOR])
> > > > + continue;
> > > > +
> > > > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > > > + if (tb[NFSD_A_VERSION_MINOR])
> > > > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > > > +
> > > > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > > > +
> > > > + switch (major) {
> > > > + case 4:
> > > > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > + break;
> > > > + case 3:
> > > > + case 2:
> > > > + if (!minor)
> > > > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > + break;
> > > > + default:
> > > > + break;
> > > > + }
> > > > + }
> > > > +
> > > > + mutex_unlock(&nfsd_mutex);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +/**
> > > > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > > > + * @skb: reply buffer
> > > > + * @info: netlink metadata and command arguments
> > > > + *
> > > > + * Return 0 on success or a negative errno.
> > > > + */
> > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > +{
> > > > + struct nfsd_net *nn;
> > > > + int i, err;
> > > > + void *hdr;
> > > > +
> > > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > > + if (!skb)
> > > > + return -ENOMEM;
> > > > +
> > > > + hdr = genlmsg_iput(skb, info);
> > > > + if (!hdr) {
> > > > + err = -EMSGSIZE;
> > > > + goto err_free_msg;
> > > > + }
> > > > +
> > > > + mutex_lock(&nfsd_mutex);
> > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > +
> > > > + for (i = 2; i <= 4; i++) {
> > > > + int j;
> > > > +
> > > > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > > > + struct nlattr *attr;
> > > > +
> > > > + /* Don't record any versions the kernel doesn't have
> > > > + * compiled in
> > > > + */
> > > > + if (!nfsd_support_version(i))
> > > > + continue;
> > > > +
> > > > + /* NFSv{2,3} does not support minor numbers */
> > > > + if (i < 4 && j)
> > > > + continue;
> > > > +
> > > > + attr = nla_nest_start(skb,
> > > > + NFSD_A_SERVER_PROTO_VERSION);
> > > > + if (!attr) {
> > > > + err = -EINVAL;
> > > > + goto err_nfsd_unlock;
> > > > + }
> > > > +
> > > > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > > > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > > > + err = -EINVAL;
> > > > + goto err_nfsd_unlock;
> > > > + }
> > > > +
> > > > + /* Set the enabled flag if the version is enabled */
> > > > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > > > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > > > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > > > + err = -EINVAL;
> > > > + goto err_nfsd_unlock;
> > > > + }
> > > > +
> > > > + nla_nest_end(skb, attr);
> > > > + }
> > > > + }
> > > > +
> > > > + mutex_unlock(&nfsd_mutex);
> > > > + genlmsg_end(skb, hdr);
> > > > +
> > > > + return genlmsg_reply(skb, info);
> > > > +
> > > > +err_nfsd_unlock:
> > > > + mutex_unlock(&nfsd_mutex);
> > > > +err_free_msg:
> > > > + nlmsg_free(skb);
> > > > +
> > > > + return err;
> > > > +}
> > > > +
> > > > /**
> > > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > > * @net: a freshly-created network namespace
> > > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > > > index ca193f7ff0e1..4fc91f50138a 100644
> > > > --- a/fs/nfsd/nfssvc.c
> > > > +++ b/fs/nfsd/nfssvc.c
> > > > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > > > .pg_rpcbind_set = nfsd_rpcbind_set,
> > > > };
> > > >
> > > > -static bool
> > > > -nfsd_support_version(int vers)
> > > > +bool nfsd_support_version(int vers)
> > > > {
> > > > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > > > return nfsd_version[vers] != NULL;
> > > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > > index ccc78a5ee650..8a0a2b344923 100644
> > > > --- a/include/uapi/linux/nfsd_netlink.h
> > > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > > @@ -38,10 +38,28 @@ enum {
> > > > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > > };
> > > >
> > > > +enum {
> > > > + NFSD_A_VERSION_MAJOR = 1,
> > > > + NFSD_A_VERSION_MINOR,
> > > > + NFSD_A_VERSION_ENABLED,
> > > > +
> > > > + __NFSD_A_VERSION_MAX,
> > > > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > > > +};
> > > > +
> > > > +enum {
> > > > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > > > +
> > > > + __NFSD_A_SERVER_PROTO_MAX,
> > > > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > > > +};
> > > > +
> > > > enum {
> > > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > > NFSD_CMD_THREADS_SET,
> > > > NFSD_CMD_THREADS_GET,
> > > > + NFSD_CMD_VERSION_SET,
> > > > + NFSD_CMD_VERSION_GET,
> > > >
> > > > __NFSD_CMD_MAX,
> > > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > > --
> > > > 2.44.0
> > > >
> > > >
> > >
> >
> > --
> > Jeff Layton <[email protected]>
> >
>

--
Jeff Layton <[email protected]>

2024-04-16 22:39:36

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

On Wed, 2024-04-17 at 07:55 +1000, NeilBrown wrote:
> On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > Introduce write_threads netlink command similar to the one available
> > through the procfs.
>
> I think this should support write_pool_threads too.
> i.e. the number of threads should be an array. If it is a singleton,
> the it does write_threads. If larger it does write_pool_threads.
> I don't think we want to add a separate command later for pool_threads.
>
> NeilBrown
>

That is a very good point. We'll take a closer look at the current
pool_threads interface and see how we can extend this one to cover that
use case. Making it an array sounds reasonable at first blush.

>
> >
> > Tested-by: Jeff Layton <[email protected]>
> > Reviewed-by: Jeff Layton <[email protected]>
> > Co-developed-by: Jeff Layton <[email protected]>
> > Signed-off-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
> > fs/nfsd/netlink.c | 19 +++++
> > fs/nfsd/netlink.h | 2 +
> > fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
> > include/uapi/linux/nfsd_netlink.h | 11 +++
> > 5 files changed, 169 insertions(+)
> >
> > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > index 05acc73e2e33..cbe6c5fd6c4d 100644
> > --- a/Documentation/netlink/specs/nfsd.yaml
> > +++ b/Documentation/netlink/specs/nfsd.yaml
> > @@ -62,6 +62,18 @@ attribute-sets:
> > name: compound-ops
> > type: u32
> > multi-attr: true
> > + -
> > + name: server-worker
> > + attributes:
> > + -
> > + name: threads
> > + type: u32
> > + -
> > + name: gracetime
> > + type: u32
> > + -
> > + name: leasetime
> > + type: u32
> >
> > operations:
> > list:
> > @@ -87,3 +99,24 @@ operations:
> > - sport
> > - dport
> > - compound-ops
> > + -
> > + name: threads-set
> > + doc: set the number of running threads
> > + attribute-set: server-worker
> > + flags: [ admin-perm ]
> > + do:
> > + request:
> > + attributes:
> > + - threads
> > + - gracetime
> > + - leasetime
> > + -
> > + name: threads-get
> > + doc: get the number of running threads
> > + attribute-set: server-worker
> > + do:
> > + reply:
> > + attributes:
> > + - threads
> > + - gracetime
> > + - leasetime
> > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > index 0e1d635ec5f9..20a646af0324 100644
> > --- a/fs/nfsd/netlink.c
> > +++ b/fs/nfsd/netlink.c
> > @@ -10,6 +10,13 @@
> >
> > #include <uapi/linux/nfsd_netlink.h>
> >
> > +/* NFSD_CMD_THREADS_SET - do */
> > +static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > + [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > + [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
> > + [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > +};
> > +
> > /* Ops table for nfsd */
> > static const struct genl_split_ops nfsd_nl_ops[] = {
> > {
> > @@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > .done = nfsd_nl_rpc_status_get_done,
> > .flags = GENL_CMD_CAP_DUMP,
> > },
> > + {
> > + .cmd = NFSD_CMD_THREADS_SET,
> > + .doit = nfsd_nl_threads_set_doit,
> > + .policy = nfsd_threads_set_nl_policy,
> > + .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
> > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > + },
> > + {
> > + .cmd = NFSD_CMD_THREADS_GET,
> > + .doit = nfsd_nl_threads_get_doit,
> > + .flags = GENL_CMD_CAP_DO,
> > + },
> > };
> >
> > struct genl_family nfsd_nl_family __ro_after_init = {
> > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > index d83dd6bdee92..4137fac477e4 100644
> > --- a/fs/nfsd/netlink.h
> > +++ b/fs/nfsd/netlink.h
> > @@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> >
> > int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > struct netlink_callback *cb);
> > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> >
> > extern struct genl_family nfsd_nl_family;
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index f2e442d7fe16..38a5df03981b 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
> > return 0;
> > }
> >
> > +/**
> > + * nfsd_nl_threads_set_doit - set the number of running threads
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + struct net *net = genl_info_net(info);
> > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > + int ret = -EBUSY;
> > + u32 nthreads;
> > +
> > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
> > + return -EINVAL;
> > +
> > + nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
> > +
> > + mutex_lock(&nfsd_mutex);
> > + if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
> > + info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
> > + const struct nlattr *attr;
> > +
> > + if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
> > + goto out_unlock;
> > +
> > + ret = -EINVAL;
> > + attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
> > + if (attr) {
> > + u32 gracetime = nla_get_u32(attr);
> > +
> > + if (gracetime < 10 || gracetime > 3600)
> > + goto out_unlock;
> > +
> > + nn->nfsd4_grace = gracetime;
> > + }
> > +
> > + attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
> > + if (attr) {
> > + u32 leasetime = nla_get_u32(attr);
> > +
> > + if (leasetime < 10 || leasetime > 3600)
> > + goto out_unlock;
> > +
> > + nn->nfsd4_lease = leasetime;
> > + }
> > + }
> > +
> > + ret = nfsd_svc(nthreads, net, get_current_cred());
> > +out_unlock:
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + return ret == nthreads ? 0 : ret;
> > +}
> > +
> > +/**
> > + * nfsd_nl_threads_get_doit - get the number of running threads
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + struct net *net = genl_info_net(info);
> > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > + void *hdr;
> > + int err;
> > +
> > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > + if (!skb)
> > + return -ENOMEM;
> > +
> > + hdr = genlmsg_iput(skb, info);
> > + if (!hdr) {
> > + err = -EMSGSIZE;
> > + goto err_free_msg;
> > + }
> > +
> > + mutex_lock(&nfsd_mutex);
> > + err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
> > + nn->nfsd4_grace) ||
> > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
> > + nn->nfsd4_lease) ||
> > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
> > + nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + if (err) {
> > + err = -EINVAL;
> > + goto err_free_msg;
> > + }
> > +
> > + genlmsg_end(skb, hdr);
> > +
> > + return genlmsg_reply(skb, info);
> > +
> > +err_free_msg:
> > + nlmsg_free(skb);
> > +
> > + return err;
> > +}
> > +
> > /**
> > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > * @net: a freshly-created network namespace
> > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > index 3cd044edee5d..ccc78a5ee650 100644
> > --- a/include/uapi/linux/nfsd_netlink.h
> > +++ b/include/uapi/linux/nfsd_netlink.h
> > @@ -29,8 +29,19 @@ enum {
> > NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
> > };
> >
> > +enum {
> > + NFSD_A_SERVER_WORKER_THREADS = 1,
> > + NFSD_A_SERVER_WORKER_GRACETIME,
> > + NFSD_A_SERVER_WORKER_LEASETIME,
> > +
> > + __NFSD_A_SERVER_WORKER_MAX,
> > + NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > +};
> > +
> > enum {
> > NFSD_CMD_RPC_STATUS_GET = 1,
> > + NFSD_CMD_THREADS_SET,
> > + NFSD_CMD_THREADS_GET,
> >
> > __NFSD_CMD_MAX,
> > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > --
> > 2.44.0
> >
> >
>

--
Jeff Layton <[email protected]>

2024-04-16 22:50:24

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

On Wed, 2024-04-17 at 08:28 +1000, NeilBrown wrote:
> On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > Introduce write_threads netlink command similar to the one available
> > through the procfs.
> >
> > Tested-by: Jeff Layton <[email protected]>
> > Reviewed-by: Jeff Layton <[email protected]>
> > Co-developed-by: Jeff Layton <[email protected]>
> > Signed-off-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
> > fs/nfsd/netlink.c | 19 +++++
> > fs/nfsd/netlink.h | 2 +
> > fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
> > include/uapi/linux/nfsd_netlink.h | 11 +++
> > 5 files changed, 169 insertions(+)
> >
> > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > index 05acc73e2e33..cbe6c5fd6c4d 100644
> > --- a/Documentation/netlink/specs/nfsd.yaml
> > +++ b/Documentation/netlink/specs/nfsd.yaml
> > @@ -62,6 +62,18 @@ attribute-sets:
> > name: compound-ops
> > type: u32
> > multi-attr: true
> > + -
> > + name: server-worker
> > + attributes:
> > + -
> > + name: threads
> > + type: u32
> > + -
> > + name: gracetime
> > + type: u32
> > + -
> > + name: leasetime
> > + type: u32
>
> Another thought: I would be really happy if the "scope" were another
> optional argument here. The mechanism of setting the scope by user the
> hostname works but is ugly. I'm inclined to ignore the hostname
> completely when netlink is used, but I'm not completely sure about that.
> (aside - I think using the hostname for the default scope was a really
> bad idea. It should have been a fixed string like "Linux").
>

I'd be ok with that.

I replicated how rpc.nfsd does this in the userland tool, but I also
thought it was pretty ugly. I think all that it does currently is make
this work in nfsd_svc() since it overrides the nodename in the new
namespace:

strscpy(nn->nfsd_name, utsname()->nodename,
sizeof(nn->nfsd_name));

If we make that a new optional parameter, we could pass the scope in as
a new const char * argument to nfsd_svc. Then we'd just copy that into
nfsd_name instead of the nodename when it's provided.

I wonder too if we ought to rename this particular operations too. It's
becoming less of a "set threads" interface and more of a generic server
setup.

>
>
> >
> > operations:
> > list:
> > @@ -87,3 +99,24 @@ operations:
> > - sport
> > - dport
> > - compound-ops
> > + -
> > + name: threads-set
> > + doc: set the number of running threads
> > + attribute-set: server-worker
> > + flags: [ admin-perm ]
> > + do:
> > + request:
> > + attributes:
> > + - threads
> > + - gracetime
> > + - leasetime
> > + -
> > + name: threads-get
> > + doc: get the number of running threads
> > + attribute-set: server-worker
> > + do:
> > + reply:
> > + attributes:
> > + - threads
> > + - gracetime
> > + - leasetime
> > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > index 0e1d635ec5f9..20a646af0324 100644
> > --- a/fs/nfsd/netlink.c
> > +++ b/fs/nfsd/netlink.c
> > @@ -10,6 +10,13 @@
> >
> > #include <uapi/linux/nfsd_netlink.h>
> >
> > +/* NFSD_CMD_THREADS_SET - do */
> > +static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > + [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > + [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
> > + [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > +};
> > +
> > /* Ops table for nfsd */
> > static const struct genl_split_ops nfsd_nl_ops[] = {
> > {
> > @@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > .done = nfsd_nl_rpc_status_get_done,
> > .flags = GENL_CMD_CAP_DUMP,
> > },
> > + {
> > + .cmd = NFSD_CMD_THREADS_SET,
> > + .doit = nfsd_nl_threads_set_doit,
> > + .policy = nfsd_threads_set_nl_policy,
> > + .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
> > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > + },
> > + {
> > + .cmd = NFSD_CMD_THREADS_GET,
> > + .doit = nfsd_nl_threads_get_doit,
> > + .flags = GENL_CMD_CAP_DO,
> > + },
> > };
> >
> > struct genl_family nfsd_nl_family __ro_after_init = {
> > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > index d83dd6bdee92..4137fac477e4 100644
> > --- a/fs/nfsd/netlink.h
> > +++ b/fs/nfsd/netlink.h
> > @@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> >
> > int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > struct netlink_callback *cb);
> > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> >
> > extern struct genl_family nfsd_nl_family;
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index f2e442d7fe16..38a5df03981b 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
> > return 0;
> > }
> >
> > +/**
> > + * nfsd_nl_threads_set_doit - set the number of running threads
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + struct net *net = genl_info_net(info);
> > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > + int ret = -EBUSY;
> > + u32 nthreads;
> > +
> > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
> > + return -EINVAL;
> > +
> > + nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
> > +
> > + mutex_lock(&nfsd_mutex);
> > + if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
> > + info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
> > + const struct nlattr *attr;
> > +
> > + if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
> > + goto out_unlock;
> > +
> > + ret = -EINVAL;
> > + attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
> > + if (attr) {
> > + u32 gracetime = nla_get_u32(attr);
> > +
> > + if (gracetime < 10 || gracetime > 3600)
> > + goto out_unlock;
> > +
> > + nn->nfsd4_grace = gracetime;
> > + }
> > +
> > + attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
> > + if (attr) {
> > + u32 leasetime = nla_get_u32(attr);
> > +
> > + if (leasetime < 10 || leasetime > 3600)
> > + goto out_unlock;
> > +
> > + nn->nfsd4_lease = leasetime;
> > + }
> > + }
> > +
> > + ret = nfsd_svc(nthreads, net, get_current_cred());
> > +out_unlock:
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + return ret == nthreads ? 0 : ret;
> > +}
> > +
> > +/**
> > + * nfsd_nl_threads_get_doit - get the number of running threads
> > + * @skb: reply buffer
> > + * @info: netlink metadata and command arguments
> > + *
> > + * Return 0 on success or a negative errno.
> > + */
> > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > +{
> > + struct net *net = genl_info_net(info);
> > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > + void *hdr;
> > + int err;
> > +
> > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > + if (!skb)
> > + return -ENOMEM;
> > +
> > + hdr = genlmsg_iput(skb, info);
> > + if (!hdr) {
> > + err = -EMSGSIZE;
> > + goto err_free_msg;
> > + }
> > +
> > + mutex_lock(&nfsd_mutex);
> > + err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
> > + nn->nfsd4_grace) ||
> > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
> > + nn->nfsd4_lease) ||
> > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
> > + nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + if (err) {
> > + err = -EINVAL;
> > + goto err_free_msg;
> > + }
> > +
> > + genlmsg_end(skb, hdr);
> > +
> > + return genlmsg_reply(skb, info);
> > +
> > +err_free_msg:
> > + nlmsg_free(skb);
> > +
> > + return err;
> > +}
> > +
> > /**
> > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > * @net: a freshly-created network namespace
> > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > index 3cd044edee5d..ccc78a5ee650 100644
> > --- a/include/uapi/linux/nfsd_netlink.h
> > +++ b/include/uapi/linux/nfsd_netlink.h
> > @@ -29,8 +29,19 @@ enum {
> > NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
> > };
> >
> > +enum {
> > + NFSD_A_SERVER_WORKER_THREADS = 1,
> > + NFSD_A_SERVER_WORKER_GRACETIME,
> > + NFSD_A_SERVER_WORKER_LEASETIME,
> > +
> > + __NFSD_A_SERVER_WORKER_MAX,
> > + NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > +};
> > +
> > enum {
> > NFSD_CMD_RPC_STATUS_GET = 1,
> > + NFSD_CMD_THREADS_SET,
> > + NFSD_CMD_THREADS_GET,
> >
> > __NFSD_CMD_MAX,
> > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > --
> > 2.44.0
> >
> >
>

--
Jeff Layton <[email protected]>

2024-04-16 23:05:26

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 17 Apr 2024, Jeff Layton wrote:
> On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > while all the other ones will be disabled. (procfs write_version
> > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > enable/disable a single version.
> > > >
> > > > It seems a little weird to me that the interface always disables all
> > > > version, but then also allows individual versions to be disabled.
> > > >
> > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > version, and just enable all versions listed??
> > > >
> > > > Or maybe only enable those with the flag, and don't disable those
> > > > without the flag?
> > > >
> > > > Those don't necessarily seem much better - but the current behaviour
> > > > still seems odd.
> > > >
> > >
> > > I think it makes sense.
> > >
> > > We disable all versions, and enable any that have the "enabled" flag set
> > > in the call from userland. Userland technically needn't send down the
> > > versions that are disabled in the call, but the current userland program
> > > does.
> > >
> > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > but disable v3" and leave the others in their current state. That
> > > requires that both the kernel and userland keep state about what
> > > versions are currently enabled and disabled, and it's possible to get
> > > that wrong.
> >
> > I understand and support your aversion for imperative interfaces.
> > But this interface, as currently implemented, looks somewhat imperative.
> > The message sent to the kernel could enable some interfaces and then
> > disable them. I know that isn't the intent, but it is what the code
> > supports. Hence "weird".
> >
> > We could add code to make that sort of thing impossible, but there isn't
> > much point. Better to make it syntactically impossible.
> >
> > Realistically there will never be NFSv4.3 as there is no need - new
> > features can be added incrementally. 
> >
>
> There is no need _so_far_. Who knows what the future will bring? Maybe
> we'll need v4.3 in order to add some needed feature?
>
> > So we could just pass an array of
> > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > I'm not sure that I do either. A "read" would return the same array
> > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > array could be variable length so 5.0 could be added one day...).
> >
>
> A set of flags is basically what this interface is. They just happen to
> also be labeled with the major and minorversion. I think that's a good
> thing.

Good for whom? Labelling allows for labelling inconsistencies.

Maybe the kernel should be able to provide an ordered list of labels,
and separately an array of which labels are enabled/disabled.
Userspace could send down a new set of enabled/disabled flags based on
the agreed set of labels.

Here is a question that is a bit of a diversion, but might help us
understand the context more fully:

Why would anyone disable v4.2 separately from v4.1 ??

I understand that v2, v3, v4.0, v4.1 are effectively different protocols
and you might want to ensure that only the approved/tested protocol is
used. But v4.2 is just a few enhancements on v4.1. Why would you want
to disable it?

The answer I can think of that there might be bugs (perish the
thought!!) in some of those features so you might want to avoid using
them.
But in that case, it is really the features that you want to suppress,
not the protocol version.

Maybe I might want to disable delegation - to just write delegation.
Can I do that? What if I just want to disable server-side copy, but
keep fallocate and umask support?

i.e. is a list of versions really the granularity that we want to use
for this interface?

NeilBrown

>
> >
> > I haven't managed to come up with anything *better* than the current
> > proposal and I don't want to stand in its way, but I wanted to highlight
> > that it looks weird - as much imperative as declarative - in case
> > someone else might be able to come up with a better alternative.
> >
>
> The intention was to create a symmetrical interface. We have 2
> operations: a "get" that asks "what versions are currently supported and
> are they enabled?", and a "set" that says "here is the new state of
> version enablement".
>
> The userland tool always sends down a complete set of versions. The
> kernel is (currently) more forgiving, and treats omitted versions as
> being disabled. The kernel could require that every supported version be
> represented in the "set" operation, if that's more desirable.
>
> > >
> > > My thinking was that by using a declarative interface like this, we
> > > eliminate ambiguity in how these interfaces are supposed to work. The
> > > client sends down an entire version map in one fell swoop, and we know
> > > exactly what the result should look like.
> > >
> > > Note that you can enable or disable just a single version with the
> > > userland tool, but this way all of that complexity stays in userland.
> > >
> > > > Also for getting the version, the doc says:
> > > >
> > > > doc: get nfs enabled versions
> > > >
> > > > which I don't think it quite right. The code reports all supported
> > > > versions, and indicates which of those are enabled. So maybe:
> > > >
> > > > doc: get enabled status for all supported versions
> > > >
> > > >
> > > > I think that fact that it actually lists all supported versions is
> > > > useful and worth making explicit.
> > > >
> > > >
> > >
> > > Agreed. We should make that change before merging anything.
> > >
> > > > >
> > > > > Reviewed-by: Jeff Layton <[email protected]>
> > > > > Tested-by: Jeff Layton <[email protected]>
> > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > > ---
> > > > > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > > > > fs/nfsd/netlink.c | 24 +++++
> > > > > fs/nfsd/netlink.h | 5 +
> > > > > fs/nfsd/netns.h | 1 +
> > > > > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > > > > fs/nfsd/nfssvc.c | 3 +-
> > > > > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > > > > 7 files changed, 236 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > > > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > > > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > > > @@ -74,6 +74,26 @@ attribute-sets:
> > > > > -
> > > > > name: leasetime
> > > > > type: u32
> > > > > + -
> > > > > + name: version
> > > > > + attributes:
> > > > > + -
> > > > > + name: major
> > > > > + type: u32
> > > > > + -
> > > > > + name: minor
> > > > > + type: u32
> > > > > + -
> > > > > + name: enabled
> > > > > + type: flag
> > > > > + -
> > > > > + name: server-proto
> > > > > + attributes:
> > > > > + -
> > > > > + name: version
> > > > > + type: nest
> > > > > + nested-attributes: version
> > > > > + multi-attr: true
> > > > >
> > > > > operations:
> > > > > list:
> > > > > @@ -120,3 +140,20 @@ operations:
> > > > > - threads
> > > > > - gracetime
> > > > > - leasetime
> > > > > + -
> > > > > + name: version-set
> > > > > + doc: set nfs enabled versions
> > > > > + attribute-set: server-proto
> > > > > + flags: [ admin-perm ]
> > > > > + do:
> > > > > + request:
> > > > > + attributes:
> > > > > + - version
> > > > > + -
> > > > > + name: version-get
> > > > > + doc: get nfs enabled versions
> > > > > + attribute-set: server-proto
> > > > > + do:
> > > > > + reply:
> > > > > + attributes:
> > > > > + - version
> > > > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > > > index 20a646af0324..bf5df9597288 100644
> > > > > --- a/fs/nfsd/netlink.c
> > > > > +++ b/fs/nfsd/netlink.c
> > > > > @@ -10,6 +10,13 @@
> > > > >
> > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > >
> > > > > +/* Common nested types */
> > > > > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > > > > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > > > > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > > > > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > > > > +};
> > > > > +
> > > > > /* NFSD_CMD_THREADS_SET - do */
> > > > > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > > > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > > > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > > > > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > > > };
> > > > >
> > > > > +/* NFSD_CMD_VERSION_SET - do */
> > > > > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > > > > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > > > > +};
> > > > > +
> > > > > /* Ops table for nfsd */
> > > > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > {
> > > > > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > .doit = nfsd_nl_threads_get_doit,
> > > > > .flags = GENL_CMD_CAP_DO,
> > > > > },
> > > > > + {
> > > > > + .cmd = NFSD_CMD_VERSION_SET,
> > > > > + .doit = nfsd_nl_version_set_doit,
> > > > > + .policy = nfsd_version_set_nl_policy,
> > > > > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > > > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > > > + },
> > > > > + {
> > > > > + .cmd = NFSD_CMD_VERSION_GET,
> > > > > + .doit = nfsd_nl_version_get_doit,
> > > > > + .flags = GENL_CMD_CAP_DO,
> > > > > + },
> > > > > };
> > > > >
> > > > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > > > index 4137fac477e4..c7c0da275481 100644
> > > > > --- a/fs/nfsd/netlink.h
> > > > > +++ b/fs/nfsd/netlink.h
> > > > > @@ -11,6 +11,9 @@
> > > > >
> > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > >
> > > > > +/* Common nested types */
> > > > > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > > > > +
> > > > > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > > > > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > > > >
> > > > > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > > > struct netlink_callback *cb);
> > > > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > >
> > > > > extern struct genl_family nfsd_nl_family;
> > > > >
> > > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > > > index d4be519b5734..14ec15656320 100644
> > > > > --- a/fs/nfsd/netns.h
> > > > > +++ b/fs/nfsd/netns.h
> > > > > @@ -218,6 +218,7 @@ struct nfsd_net {
> > > > > /* Simple check to find out if a given net was properly initialized */
> > > > > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> > > > >
> > > > > +extern bool nfsd_support_version(int vers);
> > > > > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> > > > >
> > > > > extern unsigned int nfsd_net_id;
> > > > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > > > index 38a5df03981b..2c8929ef79e9 100644
> > > > > --- a/fs/nfsd/nfsctl.c
> > > > > +++ b/fs/nfsd/nfsctl.c
> > > > > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > return err;
> > > > > }
> > > > >
> > > > > +/**
> > > > > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > > > > + * @skb: reply buffer
> > > > > + * @info: netlink metadata and command arguments
> > > > > + *
> > > > > + * Return 0 on success or a negative errno.
> > > > > + */
> > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > +{
> > > > > + const struct nlattr *attr;
> > > > > + struct nfsd_net *nn;
> > > > > + int i, rem;
> > > > > +
> > > > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > > > > + return -EINVAL;
> > > > > +
> > > > > + mutex_lock(&nfsd_mutex);
> > > > > +
> > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > + if (nn->nfsd_serv) {
> > > > > + mutex_unlock(&nfsd_mutex);
> > > > > + return -EBUSY;
> > > > > + }
> > > > > +
> > > > > + /* clear current supported versions. */
> > > > > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > > > > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > > > > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > > > > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > > > > +
> > > > > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > > > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > > > > + u32 major, minor = 0;
> > > > > + bool enabled;
> > > > > +
> > > > > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > > > > + continue;
> > > > > +
> > > > > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > > > > + nfsd_version_nl_policy, info->extack) < 0)
> > > > > + continue;
> > > > > +
> > > > > + if (!tb[NFSD_A_VERSION_MAJOR])
> > > > > + continue;
> > > > > +
> > > > > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > > > > + if (tb[NFSD_A_VERSION_MINOR])
> > > > > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > > > > +
> > > > > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > > > > +
> > > > > + switch (major) {
> > > > > + case 4:
> > > > > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > + break;
> > > > > + case 3:
> > > > > + case 2:
> > > > > + if (!minor)
> > > > > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > + break;
> > > > > + default:
> > > > > + break;
> > > > > + }
> > > > > + }
> > > > > +
> > > > > + mutex_unlock(&nfsd_mutex);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > > > > + * @skb: reply buffer
> > > > > + * @info: netlink metadata and command arguments
> > > > > + *
> > > > > + * Return 0 on success or a negative errno.
> > > > > + */
> > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > +{
> > > > > + struct nfsd_net *nn;
> > > > > + int i, err;
> > > > > + void *hdr;
> > > > > +
> > > > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > > > + if (!skb)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + hdr = genlmsg_iput(skb, info);
> > > > > + if (!hdr) {
> > > > > + err = -EMSGSIZE;
> > > > > + goto err_free_msg;
> > > > > + }
> > > > > +
> > > > > + mutex_lock(&nfsd_mutex);
> > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > +
> > > > > + for (i = 2; i <= 4; i++) {
> > > > > + int j;
> > > > > +
> > > > > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > > > > + struct nlattr *attr;
> > > > > +
> > > > > + /* Don't record any versions the kernel doesn't have
> > > > > + * compiled in
> > > > > + */
> > > > > + if (!nfsd_support_version(i))
> > > > > + continue;
> > > > > +
> > > > > + /* NFSv{2,3} does not support minor numbers */
> > > > > + if (i < 4 && j)
> > > > > + continue;
> > > > > +
> > > > > + attr = nla_nest_start(skb,
> > > > > + NFSD_A_SERVER_PROTO_VERSION);
> > > > > + if (!attr) {
> > > > > + err = -EINVAL;
> > > > > + goto err_nfsd_unlock;
> > > > > + }
> > > > > +
> > > > > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > > > > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > > > > + err = -EINVAL;
> > > > > + goto err_nfsd_unlock;
> > > > > + }
> > > > > +
> > > > > + /* Set the enabled flag if the version is enabled */
> > > > > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > > > > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > > > > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > > > > + err = -EINVAL;
> > > > > + goto err_nfsd_unlock;
> > > > > + }
> > > > > +
> > > > > + nla_nest_end(skb, attr);
> > > > > + }
> > > > > + }
> > > > > +
> > > > > + mutex_unlock(&nfsd_mutex);
> > > > > + genlmsg_end(skb, hdr);
> > > > > +
> > > > > + return genlmsg_reply(skb, info);
> > > > > +
> > > > > +err_nfsd_unlock:
> > > > > + mutex_unlock(&nfsd_mutex);
> > > > > +err_free_msg:
> > > > > + nlmsg_free(skb);
> > > > > +
> > > > > + return err;
> > > > > +}
> > > > > +
> > > > > /**
> > > > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > > > * @net: a freshly-created network namespace
> > > > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > > > > index ca193f7ff0e1..4fc91f50138a 100644
> > > > > --- a/fs/nfsd/nfssvc.c
> > > > > +++ b/fs/nfsd/nfssvc.c
> > > > > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > > > > .pg_rpcbind_set = nfsd_rpcbind_set,
> > > > > };
> > > > >
> > > > > -static bool
> > > > > -nfsd_support_version(int vers)
> > > > > +bool nfsd_support_version(int vers)
> > > > > {
> > > > > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > > > > return nfsd_version[vers] != NULL;
> > > > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > > > index ccc78a5ee650..8a0a2b344923 100644
> > > > > --- a/include/uapi/linux/nfsd_netlink.h
> > > > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > > > @@ -38,10 +38,28 @@ enum {
> > > > > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > > > };
> > > > >
> > > > > +enum {
> > > > > + NFSD_A_VERSION_MAJOR = 1,
> > > > > + NFSD_A_VERSION_MINOR,
> > > > > + NFSD_A_VERSION_ENABLED,
> > > > > +
> > > > > + __NFSD_A_VERSION_MAX,
> > > > > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > > > > +};
> > > > > +
> > > > > +enum {
> > > > > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > > > > +
> > > > > + __NFSD_A_SERVER_PROTO_MAX,
> > > > > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > > > > +};
> > > > > +
> > > > > enum {
> > > > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > > > NFSD_CMD_THREADS_SET,
> > > > > NFSD_CMD_THREADS_GET,
> > > > > + NFSD_CMD_VERSION_SET,
> > > > > + NFSD_CMD_VERSION_GET,
> > > > >
> > > > > __NFSD_CMD_MAX,
> > > > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > > > --
> > > > > 2.44.0
> > > > >
> > > > >
> > > >
> > >
> > > --
> > > Jeff Layton <[email protected]>
> > >
> >
>
> --
> Jeff Layton <[email protected]>
>


2024-04-17 00:11:04

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 2024-04-17 at 09:05 +1000, NeilBrown wrote:
> On Wed, 17 Apr 2024, Jeff Layton wrote:
> > On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > > while all the other ones will be disabled. (procfs write_version
> > > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > > enable/disable a single version.
> > > > >
> > > > > It seems a little weird to me that the interface always disables all
> > > > > version, but then also allows individual versions to be disabled.
> > > > >
> > > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > > version, and just enable all versions listed??
> > > > >
> > > > > Or maybe only enable those with the flag, and don't disable those
> > > > > without the flag?
> > > > >
> > > > > Those don't necessarily seem much better - but the current behaviour
> > > > > still seems odd.
> > > > >
> > > >
> > > > I think it makes sense.
> > > >
> > > > We disable all versions, and enable any that have the "enabled" flag set
> > > > in the call from userland. Userland technically needn't send down the
> > > > versions that are disabled in the call, but the current userland program
> > > > does.
> > > >
> > > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > > but disable v3" and leave the others in their current state. That
> > > > requires that both the kernel and userland keep state about what
> > > > versions are currently enabled and disabled, and it's possible to get
> > > > that wrong.
> > >
> > > I understand and support your aversion for imperative interfaces.
> > > But this interface, as currently implemented, looks somewhat imperative.
> > > The message sent to the kernel could enable some interfaces and then
> > > disable them. I know that isn't the intent, but it is what the code
> > > supports. Hence "weird".
> > >
> > > We could add code to make that sort of thing impossible, but there isn't
> > > much point. Better to make it syntactically impossible.
> > >
> > > Realistically there will never be NFSv4.3 as there is no need - new
> > > features can be added incrementally.?
> > >
> >
> > There is no need _so_far_. Who knows what the future will bring? Maybe
> > we'll need v4.3 in order to add some needed feature?
> >
> > > So we could just pass an array of
> > > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > > I'm not sure that I do either. A "read" would return the same array
> > > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > > array could be variable length so 5.0 could be added one day...).
> > >
> >
> > A set of flags is basically what this interface is. They just happen to
> > also be labeled with the major and minorversion. I think that's a good
> > thing.
>
> Good for whom? Labelling allows for labelling inconsistencies.
>

Now you're just being silly. You wanted a variable length array, but
what if the next slot is not v4.3 but 5.0? A positional interpretation
of a slot in an array is just as just as subject to problems.

> Maybe the kernel should be able to provide an ordered list of labels,
> and separately an array of which labels are enabled/disabled.
> Userspace could send down a new set of enabled/disabled flags based on
> the agreed set of labels.
>

How is this better than what's been proposed? One strength of netlink is
that the data is structured. The already proposed interface takes
advantage of that.

> Here is a question that is a bit of a diversion, but might help us
> understand the context more fully:
>
> Why would anyone disable v4.2 separately from v4.1 ??
>

Furthermore, what does it mean to disable v4.1 but leave v4.2 enabled?

> I understand that v2, v3, v4.0, v4.1 are effectively different protocols
> and you might want to ensure that only the approved/tested protocol is
> used. But v4.2 is just a few enhancements on v4.1. Why would you want
> to disable it?
>
> The answer I can think of that there might be bugs (perish the
> thought!!) in some of those features so you might want to avoid using
> them.
> But in that case, it is really the features that you want to suppress,
> not the protocol version.
>
> Maybe I might want to disable delegation - to just write delegation.
> Can I do that? What if I just want to disable server-side copy, but
> keep fallocate and umask support?
>
> i.e. is a list of versions really the granularity that we want to use
> for this interface?
>

Our current goal is to replace rpc.nfsd with a new program that works
via netlink. An important bit of what rpc.nfsd does is start the NFS
server with the settings in /etc/nfs.conf. Some of those settings are
vers3=, vers4.0=, etc. that govern how /proc/fs/nfsd/versions is set.
We have an immediate need to deal with those settings today, and
probably will for quite some time.

I'm not opposed to augmenting that with something more granular, but I
don't think we should block this interface and wait on that. We can
extend the interface at some point in the future to take a new feature
bitmask or something, and just declare that (e.g.) declaring vers4.2=n
disables some subset of those features.

>
> >
> > >
> > > I haven't managed to come up with anything *better* than the current
> > > proposal and I don't want to stand in its way, but I wanted to highlight
> > > that it looks weird - as much imperative as declarative - in case
> > > someone else might be able to come up with a better alternative.
> > >
> >
> > The intention was to create a symmetrical interface. We have 2
> > operations: a "get" that asks "what versions are currently supported and
> > are they enabled?", and a "set" that says "here is the new state of
> > version enablement".
> >
> > The userland tool always sends down a complete set of versions. The
> > kernel is (currently) more forgiving, and treats omitted versions as
> > being disabled. The kernel could require that every supported version be
> > represented in the "set" operation, if that's more desirable.
> >
> > > >
> > > > My thinking was that by using a declarative interface like this, we
> > > > eliminate ambiguity in how these interfaces are supposed to work. The
> > > > client sends down an entire version map in one fell swoop, and we know
> > > > exactly what the result should look like.
> > > >
> > > > Note that you can enable or disable just a single version with the
> > > > userland tool, but this way all of that complexity stays in userland.
> > > >
> > > > > Also for getting the version, the doc says:
> > > > >
> > > > > doc: get nfs enabled versions
> > > > >
> > > > > which I don't think it quite right. The code reports all supported
> > > > > versions, and indicates which of those are enabled. So maybe:
> > > > >
> > > > > doc: get enabled status for all supported versions
> > > > >
> > > > >
> > > > > I think that fact that it actually lists all supported versions is
> > > > > useful and worth making explicit.
> > > > >
> > > > >
> > > >
> > > > Agreed. We should make that change before merging anything.
> > > >
> > > > > >
> > > > > > Reviewed-by: Jeff Layton <[email protected]>
> > > > > > Tested-by: Jeff Layton <[email protected]>
> > > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > > > ---
> > > > > > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > > > > > fs/nfsd/netlink.c | 24 +++++
> > > > > > fs/nfsd/netlink.h | 5 +
> > > > > > fs/nfsd/netns.h | 1 +
> > > > > > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > > > > > fs/nfsd/nfssvc.c | 3 +-
> > > > > > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > > > > > 7 files changed, 236 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > > > > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > > > > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > > > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > > > > @@ -74,6 +74,26 @@ attribute-sets:
> > > > > > -
> > > > > > name: leasetime
> > > > > > type: u32
> > > > > > + -
> > > > > > + name: version
> > > > > > + attributes:
> > > > > > + -
> > > > > > + name: major
> > > > > > + type: u32
> > > > > > + -
> > > > > > + name: minor
> > > > > > + type: u32
> > > > > > + -
> > > > > > + name: enabled
> > > > > > + type: flag
> > > > > > + -
> > > > > > + name: server-proto
> > > > > > + attributes:
> > > > > > + -
> > > > > > + name: version
> > > > > > + type: nest
> > > > > > + nested-attributes: version
> > > > > > + multi-attr: true
> > > > > >
> > > > > > operations:
> > > > > > list:
> > > > > > @@ -120,3 +140,20 @@ operations:
> > > > > > - threads
> > > > > > - gracetime
> > > > > > - leasetime
> > > > > > + -
> > > > > > + name: version-set
> > > > > > + doc: set nfs enabled versions
> > > > > > + attribute-set: server-proto
> > > > > > + flags: [ admin-perm ]
> > > > > > + do:
> > > > > > + request:
> > > > > > + attributes:
> > > > > > + - version
> > > > > > + -
> > > > > > + name: version-get
> > > > > > + doc: get nfs enabled versions
> > > > > > + attribute-set: server-proto
> > > > > > + do:
> > > > > > + reply:
> > > > > > + attributes:
> > > > > > + - version
> > > > > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > > > > index 20a646af0324..bf5df9597288 100644
> > > > > > --- a/fs/nfsd/netlink.c
> > > > > > +++ b/fs/nfsd/netlink.c
> > > > > > @@ -10,6 +10,13 @@
> > > > > >
> > > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > > >
> > > > > > +/* Common nested types */
> > > > > > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > > > > > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > > > > > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > > > > > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > > > > > +};
> > > > > > +
> > > > > > /* NFSD_CMD_THREADS_SET - do */
> > > > > > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > > > > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > > > > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > > > > > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > > > > };
> > > > > >
> > > > > > +/* NFSD_CMD_VERSION_SET - do */
> > > > > > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > > > > > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > > > > > +};
> > > > > > +
> > > > > > /* Ops table for nfsd */
> > > > > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > > {
> > > > > > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > > .doit = nfsd_nl_threads_get_doit,
> > > > > > .flags = GENL_CMD_CAP_DO,
> > > > > > },
> > > > > > + {
> > > > > > + .cmd = NFSD_CMD_VERSION_SET,
> > > > > > + .doit = nfsd_nl_version_set_doit,
> > > > > > + .policy = nfsd_version_set_nl_policy,
> > > > > > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > > > > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > > > > + },
> > > > > > + {
> > > > > > + .cmd = NFSD_CMD_VERSION_GET,
> > > > > > + .doit = nfsd_nl_version_get_doit,
> > > > > > + .flags = GENL_CMD_CAP_DO,
> > > > > > + },
> > > > > > };
> > > > > >
> > > > > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > > > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > > > > index 4137fac477e4..c7c0da275481 100644
> > > > > > --- a/fs/nfsd/netlink.h
> > > > > > +++ b/fs/nfsd/netlink.h
> > > > > > @@ -11,6 +11,9 @@
> > > > > >
> > > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > > >
> > > > > > +/* Common nested types */
> > > > > > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > > > > > +
> > > > > > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > > > > > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > > > > >
> > > > > > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > > > > struct netlink_callback *cb);
> > > > > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > >
> > > > > > extern struct genl_family nfsd_nl_family;
> > > > > >
> > > > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > > > > index d4be519b5734..14ec15656320 100644
> > > > > > --- a/fs/nfsd/netns.h
> > > > > > +++ b/fs/nfsd/netns.h
> > > > > > @@ -218,6 +218,7 @@ struct nfsd_net {
> > > > > > /* Simple check to find out if a given net was properly initialized */
> > > > > > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> > > > > >
> > > > > > +extern bool nfsd_support_version(int vers);
> > > > > > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> > > > > >
> > > > > > extern unsigned int nfsd_net_id;
> > > > > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > > > > index 38a5df03981b..2c8929ef79e9 100644
> > > > > > --- a/fs/nfsd/nfsctl.c
> > > > > > +++ b/fs/nfsd/nfsctl.c
> > > > > > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > return err;
> > > > > > }
> > > > > >
> > > > > > +/**
> > > > > > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > > > > > + * @skb: reply buffer
> > > > > > + * @info: netlink metadata and command arguments
> > > > > > + *
> > > > > > + * Return 0 on success or a negative errno.
> > > > > > + */
> > > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > +{
> > > > > > + const struct nlattr *attr;
> > > > > > + struct nfsd_net *nn;
> > > > > > + int i, rem;
> > > > > > +
> > > > > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > > > > > + return -EINVAL;
> > > > > > +
> > > > > > + mutex_lock(&nfsd_mutex);
> > > > > > +
> > > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > > + if (nn->nfsd_serv) {
> > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > + return -EBUSY;
> > > > > > + }
> > > > > > +
> > > > > > + /* clear current supported versions. */
> > > > > > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > > > > > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > > > > > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > > > > > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > > > > > +
> > > > > > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > > > > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > > > > > + u32 major, minor = 0;
> > > > > > + bool enabled;
> > > > > > +
> > > > > > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > > > > > + continue;
> > > > > > +
> > > > > > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > > > > > + nfsd_version_nl_policy, info->extack) < 0)
> > > > > > + continue;
> > > > > > +
> > > > > > + if (!tb[NFSD_A_VERSION_MAJOR])
> > > > > > + continue;
> > > > > > +
> > > > > > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > > > > > + if (tb[NFSD_A_VERSION_MINOR])
> > > > > > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > > > > > +
> > > > > > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > > > > > +
> > > > > > + switch (major) {
> > > > > > + case 4:
> > > > > > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > > + break;
> > > > > > + case 3:
> > > > > > + case 2:
> > > > > > + if (!minor)
> > > > > > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > > + break;
> > > > > > + default:
> > > > > > + break;
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > > > > > + * @skb: reply buffer
> > > > > > + * @info: netlink metadata and command arguments
> > > > > > + *
> > > > > > + * Return 0 on success or a negative errno.
> > > > > > + */
> > > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > +{
> > > > > > + struct nfsd_net *nn;
> > > > > > + int i, err;
> > > > > > + void *hdr;
> > > > > > +
> > > > > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > > > > + if (!skb)
> > > > > > + return -ENOMEM;
> > > > > > +
> > > > > > + hdr = genlmsg_iput(skb, info);
> > > > > > + if (!hdr) {
> > > > > > + err = -EMSGSIZE;
> > > > > > + goto err_free_msg;
> > > > > > + }
> > > > > > +
> > > > > > + mutex_lock(&nfsd_mutex);
> > > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > > +
> > > > > > + for (i = 2; i <= 4; i++) {
> > > > > > + int j;
> > > > > > +
> > > > > > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > > > > > + struct nlattr *attr;
> > > > > > +
> > > > > > + /* Don't record any versions the kernel doesn't have
> > > > > > + * compiled in
> > > > > > + */
> > > > > > + if (!nfsd_support_version(i))
> > > > > > + continue;
> > > > > > +
> > > > > > + /* NFSv{2,3} does not support minor numbers */
> > > > > > + if (i < 4 && j)
> > > > > > + continue;
> > > > > > +
> > > > > > + attr = nla_nest_start(skb,
> > > > > > + NFSD_A_SERVER_PROTO_VERSION);
> > > > > > + if (!attr) {
> > > > > > + err = -EINVAL;
> > > > > > + goto err_nfsd_unlock;
> > > > > > + }
> > > > > > +
> > > > > > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > > > > > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > > > > > + err = -EINVAL;
> > > > > > + goto err_nfsd_unlock;
> > > > > > + }
> > > > > > +
> > > > > > + /* Set the enabled flag if the version is enabled */
> > > > > > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > > > > > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > > > > > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > > > > > + err = -EINVAL;
> > > > > > + goto err_nfsd_unlock;
> > > > > > + }
> > > > > > +
> > > > > > + nla_nest_end(skb, attr);
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > + genlmsg_end(skb, hdr);
> > > > > > +
> > > > > > + return genlmsg_reply(skb, info);
> > > > > > +
> > > > > > +err_nfsd_unlock:
> > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > +err_free_msg:
> > > > > > + nlmsg_free(skb);
> > > > > > +
> > > > > > + return err;
> > > > > > +}
> > > > > > +
> > > > > > /**
> > > > > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > > > > * @net: a freshly-created network namespace
> > > > > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > > > > > index ca193f7ff0e1..4fc91f50138a 100644
> > > > > > --- a/fs/nfsd/nfssvc.c
> > > > > > +++ b/fs/nfsd/nfssvc.c
> > > > > > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > > > > > .pg_rpcbind_set = nfsd_rpcbind_set,
> > > > > > };
> > > > > >
> > > > > > -static bool
> > > > > > -nfsd_support_version(int vers)
> > > > > > +bool nfsd_support_version(int vers)
> > > > > > {
> > > > > > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > > > > > return nfsd_version[vers] != NULL;
> > > > > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > > > > index ccc78a5ee650..8a0a2b344923 100644
> > > > > > --- a/include/uapi/linux/nfsd_netlink.h
> > > > > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > > > > @@ -38,10 +38,28 @@ enum {
> > > > > > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > > > > };
> > > > > >
> > > > > > +enum {
> > > > > > + NFSD_A_VERSION_MAJOR = 1,
> > > > > > + NFSD_A_VERSION_MINOR,
> > > > > > + NFSD_A_VERSION_ENABLED,
> > > > > > +
> > > > > > + __NFSD_A_VERSION_MAX,
> > > > > > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > > > > > +};
> > > > > > +
> > > > > > +enum {
> > > > > > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > > > > > +
> > > > > > + __NFSD_A_SERVER_PROTO_MAX,
> > > > > > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > > > > > +};
> > > > > > +
> > > > > > enum {
> > > > > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > > > > NFSD_CMD_THREADS_SET,
> > > > > > NFSD_CMD_THREADS_GET,
> > > > > > + NFSD_CMD_VERSION_SET,
> > > > > > + NFSD_CMD_VERSION_GET,
> > > > > >
> > > > > > __NFSD_CMD_MAX,
> > > > > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > > > > --
> > > > > > 2.44.0
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > --
> > > > Jeff Layton <[email protected]>
> > > >
> > >
> >
> > --
> > Jeff Layton <[email protected]>
> >
>

--
Jeff Layton <[email protected]>

2024-04-17 00:44:17

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 17 Apr 2024, Jeff Layton wrote:
> On Wed, 2024-04-17 at 09:05 +1000, NeilBrown wrote:
> > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > > > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > > > while all the other ones will be disabled. (procfs write_version
> > > > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > > > enable/disable a single version.
> > > > > >
> > > > > > It seems a little weird to me that the interface always disables all
> > > > > > version, but then also allows individual versions to be disabled.
> > > > > >
> > > > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > > > version, and just enable all versions listed??
> > > > > >
> > > > > > Or maybe only enable those with the flag, and don't disable those
> > > > > > without the flag?
> > > > > >
> > > > > > Those don't necessarily seem much better - but the current behaviour
> > > > > > still seems odd.
> > > > > >
> > > > >
> > > > > I think it makes sense.
> > > > >
> > > > > We disable all versions, and enable any that have the "enabled" flag set
> > > > > in the call from userland. Userland technically needn't send down the
> > > > > versions that are disabled in the call, but the current userland program
> > > > > does.
> > > > >
> > > > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > > > but disable v3" and leave the others in their current state. That
> > > > > requires that both the kernel and userland keep state about what
> > > > > versions are currently enabled and disabled, and it's possible to get
> > > > > that wrong.
> > > >
> > > > I understand and support your aversion for imperative interfaces.
> > > > But this interface, as currently implemented, looks somewhat imperative.
> > > > The message sent to the kernel could enable some interfaces and then
> > > > disable them. I know that isn't the intent, but it is what the code
> > > > supports. Hence "weird".
> > > >
> > > > We could add code to make that sort of thing impossible, but there isn't
> > > > much point. Better to make it syntactically impossible.
> > > >
> > > > Realistically there will never be NFSv4.3 as there is no need - new
> > > > features can be added incrementally. 
> > > >
> > >
> > > There is no need _so_far_. Who knows what the future will bring? Maybe
> > > we'll need v4.3 in order to add some needed feature?
> > >
> > > > So we could just pass an array of
> > > > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > > > I'm not sure that I do either. A "read" would return the same array
> > > > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > > > array could be variable length so 5.0 could be added one day...).
> > > >
> > >
> > > A set of flags is basically what this interface is. They just happen to
> > > also be labeled with the major and minorversion. I think that's a good
> > > thing.
> >
> > Good for whom? Labelling allows for labelling inconsistencies.
> >
>
> Now you're just being silly. You wanted a variable length array, but
> what if the next slot is not v4.3 but 5.0? A positional interpretation
> of a slot in an array is just as just as subject to problems.

I don't think it could look like a imperative interface, which the
current one does a bit.

>
> > Maybe the kernel should be able to provide an ordered list of labels,
> > and separately an array of which labels are enabled/disabled.
> > Userspace could send down a new set of enabled/disabled flags based on
> > the agreed set of labels.
> >
>
> How is this better than what's been proposed? One strength of netlink is
> that the data is structured. The already proposed interface takes
> advantage of that.
>
> > Here is a question that is a bit of a diversion, but might help us
> > understand the context more fully:
> >
> > Why would anyone disable v4.2 separately from v4.1 ??
> >
>
> Furthermore, what does it mean to disable v4.1 but leave v4.2 enabled?

Indeed!

>
> > I understand that v2, v3, v4.0, v4.1 are effectively different protocols
> > and you might want to ensure that only the approved/tested protocol is
> > used. But v4.2 is just a few enhancements on v4.1. Why would you want
> > to disable it?
> >
> > The answer I can think of that there might be bugs (perish the
> > thought!!) in some of those features so you might want to avoid using
> > them.
> > But in that case, it is really the features that you want to suppress,
> > not the protocol version.
> >
> > Maybe I might want to disable delegation - to just write delegation.
> > Can I do that? What if I just want to disable server-side copy, but
> > keep fallocate and umask support?
> >
> > i.e. is a list of versions really the granularity that we want to use
> > for this interface?
> >
>
> Our current goal is to replace rpc.nfsd with a new program that works
> via netlink. An important bit of what rpc.nfsd does is start the NFS
> server with the settings in /etc/nfs.conf. Some of those settings are
> vers3=, vers4.0=, etc. that govern how /proc/fs/nfsd/versions is set.
> We have an immediate need to deal with those settings today, and
> probably will for quite some time.
>
> I'm not opposed to augmenting that with something more granular, but I
> don't think we should block this interface and wait on that. We can
> extend the interface at some point in the future to take a new feature
> bitmask or something, and just declare that (e.g.) declaring vers4.2=n
> disables some subset of those features.

I agree that we don't want to block "good" while we wait for "perfect". I
just want to be sure that what we have is "good".

The current /proc interface uses strings like "v3" and "v4.1".
The proposed netlink interface uses pairs on u32s - "major" and "minor".
So we lose some easy paths for extensibility. Are we comfortable with
that?

This isn't a big deal - certainly not a blocked. I just don't feel
entirely comfortable about the current interface and I'm exploring to
see if there might be something better.

Thanks,
NeilBrown


>
> >
> > >
> > > >
> > > > I haven't managed to come up with anything *better* than the current
> > > > proposal and I don't want to stand in its way, but I wanted to highlight
> > > > that it looks weird - as much imperative as declarative - in case
> > > > someone else might be able to come up with a better alternative.
> > > >
> > >
> > > The intention was to create a symmetrical interface. We have 2
> > > operations: a "get" that asks "what versions are currently supported and
> > > are they enabled?", and a "set" that says "here is the new state of
> > > version enablement".
> > >
> > > The userland tool always sends down a complete set of versions. The
> > > kernel is (currently) more forgiving, and treats omitted versions as
> > > being disabled. The kernel could require that every supported version be
> > > represented in the "set" operation, if that's more desirable.
> > >
> > > > >
> > > > > My thinking was that by using a declarative interface like this, we
> > > > > eliminate ambiguity in how these interfaces are supposed to work. The
> > > > > client sends down an entire version map in one fell swoop, and we know
> > > > > exactly what the result should look like.
> > > > >
> > > > > Note that you can enable or disable just a single version with the
> > > > > userland tool, but this way all of that complexity stays in userland.
> > > > >
> > > > > > Also for getting the version, the doc says:
> > > > > >
> > > > > > doc: get nfs enabled versions
> > > > > >
> > > > > > which I don't think it quite right. The code reports all supported
> > > > > > versions, and indicates which of those are enabled. So maybe:
> > > > > >
> > > > > > doc: get enabled status for all supported versions
> > > > > >
> > > > > >
> > > > > > I think that fact that it actually lists all supported versions is
> > > > > > useful and worth making explicit.
> > > > > >
> > > > > >
> > > > >
> > > > > Agreed. We should make that change before merging anything.
> > > > >
> > > > > > >
> > > > > > > Reviewed-by: Jeff Layton <[email protected]>
> > > > > > > Tested-by: Jeff Layton <[email protected]>
> > > > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > > > > ---
> > > > > > > Documentation/netlink/specs/nfsd.yaml | 37 +++++++
> > > > > > > fs/nfsd/netlink.c | 24 +++++
> > > > > > > fs/nfsd/netlink.h | 5 +
> > > > > > > fs/nfsd/netns.h | 1 +
> > > > > > > fs/nfsd/nfsctl.c | 150 ++++++++++++++++++++++++++
> > > > > > > fs/nfsd/nfssvc.c | 3 +-
> > > > > > > include/uapi/linux/nfsd_netlink.h | 18 ++++
> > > > > > > 7 files changed, 236 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > > > > > index cbe6c5fd6c4d..0396e8b3ea1f 100644
> > > > > > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > > > > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > > > > > @@ -74,6 +74,26 @@ attribute-sets:
> > > > > > > -
> > > > > > > name: leasetime
> > > > > > > type: u32
> > > > > > > + -
> > > > > > > + name: version
> > > > > > > + attributes:
> > > > > > > + -
> > > > > > > + name: major
> > > > > > > + type: u32
> > > > > > > + -
> > > > > > > + name: minor
> > > > > > > + type: u32
> > > > > > > + -
> > > > > > > + name: enabled
> > > > > > > + type: flag
> > > > > > > + -
> > > > > > > + name: server-proto
> > > > > > > + attributes:
> > > > > > > + -
> > > > > > > + name: version
> > > > > > > + type: nest
> > > > > > > + nested-attributes: version
> > > > > > > + multi-attr: true
> > > > > > >
> > > > > > > operations:
> > > > > > > list:
> > > > > > > @@ -120,3 +140,20 @@ operations:
> > > > > > > - threads
> > > > > > > - gracetime
> > > > > > > - leasetime
> > > > > > > + -
> > > > > > > + name: version-set
> > > > > > > + doc: set nfs enabled versions
> > > > > > > + attribute-set: server-proto
> > > > > > > + flags: [ admin-perm ]
> > > > > > > + do:
> > > > > > > + request:
> > > > > > > + attributes:
> > > > > > > + - version
> > > > > > > + -
> > > > > > > + name: version-get
> > > > > > > + doc: get nfs enabled versions
> > > > > > > + attribute-set: server-proto
> > > > > > > + do:
> > > > > > > + reply:
> > > > > > > + attributes:
> > > > > > > + - version
> > > > > > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > > > > > index 20a646af0324..bf5df9597288 100644
> > > > > > > --- a/fs/nfsd/netlink.c
> > > > > > > +++ b/fs/nfsd/netlink.c
> > > > > > > @@ -10,6 +10,13 @@
> > > > > > >
> > > > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > > > >
> > > > > > > +/* Common nested types */
> > > > > > > +const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1] = {
> > > > > > > + [NFSD_A_VERSION_MAJOR] = { .type = NLA_U32, },
> > > > > > > + [NFSD_A_VERSION_MINOR] = { .type = NLA_U32, },
> > > > > > > + [NFSD_A_VERSION_ENABLED] = { .type = NLA_FLAG, },
> > > > > > > +};
> > > > > > > +
> > > > > > > /* NFSD_CMD_THREADS_SET - do */
> > > > > > > static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > > > > > [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > > > > > @@ -17,6 +24,11 @@ static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_L
> > > > > > > [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > > > > > };
> > > > > > >
> > > > > > > +/* NFSD_CMD_VERSION_SET - do */
> > > > > > > +static const struct nla_policy nfsd_version_set_nl_policy[NFSD_A_SERVER_PROTO_VERSION + 1] = {
> > > > > > > + [NFSD_A_SERVER_PROTO_VERSION] = NLA_POLICY_NESTED(nfsd_version_nl_policy),
> > > > > > > +};
> > > > > > > +
> > > > > > > /* Ops table for nfsd */
> > > > > > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > > > {
> > > > > > > @@ -38,6 +50,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > > > > > .doit = nfsd_nl_threads_get_doit,
> > > > > > > .flags = GENL_CMD_CAP_DO,
> > > > > > > },
> > > > > > > + {
> > > > > > > + .cmd = NFSD_CMD_VERSION_SET,
> > > > > > > + .doit = nfsd_nl_version_set_doit,
> > > > > > > + .policy = nfsd_version_set_nl_policy,
> > > > > > > + .maxattr = NFSD_A_SERVER_PROTO_VERSION,
> > > > > > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > > > > > + },
> > > > > > > + {
> > > > > > > + .cmd = NFSD_CMD_VERSION_GET,
> > > > > > > + .doit = nfsd_nl_version_get_doit,
> > > > > > > + .flags = GENL_CMD_CAP_DO,
> > > > > > > + },
> > > > > > > };
> > > > > > >
> > > > > > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > > > > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > > > > > index 4137fac477e4..c7c0da275481 100644
> > > > > > > --- a/fs/nfsd/netlink.h
> > > > > > > +++ b/fs/nfsd/netlink.h
> > > > > > > @@ -11,6 +11,9 @@
> > > > > > >
> > > > > > > #include <uapi/linux/nfsd_netlink.h>
> > > > > > >
> > > > > > > +/* Common nested types */
> > > > > > > +extern const struct nla_policy nfsd_version_nl_policy[NFSD_A_VERSION_ENABLED + 1];
> > > > > > > +
> > > > > > > int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
> > > > > > > int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > > > > > >
> > > > > > > @@ -18,6 +21,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > > > > > struct netlink_callback *cb);
> > > > > > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > > int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info);
> > > > > > >
> > > > > > > extern struct genl_family nfsd_nl_family;
> > > > > > >
> > > > > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > > > > > index d4be519b5734..14ec15656320 100644
> > > > > > > --- a/fs/nfsd/netns.h
> > > > > > > +++ b/fs/nfsd/netns.h
> > > > > > > @@ -218,6 +218,7 @@ struct nfsd_net {
> > > > > > > /* Simple check to find out if a given net was properly initialized */
> > > > > > > #define nfsd_netns_ready(nn) ((nn)->sessionid_hashtbl)
> > > > > > >
> > > > > > > +extern bool nfsd_support_version(int vers);
> > > > > > > extern void nfsd_netns_free_versions(struct nfsd_net *nn);
> > > > > > >
> > > > > > > extern unsigned int nfsd_net_id;
> > > > > > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > > > > > index 38a5df03981b..2c8929ef79e9 100644
> > > > > > > --- a/fs/nfsd/nfsctl.c
> > > > > > > +++ b/fs/nfsd/nfsctl.c
> > > > > > > @@ -1757,6 +1757,156 @@ int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > > return err;
> > > > > > > }
> > > > > > >
> > > > > > > +/**
> > > > > > > + * nfsd_nl_version_set_doit - set the nfs enabled versions
> > > > > > > + * @skb: reply buffer
> > > > > > > + * @info: netlink metadata and command arguments
> > > > > > > + *
> > > > > > > + * Return 0 on success or a negative errno.
> > > > > > > + */
> > > > > > > +int nfsd_nl_version_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > > +{
> > > > > > > + const struct nlattr *attr;
> > > > > > > + struct nfsd_net *nn;
> > > > > > > + int i, rem;
> > > > > > > +
> > > > > > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_PROTO_VERSION))
> > > > > > > + return -EINVAL;
> > > > > > > +
> > > > > > > + mutex_lock(&nfsd_mutex);
> > > > > > > +
> > > > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > > > + if (nn->nfsd_serv) {
> > > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > > + return -EBUSY;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* clear current supported versions. */
> > > > > > > + nfsd_vers(nn, 2, NFSD_CLEAR);
> > > > > > > + nfsd_vers(nn, 3, NFSD_CLEAR);
> > > > > > > + for (i = 0; i <= NFSD_SUPPORTED_MINOR_VERSION; i++)
> > > > > > > + nfsd_minorversion(nn, i, NFSD_CLEAR);
> > > > > > > +
> > > > > > > + nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > > > > > + struct nlattr *tb[NFSD_A_VERSION_MAX + 1];
> > > > > > > + u32 major, minor = 0;
> > > > > > > + bool enabled;
> > > > > > > +
> > > > > > > + if (nla_type(attr) != NFSD_A_SERVER_PROTO_VERSION)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + if (nla_parse_nested(tb, NFSD_A_VERSION_MAX, attr,
> > > > > > > + nfsd_version_nl_policy, info->extack) < 0)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + if (!tb[NFSD_A_VERSION_MAJOR])
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + major = nla_get_u32(tb[NFSD_A_VERSION_MAJOR]);
> > > > > > > + if (tb[NFSD_A_VERSION_MINOR])
> > > > > > > + minor = nla_get_u32(tb[NFSD_A_VERSION_MINOR]);
> > > > > > > +
> > > > > > > + enabled = nla_get_flag(tb[NFSD_A_VERSION_ENABLED]);
> > > > > > > +
> > > > > > > + switch (major) {
> > > > > > > + case 4:
> > > > > > > + nfsd_minorversion(nn, minor, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > > > + break;
> > > > > > > + case 3:
> > > > > > > + case 2:
> > > > > > > + if (!minor)
> > > > > > > + nfsd_vers(nn, major, enabled ? NFSD_SET : NFSD_CLEAR);
> > > > > > > + break;
> > > > > > > + default:
> > > > > > > + break;
> > > > > > > + }
> > > > > > > + }
> > > > > > > +
> > > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > > +
> > > > > > > + return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * nfsd_nl_version_get_doit - get the nfs enabled versions
> > > > > > > + * @skb: reply buffer
> > > > > > > + * @info: netlink metadata and command arguments
> > > > > > > + *
> > > > > > > + * Return 0 on success or a negative errno.
> > > > > > > + */
> > > > > > > +int nfsd_nl_version_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > > > > > +{
> > > > > > > + struct nfsd_net *nn;
> > > > > > > + int i, err;
> > > > > > > + void *hdr;
> > > > > > > +
> > > > > > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > > > > > + if (!skb)
> > > > > > > + return -ENOMEM;
> > > > > > > +
> > > > > > > + hdr = genlmsg_iput(skb, info);
> > > > > > > + if (!hdr) {
> > > > > > > + err = -EMSGSIZE;
> > > > > > > + goto err_free_msg;
> > > > > > > + }
> > > > > > > +
> > > > > > > + mutex_lock(&nfsd_mutex);
> > > > > > > + nn = net_generic(genl_info_net(info), nfsd_net_id);
> > > > > > > +
> > > > > > > + for (i = 2; i <= 4; i++) {
> > > > > > > + int j;
> > > > > > > +
> > > > > > > + for (j = 0; j <= NFSD_SUPPORTED_MINOR_VERSION; j++) {
> > > > > > > + struct nlattr *attr;
> > > > > > > +
> > > > > > > + /* Don't record any versions the kernel doesn't have
> > > > > > > + * compiled in
> > > > > > > + */
> > > > > > > + if (!nfsd_support_version(i))
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + /* NFSv{2,3} does not support minor numbers */
> > > > > > > + if (i < 4 && j)
> > > > > > > + continue;
> > > > > > > +
> > > > > > > + attr = nla_nest_start(skb,
> > > > > > > + NFSD_A_SERVER_PROTO_VERSION);
> > > > > > > + if (!attr) {
> > > > > > > + err = -EINVAL;
> > > > > > > + goto err_nfsd_unlock;
> > > > > > > + }
> > > > > > > +
> > > > > > > + if (nla_put_u32(skb, NFSD_A_VERSION_MAJOR, i) ||
> > > > > > > + nla_put_u32(skb, NFSD_A_VERSION_MINOR, j)) {
> > > > > > > + err = -EINVAL;
> > > > > > > + goto err_nfsd_unlock;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* Set the enabled flag if the version is enabled */
> > > > > > > + if (nfsd_vers(nn, i, NFSD_TEST) &&
> > > > > > > + (i < 4 || nfsd_minorversion(nn, j, NFSD_TEST)) &&
> > > > > > > + nla_put_flag(skb, NFSD_A_VERSION_ENABLED)) {
> > > > > > > + err = -EINVAL;
> > > > > > > + goto err_nfsd_unlock;
> > > > > > > + }
> > > > > > > +
> > > > > > > + nla_nest_end(skb, attr);
> > > > > > > + }
> > > > > > > + }
> > > > > > > +
> > > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > > + genlmsg_end(skb, hdr);
> > > > > > > +
> > > > > > > + return genlmsg_reply(skb, info);
> > > > > > > +
> > > > > > > +err_nfsd_unlock:
> > > > > > > + mutex_unlock(&nfsd_mutex);
> > > > > > > +err_free_msg:
> > > > > > > + nlmsg_free(skb);
> > > > > > > +
> > > > > > > + return err;
> > > > > > > +}
> > > > > > > +
> > > > > > > /**
> > > > > > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > > > > > * @net: a freshly-created network namespace
> > > > > > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > > > > > > index ca193f7ff0e1..4fc91f50138a 100644
> > > > > > > --- a/fs/nfsd/nfssvc.c
> > > > > > > +++ b/fs/nfsd/nfssvc.c
> > > > > > > @@ -133,8 +133,7 @@ struct svc_program nfsd_program = {
> > > > > > > .pg_rpcbind_set = nfsd_rpcbind_set,
> > > > > > > };
> > > > > > >
> > > > > > > -static bool
> > > > > > > -nfsd_support_version(int vers)
> > > > > > > +bool nfsd_support_version(int vers)
> > > > > > > {
> > > > > > > if (vers >= NFSD_MINVERS && vers < NFSD_NRVERS)
> > > > > > > return nfsd_version[vers] != NULL;
> > > > > > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > > > > > index ccc78a5ee650..8a0a2b344923 100644
> > > > > > > --- a/include/uapi/linux/nfsd_netlink.h
> > > > > > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > > > > > @@ -38,10 +38,28 @@ enum {
> > > > > > > NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > > > > > };
> > > > > > >
> > > > > > > +enum {
> > > > > > > + NFSD_A_VERSION_MAJOR = 1,
> > > > > > > + NFSD_A_VERSION_MINOR,
> > > > > > > + NFSD_A_VERSION_ENABLED,
> > > > > > > +
> > > > > > > + __NFSD_A_VERSION_MAX,
> > > > > > > + NFSD_A_VERSION_MAX = (__NFSD_A_VERSION_MAX - 1)
> > > > > > > +};
> > > > > > > +
> > > > > > > +enum {
> > > > > > > + NFSD_A_SERVER_PROTO_VERSION = 1,
> > > > > > > +
> > > > > > > + __NFSD_A_SERVER_PROTO_MAX,
> > > > > > > + NFSD_A_SERVER_PROTO_MAX = (__NFSD_A_SERVER_PROTO_MAX - 1)
> > > > > > > +};
> > > > > > > +
> > > > > > > enum {
> > > > > > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > > > > > NFSD_CMD_THREADS_SET,
> > > > > > > NFSD_CMD_THREADS_GET,
> > > > > > > + NFSD_CMD_VERSION_SET,
> > > > > > > + NFSD_CMD_VERSION_GET,
> > > > > > >
> > > > > > > __NFSD_CMD_MAX,
> > > > > > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > > > > > --
> > > > > > > 2.44.0
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Jeff Layton <[email protected]>
> > > > >
> > > >
> > >
> > > --
> > > Jeff Layton <[email protected]>
> > >
> >
>
> --
> Jeff Layton <[email protected]>
>


2024-04-17 11:25:58

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 2024-04-17 at 10:43 +1000, NeilBrown wrote:
> On Wed, 17 Apr 2024, Jeff Layton wrote:
> > On Wed, 2024-04-17 at 09:05 +1000, NeilBrown wrote:
> > > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > > On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > > > > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > > > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > > > > while all the other ones will be disabled. (procfs write_version
> > > > > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > > > > enable/disable a single version.
> > > > > > >
> > > > > > > It seems a little weird to me that the interface always disables all
> > > > > > > version, but then also allows individual versions to be disabled.
> > > > > > >
> > > > > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > > > > version, and just enable all versions listed??
> > > > > > >
> > > > > > > Or maybe only enable those with the flag, and don't disable those
> > > > > > > without the flag?
> > > > > > >
> > > > > > > Those don't necessarily seem much better - but the current behaviour
> > > > > > > still seems odd.
> > > > > > >
> > > > > >
> > > > > > I think it makes sense.
> > > > > >
> > > > > > We disable all versions, and enable any that have the "enabled" flag set
> > > > > > in the call from userland. Userland technically needn't send down the
> > > > > > versions that are disabled in the call, but the current userland program
> > > > > > does.
> > > > > >
> > > > > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > > > > but disable v3" and leave the others in their current state. That
> > > > > > requires that both the kernel and userland keep state about what
> > > > > > versions are currently enabled and disabled, and it's possible to get
> > > > > > that wrong.
> > > > >
> > > > > I understand and support your aversion for imperative interfaces.
> > > > > But this interface, as currently implemented, looks somewhat imperative.
> > > > > The message sent to the kernel could enable some interfaces and then
> > > > > disable them. I know that isn't the intent, but it is what the code
> > > > > supports. Hence "weird".
> > > > >
> > > > > We could add code to make that sort of thing impossible, but there isn't
> > > > > much point. Better to make it syntactically impossible.
> > > > >
> > > > > Realistically there will never be NFSv4.3 as there is no need - new
> > > > > features can be added incrementally.?
> > > > >
> > > >
> > > > There is no need _so_far_. Who knows what the future will bring? Maybe
> > > > we'll need v4.3 in order to add some needed feature?
> > > >
> > > > > So we could just pass an array of
> > > > > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > > > > I'm not sure that I do either. A "read" would return the same array
> > > > > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > > > > array could be variable length so 5.0 could be added one day...).
> > > > >
> > > >
> > > > A set of flags is basically what this interface is. They just happen to
> > > > also be labeled with the major and minorversion. I think that's a good
> > > > thing.
> > >
> > > Good for whom? Labelling allows for labelling inconsistencies.
> > >
> >
> > Now you're just being silly. You wanted a variable length array, but
> > what if the next slot is not v4.3 but 5.0? A positional interpretation
> > of a slot in an array is just as just as subject to problems.
>
> I don't think it could look like a imperative interface, which the
> current one does a bit.
>
> >
> > > Maybe the kernel should be able to provide an ordered list of labels,
> > > and separately an array of which labels are enabled/disabled.
> > > Userspace could send down a new set of enabled/disabled flags based on
> > > the agreed set of labels.
> > >
> >
> > How is this better than what's been proposed? One strength of netlink is
> > that the data is structured. The already proposed interface takes
> > advantage of that.
> >
> > > Here is a question that is a bit of a diversion, but might help us
> > > understand the context more fully:
> > >
> > > Why would anyone disable v4.2 separately from v4.1 ??
> > >
> >
> > Furthermore, what does it mean to disable v4.1 but leave v4.2 enabled?
>
> Indeed!
>
> >
> > > I understand that v2, v3, v4.0, v4.1 are effectively different protocols
> > > and you might want to ensure that only the approved/tested protocol is
> > > used. But v4.2 is just a few enhancements on v4.1. Why would you want
> > > to disable it?
> > >
> > > The answer I can think of that there might be bugs (perish the
> > > thought!!) in some of those features so you might want to avoid using
> > > them.
> > > But in that case, it is really the features that you want to suppress,
> > > not the protocol version.
> > >
> > > Maybe I might want to disable delegation - to just write delegation.
> > > Can I do that? What if I just want to disable server-side copy, but
> > > keep fallocate and umask support?
> > >
> > > i.e. is a list of versions really the granularity that we want to use
> > > for this interface?
> > >
> >
> > Our current goal is to replace rpc.nfsd with a new program that works
> > via netlink. An important bit of what rpc.nfsd does is start the NFS
> > server with the settings in /etc/nfs.conf. Some of those settings are
> > vers3=, vers4.0=, etc. that govern how /proc/fs/nfsd/versions is set.
> > We have an immediate need to deal with those settings today, and
> > probably will for quite some time.
> >
> > I'm not opposed to augmenting that with something more granular, but I
> > don't think we should block this interface and wait on that. We can
> > extend the interface at some point in the future to take a new feature
> > bitmask or something, and just declare that (e.g.) declaring vers4.2=n
> > disables some subset of those features.
>
> I agree that we don't want to block "good" while we wait for "perfect". I
> just want to be sure that what we have is "good".
>
> The current /proc interface uses strings like "v3" and "v4.1".
> The proposed netlink interface uses pairs on u32s - "major" and "minor".
> So we lose some easy paths for extensibility. Are we comfortable with
> that?
>
> This isn't a big deal - certainly not a blocked. I just don't feel
> entirely comfortable about the current interface and I'm exploring to
> see if there might be something better.
>
>

Ok, I'm not convinced that anything you've proposed so far is better
than what we have, but I'm open-minded.

At this point we have these to-dos:

1) make the threads interface accept an array of integers rather than a
singleton. This is needed when sunrpc.pool_mode is not global.

2) have the interface pass down the scope string in the THREADS_SET
instead of making userland change the UTS namespace before starting
threads. This should just mean adding a new optional string attribute to
the threads interfaces.

...anything else we're missing that needs doing here? We'd really like
to see this in -next soon, so we can possibly make v6.10 with this.

Thanks,
--
Jeff Layton <[email protected]>

2024-04-17 13:04:28

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] NFSD: convert write_threads to netlink command

On Tue, 2024-04-16 at 18:39 -0400, Jeff Layton wrote:
> On Wed, 2024-04-17 at 07:55 +1000, NeilBrown wrote:
> > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > Introduce write_threads netlink command similar to the one available
> > > through the procfs.
> >
> > I think this should support write_pool_threads too.
> > i.e. the number of threads should be an array. If it is a singleton,
> > the it does write_threads. If larger it does write_pool_threads.
> > I don't think we want to add a separate command later for pool_threads.
> >
> > NeilBrown
> >
>
> That is a very good point. We'll take a closer look at the current
> pool_threads interface and see how we can extend this one to cover that
> use case. Making it an array sounds reasonable at first blush.
>

Lorenzo and I had a look this morning:

Doing this properly is going to take some refactoring, I think. The
existing pool code has a weird restriction where you can't bring the
server up and down exclusively using the pool_threads interface. There
are no userland tools for it either, so it's out of scope since it's not
really part of the how rpc.nfsd works today.

I think the way forward here is to go ahead and make this interface use
an array of integers (like Neil suggests), but for now if someone tries
to send down a THREAD_SET operation and the pool_mode isn't global,
we'll just return an error of some sort (EINVAL, EOPNOTSUPP, ?).

One the new interfaces are in I'll take a look at adding support for
managing other pool_modes properly.

Does that sound ok?


> >
> > >
> > > Tested-by: Jeff Layton <[email protected]>
> > > Reviewed-by: Jeff Layton <[email protected]>
> > > Co-developed-by: Jeff Layton <[email protected]>
> > > Signed-off-by: Jeff Layton <[email protected]>
> > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > ---
> > > Documentation/netlink/specs/nfsd.yaml | 33 ++++++++
> > > fs/nfsd/netlink.c | 19 +++++
> > > fs/nfsd/netlink.h | 2 +
> > > fs/nfsd/nfsctl.c | 104 ++++++++++++++++++++++++++
> > > include/uapi/linux/nfsd_netlink.h | 11 +++
> > > 5 files changed, 169 insertions(+)
> > >
> > > diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> > > index 05acc73e2e33..cbe6c5fd6c4d 100644
> > > --- a/Documentation/netlink/specs/nfsd.yaml
> > > +++ b/Documentation/netlink/specs/nfsd.yaml
> > > @@ -62,6 +62,18 @@ attribute-sets:
> > > name: compound-ops
> > > type: u32
> > > multi-attr: true
> > > + -
> > > + name: server-worker
> > > + attributes:
> > > + -
> > > + name: threads
> > > + type: u32
> > > + -
> > > + name: gracetime
> > > + type: u32
> > > + -
> > > + name: leasetime
> > > + type: u32
> > >
> > > operations:
> > > list:
> > > @@ -87,3 +99,24 @@ operations:
> > > - sport
> > > - dport
> > > - compound-ops
> > > + -
> > > + name: threads-set
> > > + doc: set the number of running threads
> > > + attribute-set: server-worker
> > > + flags: [ admin-perm ]
> > > + do:
> > > + request:
> > > + attributes:
> > > + - threads
> > > + - gracetime
> > > + - leasetime
> > > + -
> > > + name: threads-get
> > > + doc: get the number of running threads
> > > + attribute-set: server-worker
> > > + do:
> > > + reply:
> > > + attributes:
> > > + - threads
> > > + - gracetime
> > > + - leasetime
> > > diff --git a/fs/nfsd/netlink.c b/fs/nfsd/netlink.c
> > > index 0e1d635ec5f9..20a646af0324 100644
> > > --- a/fs/nfsd/netlink.c
> > > +++ b/fs/nfsd/netlink.c
> > > @@ -10,6 +10,13 @@
> > >
> > > #include <uapi/linux/nfsd_netlink.h>
> > >
> > > +/* NFSD_CMD_THREADS_SET - do */
> > > +static const struct nla_policy nfsd_threads_set_nl_policy[NFSD_A_SERVER_WORKER_LEASETIME + 1] = {
> > > + [NFSD_A_SERVER_WORKER_THREADS] = { .type = NLA_U32, },
> > > + [NFSD_A_SERVER_WORKER_GRACETIME] = { .type = NLA_U32, },
> > > + [NFSD_A_SERVER_WORKER_LEASETIME] = { .type = NLA_U32, },
> > > +};
> > > +
> > > /* Ops table for nfsd */
> > > static const struct genl_split_ops nfsd_nl_ops[] = {
> > > {
> > > @@ -19,6 +26,18 @@ static const struct genl_split_ops nfsd_nl_ops[] = {
> > > .done = nfsd_nl_rpc_status_get_done,
> > > .flags = GENL_CMD_CAP_DUMP,
> > > },
> > > + {
> > > + .cmd = NFSD_CMD_THREADS_SET,
> > > + .doit = nfsd_nl_threads_set_doit,
> > > + .policy = nfsd_threads_set_nl_policy,
> > > + .maxattr = NFSD_A_SERVER_WORKER_LEASETIME,
> > > + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> > > + },
> > > + {
> > > + .cmd = NFSD_CMD_THREADS_GET,
> > > + .doit = nfsd_nl_threads_get_doit,
> > > + .flags = GENL_CMD_CAP_DO,
> > > + },
> > > };
> > >
> > > struct genl_family nfsd_nl_family __ro_after_init = {
> > > diff --git a/fs/nfsd/netlink.h b/fs/nfsd/netlink.h
> > > index d83dd6bdee92..4137fac477e4 100644
> > > --- a/fs/nfsd/netlink.h
> > > +++ b/fs/nfsd/netlink.h
> > > @@ -16,6 +16,8 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
> > >
> > > int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > struct netlink_callback *cb);
> > > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info);
> > > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info);
> > >
> > > extern struct genl_family nfsd_nl_family;
> > >
> > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > index f2e442d7fe16..38a5df03981b 100644
> > > --- a/fs/nfsd/nfsctl.c
> > > +++ b/fs/nfsd/nfsctl.c
> > > @@ -1653,6 +1653,110 @@ int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
> > > return 0;
> > > }
> > >
> > > +/**
> > > + * nfsd_nl_threads_set_doit - set the number of running threads
> > > + * @skb: reply buffer
> > > + * @info: netlink metadata and command arguments
> > > + *
> > > + * Return 0 on success or a negative errno.
> > > + */
> > > +int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > +{
> > > + struct net *net = genl_info_net(info);
> > > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > + int ret = -EBUSY;
> > > + u32 nthreads;
> > > +
> > > + if (GENL_REQ_ATTR_CHECK(info, NFSD_A_SERVER_WORKER_THREADS))
> > > + return -EINVAL;
> > > +
> > > + nthreads = nla_get_u32(info->attrs[NFSD_A_SERVER_WORKER_THREADS]);
> > > +
> > > + mutex_lock(&nfsd_mutex);
> > > + if (info->attrs[NFSD_A_SERVER_WORKER_GRACETIME] ||
> > > + info->attrs[NFSD_A_SERVER_WORKER_LEASETIME]) {
> > > + const struct nlattr *attr;
> > > +
> > > + if (nn->nfsd_serv && nn->nfsd_serv->sv_nrthreads)
> > > + goto out_unlock;
> > > +
> > > + ret = -EINVAL;
> > > + attr = info->attrs[NFSD_A_SERVER_WORKER_GRACETIME];
> > > + if (attr) {
> > > + u32 gracetime = nla_get_u32(attr);
> > > +
> > > + if (gracetime < 10 || gracetime > 3600)
> > > + goto out_unlock;
> > > +
> > > + nn->nfsd4_grace = gracetime;
> > > + }
> > > +
> > > + attr = info->attrs[NFSD_A_SERVER_WORKER_LEASETIME];
> > > + if (attr) {
> > > + u32 leasetime = nla_get_u32(attr);
> > > +
> > > + if (leasetime < 10 || leasetime > 3600)
> > > + goto out_unlock;
> > > +
> > > + nn->nfsd4_lease = leasetime;
> > > + }
> > > + }
> > > +
> > > + ret = nfsd_svc(nthreads, net, get_current_cred());
> > > +out_unlock:
> > > + mutex_unlock(&nfsd_mutex);
> > > +
> > > + return ret == nthreads ? 0 : ret;
> > > +}
> > > +
> > > +/**
> > > + * nfsd_nl_threads_get_doit - get the number of running threads
> > > + * @skb: reply buffer
> > > + * @info: netlink metadata and command arguments
> > > + *
> > > + * Return 0 on success or a negative errno.
> > > + */
> > > +int nfsd_nl_threads_get_doit(struct sk_buff *skb, struct genl_info *info)
> > > +{
> > > + struct net *net = genl_info_net(info);
> > > + struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > + void *hdr;
> > > + int err;
> > > +
> > > + skb = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> > > + if (!skb)
> > > + return -ENOMEM;
> > > +
> > > + hdr = genlmsg_iput(skb, info);
> > > + if (!hdr) {
> > > + err = -EMSGSIZE;
> > > + goto err_free_msg;
> > > + }
> > > +
> > > + mutex_lock(&nfsd_mutex);
> > > + err = nla_put_u32(skb, NFSD_A_SERVER_WORKER_GRACETIME,
> > > + nn->nfsd4_grace) ||
> > > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_LEASETIME,
> > > + nn->nfsd4_lease) ||
> > > + nla_put_u32(skb, NFSD_A_SERVER_WORKER_THREADS,
> > > + nn->nfsd_serv ? nn->nfsd_serv->sv_nrthreads : 0);
> > > + mutex_unlock(&nfsd_mutex);
> > > +
> > > + if (err) {
> > > + err = -EINVAL;
> > > + goto err_free_msg;
> > > + }
> > > +
> > > + genlmsg_end(skb, hdr);
> > > +
> > > + return genlmsg_reply(skb, info);
> > > +
> > > +err_free_msg:
> > > + nlmsg_free(skb);
> > > +
> > > + return err;
> > > +}
> > > +
> > > /**
> > > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > > * @net: a freshly-created network namespace
> > > diff --git a/include/uapi/linux/nfsd_netlink.h b/include/uapi/linux/nfsd_netlink.h
> > > index 3cd044edee5d..ccc78a5ee650 100644
> > > --- a/include/uapi/linux/nfsd_netlink.h
> > > +++ b/include/uapi/linux/nfsd_netlink.h
> > > @@ -29,8 +29,19 @@ enum {
> > > NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
> > > };
> > >
> > > +enum {
> > > + NFSD_A_SERVER_WORKER_THREADS = 1,
> > > + NFSD_A_SERVER_WORKER_GRACETIME,
> > > + NFSD_A_SERVER_WORKER_LEASETIME,
> > > +
> > > + __NFSD_A_SERVER_WORKER_MAX,
> > > + NFSD_A_SERVER_WORKER_MAX = (__NFSD_A_SERVER_WORKER_MAX - 1)
> > > +};
> > > +
> > > enum {
> > > NFSD_CMD_RPC_STATUS_GET = 1,
> > > + NFSD_CMD_THREADS_SET,
> > > + NFSD_CMD_THREADS_GET,
> > >
> > > __NFSD_CMD_MAX,
> > > NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > > --
> > > 2.44.0
> > >
> > >
> >
>

--
Jeff Layton <[email protected]>

2024-04-22 03:37:08

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Wed, 17 Apr 2024, Jeff Layton wrote:
> On Wed, 2024-04-17 at 10:43 +1000, NeilBrown wrote:
> > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > On Wed, 2024-04-17 at 09:05 +1000, NeilBrown wrote:
> > > > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > > > On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > > > > > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > > > > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > > > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > > > > > while all the other ones will be disabled. (procfs write_version
> > > > > > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > > > > > enable/disable a single version.
> > > > > > > >
> > > > > > > > It seems a little weird to me that the interface always disables all
> > > > > > > > version, but then also allows individual versions to be disabled.
> > > > > > > >
> > > > > > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > > > > > version, and just enable all versions listed??
> > > > > > > >
> > > > > > > > Or maybe only enable those with the flag, and don't disable those
> > > > > > > > without the flag?
> > > > > > > >
> > > > > > > > Those don't necessarily seem much better - but the current behaviour
> > > > > > > > still seems odd.
> > > > > > > >
> > > > > > >
> > > > > > > I think it makes sense.
> > > > > > >
> > > > > > > We disable all versions, and enable any that have the "enabled" flag set
> > > > > > > in the call from userland. Userland technically needn't send down the
> > > > > > > versions that are disabled in the call, but the current userland program
> > > > > > > does.
> > > > > > >
> > > > > > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > > > > > but disable v3" and leave the others in their current state. That
> > > > > > > requires that both the kernel and userland keep state about what
> > > > > > > versions are currently enabled and disabled, and it's possible to get
> > > > > > > that wrong.
> > > > > >
> > > > > > I understand and support your aversion for imperative interfaces.
> > > > > > But this interface, as currently implemented, looks somewhat imperative.
> > > > > > The message sent to the kernel could enable some interfaces and then
> > > > > > disable them. I know that isn't the intent, but it is what the code
> > > > > > supports. Hence "weird".
> > > > > >
> > > > > > We could add code to make that sort of thing impossible, but there isn't
> > > > > > much point. Better to make it syntactically impossible.
> > > > > >
> > > > > > Realistically there will never be NFSv4.3 as there is no need - new
> > > > > > features can be added incrementally. 
> > > > > >
> > > > >
> > > > > There is no need _so_far_. Who knows what the future will bring? Maybe
> > > > > we'll need v4.3 in order to add some needed feature?
> > > > >
> > > > > > So we could just pass an array of
> > > > > > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > > > > > I'm not sure that I do either. A "read" would return the same array
> > > > > > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > > > > > array could be variable length so 5.0 could be added one day...).
> > > > > >
> > > > >
> > > > > A set of flags is basically what this interface is. They just happen to
> > > > > also be labeled with the major and minorversion. I think that's a good
> > > > > thing.
> > > >
> > > > Good for whom? Labelling allows for labelling inconsistencies.
> > > >
> > >
> > > Now you're just being silly. You wanted a variable length array, but
> > > what if the next slot is not v4.3 but 5.0? A positional interpretation
> > > of a slot in an array is just as just as subject to problems.
> >
> > I don't think it could look like a imperative interface, which the
> > current one does a bit.
> >
> > >
> > > > Maybe the kernel should be able to provide an ordered list of labels,
> > > > and separately an array of which labels are enabled/disabled.
> > > > Userspace could send down a new set of enabled/disabled flags based on
> > > > the agreed set of labels.
> > > >
> > >
> > > How is this better than what's been proposed? One strength of netlink is
> > > that the data is structured. The already proposed interface takes
> > > advantage of that.
> > >
> > > > Here is a question that is a bit of a diversion, but might help us
> > > > understand the context more fully:
> > > >
> > > > Why would anyone disable v4.2 separately from v4.1 ??
> > > >
> > >
> > > Furthermore, what does it mean to disable v4.1 but leave v4.2 enabled?
> >
> > Indeed!
> >
> > >
> > > > I understand that v2, v3, v4.0, v4.1 are effectively different protocols
> > > > and you might want to ensure that only the approved/tested protocol is
> > > > used. But v4.2 is just a few enhancements on v4.1. Why would you want
> > > > to disable it?
> > > >
> > > > The answer I can think of that there might be bugs (perish the
> > > > thought!!) in some of those features so you might want to avoid using
> > > > them.
> > > > But in that case, it is really the features that you want to suppress,
> > > > not the protocol version.
> > > >
> > > > Maybe I might want to disable delegation - to just write delegation.
> > > > Can I do that? What if I just want to disable server-side copy, but
> > > > keep fallocate and umask support?
> > > >
> > > > i.e. is a list of versions really the granularity that we want to use
> > > > for this interface?
> > > >
> > >
> > > Our current goal is to replace rpc.nfsd with a new program that works
> > > via netlink. An important bit of what rpc.nfsd does is start the NFS
> > > server with the settings in /etc/nfs.conf. Some of those settings are
> > > vers3=, vers4.0=, etc. that govern how /proc/fs/nfsd/versions is set.
> > > We have an immediate need to deal with those settings today, and
> > > probably will for quite some time.
> > >
> > > I'm not opposed to augmenting that with something more granular, but I
> > > don't think we should block this interface and wait on that. We can
> > > extend the interface at some point in the future to take a new feature
> > > bitmask or something, and just declare that (e.g.) declaring vers4.2=n
> > > disables some subset of those features.
> >
> > I agree that we don't want to block "good" while we wait for "perfect". I
> > just want to be sure that what we have is "good".
> >
> > The current /proc interface uses strings like "v3" and "v4.1".
> > The proposed netlink interface uses pairs on u32s - "major" and "minor".
> > So we lose some easy paths for extensibility. Are we comfortable with
> > that?
> >
> > This isn't a big deal - certainly not a blocked. I just don't feel
> > entirely comfortable about the current interface and I'm exploring to
> > see if there might be something better.
> >
> >
>
> Ok, I'm not convinced that anything you've proposed so far is better
> than what we have, but I'm open-minded.

Thanks. I admit that I don't think anything I've come up with is better
either.

>
> At this point we have these to-dos:
>
> 1) make the threads interface accept an array of integers rather than a
> singleton. This is needed when sunrpc.pool_mode is not global.
>

While this I think probably still makes sense, I'm feeling less
enthusiastic about it and probably wouldn't complain if it didn't
happen.

I don't like that fact that we need to configure a number of threads. I
would much rather it grow/shrink dynamically. I've got more patches
that are heading in this direction but they aren't ready yet.

If we do land these and they prove to be effective, then configuring
per-pool threads would become extremely uninteresting. The demand on
each pool would adjust each pool separately.

So if use an array here turns out to be problematic for some reason, I
won't complain.

> 2) have the interface pass down the scope string in the THREADS_SET
> instead of making userland change the UTS namespace before starting
> threads. This should just mean adding a new optional string attribute to
> the threads interfaces.
>
> ...anything else we're missing that needs doing here? We'd really like
> to see this in -next soon, so we can possibly make v6.10 with this.

I haven't yet found any obvious gaps. I agree that we can and should
move forward promptly.

Thanks,
NeilBrown


>
> Thanks,
> --
> Jeff Layton <[email protected]>
>


2024-04-23 12:43:07

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] NFSD: add write_version to netlink command

On Mon, 2024-04-22 at 13:36 +1000, NeilBrown wrote:
> On Wed, 17 Apr 2024, Jeff Layton wrote:
> > On Wed, 2024-04-17 at 10:43 +1000, NeilBrown wrote:
> > > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > > On Wed, 2024-04-17 at 09:05 +1000, NeilBrown wrote:
> > > > > On Wed, 17 Apr 2024, Jeff Layton wrote:
> > > > > > On Wed, 2024-04-17 at 07:48 +1000, NeilBrown wrote:
> > > > > > > On Tue, 16 Apr 2024, Jeff Layton wrote:
> > > > > > > > On Tue, 2024-04-16 at 13:16 +1000, NeilBrown wrote:
> > > > > > > > > On Tue, 16 Apr 2024, Lorenzo Bianconi wrote:
> > > > > > > > > > Introduce write_version netlink command through a "declarative" interface.
> > > > > > > > > > This patch introduces a change in behavior since for version-set userspace
> > > > > > > > > > is expected to provide a NFS major/minor version list it wants to enable
> > > > > > > > > > while all the other ones will be disabled. (procfs write_version
> > > > > > > > > > command implements imperative interface where the admin writes +3/-3 to
> > > > > > > > > > enable/disable a single version.
> > > > > > > > >
> > > > > > > > > It seems a little weird to me that the interface always disables all
> > > > > > > > > version, but then also allows individual versions to be disabled.
> > > > > > > > >
> > > > > > > > > Would it be reasonable to simply ignore the "enabled" flag when setting
> > > > > > > > > version, and just enable all versions listed??
> > > > > > > > >
> > > > > > > > > Or maybe only enable those with the flag, and don't disable those
> > > > > > > > > without the flag?
> > > > > > > > >
> > > > > > > > > Those don't necessarily seem much better - but the current behaviour
> > > > > > > > > still seems odd.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think it makes sense.
> > > > > > > >
> > > > > > > > We disable all versions, and enable any that have the "enabled" flag set
> > > > > > > > in the call from userland. Userland technically needn't send down the
> > > > > > > > versions that are disabled in the call, but the current userland program
> > > > > > > > does.
> > > > > > > >
> > > > > > > > I worry about imperative interfaces that might only say -- "enable v4.1,
> > > > > > > > but disable v3" and leave the others in their current state. That
> > > > > > > > requires that both the kernel and userland keep state about what
> > > > > > > > versions are currently enabled and disabled, and it's possible to get
> > > > > > > > that wrong.
> > > > > > >
> > > > > > > I understand and support your aversion for imperative interfaces.
> > > > > > > But this interface, as currently implemented, looks somewhat imperative.
> > > > > > > The message sent to the kernel could enable some interfaces and then
> > > > > > > disable them. I know that isn't the intent, but it is what the code
> > > > > > > supports. Hence "weird".
> > > > > > >
> > > > > > > We could add code to make that sort of thing impossible, but there isn't
> > > > > > > much point. Better to make it syntactically impossible.
> > > > > > >
> > > > > > > Realistically there will never be NFSv4.3 as there is no need - new
> > > > > > > features can be added incrementally.?
> > > > > > >
> > > > > >
> > > > > > There is no need _so_far_. Who knows what the future will bring? Maybe
> > > > > > we'll need v4.3 in order to add some needed feature?
> > > > > >
> > > > > > > So we could just pass an array of
> > > > > > > 5 active flags: 2,3,4.0,4.1,4.2. I suspect you wouldn't like that and
> > > > > > > I'm not sure that I do either. A "read" would return the same array
> > > > > > > with 3 possible states: unavailable, disabled, enabled. (Maybe the
> > > > > > > array could be variable length so 5.0 could be added one day...).
> > > > > > >
> > > > > >
> > > > > > A set of flags is basically what this interface is. They just happen to
> > > > > > also be labeled with the major and minorversion. I think that's a good
> > > > > > thing.
> > > > >
> > > > > Good for whom? Labelling allows for labelling inconsistencies.
> > > > >
> > > >
> > > > Now you're just being silly. You wanted a variable length array, but
> > > > what if the next slot is not v4.3 but 5.0? A positional interpretation
> > > > of a slot in an array is just as just as subject to problems.
> > >
> > > I don't think it could look like a imperative interface, which the
> > > current one does a bit.
> > >
> > > >
> > > > > Maybe the kernel should be able to provide an ordered list of labels,
> > > > > and separately an array of which labels are enabled/disabled.
> > > > > Userspace could send down a new set of enabled/disabled flags based on
> > > > > the agreed set of labels.
> > > > >
> > > >
> > > > How is this better than what's been proposed? One strength of netlink is
> > > > that the data is structured. The already proposed interface takes
> > > > advantage of that.
> > > >
> > > > > Here is a question that is a bit of a diversion, but might help us
> > > > > understand the context more fully:
> > > > >
> > > > > Why would anyone disable v4.2 separately from v4.1 ??
> > > > >
> > > >
> > > > Furthermore, what does it mean to disable v4.1 but leave v4.2 enabled?
> > >
> > > Indeed!
> > >
> > > >
> > > > > I understand that v2, v3, v4.0, v4.1 are effectively different protocols
> > > > > and you might want to ensure that only the approved/tested protocol is
> > > > > used. But v4.2 is just a few enhancements on v4.1. Why would you want
> > > > > to disable it?
> > > > >
> > > > > The answer I can think of that there might be bugs (perish the
> > > > > thought!!) in some of those features so you might want to avoid using
> > > > > them.
> > > > > But in that case, it is really the features that you want to suppress,
> > > > > not the protocol version.
> > > > >
> > > > > Maybe I might want to disable delegation - to just write delegation.
> > > > > Can I do that? What if I just want to disable server-side copy, but
> > > > > keep fallocate and umask support?
> > > > >
> > > > > i.e. is a list of versions really the granularity that we want to use
> > > > > for this interface?
> > > > >
> > > >
> > > > Our current goal is to replace rpc.nfsd with a new program that works
> > > > via netlink. An important bit of what rpc.nfsd does is start the NFS
> > > > server with the settings in /etc/nfs.conf. Some of those settings are
> > > > vers3=, vers4.0=, etc. that govern how /proc/fs/nfsd/versions is set.
> > > > We have an immediate need to deal with those settings today, and
> > > > probably will for quite some time.
> > > >
> > > > I'm not opposed to augmenting that with something more granular, but I
> > > > don't think we should block this interface and wait on that. We can
> > > > extend the interface at some point in the future to take a new feature
> > > > bitmask or something, and just declare that (e.g.) declaring vers4.2=n
> > > > disables some subset of those features.
> > >
> > > I agree that we don't want to block "good" while we wait for "perfect". I
> > > just want to be sure that what we have is "good".
> > >
> > > The current /proc interface uses strings like "v3" and "v4.1".
> > > The proposed netlink interface uses pairs on u32s - "major" and "minor".
> > > So we lose some easy paths for extensibility. Are we comfortable with
> > > that?
> > >
> > > This isn't a big deal - certainly not a blocked. I just don't feel
> > > entirely comfortable about the current interface and I'm exploring to
> > > see if there might be something better.
> > >
> > >
> >
> > Ok, I'm not convinced that anything you've proposed so far is better
> > than what we have, but I'm open-minded.
>
> Thanks. I admit that I don't think anything I've come up with is better
> either.
>
> >
> > At this point we have these to-dos:
> >
> > 1) make the threads interface accept an array of integers rather than a
> > singleton. This is needed when sunrpc.pool_mode is not global.
> >
>
> While this I think probably still makes sense, I'm feeling less
> enthusiastic about it and probably wouldn't complain if it didn't
> happen.
>

Ok. I think making it an array is fine. For now, it will just error out
when you send down more than one value, but I think we can make it work
the way you'd expect. Regardlness, cleaning up and unifying the pooled
vs. non-pooled server handling seems like a good thing to do. The
existing one is a long-standing kludge.

> I don't like that fact that we need to configure a number of threads. I
> would much rather it grow/shrink dynamically. I've got more patches
> that are heading in this direction but they aren't ready yet.
>
> If we do land these and they prove to be effective, then configuring
> per-pool threads would become extremely uninteresting. The demand on
> each pool would adjust each pool separately.
>
> So if use an array here turns out to be problematic for some reason, I
> won't complain.
>

+1 on making the thread pool sizing dynamic. I had started looking at
what heuristics we should use, but I haven't gotten very far yet.

Some general thoughts:

nfsd threads spend most of their time waiting. They are only very rarely
cpu-bound, so the max size of the thread pools should mostly scale with
the amount of memory in the host.

I have a patch that adds a new percpu counter to count how often we go
to wake a thread and don't find one, and have to queue it. It clearly
hits a lot more when there are fewer running nfsd threads.

I don't think that's sufficient to know when to spawn a new thread
though. For that, we probably ought to be looking at how long RPCs are
sitting and waiting for a thread to pick them up. When that starts
growing then we probably need to bring in more threads.

Conversely, when a thread is sitting idle for a long time (10s? 60s?),
we can probably spin it down. Maybe we can have a new LRU for the
threads and have a periodic thread reaper job that checks for those.


> > 2) have the interface pass down the scope string in the THREADS_SET
> > instead of making userland change the UTS namespace before starting
> > threads. This should just mean adding a new optional string attribute to
> > the threads interfaces.
> >
> > ...anything else we're missing that needs doing here? We'd really like
> > to see this in -next soon, so we can possibly make v6.10 with this.
>
> I haven't yet found any obvious gaps. I agree that we can and should
> move forward promptly.
>

Great. I think Lorenzo is planning to send another set soon that should
hopefully be reasonable for merge.

>
> >
> > Thanks,
> > --
> > Jeff Layton <[email protected]>
> >
>

--
Jeff Layton <[email protected]>