2023-09-11 22:00:07

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 0/3] add rpc_status netlink support for NFSD

Introduce rpc_status netlink support for NFSD in order to dump pending
RPC requests debugging information from userspace.
The new code can be tested with the user-space tool reported below:
https://github.com/LorenzoBianconi/nfsd-rpc-netlink-monitor/tree/main

Changes since v7:
- introduce nfsd_server.yaml for netlink messages

Changes since v6:
- report info to user-space through netlink and get rid of the related
entry in the procfs

Changes since v5:
- add missing delimiters for nfs4 compound ops
- add a header line for rpc_status handler
- do not dump rq_prog
- do not dump rq_flags in hex

Changes since v4:
- rely on acquire/release APIs and get rid of atomic operation
- fix kdoc for nfsd_rpc_status_open
- get rid of ',' as field delimiter in nfsd_rpc_status hanlder
- move nfsd_rpc_status before nfsd_v4 enum entries
- fix compilantion error if nfsdv4 is not enabled

Changes since v3:
- introduce rq_status_counter in order to detect if the RPC request is
pending and RPC info are stable
- rely on __svc_print_addr to dump IP info

Changes since v2:
- minor changes in nfsd_rpc_status_show output

Changes since v1:
- rework nfsd_rpc_status_show output

Changes since RFCv1:
- riduce time holding nfsd_mutex bumping svc_serv refcoung in
nfsd_rpc_status_open()
- dump rqstp->rq_stime
- add missing kdoc for nfsd_rpc_status_open()

Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D3D3D366

Lorenzo Bianconi (3):
Documentation: netlink: add a YAML spec for nfsd_server
NFSD: introduce netlink rpc_status stubs
NFSD: add rpc_status netlink support

Documentation/netlink/specs/nfsd_server.yaml | 97 +++++++++
fs/nfsd/Makefile | 3 +-
fs/nfsd/nfs_netlink_gen.c | 32 +++
fs/nfsd/nfs_netlink_gen.h | 22 ++
fs/nfsd/nfsctl.c | 204 +++++++++++++++++++
fs/nfsd/nfsd.h | 16 ++
fs/nfsd/nfssvc.c | 15 ++
fs/nfsd/state.h | 2 -
include/linux/sunrpc/svc.h | 1 +
include/uapi/linux/nfsd_server.h | 49 +++++
10 files changed, 438 insertions(+), 3 deletions(-)
create mode 100644 Documentation/netlink/specs/nfsd_server.yaml
create mode 100644 fs/nfsd/nfs_netlink_gen.c
create mode 100644 fs/nfsd/nfs_netlink_gen.h
create mode 100644 include/uapi/linux/nfsd_server.h

--
2.41.0


2023-09-11 22:00:25

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

Introduce nfsd_server.yaml specs to generate uAPI and netlink
code for nfsd server.
Add rpc-status specs to define message reported by the nfsd server
dumping the pending RPC requests.

Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
Documentation/netlink/specs/nfsd_server.yaml | 97 ++++++++++++++++++++
1 file changed, 97 insertions(+)
create mode 100644 Documentation/netlink/specs/nfsd_server.yaml

diff --git a/Documentation/netlink/specs/nfsd_server.yaml b/Documentation/netlink/specs/nfsd_server.yaml
new file mode 100644
index 000000000000..e681b493847b
--- /dev/null
+++ b/Documentation/netlink/specs/nfsd_server.yaml
@@ -0,0 +1,97 @@
+# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+
+name: nfsd_server
+
+doc:
+ nfsd server configuration over generic netlink.
+
+attribute-sets:
+ -
+ name: rpc-status-comp-op-attr
+ enum-name: nfsd-rpc-status-comp-attr
+ name-prefix: nfsd-attr-rpc-status-comp-
+ attributes:
+ -
+ name: unspec
+ type: unused
+ value: 0
+ -
+ name: op
+ type: u32
+ -
+ name: rpc-status-attr
+ enum-name: nfsd-rpc-status-attr
+ name-prefix: nfsd-attr-rpc-status-
+ attributes:
+ -
+ name: unspec
+ type: unused
+ value: 0
+ -
+ name: xid
+ type: u32
+ byte-order: big-endian
+ -
+ name: flags
+ type: u32
+ -
+ name: prog
+ type: u32
+ -
+ name: version
+ type: u8
+ -
+ name: proc
+ type: u32
+ -
+ name: service_time
+ type: s64
+ -
+ name: pad
+ type: pad
+ -
+ name: saddr4
+ type: u32
+ byte-order: big-endian
+ display-hint: ipv4
+ -
+ name: daddr4
+ type: u32
+ byte-order: big-endian
+ display-hint: ipv4
+ -
+ name: saddr6
+ type: binary
+ display-hint: ipv6
+ -
+ name: daddr6
+ type: binary
+ display-hint: ipv6
+ -
+ name: sport
+ type: u16
+ byte-order: big-endian
+ -
+ name: dport
+ type: u16
+ byte-order: big-endian
+ -
+ name: compond-op
+ type: array-nest
+ nested-attributes: rpc-status-comp-op-attr
+
+operations:
+ enum-name: nfsd-commands
+ name-prefix: nfsd-cmd-
+ list:
+ -
+ name: unspec
+ doc: unused
+ value: 0
+ -
+ name: rpc-status-get
+ doc: dump pending nfsd rpc
+ attribute-set: rpc-status-attr
+ dump:
+ pre: nfsd-server-nl-rpc-status-get-start
+ post: nfsd-server-nl-rpc-status-get-done
--
2.41.0

2023-09-11 22:01:04

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 3/3] NFSD: add rpc_status netlink support

Introduce rpc_status netlink support for NFSD in order to dump pending
RPC requests debugging information from userspace.

Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
fs/nfsd/nfsctl.c | 192 ++++++++++++++++++++++++++++++++++++-
fs/nfsd/nfsd.h | 16 ++++
fs/nfsd/nfssvc.c | 15 +++
fs/nfsd/state.h | 2 -
include/linux/sunrpc/svc.h | 1 +
5 files changed, 222 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 1be66088849c..b862a759ea15 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -26,6 +26,7 @@
#include "pnfs.h"
#include "filecache.h"
#include "trace.h"
+#include "nfs_netlink_gen.h"

/*
* We have a single directory with several nodes in it.
@@ -1497,17 +1498,199 @@ unsigned int nfsd_net_id;

int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
{
- return 0;
+ struct nfsd_net *nn = net_generic(sock_net(cb->skb->sk), nfsd_net_id);
+ int ret = -ENODEV;
+
+ mutex_lock(&nfsd_mutex);
+ if (nn->nfsd_serv) {
+ svc_get(nn->nfsd_serv);
+ ret = 0;
+ }
+ mutex_unlock(&nfsd_mutex);
+
+ return ret;
}

-int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
+static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
+ struct netlink_callback *cb,
+ struct nfsd_genl_rqstp *rqstp)
{
+ void *hdr;
+ int i;
+
+ hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+ &nfsd_server_nl_family, NLM_F_MULTI,
+ NFSD_CMD_RPC_STATUS_GET);
+ if (!hdr)
+ return -ENOBUFS;
+
+ if (nla_put_be32(skb, NFSD_ATTR_RPC_STATUS_XID, rqstp->rq_xid) ||
+ nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_FLAGS, rqstp->rq_flags) ||
+ nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROG, rqstp->rq_prog) ||
+ nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROC, rqstp->rq_proc) ||
+ nla_put_u8(skb, NFSD_ATTR_RPC_STATUS_VERSION, rqstp->rq_vers) ||
+ nla_put_s64(skb, NFSD_ATTR_RPC_STATUS_SERVICE_TIME,
+ ktime_to_us(rqstp->rq_stime),
+ NFSD_ATTR_RPC_STATUS_PAD))
+ return -ENOBUFS;
+
+ switch (rqstp->saddr.sa_family) {
+ case AF_INET: {
+ const struct sockaddr_in *s_in, *d_in;
+
+ s_in = (const struct sockaddr_in *)&rqstp->saddr;
+ d_in = (const struct sockaddr_in *)&rqstp->daddr;
+ if (nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR4,
+ s_in->sin_addr.s_addr) ||
+ nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR4,
+ d_in->sin_addr.s_addr) ||
+ nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
+ s_in->sin_port) ||
+ nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
+ d_in->sin_port))
+ return -ENOBUFS;
+ break;
+ }
+ case AF_INET6: {
+ const struct sockaddr_in6 *s_in, *d_in;
+
+ s_in = (const struct sockaddr_in6 *)&rqstp->saddr;
+ d_in = (const struct sockaddr_in6 *)&rqstp->daddr;
+ if (nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR6,
+ &s_in->sin6_addr) ||
+ nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR6,
+ &d_in->sin6_addr) ||
+ nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
+ s_in->sin6_port) ||
+ nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
+ d_in->sin6_port))
+ return -ENOBUFS;
+ break;
+ }
+ default:
+ break;
+ }
+
+ if (rqstp->opcnt) {
+ struct nlattr *attr;
+
+ attr = nla_nest_start(skb, NFSD_ATTR_RPC_STATUS_COMPOND_OP);
+ if (!attr)
+ return -ENOBUFS;
+
+ for (i = 0; i < rqstp->opcnt; i++) {
+ struct nlattr *op_attr;
+
+ op_attr = nla_nest_start(skb, i);
+ if (!op_attr)
+ return -ENOBUFS;
+
+ if (nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_COMP_OP,
+ rqstp->opnum[i]))
+ return -ENOBUFS;
+
+ nla_nest_end(skb, op_attr);
+ }
+
+ nla_nest_end(skb, attr);
+ }
+
+ genlmsg_end(skb, hdr);
+
return 0;
}

int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
+ struct nfsd_net *nn = net_generic(sock_net(skb->sk), nfsd_net_id);
+ int i, ret, rqstp_index;
+
+ rcu_read_lock();
+
+ for (i = 0; i < nn->nfsd_serv->sv_nrpools; i++) {
+ struct svc_rqst *rqstp;
+
+ if (i < cb->args[0]) /* already consumed */
+ continue;
+
+ rqstp_index = 0;
+ list_for_each_entry_rcu(rqstp,
+ &nn->nfsd_serv->sv_pools[i].sp_all_threads,
+ rq_all) {
+ struct nfsd_genl_rqstp genl_rqstp;
+ unsigned int status_counter;
+
+ if (rqstp_index++ < cb->args[1]) /* already consumed */
+ continue;
+ /*
+ * Acquire rq_status_counter before parsing the rqst
+ * fields. rq_status_counter is set to an odd value in
+ * order to notify the consumers the rqstp fields are
+ * meaningful.
+ */
+ status_counter =
+ smp_load_acquire(&rqstp->rq_status_counter);
+ if (!(status_counter & 1))
+ continue;
+
+ genl_rqstp.rq_xid = rqstp->rq_xid;
+ genl_rqstp.rq_flags = rqstp->rq_flags;
+ genl_rqstp.rq_vers = rqstp->rq_vers;
+ genl_rqstp.rq_prog = rqstp->rq_prog;
+ genl_rqstp.rq_proc = rqstp->rq_proc;
+ genl_rqstp.rq_stime = rqstp->rq_stime;
+ genl_rqstp.opcnt = 0;
+ memcpy(&genl_rqstp.daddr, svc_daddr(rqstp),
+ sizeof(struct sockaddr));
+ memcpy(&genl_rqstp.saddr, svc_addr(rqstp),
+ sizeof(struct sockaddr));
+
+#ifdef CONFIG_NFSD_V4
+ if (rqstp->rq_vers == NFS4_VERSION &&
+ rqstp->rq_proc == NFSPROC4_COMPOUND) {
+ /* NFSv4 compund */
+ struct nfsd4_compoundargs *args;
+ int j;
+
+ args = rqstp->rq_argp;
+ genl_rqstp.opcnt = args->opcnt;
+ for (j = 0; j < genl_rqstp.opcnt; j++)
+ genl_rqstp.opnum[j] =
+ args->ops[j].opnum;
+ }
+#endif /* CONFIG_NFSD_V4 */
+
+ /*
+ * Acquire rq_status_counter before reporting the rqst
+ * fields to the user.
+ */
+ if (smp_load_acquire(&rqstp->rq_status_counter) !=
+ status_counter)
+ continue;
+
+ ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
+ &genl_rqstp);
+ if (ret)
+ goto out;
+ }
+ }
+
+ cb->args[0] = i;
+ cb->args[1] = rqstp_index;
+ ret = skb->len;
+out:
+ rcu_read_unlock();
+
+ return ret;
+}
+
+int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
+{
+ mutex_lock(&nfsd_mutex);
+ nfsd_put(sock_net(cb->skb->sk));
+ mutex_unlock(&nfsd_mutex);
+
return 0;
}

@@ -1605,6 +1788,10 @@ static int __init init_nfsd(void)
retval = register_filesystem(&nfsd_fs_type);
if (retval)
goto out_free_all;
+ retval = genl_register_family(&nfsd_server_nl_family);
+ if (retval)
+ goto out_free_all;
+
return 0;
out_free_all:
nfsd4_destroy_laundry_wq();
@@ -1629,6 +1816,7 @@ static int __init init_nfsd(void)

static void __exit exit_nfsd(void)
{
+ genl_unregister_family(&nfsd_server_nl_family);
unregister_filesystem(&nfsd_fs_type);
nfsd4_destroy_laundry_wq();
unregister_cld_notifier();
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 11c14faa6c67..d787bd38c053 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -62,6 +62,22 @@ struct readdir_cd {
__be32 err; /* 0, nfserr, or nfserr_eof */
};

+/* Maximum number of operations per session compound */
+#define NFSD_MAX_OPS_PER_COMPOUND 50
+
+struct nfsd_genl_rqstp {
+ struct sockaddr daddr;
+ struct sockaddr saddr;
+ unsigned long rq_flags;
+ ktime_t rq_stime;
+ __be32 rq_xid;
+ u32 rq_vers;
+ u32 rq_prog;
+ u32 rq_proc;
+ /* NFSv4 compund */
+ u32 opnum[NFSD_MAX_OPS_PER_COMPOUND];
+ u16 opcnt;
+};

extern struct svc_program nfsd_program;
extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 1582af33e204..fad34a7325b3 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -998,6 +998,15 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream))
goto out_decode_err;

+ /*
+ * Release rq_status_counter setting it to an odd value after the rpc
+ * request has been properly parsed. rq_status_counter is used to
+ * notify the consumers if the rqstp fields are stable
+ * (rq_status_counter is odd) or not meaningful (rq_status_counter
+ * is even).
+ */
+ smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter | 1);
+
rp = NULL;
switch (nfsd_cache_lookup(rqstp, &rp)) {
case RC_DOIT:
@@ -1015,6 +1024,12 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream))
goto out_encode_err;

+ /*
+ * Release rq_status_counter setting it to an even value after the rpc
+ * request has been properly processed.
+ */
+ smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
+
nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
out_cached_reply:
return 1;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index cbddcf484dba..41bdc913fa71 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -174,8 +174,6 @@ static inline struct nfs4_delegation *delegstateid(struct nfs4_stid *s)

/* Maximum number of slots per session. 160 is useful for long haul TCP */
#define NFSD_MAX_SLOTS_PER_SESSION 160
-/* Maximum number of operations per session compound */
-#define NFSD_MAX_OPS_PER_COMPOUND 50
/* Maximum session per slot cache size */
#define NFSD_SLOT_CACHE_SIZE 2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index dbf5b21feafe..caa20defd255 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -251,6 +251,7 @@ struct svc_rqst {
* net namespace
*/
void ** rq_lease_breaker; /* The v4 client breaking a lease */
+ unsigned int rq_status_counter; /* RPC processing counter */
};

/* bits for rq_flags */
--
2.41.0

2023-09-11 22:01:05

by Lorenzo Bianconi

[permalink] [raw]
Subject: [PATCH v8 2/3] NFSD: introduce netlink rpc_status stubs

Generate empty netlink stubs and uAPI through nfsd_server.yaml specs:

$./tools/net/ynl/ynl-gen-c.py --mode uapi \
--spec Documentation/netlink/specs/nfsd_server.yaml \
--header -o include/uapi/linux/nfsd_server.h
$./tools/net/ynl/ynl-gen-c.py --mode kernel \
--spec Documentation/netlink/specs/nfsd_server.yaml \
--header -o fs/nfsd/nfs_netlink_gen.h
$./tools/net/ynl/ynl-gen-c.py --mode kernel \
--spec Documentation/netlink/specs/nfsd_server.yaml \
--source -o fs/nfsd/nfs_netlink_gen.c

Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
---
fs/nfsd/Makefile | 3 +-
fs/nfsd/nfs_netlink_gen.c | 32 +++++++++++++++++++++
fs/nfsd/nfs_netlink_gen.h | 22 ++++++++++++++
fs/nfsd/nfsctl.c | 16 +++++++++++
include/uapi/linux/nfsd_server.h | 49 ++++++++++++++++++++++++++++++++
5 files changed, 121 insertions(+), 1 deletion(-)
create mode 100644 fs/nfsd/nfs_netlink_gen.c
create mode 100644 fs/nfsd/nfs_netlink_gen.h
create mode 100644 include/uapi/linux/nfsd_server.h

diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile
index 6fffc8f03f74..6ae1d5450bf6 100644
--- a/fs/nfsd/Makefile
+++ b/fs/nfsd/Makefile
@@ -12,7 +12,8 @@ nfsd-y += trace.o

nfsd-y += nfssvc.o nfsctl.o nfsfh.o vfs.o \
export.o auth.o lockd.o nfscache.o \
- stats.o filecache.o nfs3proc.o nfs3xdr.o
+ stats.o filecache.o nfs3proc.o nfs3xdr.o \
+ nfs_netlink_gen.o
nfsd-$(CONFIG_NFSD_V2) += nfsproc.o nfsxdr.o
nfsd-$(CONFIG_NFSD_V2_ACL) += nfs2acl.o
nfsd-$(CONFIG_NFSD_V3_ACL) += nfs3acl.o
diff --git a/fs/nfsd/nfs_netlink_gen.c b/fs/nfsd/nfs_netlink_gen.c
new file mode 100644
index 000000000000..4d71b80bf4a7
--- /dev/null
+++ b/fs/nfsd/nfs_netlink_gen.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+/* Do not edit directly, auto-generated from: */
+/* Documentation/netlink/specs/nfsd_server.yaml */
+/* YNL-GEN kernel source */
+
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+#include "nfs_netlink_gen.h"
+
+#include <uapi/linux/nfsd_server.h>
+
+/* Ops table for nfsd_server */
+static const struct genl_split_ops nfsd_server_nl_ops[] = {
+ {
+ .cmd = NFSD_CMD_RPC_STATUS_GET,
+ .start = nfsd_server_nl_rpc_status_get_start,
+ .dumpit = nfsd_server_nl_rpc_status_get_dumpit,
+ .done = nfsd_server_nl_rpc_status_get_done,
+ .flags = GENL_CMD_CAP_DUMP,
+ },
+};
+
+struct genl_family nfsd_server_nl_family __ro_after_init = {
+ .name = NFSD_SERVER_FAMILY_NAME,
+ .version = NFSD_SERVER_FAMILY_VERSION,
+ .netnsok = true,
+ .parallel_ops = true,
+ .module = THIS_MODULE,
+ .split_ops = nfsd_server_nl_ops,
+ .n_split_ops = ARRAY_SIZE(nfsd_server_nl_ops),
+};
diff --git a/fs/nfsd/nfs_netlink_gen.h b/fs/nfsd/nfs_netlink_gen.h
new file mode 100644
index 000000000000..f66b29e528c1
--- /dev/null
+++ b/fs/nfsd/nfs_netlink_gen.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* Do not edit directly, auto-generated from: */
+/* Documentation/netlink/specs/nfsd_server.yaml */
+/* YNL-GEN kernel header */
+
+#ifndef _LINUX_NFSD_SERVER_GEN_H
+#define _LINUX_NFSD_SERVER_GEN_H
+
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+#include <uapi/linux/nfsd_server.h>
+
+int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb);
+int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb);
+
+int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
+ struct netlink_callback *cb);
+
+extern struct genl_family nfsd_server_nl_family;
+
+#endif /* _LINUX_NFSD_SERVER_GEN_H */
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 33f80d289d63..1be66088849c 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1495,6 +1495,22 @@ static int create_proc_exports_entry(void)

unsigned int nfsd_net_id;

+int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
+{
+ return 0;
+}
+
+int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
+{
+ return 0;
+}
+
+int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
+ struct netlink_callback *cb)
+{
+ return 0;
+}
+
/**
* nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
* @net: a freshly-created network namespace
diff --git a/include/uapi/linux/nfsd_server.h b/include/uapi/linux/nfsd_server.h
new file mode 100644
index 000000000000..c9ee00ceca3b
--- /dev/null
+++ b/include/uapi/linux/nfsd_server.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* Do not edit directly, auto-generated from: */
+/* Documentation/netlink/specs/nfsd_server.yaml */
+/* YNL-GEN uapi header */
+
+#ifndef _UAPI_LINUX_NFSD_SERVER_H
+#define _UAPI_LINUX_NFSD_SERVER_H
+
+#define NFSD_SERVER_FAMILY_NAME "nfsd_server"
+#define NFSD_SERVER_FAMILY_VERSION 1
+
+enum nfsd_rpc_status_comp_attr {
+ NFSD_ATTR_RPC_STATUS_COMP_UNSPEC,
+ NFSD_ATTR_RPC_STATUS_COMP_OP,
+
+ __NFSD_ATTR_RPC_STATUS_COMP_MAX,
+ NFSD_ATTR_RPC_STATUS_COMP_MAX = (__NFSD_ATTR_RPC_STATUS_COMP_MAX - 1)
+};
+
+enum nfsd_rpc_status_attr {
+ NFSD_ATTR_RPC_STATUS_UNSPEC,
+ NFSD_ATTR_RPC_STATUS_XID,
+ NFSD_ATTR_RPC_STATUS_FLAGS,
+ NFSD_ATTR_RPC_STATUS_PROG,
+ NFSD_ATTR_RPC_STATUS_VERSION,
+ NFSD_ATTR_RPC_STATUS_PROC,
+ NFSD_ATTR_RPC_STATUS_SERVICE_TIME,
+ NFSD_ATTR_RPC_STATUS_PAD,
+ NFSD_ATTR_RPC_STATUS_SADDR4,
+ NFSD_ATTR_RPC_STATUS_DADDR4,
+ NFSD_ATTR_RPC_STATUS_SADDR6,
+ NFSD_ATTR_RPC_STATUS_DADDR6,
+ NFSD_ATTR_RPC_STATUS_SPORT,
+ NFSD_ATTR_RPC_STATUS_DPORT,
+ NFSD_ATTR_RPC_STATUS_COMPOND_OP,
+
+ __NFSD_ATTR_RPC_STATUS_MAX,
+ NFSD_ATTR_RPC_STATUS_MAX = (__NFSD_ATTR_RPC_STATUS_MAX - 1)
+};
+
+enum nfsd_commands {
+ NFSD_CMD_UNSPEC,
+ NFSD_CMD_RPC_STATUS_GET,
+
+ __NFSD_CMD_MAX,
+ NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
+};
+
+#endif /* _UAPI_LINUX_NFSD_SERVER_H */
--
2.41.0

2023-09-12 15:13:07

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v8 2/3] NFSD: introduce netlink rpc_status stubs

On Mon, Sep 11, 2023 at 02:49:45PM +0200, Lorenzo Bianconi wrote:
> Generate empty netlink stubs and uAPI through nfsd_server.yaml specs:
>
> $./tools/net/ynl/ynl-gen-c.py --mode uapi \
> --spec Documentation/netlink/specs/nfsd_server.yaml \
> --header -o include/uapi/linux/nfsd_server.h
> $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> --spec Documentation/netlink/specs/nfsd_server.yaml \
> --header -o fs/nfsd/nfs_netlink_gen.h
> $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> --spec Documentation/netlink/specs/nfsd_server.yaml \
> --source -o fs/nfsd/nfs_netlink_gen.c
>
> Tested-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>

...

> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 33f80d289d63..1be66088849c 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1495,6 +1495,22 @@ static int create_proc_exports_entry(void)
>
> unsigned int nfsd_net_id;
>
> +int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
> +{
> + return 0;
> +}
> +
> +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> +{
> + return 0;
> +}
> +
> +int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> + struct netlink_callback *cb)
> +{
> + return 0;
> +}
> +

Hi Lorenzo,

W=1 build for gcc-13 and clang-16, and Smatch, complain that
there is no prototype for the above functions.

Perhaps nfs_netlink_gen.h should be included in this file?

> /**
> * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> * @net: a freshly-created network namespace

...

2023-09-12 15:27:07

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

On Mon, Sep 11, 2023 at 02:49:46PM +0200, Lorenzo Bianconi wrote:
> Introduce rpc_status netlink support for NFSD in order to dump pending
> RPC requests debugging information from userspace.
>
> Tested-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
o
Hi Lorenzo,

some minor feedback from my side.

...

> int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> struct netlink_callback *cb)
> {
> + struct nfsd_net *nn = net_generic(sock_net(skb->sk), nfsd_net_id);
> + int i, ret, rqstp_index;
> +
> + rcu_read_lock();
> +
> + for (i = 0; i < nn->nfsd_serv->sv_nrpools; i++) {
> + struct svc_rqst *rqstp;
> +
> + if (i < cb->args[0]) /* already consumed */
> + continue;
> +
> + rqstp_index = 0;
> + list_for_each_entry_rcu(rqstp,
> + &nn->nfsd_serv->sv_pools[i].sp_all_threads,
> + rq_all) {
> + struct nfsd_genl_rqstp genl_rqstp;
> + unsigned int status_counter;
> +
> + if (rqstp_index++ < cb->args[1]) /* already consumed */
> + continue;
> + /*
> + * Acquire rq_status_counter before parsing the rqst
> + * fields. rq_status_counter is set to an odd value in
> + * order to notify the consumers the rqstp fields are
> + * meaningful.
> + */
> + status_counter =
> + smp_load_acquire(&rqstp->rq_status_counter);
> + if (!(status_counter & 1))
> + continue;
> +
> + genl_rqstp.rq_xid = rqstp->rq_xid;
> + genl_rqstp.rq_flags = rqstp->rq_flags;
> + genl_rqstp.rq_vers = rqstp->rq_vers;
> + genl_rqstp.rq_prog = rqstp->rq_prog;
> + genl_rqstp.rq_proc = rqstp->rq_proc;
> + genl_rqstp.rq_stime = rqstp->rq_stime;
> + genl_rqstp.opcnt = 0;
> + memcpy(&genl_rqstp.daddr, svc_daddr(rqstp),
> + sizeof(struct sockaddr));
> + memcpy(&genl_rqstp.saddr, svc_addr(rqstp),
> + sizeof(struct sockaddr));
> +
> +#ifdef CONFIG_NFSD_V4
> + if (rqstp->rq_vers == NFS4_VERSION &&
> + rqstp->rq_proc == NFSPROC4_COMPOUND) {
> + /* NFSv4 compund */

nit: compound

> + struct nfsd4_compoundargs *args;
> + int j;
> +
> + args = rqstp->rq_argp;
> + genl_rqstp.opcnt = args->opcnt;
> + for (j = 0; j < genl_rqstp.opcnt; j++)
> + genl_rqstp.opnum[j] =
> + args->ops[j].opnum;
> + }
> +#endif /* CONFIG_NFSD_V4 */
> +
> + /*
> + * Acquire rq_status_counter before reporting the rqst
> + * fields to the user.
> + */
> + if (smp_load_acquire(&rqstp->rq_status_counter) !=
> + status_counter)
> + continue;
> +
> + ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
> + &genl_rqstp);
> + if (ret)
> + goto out;
> + }
> + }
> +
> + cb->args[0] = i;
> + cb->args[1] = rqstp_index;

I'm unsure if this is possible, but if the for loop above iterates zero
times, or for all iterations (i < cb->args[0]), then rqstp_index will
be used uninitialised here.

Flagged by Smatch.

> + ret = skb->len;
> +out:
> + rcu_read_unlock();
> +
> + return ret;
> +}

...

> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> index 11c14faa6c67..d787bd38c053 100644
> --- a/fs/nfsd/nfsd.h
> +++ b/fs/nfsd/nfsd.h
> @@ -62,6 +62,22 @@ struct readdir_cd {
> __be32 err; /* 0, nfserr, or nfserr_eof */
> };
>
> +/* Maximum number of operations per session compound */
> +#define NFSD_MAX_OPS_PER_COMPOUND 50
> +
> +struct nfsd_genl_rqstp {
> + struct sockaddr daddr;
> + struct sockaddr saddr;
> + unsigned long rq_flags;
> + ktime_t rq_stime;
> + __be32 rq_xid;
> + u32 rq_vers;
> + u32 rq_prog;
> + u32 rq_proc;
> + /* NFSv4 compund */

nit: compound

> + u32 opnum[NFSD_MAX_OPS_PER_COMPOUND];
> + u16 opcnt;
> +};
>
> extern struct svc_program nfsd_program;
> extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;

...

2023-09-14 14:10:10

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

> On Mon, Sep 11, 2023 at 02:49:44PM +0200, Lorenzo Bianconi wrote:
> > Introduce nfsd_server.yaml specs to generate uAPI and netlink
> > code for nfsd server.
> > Add rpc-status specs to define message reported by the nfsd server
> > dumping the pending RPC requests.
> >
> > Tested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > Documentation/netlink/specs/nfsd_server.yaml | 97 ++++++++++++++++++++
> > 1 file changed, 97 insertions(+)
> > create mode 100644 Documentation/netlink/specs/nfsd_server.yaml
>
> I've had a look... the series is simple and short. Thanks!
>
> My only quibbles right now are cosmetic and naming-related, all
> of which can be addressed when I apply these. So I'm going to
> wait for other review comments to see if we need another version
> or whether I can apply v8 with by-hand clean-ups.
>
> Comments below are what I might change when applying this one.
> This is not (yet) a request for a new version.

Hi Chuck,

thx for the review. Please let me know if I need to post a v9.

>
>
> > diff --git a/Documentation/netlink/specs/nfsd_server.yaml b/Documentation/netlink/specs/nfsd_server.yaml
> > new file mode 100644
> > index 000000000000..e681b493847b
> > --- /dev/null
> > +++ b/Documentation/netlink/specs/nfsd_server.yaml
> > @@ -0,0 +1,97 @@
> > +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
> > +
> > +name: nfsd_server
>
> IMHO "nfsd_server" is redundant. "nfsd" should work.

ack, fine

>
>
> > +
> > +doc:
> > + nfsd server configuration over generic netlink.
> > +
> > +attribute-sets:
> > + -
> > + name: rpc-status-comp-op-attr
> > + enum-name: nfsd-rpc-status-comp-attr
> > + name-prefix: nfsd-attr-rpc-status-comp-
> > + attributes:
> > + -
> > + name: unspec
> > + type: unused
> > + value: 0
>
> I don't recall whether a zero-value definition is explicitly
> necessary. Maybe "value-start: 1" would work rather than
> these three lines? Why is zero a special attribute value?

I do not think it is mandatory, I added them just to be aligned with other
netlink definitions, but we do not use them.

>
>
> > + -
> > + name: op
> > + type: u32
> > + -
> > + name: rpc-status-attr
> > + enum-name: nfsd-rpc-status-attr
> > + name-prefix: nfsd-attr-rpc-status-
>
> Specifying all three of these name settings seems a bit
> cluttered.

ack, I would say we can get rid of enum-name, agree?

>
>
> > + attributes:
> > + -
> > + name: unspec
> > + type: unused
> > + value: 0
> > + -
> > + name: xid
> > + type: u32
> > + byte-order: big-endian
> > + -
> > + name: flags
> > + type: u32
> > + -
> > + name: prog
> > + type: u32
> > + -
> > + name: version
> > + type: u8
> > + -
> > + name: proc
> > + type: u32
> > + -
> > + name: service_time
> > + type: s64
> > + -
> > + name: pad
> > + type: pad
> > + -
> > + name: saddr4
> > + type: u32
> > + byte-order: big-endian
> > + display-hint: ipv4
> > + -
> > + name: daddr4
> > + type: u32
> > + byte-order: big-endian
> > + display-hint: ipv4
> > + -
> > + name: saddr6
> > + type: binary
> > + display-hint: ipv6
> > + -
> > + name: daddr6
> > + type: binary
> > + display-hint: ipv6
> > + -
> > + name: sport
> > + type: u16
> > + byte-order: big-endian
> > + -
> > + name: dport
> > + type: u16
> > + byte-order: big-endian
> > + -
> > + name: compond-op
>
> s/compond-op/compound-op

ack

>
> > + type: array-nest
> > + nested-attributes: rpc-status-comp-op-attr
>
> So, this is supposed to be a counted array of op numbers? Is there
> an existing type that could be used for this instead?

I think the attribute-sets available types are defined here:

https://github.com/torvalds/linux/blob/master/Documentation/netlink/genetlink-c.yaml#L151

Regards,
Lorenzo

>
>
> > +
> > +operations:
> > + enum-name: nfsd-commands
> > + name-prefix: nfsd-cmd-
> > + list:
> > + -
> > + name: unspec
> > + doc: unused
> > + value: 0
> > + -
> > + name: rpc-status-get
> > + doc: dump pending nfsd rpc
> > + attribute-set: rpc-status-attr
> > + dump:
> > + pre: nfsd-server-nl-rpc-status-get-start
> > + post: nfsd-server-nl-rpc-status-get-done
>
> --
> Chuck Lever
>


Attachments:
(No filename) (4.73 kB)
signature.asc (235.00 B)
Download all attachments

2023-09-14 15:02:39

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 2/3] NFSD: introduce netlink rpc_status stubs

> On Mon, Sep 11, 2023 at 02:49:45PM +0200, Lorenzo Bianconi wrote:
> > Generate empty netlink stubs and uAPI through nfsd_server.yaml specs:
> >
> > $./tools/net/ynl/ynl-gen-c.py --mode uapi \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --header -o include/uapi/linux/nfsd_server.h
> > $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --header -o fs/nfsd/nfs_netlink_gen.h
> > $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --source -o fs/nfsd/nfs_netlink_gen.c
> >
> > Tested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
>
> ...
>
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 33f80d289d63..1be66088849c 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1495,6 +1495,22 @@ static int create_proc_exports_entry(void)
> >
> > unsigned int nfsd_net_id;
> >
> > +int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
> > +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
> > +int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > + struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
>
> Hi Lorenzo,

Hi Simon,

>
> W=1 build for gcc-13 and clang-16, and Smatch, complain that
> there is no prototype for the above functions.
>
> Perhaps nfs_netlink_gen.h should be included in this file?

actually I added it in patch 3/3. I will move it here. Thx.

Regards,
Lorenzo

>
> > /**
> > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > * @net: a freshly-created network namespace
>
> ...
>


Attachments:
(No filename) (1.82 kB)
signature.asc (235.00 B)
Download all attachments

2023-09-14 17:33:08

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

[...]

> > +
> > +#ifdef CONFIG_NFSD_V4
> > + if (rqstp->rq_vers == NFS4_VERSION &&
> > + rqstp->rq_proc == NFSPROC4_COMPOUND) {
> > + /* NFSv4 compund */
>
> nit: compound

ack, I will fix it.

>
> > + struct nfsd4_compoundargs *args;
> > + int j;
> > +
> > + args = rqstp->rq_argp;
> > + genl_rqstp.opcnt = args->opcnt;
> > + for (j = 0; j < genl_rqstp.opcnt; j++)
> > + genl_rqstp.opnum[j] =
> > + args->ops[j].opnum;
> > + }
> > +#endif /* CONFIG_NFSD_V4 */
> > +
> > + /*
> > + * Acquire rq_status_counter before reporting the rqst
> > + * fields to the user.
> > + */
> > + if (smp_load_acquire(&rqstp->rq_status_counter) !=
> > + status_counter)
> > + continue;
> > +
> > + ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
> > + &genl_rqstp);
> > + if (ret)
> > + goto out;
> > + }
> > + }
> > +
> > + cb->args[0] = i;
> > + cb->args[1] = rqstp_index;
>
> I'm unsure if this is possible, but if the for loop above iterates zero
> times, or for all iterations (i < cb->args[0]), then rqstp_index will
> be used uninitialised here.

ack, thx for spotting it, I will fix it.

Regards,
Lorenzo

>
> Flagged by Smatch.
>
> > + ret = skb->len;
> > +out:
> > + rcu_read_unlock();
> > +
> > + return ret;
> > +}
>
> ...
>
> > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> > index 11c14faa6c67..d787bd38c053 100644
> > --- a/fs/nfsd/nfsd.h
> > +++ b/fs/nfsd/nfsd.h
> > @@ -62,6 +62,22 @@ struct readdir_cd {
> > __be32 err; /* 0, nfserr, or nfserr_eof */
> > };
> >
> > +/* Maximum number of operations per session compound */
> > +#define NFSD_MAX_OPS_PER_COMPOUND 50
> > +
> > +struct nfsd_genl_rqstp {
> > + struct sockaddr daddr;
> > + struct sockaddr saddr;
> > + unsigned long rq_flags;
> > + ktime_t rq_stime;
> > + __be32 rq_xid;
> > + u32 rq_vers;
> > + u32 rq_prog;
> > + u32 rq_proc;
> > + /* NFSv4 compund */
>
> nit: compound
>
> > + u32 opnum[NFSD_MAX_OPS_PER_COMPOUND];
> > + u16 opcnt;
> > +};
> >
> > extern struct svc_program nfsd_program;
> > extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;
>
> ...
>


Attachments:
(No filename) (2.19 kB)
signature.asc (235.00 B)
Download all attachments

2023-09-14 22:10:41

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 2/3] NFSD: introduce netlink rpc_status stubs

> On Mon, Sep 11, 2023 at 02:49:45PM +0200, Lorenzo Bianconi wrote:
> > Generate empty netlink stubs and uAPI through nfsd_server.yaml specs:
> >
> > $./tools/net/ynl/ynl-gen-c.py --mode uapi \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --header -o include/uapi/linux/nfsd_server.h
> > $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --header -o fs/nfsd/nfs_netlink_gen.h
> > $./tools/net/ynl/ynl-gen-c.py --mode kernel \
> > --spec Documentation/netlink/specs/nfsd_server.yaml \
> > --source -o fs/nfsd/nfs_netlink_gen.c
>
> Actually there's a tool that walks the whole kernel source tree
> and handles any files that contain the YNL-GEN tag:
>
> $ tools/net/ynl/ynl-regen.sh
>
>
> > Tested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > fs/nfsd/Makefile | 3 +-
> > fs/nfsd/nfs_netlink_gen.c | 32 +++++++++++++++++++++
> > fs/nfsd/nfs_netlink_gen.h | 22 ++++++++++++++
> > fs/nfsd/nfsctl.c | 16 +++++++++++
> > include/uapi/linux/nfsd_server.h | 49 ++++++++++++++++++++++++++++++++
> > 5 files changed, 121 insertions(+), 1 deletion(-)
> > create mode 100644 fs/nfsd/nfs_netlink_gen.c
> > create mode 100644 fs/nfsd/nfs_netlink_gen.h
> > create mode 100644 include/uapi/linux/nfsd_server.h
> >
> > diff --git a/fs/nfsd/Makefile b/fs/nfsd/Makefile
> > index 6fffc8f03f74..6ae1d5450bf6 100644
> > --- a/fs/nfsd/Makefile
> > +++ b/fs/nfsd/Makefile
> > @@ -12,7 +12,8 @@ nfsd-y += trace.o
> >
> > nfsd-y += nfssvc.o nfsctl.o nfsfh.o vfs.o \
> > export.o auth.o lockd.o nfscache.o \
> > - stats.o filecache.o nfs3proc.o nfs3xdr.o
> > + stats.o filecache.o nfs3proc.o nfs3xdr.o \
> > + nfs_netlink_gen.o
> > nfsd-$(CONFIG_NFSD_V2) += nfsproc.o nfsxdr.o
> > nfsd-$(CONFIG_NFSD_V2_ACL) += nfs2acl.o
> > nfsd-$(CONFIG_NFSD_V3_ACL) += nfs3acl.o
> > diff --git a/fs/nfsd/nfs_netlink_gen.c b/fs/nfsd/nfs_netlink_gen.c
> > new file mode 100644
> > index 000000000000..4d71b80bf4a7
> > --- /dev/null
> > +++ b/fs/nfsd/nfs_netlink_gen.c
>
> I'd like a shorter file name. How about fs/nfsd/netlink.c
> (and below, netlink.h) ?

ack, fine

Regards,
Lorenzo

>
>
> > @@ -0,0 +1,32 @@
> > +// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
> > +/* Do not edit directly, auto-generated from: */
> > +/* Documentation/netlink/specs/nfsd_server.yaml */
> > +/* YNL-GEN kernel source */
> > +
> > +#include <net/netlink.h>
> > +#include <net/genetlink.h>
> > +
> > +#include "nfs_netlink_gen.h"
> > +
> > +#include <uapi/linux/nfsd_server.h>
> > +
> > +/* Ops table for nfsd_server */
> > +static const struct genl_split_ops nfsd_server_nl_ops[] = {
> > + {
> > + .cmd = NFSD_CMD_RPC_STATUS_GET,
> > + .start = nfsd_server_nl_rpc_status_get_start,
> > + .dumpit = nfsd_server_nl_rpc_status_get_dumpit,
> > + .done = nfsd_server_nl_rpc_status_get_done,
> > + .flags = GENL_CMD_CAP_DUMP,
> > + },
> > +};
> > +
> > +struct genl_family nfsd_server_nl_family __ro_after_init = {
> > + .name = NFSD_SERVER_FAMILY_NAME,
> > + .version = NFSD_SERVER_FAMILY_VERSION,
> > + .netnsok = true,
> > + .parallel_ops = true,
> > + .module = THIS_MODULE,
> > + .split_ops = nfsd_server_nl_ops,
> > + .n_split_ops = ARRAY_SIZE(nfsd_server_nl_ops),
> > +};
> > diff --git a/fs/nfsd/nfs_netlink_gen.h b/fs/nfsd/nfs_netlink_gen.h
> > new file mode 100644
> > index 000000000000..f66b29e528c1
> > --- /dev/null
> > +++ b/fs/nfsd/nfs_netlink_gen.h
> > @@ -0,0 +1,22 @@
> > +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
> > +/* Do not edit directly, auto-generated from: */
> > +/* Documentation/netlink/specs/nfsd_server.yaml */
> > +/* YNL-GEN kernel header */
> > +
> > +#ifndef _LINUX_NFSD_SERVER_GEN_H
> > +#define _LINUX_NFSD_SERVER_GEN_H
> > +
> > +#include <net/netlink.h>
> > +#include <net/genetlink.h>
> > +
> > +#include <uapi/linux/nfsd_server.h>
> > +
> > +int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb);
> > +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb);
> > +
> > +int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > + struct netlink_callback *cb);
> > +
> > +extern struct genl_family nfsd_server_nl_family;
> > +
> > +#endif /* _LINUX_NFSD_SERVER_GEN_H */
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 33f80d289d63..1be66088849c 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1495,6 +1495,22 @@ static int create_proc_exports_entry(void)
> >
> > unsigned int nfsd_net_id;
> >
> > +int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
> > +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
> > +int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > + struct netlink_callback *cb)
> > +{
> > + return 0;
> > +}
> > +
> > /**
> > * nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
> > * @net: a freshly-created network namespace
> > diff --git a/include/uapi/linux/nfsd_server.h b/include/uapi/linux/nfsd_server.h
> > new file mode 100644
> > index 000000000000..c9ee00ceca3b
> > --- /dev/null
> > +++ b/include/uapi/linux/nfsd_server.h
> > @@ -0,0 +1,49 @@
> > +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
> > +/* Do not edit directly, auto-generated from: */
> > +/* Documentation/netlink/specs/nfsd_server.yaml */
> > +/* YNL-GEN uapi header */
> > +
> > +#ifndef _UAPI_LINUX_NFSD_SERVER_H
> > +#define _UAPI_LINUX_NFSD_SERVER_H
> > +
> > +#define NFSD_SERVER_FAMILY_NAME "nfsd_server"
> > +#define NFSD_SERVER_FAMILY_VERSION 1
> > +
> > +enum nfsd_rpc_status_comp_attr {
> > + NFSD_ATTR_RPC_STATUS_COMP_UNSPEC,
> > + NFSD_ATTR_RPC_STATUS_COMP_OP,
> > +
> > + __NFSD_ATTR_RPC_STATUS_COMP_MAX,
> > + NFSD_ATTR_RPC_STATUS_COMP_MAX = (__NFSD_ATTR_RPC_STATUS_COMP_MAX - 1)
> > +};
> > +
> > +enum nfsd_rpc_status_attr {
> > + NFSD_ATTR_RPC_STATUS_UNSPEC,
> > + NFSD_ATTR_RPC_STATUS_XID,
> > + NFSD_ATTR_RPC_STATUS_FLAGS,
> > + NFSD_ATTR_RPC_STATUS_PROG,
> > + NFSD_ATTR_RPC_STATUS_VERSION,
> > + NFSD_ATTR_RPC_STATUS_PROC,
> > + NFSD_ATTR_RPC_STATUS_SERVICE_TIME,
> > + NFSD_ATTR_RPC_STATUS_PAD,
> > + NFSD_ATTR_RPC_STATUS_SADDR4,
> > + NFSD_ATTR_RPC_STATUS_DADDR4,
> > + NFSD_ATTR_RPC_STATUS_SADDR6,
> > + NFSD_ATTR_RPC_STATUS_DADDR6,
> > + NFSD_ATTR_RPC_STATUS_SPORT,
> > + NFSD_ATTR_RPC_STATUS_DPORT,
> > + NFSD_ATTR_RPC_STATUS_COMPOND_OP,
> > +
> > + __NFSD_ATTR_RPC_STATUS_MAX,
> > + NFSD_ATTR_RPC_STATUS_MAX = (__NFSD_ATTR_RPC_STATUS_MAX - 1)
> > +};
> > +
> > +enum nfsd_commands {
> > + NFSD_CMD_UNSPEC,
> > + NFSD_CMD_RPC_STATUS_GET,
> > +
> > + __NFSD_CMD_MAX,
> > + NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
> > +};
> > +
> > +#endif /* _UAPI_LINUX_NFSD_SERVER_H */
> > --
> > 2.41.0
> >
>
> --
> Chuck Lever
>


Attachments:
(No filename) (7.04 kB)
signature.asc (235.00 B)
Download all attachments

2023-09-15 21:39:30

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 0/3] add rpc_status netlink support for NFSD

[...]
> > Documentation/netlink/specs/nfsd_server.yaml | 97 +++++++++
> > fs/nfsd/Makefile | 3 +-
> > fs/nfsd/nfs_netlink_gen.c | 32 +++
> > fs/nfsd/nfs_netlink_gen.h | 22 ++
> > fs/nfsd/nfsctl.c | 204 +++++++++++++++++++
> > fs/nfsd/nfsd.h | 16 ++
> > fs/nfsd/nfssvc.c | 15 ++
> > fs/nfsd/state.h | 2 -
> > include/linux/sunrpc/svc.h | 1 +
> > include/uapi/linux/nfsd_server.h | 49 +++++
> > 10 files changed, 438 insertions(+), 3 deletions(-)
> > create mode 100644 Documentation/netlink/specs/nfsd_server.yaml
> > create mode 100644 fs/nfsd/nfs_netlink_gen.c
> > create mode 100644 fs/nfsd/nfs_netlink_gen.h
> > create mode 100644 include/uapi/linux/nfsd_server.h
>
> Hi Lorenzo -
>
> I've applied these three to nfsd-next with the following changes:
>
> - Renaming as we discussed
> - Replaced the nested compound_op attribute -- may require some user
> space tooling changes
> - Simon's Smatch bug fixed
> - Squashed 1/3 and 2/3 into one patch
> - Added Closes/Acked-by etc

Hi Chuck,

Thanks for addressing the points above.

>
> If you spot a bug, send patches against nfsd-next and I can squash
> them in.

ack, I will do

>
> I was wondering if you have a little more time to try adding one or
> two control cmds. write_threads and v4_end_grace might be simple
> ones to start with. No problem if you are "done" with this project,
> I can add these over time.

sure, no worries. I will look into them soon :)

Regards,
Lorenzo

>
> --
> Chuck Lever


Attachments:
(No filename) (1.72 kB)
signature.asc (235.00 B)
Download all attachments

2023-09-15 21:45:55

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v8 0/3] add rpc_status netlink support for NFSD

On Mon, Sep 11, 2023 at 02:49:43PM +0200, Lorenzo Bianconi wrote:
> Introduce rpc_status netlink support for NFSD in order to dump pending
> RPC requests debugging information from userspace.
> The new code can be tested with the user-space tool reported below:
> https://github.com/LorenzoBianconi/nfsd-rpc-netlink-monitor/tree/main
>
> Changes since v7:
> - introduce nfsd_server.yaml for netlink messages
>
> Changes since v6:
> - report info to user-space through netlink and get rid of the related
> entry in the procfs
>
> Changes since v5:
> - add missing delimiters for nfs4 compound ops
> - add a header line for rpc_status handler
> - do not dump rq_prog
> - do not dump rq_flags in hex
>
> Changes since v4:
> - rely on acquire/release APIs and get rid of atomic operation
> - fix kdoc for nfsd_rpc_status_open
> - get rid of ',' as field delimiter in nfsd_rpc_status hanlder
> - move nfsd_rpc_status before nfsd_v4 enum entries
> - fix compilantion error if nfsdv4 is not enabled
>
> Changes since v3:
> - introduce rq_status_counter in order to detect if the RPC request is
> pending and RPC info are stable
> - rely on __svc_print_addr to dump IP info
>
> Changes since v2:
> - minor changes in nfsd_rpc_status_show output
>
> Changes since v1:
> - rework nfsd_rpc_status_show output
>
> Changes since RFCv1:
> - riduce time holding nfsd_mutex bumping svc_serv refcoung in
> nfsd_rpc_status_open()
> - dump rqstp->rq_stime
> - add missing kdoc for nfsd_rpc_status_open()
>
> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=3D3D3D366
>
> Lorenzo Bianconi (3):
> Documentation: netlink: add a YAML spec for nfsd_server
> NFSD: introduce netlink rpc_status stubs
> NFSD: add rpc_status netlink support
>
> Documentation/netlink/specs/nfsd_server.yaml | 97 +++++++++
> fs/nfsd/Makefile | 3 +-
> fs/nfsd/nfs_netlink_gen.c | 32 +++
> fs/nfsd/nfs_netlink_gen.h | 22 ++
> fs/nfsd/nfsctl.c | 204 +++++++++++++++++++
> fs/nfsd/nfsd.h | 16 ++
> fs/nfsd/nfssvc.c | 15 ++
> fs/nfsd/state.h | 2 -
> include/linux/sunrpc/svc.h | 1 +
> include/uapi/linux/nfsd_server.h | 49 +++++
> 10 files changed, 438 insertions(+), 3 deletions(-)
> create mode 100644 Documentation/netlink/specs/nfsd_server.yaml
> create mode 100644 fs/nfsd/nfs_netlink_gen.c
> create mode 100644 fs/nfsd/nfs_netlink_gen.h
> create mode 100644 include/uapi/linux/nfsd_server.h

Hi Lorenzo -

I've applied these three to nfsd-next with the following changes:

- Renaming as we discussed
- Replaced the nested compound_op attribute -- may require some user
space tooling changes
- Simon's Smatch bug fixed
- Squashed 1/3 and 2/3 into one patch
- Added Closes/Acked-by etc

If you spot a bug, send patches against nfsd-next and I can squash
them in.

I was wondering if you have a little more time to try adding one or
two control cmds. write_threads and v4_end_grace might be simple
ones to start with. No problem if you are "done" with this project,
I can add these over time.

--
Chuck Lever

2023-10-03 17:56:15

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

On Mon, 11 Sep 2023 14:49:44 +0200 Lorenzo Bianconi wrote:
> Introduce nfsd_server.yaml specs to generate uAPI and netlink
> code for nfsd server.
> Add rpc-status specs to define message reported by the nfsd server
> dumping the pending RPC requests.

Sorry for the delay, some minor "take it or leave it" nits below.

> +doc:
> + nfsd server configuration over generic netlink.
> +
> +attribute-sets:
> + -
> + name: rpc-status-comp-op-attr
> + enum-name: nfsd-rpc-status-comp-attr
> + name-prefix: nfsd-attr-rpc-status-comp-
> + attributes:
> + -
> + name: unspec
> + type: unused
> + value: 0

the unused attrs can usually be skipped, the specs now start with value
of 1 by default. Same for the unused command.

> + -
> + name: dport
> + type: u16
> + byte-order: big-endian
> + -
> + name: compond-op
> + type: array-nest

Avoid array-nests if you can, they are legacy (does this spec pass JSON
schema validation?).

There's only one attribute in the nest, can you use

-
name: op
type: u32
multi-attr: true

?

> + nested-attributes: rpc-status-comp-op-attr

> + -
> + name: rpc-status-get
> + doc: dump pending nfsd rpc
> + attribute-set: rpc-status-attr
> + dump:
> + pre: nfsd-server-nl-rpc-status-get-start
> + post: nfsd-server-nl-rpc-status-get-done

No attributes listed? User space C codegen will need those to make
sense of the commands.

2023-10-03 18:04:04

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

On Mon, 11 Sep 2023 14:49:46 +0200 Lorenzo Bianconi wrote:
> + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> + &nfsd_server_nl_family, NLM_F_MULTI,
> + NFSD_CMD_RPC_STATUS_GET);
> + if (!hdr)
> + return -ENOBUFS;

Why NLM_F_MULTI? AFAIU that means "I'm splitting one object over
multiple messages". 99% of the time the right thing to do is change
what we consider to be "an object" rather than do F_MULTI. In theory
user space should re-constitute all the NLM_F_MULTI messages into as
single object, which none of YNL does today :(

2023-10-03 18:40:57

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

Hi Jakub-

> On Oct 3, 2023, at 1:55 PM, Jakub Kicinski <[email protected]> wrote:
>
> On Mon, 11 Sep 2023 14:49:44 +0200 Lorenzo Bianconi wrote:
>> Introduce nfsd_server.yaml specs to generate uAPI and netlink
>> code for nfsd server.
>> Add rpc-status specs to define message reported by the nfsd server
>> dumping the pending RPC requests.
>
> Sorry for the delay, some minor "take it or leave it" nits below.
>
>> +doc:
>> + nfsd server configuration over generic netlink.
>> +
>> +attribute-sets:
>> + -
>> + name: rpc-status-comp-op-attr
>> + enum-name: nfsd-rpc-status-comp-attr
>> + name-prefix: nfsd-attr-rpc-status-comp-
>> + attributes:
>> + -
>> + name: unspec
>> + type: unused
>> + value: 0
>
> the unused attrs can usually be skipped, the specs now start with value
> of 1 by default. Same for the unused command.
>
>> + -
>> + name: dport
>> + type: u16
>> + byte-order: big-endian
>> + -
>> + name: compond-op
>> + type: array-nest
>
> Avoid array-nests if you can, they are legacy (does this spec pass JSON
> schema validation?).
>
> There's only one attribute in the nest, can you use
>
> -
> name: op
> type: u32
> multi-attr: true
>
> ?

I've made similar modifications to Lorenzo's original
contribution. The current updated version of this spec
is here:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-next&id=55f55c77f624066895470306898190f090e00cda


>> + nested-attributes: rpc-status-comp-op-attr
>
>> + -
>> + name: rpc-status-get
>> + doc: dump pending nfsd rpc
>> + attribute-set: rpc-status-attr
>> + dump:
>> + pre: nfsd-server-nl-rpc-status-get-start
>> + post: nfsd-server-nl-rpc-status-get-done
>
> No attributes listed? User space C codegen will need those to make
> sense of the commands.


--
Chuck Lever


2023-10-03 19:03:09

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

On Tue, 3 Oct 2023 18:40:29 +0000 Chuck Lever III wrote:
> I've made similar modifications to Lorenzo's original
> contribution. The current updated version of this spec
> is here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-next&id=55f55c77f624066895470306898190f090e00cda

Great! I noticed too late :)
Just the attr listing is missing in the spec, then.

2023-10-03 23:00:46

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server



> On Oct 3, 2023, at 3:02 PM, Jakub Kicinski <[email protected]> wrote:
>
> On Tue, 3 Oct 2023 18:40:29 +0000 Chuck Lever III wrote:
>> I've made similar modifications to Lorenzo's original
>> contribution. The current updated version of this spec
>> is here:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/commit/?h=nfsd-next&id=55f55c77f624066895470306898190f090e00cda
>
> Great! I noticed too late :)
> Just the attr listing is missing in the spec, then.

To ensure that I understand you correctly:

diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
index 403d3e3a04f3..f6a9f3da6291 100644
--- a/Documentation/netlink/specs/nfsd.yaml
+++ b/Documentation/netlink/specs/nfsd.yaml
@@ -72,3 +72,19 @@ operations:
dump:
pre: nfsd-nl-rpc-status-get-start
post: nfsd-nl-rpc-status-get-done
+ reply:
+ attributes:
+ - xid
+ - flags
+ - prog
+ - version
+ - proc
+ - service_time
+ - pad
+ - saddr4
+ - daddr4
+ - saddr6
+ - daddr6
+ - sport
+ - dport
+ - compound-ops

Yes?

--
Chuck Lever


2023-10-04 00:24:30

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v8 1/3] Documentation: netlink: add a YAML spec for nfsd_server

On Tue, 3 Oct 2023 23:00:09 +0000 Chuck Lever III wrote:
> To ensure that I understand you correctly:
>
> diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml
> index 403d3e3a04f3..f6a9f3da6291 100644
> --- a/Documentation/netlink/specs/nfsd.yaml
> +++ b/Documentation/netlink/specs/nfsd.yaml
> @@ -72,3 +72,19 @@ operations:
> dump:
> pre: nfsd-nl-rpc-status-get-start
> post: nfsd-nl-rpc-status-get-done
> + reply:
> + attributes:
> + - xid
> + - flags
> + - prog
> + - version
> + - proc
> + - service_time
> + - pad
> + - saddr4
> + - daddr4
> + - saddr6
> + - daddr6
> + - sport
> + - dport
> + - compound-ops

Yup! (the pad can be skipped since it's not rendered, anyway).

2023-10-04 10:16:28

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

> On Mon, 11 Sep 2023 14:49:46 +0200 Lorenzo Bianconi wrote:
> > + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> > + &nfsd_server_nl_family, NLM_F_MULTI,
> > + NFSD_CMD_RPC_STATUS_GET);
> > + if (!hdr)
> > + return -ENOBUFS;
>
> Why NLM_F_MULTI? AFAIU that means "I'm splitting one object over
> multiple messages". 99% of the time the right thing to do is change
> what we consider to be "an object" rather than do F_MULTI. In theory
> user space should re-constitute all the NLM_F_MULTI messages into as
> single object, which none of YNL does today :(
>
ack, fine. I think we can get rid of it.
@chuck: do you want me to send a patch or are you taking care of it?

Regards,
Lorenzo


Attachments:
(No filename) (742.00 B)
signature.asc (235.00 B)
Download all attachments

2023-10-04 13:27:42

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support



> On Oct 4, 2023, at 6:14 AM, Lorenzo Bianconi <[email protected]> wrote:
>
>> On Mon, 11 Sep 2023 14:49:46 +0200 Lorenzo Bianconi wrote:
>>> + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
>>> + &nfsd_server_nl_family, NLM_F_MULTI,
>>> + NFSD_CMD_RPC_STATUS_GET);
>>> + if (!hdr)
>>> + return -ENOBUFS;
>>
>> Why NLM_F_MULTI? AFAIU that means "I'm splitting one object over
>> multiple messages". 99% of the time the right thing to do is change
>> what we consider to be "an object" rather than do F_MULTI. In theory
>> user space should re-constitute all the NLM_F_MULTI messages into as
>> single object, which none of YNL does today :(
>>
> ack, fine. I think we can get rid of it.
> @chuck: do you want me to send a patch or are you taking care of it?

Send a (tested) patch and I can squash it into this one.


--
Chuck Lever


2023-10-04 14:05:45

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

>
>
> > On Oct 4, 2023, at 6:14 AM, Lorenzo Bianconi <[email protected]> wrote:
> >
> >> On Mon, 11 Sep 2023 14:49:46 +0200 Lorenzo Bianconi wrote:
> >>> + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> >>> + &nfsd_server_nl_family, NLM_F_MULTI,
> >>> + NFSD_CMD_RPC_STATUS_GET);
> >>> + if (!hdr)
> >>> + return -ENOBUFS;
> >>
> >> Why NLM_F_MULTI? AFAIU that means "I'm splitting one object over
> >> multiple messages". 99% of the time the right thing to do is change
> >> what we consider to be "an object" rather than do F_MULTI. In theory
> >> user space should re-constitute all the NLM_F_MULTI messages into as
> >> single object, which none of YNL does today :(
> >>
> > ack, fine. I think we can get rid of it.
> > @chuck: do you want me to send a patch or are you taking care of it?
>
> Send a (tested) patch and I can squash it into this one.

ack, I will do.

Regards,
Lorenzo

>
>
> --
> Chuck Lever
>
>


Attachments:
(No filename) (0.98 kB)
signature.asc (235.00 B)
Download all attachments

2023-12-11 18:56:12

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

On Mon, 2023-09-11 at 14:49 +0200, Lorenzo Bianconi wrote:
> Introduce rpc_status netlink support for NFSD in order to dump pending
> RPC requests debugging information from userspace.
>
> Tested-by: Jeff Layton <[email protected]>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> fs/nfsd/nfsctl.c | 192 ++++++++++++++++++++++++++++++++++++-
> fs/nfsd/nfsd.h | 16 ++++
> fs/nfsd/nfssvc.c | 15 +++
> fs/nfsd/state.h | 2 -
> include/linux/sunrpc/svc.h | 1 +
> 5 files changed, 222 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 1be66088849c..b862a759ea15 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -26,6 +26,7 @@
> #include "pnfs.h"
> #include "filecache.h"
> #include "trace.h"
> +#include "nfs_netlink_gen.h"
>
> /*
> * We have a single directory with several nodes in it.
> @@ -1497,17 +1498,199 @@ unsigned int nfsd_net_id;
>
> int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
> {
> - return 0;
> + struct nfsd_net *nn = net_generic(sock_net(cb->skb->sk), nfsd_net_id);
> + int ret = -ENODEV;
> +
> + mutex_lock(&nfsd_mutex);
> + if (nn->nfsd_serv) {
> + svc_get(nn->nfsd_serv);
> + ret = 0;
> + }
> + mutex_unlock(&nfsd_mutex);
> +
> + return ret;
> }

I think there is a potential race above. Once you've dropped the
nfsd_mutex, there is no guarantee that the nn->nfsd_serv will still be
set when you come back to put the serv. That means that we could oops
when we hit the _done method below.

Is it possible to stash a pointer to the serv while we hold the
reference?

>
> -int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> +static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
> + struct netlink_callback *cb,
> + struct nfsd_genl_rqstp *rqstp)
> {
> + void *hdr;
> + int i;
> +
> + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> + &nfsd_server_nl_family, NLM_F_MULTI,
> + NFSD_CMD_RPC_STATUS_GET);
> + if (!hdr)
> + return -ENOBUFS;
> +
> + if (nla_put_be32(skb, NFSD_ATTR_RPC_STATUS_XID, rqstp->rq_xid) ||
> + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_FLAGS, rqstp->rq_flags) ||
> + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROG, rqstp->rq_prog) ||
> + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROC, rqstp->rq_proc) ||
> + nla_put_u8(skb, NFSD_ATTR_RPC_STATUS_VERSION, rqstp->rq_vers) ||
> + nla_put_s64(skb, NFSD_ATTR_RPC_STATUS_SERVICE_TIME,
> + ktime_to_us(rqstp->rq_stime),
> + NFSD_ATTR_RPC_STATUS_PAD))
> + return -ENOBUFS;
> +
> + switch (rqstp->saddr.sa_family) {
> + case AF_INET: {
> + const struct sockaddr_in *s_in, *d_in;
> +
> + s_in = (const struct sockaddr_in *)&rqstp->saddr;
> + d_in = (const struct sockaddr_in *)&rqstp->daddr;
> + if (nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR4,
> + s_in->sin_addr.s_addr) ||
> + nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR4,
> + d_in->sin_addr.s_addr) ||
> + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
> + s_in->sin_port) ||
> + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
> + d_in->sin_port))
> + return -ENOBUFS;
> + break;
> + }
> + case AF_INET6: {
> + const struct sockaddr_in6 *s_in, *d_in;
> +
> + s_in = (const struct sockaddr_in6 *)&rqstp->saddr;
> + d_in = (const struct sockaddr_in6 *)&rqstp->daddr;
> + if (nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR6,
> + &s_in->sin6_addr) ||
> + nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR6,
> + &d_in->sin6_addr) ||
> + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
> + s_in->sin6_port) ||
> + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
> + d_in->sin6_port))
> + return -ENOBUFS;
> + break;
> + }
> + default:
> + break;
> + }
> +
> + if (rqstp->opcnt) {
> + struct nlattr *attr;
> +
> + attr = nla_nest_start(skb, NFSD_ATTR_RPC_STATUS_COMPOND_OP);
> + if (!attr)
> + return -ENOBUFS;
> +
> + for (i = 0; i < rqstp->opcnt; i++) {
> + struct nlattr *op_attr;
> +
> + op_attr = nla_nest_start(skb, i);
> + if (!op_attr)
> + return -ENOBUFS;
> +
> + if (nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_COMP_OP,
> + rqstp->opnum[i]))
> + return -ENOBUFS;
> +
> + nla_nest_end(skb, op_attr);
> + }
> +
> + nla_nest_end(skb, attr);
> + }
> +
> + genlmsg_end(skb, hdr);
> +
> return 0;
> }
>
> int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> struct netlink_callback *cb)
> {
> + struct nfsd_net *nn = net_generic(sock_net(skb->sk), nfsd_net_id);
> + int i, ret, rqstp_index;
> +
> + rcu_read_lock();
> +
> + for (i = 0; i < nn->nfsd_serv->sv_nrpools; i++) {
> + struct svc_rqst *rqstp;
> +
> + if (i < cb->args[0]) /* already consumed */
> + continue;
> +
> + rqstp_index = 0;
> + list_for_each_entry_rcu(rqstp,
> + &nn->nfsd_serv->sv_pools[i].sp_all_threads,
> + rq_all) {
> + struct nfsd_genl_rqstp genl_rqstp;
> + unsigned int status_counter;
> +
> + if (rqstp_index++ < cb->args[1]) /* already consumed */
> + continue;
> + /*
> + * Acquire rq_status_counter before parsing the rqst
> + * fields. rq_status_counter is set to an odd value in
> + * order to notify the consumers the rqstp fields are
> + * meaningful.
> + */
> + status_counter =
> + smp_load_acquire(&rqstp->rq_status_counter);
> + if (!(status_counter & 1))
> + continue;
> +
> + genl_rqstp.rq_xid = rqstp->rq_xid;
> + genl_rqstp.rq_flags = rqstp->rq_flags;
> + genl_rqstp.rq_vers = rqstp->rq_vers;
> + genl_rqstp.rq_prog = rqstp->rq_prog;
> + genl_rqstp.rq_proc = rqstp->rq_proc;
> + genl_rqstp.rq_stime = rqstp->rq_stime;
> + genl_rqstp.opcnt = 0;
> + memcpy(&genl_rqstp.daddr, svc_daddr(rqstp),
> + sizeof(struct sockaddr));
> + memcpy(&genl_rqstp.saddr, svc_addr(rqstp),
> + sizeof(struct sockaddr));
> +
> +#ifdef CONFIG_NFSD_V4
> + if (rqstp->rq_vers == NFS4_VERSION &&
> + rqstp->rq_proc == NFSPROC4_COMPOUND) {
> + /* NFSv4 compund */
> + struct nfsd4_compoundargs *args;
> + int j;
> +
> + args = rqstp->rq_argp;
> + genl_rqstp.opcnt = args->opcnt;
> + for (j = 0; j < genl_rqstp.opcnt; j++)
> + genl_rqstp.opnum[j] =
> + args->ops[j].opnum;
> + }
> +#endif /* CONFIG_NFSD_V4 */
> +
> + /*
> + * Acquire rq_status_counter before reporting the rqst
> + * fields to the user.
> + */
> + if (smp_load_acquire(&rqstp->rq_status_counter) !=
> + status_counter)
> + continue;
> +
> + ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
> + &genl_rqstp);
> + if (ret)
> + goto out;
> + }
> + }
> +
> + cb->args[0] = i;
> + cb->args[1] = rqstp_index;
> + ret = skb->len;
> +out:
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
> +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> +{
> + mutex_lock(&nfsd_mutex);
> + nfsd_put(sock_net(cb->skb->sk));
> + mutex_unlock(&nfsd_mutex);
> +
> return 0;
> }
>

I think there is a potential race above. Once you've


> @@ -1605,6 +1788,10 @@ static int __init init_nfsd(void)
> retval = register_filesystem(&nfsd_fs_type);
> if (retval)
> goto out_free_all;
> + retval = genl_register_family(&nfsd_server_nl_family);
> + if (retval)
> + goto out_free_all;
> +
> return 0;
> out_free_all:
> nfsd4_destroy_laundry_wq();
> @@ -1629,6 +1816,7 @@ static int __init init_nfsd(void)
>
> static void __exit exit_nfsd(void)
> {
> + genl_unregister_family(&nfsd_server_nl_family);
> unregister_filesystem(&nfsd_fs_type);
> nfsd4_destroy_laundry_wq();
> unregister_cld_notifier();
> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> index 11c14faa6c67..d787bd38c053 100644
> --- a/fs/nfsd/nfsd.h
> +++ b/fs/nfsd/nfsd.h
> @@ -62,6 +62,22 @@ struct readdir_cd {
> __be32 err; /* 0, nfserr, or nfserr_eof */
> };
>
> +/* Maximum number of operations per session compound */
> +#define NFSD_MAX_OPS_PER_COMPOUND 50
> +
> +struct nfsd_genl_rqstp {
> + struct sockaddr daddr;
> + struct sockaddr saddr;
> + unsigned long rq_flags;
> + ktime_t rq_stime;
> + __be32 rq_xid;
> + u32 rq_vers;
> + u32 rq_prog;
> + u32 rq_proc;
> + /* NFSv4 compund */
> + u32 opnum[NFSD_MAX_OPS_PER_COMPOUND];
> + u16 opcnt;
> +};
>
> extern struct svc_program nfsd_program;
> extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 1582af33e204..fad34a7325b3 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -998,6 +998,15 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream))
> goto out_decode_err;
>
> + /*
> + * Release rq_status_counter setting it to an odd value after the rpc
> + * request has been properly parsed. rq_status_counter is used to
> + * notify the consumers if the rqstp fields are stable
> + * (rq_status_counter is odd) or not meaningful (rq_status_counter
> + * is even).
> + */
> + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter | 1);
> +
> rp = NULL;
> switch (nfsd_cache_lookup(rqstp, &rp)) {
> case RC_DOIT:
> @@ -1015,6 +1024,12 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream))
> goto out_encode_err;
>
> + /*
> + * Release rq_status_counter setting it to an even value after the rpc
> + * request has been properly processed.
> + */
> + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
> +
> nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
> out_cached_reply:
> return 1;
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index cbddcf484dba..41bdc913fa71 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -174,8 +174,6 @@ static inline struct nfs4_delegation *delegstateid(struct nfs4_stid *s)
>
> /* Maximum number of slots per session. 160 is useful for long haul TCP */
> #define NFSD_MAX_SLOTS_PER_SESSION 160
> -/* Maximum number of operations per session compound */
> -#define NFSD_MAX_OPS_PER_COMPOUND 50
> /* Maximum session per slot cache size */
> #define NFSD_SLOT_CACHE_SIZE 2048
> /* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index dbf5b21feafe..caa20defd255 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -251,6 +251,7 @@ struct svc_rqst {
> * net namespace
> */
> void ** rq_lease_breaker; /* The v4 client breaking a lease */
> + unsigned int rq_status_counter; /* RPC processing counter */
> };
>
> /* bits for rq_flags */

--
Jeff Layton <[email protected]>

2023-12-12 12:07:42

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] NFSD: add rpc_status netlink support

On Mon, 2023-12-11 at 13:56 -0500, Jeff Layton wrote:
> On Mon, 2023-09-11 at 14:49 +0200, Lorenzo Bianconi wrote:
> > Introduce rpc_status netlink support for NFSD in order to dump pending
> > RPC requests debugging information from userspace.
> >
> > Tested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > fs/nfsd/nfsctl.c | 192 ++++++++++++++++++++++++++++++++++++-
> > fs/nfsd/nfsd.h | 16 ++++
> > fs/nfsd/nfssvc.c | 15 +++
> > fs/nfsd/state.h | 2 -
> > include/linux/sunrpc/svc.h | 1 +
> > 5 files changed, 222 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 1be66088849c..b862a759ea15 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -26,6 +26,7 @@
> > #include "pnfs.h"
> > #include "filecache.h"
> > #include "trace.h"
> > +#include "nfs_netlink_gen.h"
> >
> > /*
> > * We have a single directory with several nodes in it.
> > @@ -1497,17 +1498,199 @@ unsigned int nfsd_net_id;
> >
> > int nfsd_server_nl_rpc_status_get_start(struct netlink_callback *cb)
> > {
> > - return 0;
> > + struct nfsd_net *nn = net_generic(sock_net(cb->skb->sk), nfsd_net_id);
> > + int ret = -ENODEV;
> > +
> > + mutex_lock(&nfsd_mutex);
> > + if (nn->nfsd_serv) {
> > + svc_get(nn->nfsd_serv);
> > + ret = 0;
> > + }
> > + mutex_unlock(&nfsd_mutex);
> > +
> > + return ret;
> > }
>
> I think there is a potential race above. Once you've dropped the
> nfsd_mutex, there is no guarantee that the nn->nfsd_serv will still be
> set when you come back to put the serv. That means that we could oops
> when we hit the _done method below.
>
> Is it possible to stash a pointer to the serv while we hold the
> reference?
>

Actually, it looks like Neil may have already fixed this in the series
he sent on Oct 29th. See:

[PATCH 3/5] nfsd: hold nfsd_mutex across entire netlink operation

Another reason to go ahead and get that series in...

> >
> > -int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> > +static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
> > + struct netlink_callback *cb,
> > + struct nfsd_genl_rqstp *rqstp)
> > {
> > + void *hdr;
> > + int i;
> > +
> > + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
> > + &nfsd_server_nl_family, NLM_F_MULTI,
> > + NFSD_CMD_RPC_STATUS_GET);
> > + if (!hdr)
> > + return -ENOBUFS;
> > +
> > + if (nla_put_be32(skb, NFSD_ATTR_RPC_STATUS_XID, rqstp->rq_xid) ||
> > + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_FLAGS, rqstp->rq_flags) ||
> > + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROG, rqstp->rq_prog) ||
> > + nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_PROC, rqstp->rq_proc) ||
> > + nla_put_u8(skb, NFSD_ATTR_RPC_STATUS_VERSION, rqstp->rq_vers) ||
> > + nla_put_s64(skb, NFSD_ATTR_RPC_STATUS_SERVICE_TIME,
> > + ktime_to_us(rqstp->rq_stime),
> > + NFSD_ATTR_RPC_STATUS_PAD))
> > + return -ENOBUFS;
> > +
> > + switch (rqstp->saddr.sa_family) {
> > + case AF_INET: {
> > + const struct sockaddr_in *s_in, *d_in;
> > +
> > + s_in = (const struct sockaddr_in *)&rqstp->saddr;
> > + d_in = (const struct sockaddr_in *)&rqstp->daddr;
> > + if (nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR4,
> > + s_in->sin_addr.s_addr) ||
> > + nla_put_in_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR4,
> > + d_in->sin_addr.s_addr) ||
> > + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
> > + s_in->sin_port) ||
> > + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
> > + d_in->sin_port))
> > + return -ENOBUFS;
> > + break;
> > + }
> > + case AF_INET6: {
> > + const struct sockaddr_in6 *s_in, *d_in;
> > +
> > + s_in = (const struct sockaddr_in6 *)&rqstp->saddr;
> > + d_in = (const struct sockaddr_in6 *)&rqstp->daddr;
> > + if (nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_SADDR6,
> > + &s_in->sin6_addr) ||
> > + nla_put_in6_addr(skb, NFSD_ATTR_RPC_STATUS_DADDR6,
> > + &d_in->sin6_addr) ||
> > + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_SPORT,
> > + s_in->sin6_port) ||
> > + nla_put_be16(skb, NFSD_ATTR_RPC_STATUS_DPORT,
> > + d_in->sin6_port))
> > + return -ENOBUFS;
> > + break;
> > + }
> > + default:
> > + break;
> > + }
> > +
> > + if (rqstp->opcnt) {
> > + struct nlattr *attr;
> > +
> > + attr = nla_nest_start(skb, NFSD_ATTR_RPC_STATUS_COMPOND_OP);
> > + if (!attr)
> > + return -ENOBUFS;
> > +
> > + for (i = 0; i < rqstp->opcnt; i++) {
> > + struct nlattr *op_attr;
> > +
> > + op_attr = nla_nest_start(skb, i);
> > + if (!op_attr)
> > + return -ENOBUFS;
> > +
> > + if (nla_put_u32(skb, NFSD_ATTR_RPC_STATUS_COMP_OP,
> > + rqstp->opnum[i]))
> > + return -ENOBUFS;
> > +
> > + nla_nest_end(skb, op_attr);
> > + }
> > +
> > + nla_nest_end(skb, attr);
> > + }
> > +
> > + genlmsg_end(skb, hdr);
> > +
> > return 0;
> > }
> >
> > int nfsd_server_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > struct netlink_callback *cb)
> > {
> > + struct nfsd_net *nn = net_generic(sock_net(skb->sk), nfsd_net_id);
> > + int i, ret, rqstp_index;
> > +
> > + rcu_read_lock();
> > +
> > + for (i = 0; i < nn->nfsd_serv->sv_nrpools; i++) {
> > + struct svc_rqst *rqstp;
> > +
> > + if (i < cb->args[0]) /* already consumed */
> > + continue;
> > +
> > + rqstp_index = 0;
> > + list_for_each_entry_rcu(rqstp,
> > + &nn->nfsd_serv->sv_pools[i].sp_all_threads,
> > + rq_all) {
> > + struct nfsd_genl_rqstp genl_rqstp;
> > + unsigned int status_counter;
> > +
> > + if (rqstp_index++ < cb->args[1]) /* already consumed */
> > + continue;
> > + /*
> > + * Acquire rq_status_counter before parsing the rqst
> > + * fields. rq_status_counter is set to an odd value in
> > + * order to notify the consumers the rqstp fields are
> > + * meaningful.
> > + */
> > + status_counter =
> > + smp_load_acquire(&rqstp->rq_status_counter);
> > + if (!(status_counter & 1))
> > + continue;
> > +
> > + genl_rqstp.rq_xid = rqstp->rq_xid;
> > + genl_rqstp.rq_flags = rqstp->rq_flags;
> > + genl_rqstp.rq_vers = rqstp->rq_vers;
> > + genl_rqstp.rq_prog = rqstp->rq_prog;
> > + genl_rqstp.rq_proc = rqstp->rq_proc;
> > + genl_rqstp.rq_stime = rqstp->rq_stime;
> > + genl_rqstp.opcnt = 0;
> > + memcpy(&genl_rqstp.daddr, svc_daddr(rqstp),
> > + sizeof(struct sockaddr));
> > + memcpy(&genl_rqstp.saddr, svc_addr(rqstp),
> > + sizeof(struct sockaddr));
> > +
> > +#ifdef CONFIG_NFSD_V4
> > + if (rqstp->rq_vers == NFS4_VERSION &&
> > + rqstp->rq_proc == NFSPROC4_COMPOUND) {
> > + /* NFSv4 compund */
> > + struct nfsd4_compoundargs *args;
> > + int j;
> > +
> > + args = rqstp->rq_argp;
> > + genl_rqstp.opcnt = args->opcnt;
> > + for (j = 0; j < genl_rqstp.opcnt; j++)
> > + genl_rqstp.opnum[j] =
> > + args->ops[j].opnum;
> > + }
> > +#endif /* CONFIG_NFSD_V4 */
> > +
> > + /*
> > + * Acquire rq_status_counter before reporting the rqst
> > + * fields to the user.
> > + */
> > + if (smp_load_acquire(&rqstp->rq_status_counter) !=
> > + status_counter)
> > + continue;
> > +
> > + ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
> > + &genl_rqstp);
> > + if (ret)
> > + goto out;
> > + }
> > + }
> > +
> > + cb->args[0] = i;
> > + cb->args[1] = rqstp_index;
> > + ret = skb->len;
> > +out:
> > + rcu_read_unlock();
> > +
> > + return ret;
> > +}
> > +
> > +int nfsd_server_nl_rpc_status_get_done(struct netlink_callback *cb)
> > +{
> > + mutex_lock(&nfsd_mutex);
> > + nfsd_put(sock_net(cb->skb->sk));
> > + mutex_unlock(&nfsd_mutex);
> > +
> > return 0;
> > }
> >
>
> I think there is a potential race above. Once you've
>
>
> > @@ -1605,6 +1788,10 @@ static int __init init_nfsd(void)
> > retval = register_filesystem(&nfsd_fs_type);
> > if (retval)
> > goto out_free_all;
> > + retval = genl_register_family(&nfsd_server_nl_family);
> > + if (retval)
> > + goto out_free_all;
> > +
> > return 0;
> > out_free_all:
> > nfsd4_destroy_laundry_wq();
> > @@ -1629,6 +1816,7 @@ static int __init init_nfsd(void)
> >
> > static void __exit exit_nfsd(void)
> > {
> > + genl_unregister_family(&nfsd_server_nl_family);
> > unregister_filesystem(&nfsd_fs_type);
> > nfsd4_destroy_laundry_wq();
> > unregister_cld_notifier();
> > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> > index 11c14faa6c67..d787bd38c053 100644
> > --- a/fs/nfsd/nfsd.h
> > +++ b/fs/nfsd/nfsd.h
> > @@ -62,6 +62,22 @@ struct readdir_cd {
> > __be32 err; /* 0, nfserr, or nfserr_eof */
> > };
> >
> > +/* Maximum number of operations per session compound */
> > +#define NFSD_MAX_OPS_PER_COMPOUND 50
> > +
> > +struct nfsd_genl_rqstp {
> > + struct sockaddr daddr;
> > + struct sockaddr saddr;
> > + unsigned long rq_flags;
> > + ktime_t rq_stime;
> > + __be32 rq_xid;
> > + u32 rq_vers;
> > + u32 rq_prog;
> > + u32 rq_proc;
> > + /* NFSv4 compund */
> > + u32 opnum[NFSD_MAX_OPS_PER_COMPOUND];
> > + u16 opcnt;
> > +};
> >
> > extern struct svc_program nfsd_program;
> > extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;
> > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > index 1582af33e204..fad34a7325b3 100644
> > --- a/fs/nfsd/nfssvc.c
> > +++ b/fs/nfsd/nfssvc.c
> > @@ -998,6 +998,15 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> > if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream))
> > goto out_decode_err;
> >
> > + /*
> > + * Release rq_status_counter setting it to an odd value after the rpc
> > + * request has been properly parsed. rq_status_counter is used to
> > + * notify the consumers if the rqstp fields are stable
> > + * (rq_status_counter is odd) or not meaningful (rq_status_counter
> > + * is even).
> > + */
> > + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter | 1);
> > +
> > rp = NULL;
> > switch (nfsd_cache_lookup(rqstp, &rp)) {
> > case RC_DOIT:
> > @@ -1015,6 +1024,12 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> > if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream))
> > goto out_encode_err;
> >
> > + /*
> > + * Release rq_status_counter setting it to an even value after the rpc
> > + * request has been properly processed.
> > + */
> > + smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
> > +
> > nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
> > out_cached_reply:
> > return 1;
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index cbddcf484dba..41bdc913fa71 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -174,8 +174,6 @@ static inline struct nfs4_delegation *delegstateid(struct nfs4_stid *s)
> >
> > /* Maximum number of slots per session. 160 is useful for long haul TCP */
> > #define NFSD_MAX_SLOTS_PER_SESSION 160
> > -/* Maximum number of operations per session compound */
> > -#define NFSD_MAX_OPS_PER_COMPOUND 50
> > /* Maximum session per slot cache size */
> > #define NFSD_SLOT_CACHE_SIZE 2048
> > /* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
> > diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> > index dbf5b21feafe..caa20defd255 100644
> > --- a/include/linux/sunrpc/svc.h
> > +++ b/include/linux/sunrpc/svc.h
> > @@ -251,6 +251,7 @@ struct svc_rqst {
> > * net namespace
> > */
> > void ** rq_lease_breaker; /* The v4 client breaking a lease */
> > + unsigned int rq_status_counter; /* RPC processing counter */
> > };
> >
> > /* bits for rq_flags */
>

--
Jeff Layton <[email protected]>