From: "J. Bruce Fields" <[email protected]>
The following patches expose information about NFSv4 opens held by knfsd
on behalf of NFSv4 clients. Those are currently invisible to userspace,
unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
The approach is to add a new directory /proc/fs/nfsd/clients/ with
subdirectories for each active NFSv4 client. Each subdirectory has an
"info" file with some basic information to help identify the client and
an "opens" directory that lists the opens held by that client.
I got it working by cobbling together some poorly-understood code I
found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
and tell me what I've got wrong, they're more than welcome, but at this
stage I'm more curious for feedback on the interface.
I'm also cc'ing people responsible for lsof and util-linux in case they
have any opinions.
Currently these pseudofiles look like:
# find /proc/fs/nfsd/clients -type f|xargs tail
==> /proc/fs/nfsd/clients/3741/opens <==
5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
==> /proc/fs/nfsd/clients/3741/info <==
clientid: 6debfb505cc0cd36
address: 192.168.122.36:0
name: Linux NFSv4.2 test2.fieldses.org
minor version: 2
Each line of the "opens" file is tab-delimited and describes one open,
and the fields are stateid, open access bits, deny bits,
major:minor:ino, and open owner.
So, some random questions:
- I just copied the major:minor:ino thing from /proc/locks, I
suspect we would have picked something different to identify
inodes if /proc/locks were done now. (Mount id and inode?
Something else?)
- The open owner is just an opaque blob of binary data, but
clients may choose to include some useful asci-encoded
information, so I'm formatting them as strings with non-ascii
stuff escaped. For example, pynfs usually uses the name of
the test as the open owner. But as you see above, the ascii
content of the Linux client's open owners is less useful.
Also, there's no way I know of to map them back to a file
description or process or anything else useful on the client,
so perhaps they're of limited interest.
- I'm not sure about the stateid either. I did think it might
be useful just as a unique identifier for each line.
(Actually for that it'd be enough to take just the third of
those four numbers making up the stateid--maybe that would be
better.)
In the "info" file, the "name" line is the client identifier/client
owner provided by the client, which (like the stateowner) is just opaque
binary data, though as you can see here the Linux client is providing a
readable ascii string.
There's probably a lot more we could add to that info file eventually.
Other stuff to add next:
- nfsd/clients/#/kill that you can write to to revoke all a
client's state if it's wedged somehow.
- lists of locks and delegations; lower priority since most of
that information is already in /proc/locks.
--b.
J. Bruce Fields (10):
nfsd: persist nfsd filesystem across mounts
nfsd: rename cl_refcount
nfsd4: use reference count to free client
nfsd: add nfsd/clients directory
nfsd: make client/ directory names small ints
rpc: replace rpc_filelist by tree_descr
nfsd4: add a client info file
nfsd4: add file to display list of client's opens
nfsd: expose some more information about NFSv4 opens
nfsd: add more information to client info file
fs/nfsd/netns.h | 6 +
fs/nfsd/nfs4state.c | 228 ++++++++++++++++++++++++++++++---
fs/nfsd/nfsctl.c | 225 +++++++++++++++++++++++++++++++-
fs/nfsd/nfsd.h | 11 ++
fs/nfsd/state.h | 9 +-
fs/seq_file.c | 17 +++
include/linux/seq_file.h | 2 +
include/linux/string_helpers.h | 1 +
lib/string_helpers.c | 5 +-
net/sunrpc/rpc_pipe.c | 37 ++----
10 files changed, 491 insertions(+), 50 deletions(-)
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Keep around one internal mount of the nfsd filesystem so that we can
e.g. add stuff to it when clients come and go, regardless of whether
anyone has it mounted.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/netns.h | 3 +++
fs/nfsd/nfsctl.c | 13 +++++++++++++
2 files changed, 16 insertions(+)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 32cb8c027483..cce335e1ec98 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -55,6 +55,9 @@ struct nfsd_net {
bool grace_ended;
time_t boot_time;
+ /* internal mount of the "nfsd" pseudofilesystem: */
+ struct vfsmount *nfsd_mnt;
+
/*
* reclaim_str_hashtbl[] holds known client info from previous reset/reboot
* used in reboot/reset lease grace period processing
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index f2feb2d11bae..8d2062428569 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1231,6 +1231,7 @@ unsigned int nfsd_net_id;
static __net_init int nfsd_init_net(struct net *net)
{
int retval;
+ struct vfsmount *mnt;
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
retval = nfsd_export_init(net);
@@ -1248,8 +1249,17 @@ static __net_init int nfsd_init_net(struct net *net)
atomic_set(&nn->ntf_refcnt, 0);
init_waitqueue_head(&nn->ntf_wq);
+
+ mnt = vfs_kern_mount(&nfsd_fs_type, SB_KERNMOUNT, "nfsd", NULL);
+ if (IS_ERR(mnt)) {
+ retval = PTR_ERR(mnt);
+ goto out_mount_err;
+ }
+ nn->nfsd_mnt = mnt;
return 0;
+out_mount_err:
+ nfsd_idmap_shutdown(net);
out_idmap_error:
nfsd_export_shutdown(net);
out_export_error:
@@ -1258,6 +1268,9 @@ static __net_init int nfsd_init_net(struct net *net)
static __net_exit void nfsd_exit_net(struct net *net)
{
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+ mntput(nn->nfsd_mnt);
nfsd_idmap_shutdown(net);
nfsd_export_shutdown(net);
}
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Add open modes, device numbers, inode numbers, and open owners to each
line of nfsd/clients/#/opens.
Open owners are totally opaque but seem to sometimes have some useful
ascii strings included, so passing through printable ascii characters
and escaping the rest seems useful while still being machine-readable.
---
fs/nfsd/nfs4state.c | 40 ++++++++++++++++++++++++++++------
fs/nfsd/state.h | 2 +-
fs/seq_file.c | 17 +++++++++++++++
include/linux/seq_file.h | 2 ++
include/linux/string_helpers.h | 1 +
lib/string_helpers.c | 5 +++--
6 files changed, 57 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 829d1e5440d3..f53621a65e60 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -42,6 +42,7 @@
#include <linux/sunrpc/svcauth_gss.h>
#include <linux/sunrpc/addr.h>
#include <linux/jhash.h>
+#include <linux/string_helpers.h>
#include "xdr4.h"
#include "xdr4cb.h"
#include "vfs.h"
@@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s, void *v)
static int opens_show(struct seq_file *s, void *v)
{
struct nfs4_stid *st = v;
- struct nfs4_ol_stateid *os;
- u64 stateid;
+ struct nfs4_ol_stateid *ols;
+ struct nfs4_file *nf;
+ struct file *file;
+ struct inode *inode;
+ struct nfs4_stateowner *oo;
+ unsigned int access, deny;
if (st->sc_type != NFS4_OPEN_STID)
return 0; /* XXX: or SEQ_SKIP? */
- os = openlockstateid(st);
- /* XXX: get info about file, etc., here */
+ ols = openlockstateid(st);
+ oo = ols->st_stateowner;
+ nf = st->sc_file;
+ file = find_any_file(nf);
+ inode = d_inode(file->f_path.dentry);
+
+ seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
+
+ access = bmap_to_share_mode(ols->st_access_bmap);
+ deny = bmap_to_share_mode(ols->st_deny_bmap);
+
+ seq_printf(s, "\t\%s\%s",
+ access & NFS4_SHARE_ACCESS_READ ? "r" : "-",
+ access & NFS4_SHARE_ACCESS_WRITE ? "w" : "-");
+ seq_printf(s, "\t\%s\%s",
+ deny & NFS4_SHARE_ACCESS_READ ? "R" : "-",
+ deny & NFS4_SHARE_ACCESS_WRITE ? "W" : "-");
+
+ seq_printf(s, "\t%02x:%02x:%ld\t", MAJOR(inode->i_sb->s_dev),
+ MINOR(inode->i_sb->s_dev),
+ inode->i_ino);
+ seq_printf(s, "'");
+ seq_escape_mem(s, oo->so_owner.data, oo->so_owner.len,
+ ESCAPE_NP|ESCAPE_HEX|ESCAPE_NOHIGH, NULL);
+ seq_printf(s, "'\n");
- memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
- seq_printf(s, "stateid: %llx\n", stateid);
return 0;
}
@@ -2320,7 +2346,7 @@ static const struct file_operations client_opens_fops = {
static const struct tree_descr client_files[] = {
[0] = {"info", &client_info_fops, S_IRUSR},
- [1] = {"open", &client_opens_fops, S_IRUSR},
+ [1] = {"opens", &client_opens_fops, S_IRUSR},
[2] = {""},
};
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 5a999e4b766e..4db22433be82 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -56,7 +56,7 @@ typedef struct {
stateid_opaque_t si_opaque;
} stateid_t;
-#define STATEID_FMT "(%08x/%08x/%08x/%08x)"
+#define STATEID_FMT "%08x/%08x/%08x/%08x"
#define STATEID_VAL(s) \
(s)->si_opaque.so_clid.cl_boot, \
(s)->si_opaque.so_clid.cl_id, \
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 1dea7a8a5255..bb646e0c5b9c 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -383,6 +383,23 @@ void seq_escape(struct seq_file *m, const char *s, const char *esc)
}
EXPORT_SYMBOL(seq_escape);
+/**
+ * seq_escape_mem - copy data to buffer, escaping some characters
+ *
+ * See string_escape_mem for explanations of the arguments.
+ */
+void seq_escape_mem(struct seq_file *m, const char *src, size_t isz,
+ unsigned int flags, const char *only)
+{
+ char *buf;
+ size_t size = seq_get_buf(m, &buf);
+ int ret;
+
+ ret = string_escape_mem(src, isz, buf, size, flags, only);
+ seq_commit(m, ret < size ? ret : -1);
+}
+EXPORT_SYMBOL(seq_escape_mem);
+
void seq_vprintf(struct seq_file *m, const char *f, va_list args)
{
int len;
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index a121982af0f5..de6ace9e5e6b 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -127,6 +127,8 @@ void seq_put_hex_ll(struct seq_file *m, const char *delimiter,
unsigned long long v, unsigned int width);
void seq_escape(struct seq_file *m, const char *s, const char *esc);
+void seq_escape_mem(struct seq_file *m, const char *src, size_t isz,
+ unsigned int flags, const char *only);
void seq_hex_dump(struct seq_file *m, const char *prefix_str, int prefix_type,
int rowsize, int groupsize, const void *buf, size_t len,
diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
index d23c5030901a..54034526aa46 100644
--- a/include/linux/string_helpers.h
+++ b/include/linux/string_helpers.h
@@ -50,6 +50,7 @@ static inline int string_unescape_any_inplace(char *buf)
#define ESCAPE_NP 0x10
#define ESCAPE_ANY_NP (ESCAPE_ANY | ESCAPE_NP)
#define ESCAPE_HEX 0x20
+#define ESCAPE_NOHIGH 0x40
int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
unsigned int flags, const char *only);
diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 29c490e5d478..2c3e7b93775e 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -511,8 +511,9 @@ int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz,
* In these cases we just pass through a character to the
* output buffer.
*/
- if ((flags & ESCAPE_NP && isprint(c)) ||
- (is_dict && !strchr(only, c))) {
+ if (( (flags & ESCAPE_NP && isprint(c)) ||
+ (is_dict && !strchr(only, c))) &&
+ !(flags & ESCAPE_NOHIGH && (unsigned char)c > 127)) {
/* do nothing */
} else {
if (flags & ESCAPE_SPACE && escape_space(c, &p, end))
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Add a new nfsd/clients/#/info file with some basic information about
each NFSv4 client.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 38 ++++++++++++++-
fs/nfsd/nfsctl.c | 115 +++++++++++++++++++++++++++++++++++++++++---
fs/nfsd/nfsd.h | 4 +-
3 files changed, 148 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index de99bfe3e6e2..928705fc8ff5 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2193,6 +2193,41 @@ find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
return s;
}
+static int client_info_show(struct seq_file *m, void *v)
+{
+ struct inode *inode = m->private;
+ struct nfsdfs_client *nc;
+ struct nfs4_client *clp;
+ u64 clid;
+
+ nc = get_nfsdfs_client(inode);
+ if (!nc)
+ return -ENXIO;
+ clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
+ memcpy(&clid, &clp->cl_clientid, sizeof(clid));
+ seq_printf(m, "clientid: %llx\n", clid);
+ drop_client(clp);
+
+ return 0;
+}
+
+static int client_info_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, client_info_show, inode);
+}
+
+static const struct file_operations client_info_fops = {
+ .open = client_info_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static const struct tree_descr client_files[] = {
+ [0] = {"info", &client_info_fops, S_IRUSR},
+ [1] = {""},
+};
+
static struct nfs4_client *create_client(struct xdr_netobj name,
struct svc_rqst *rqstp, nfs4_verifier *verf)
{
@@ -2221,7 +2256,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
clp->cl_cb_session = NULL;
clp->net = net;
clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
- clp->cl_clientid.cl_id - nn->clientid_base);
+ clp->cl_clientid.cl_id - nn->clientid_base,
+ client_files);
if (!clp->cl_nfsd_dentry) {
free_client(clp);
return NULL;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index edf4dada42bf..205a56718f3a 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1207,14 +1207,116 @@ static struct dentry *nfsd_mkdir(struct dentry *parent, struct nfsdfs_client *nc
goto out;
}
+static void clear_ncl(struct inode *inode)
+{
+ struct nfsdfs_client *ncl = inode->i_private;
+
+ inode->i_private = NULL;
+ synchronize_rcu();
+ kref_put(&ncl->cl_ref, ncl->cl_release);
+}
+
+
+struct nfsdfs_client *__get_nfsdfs_client(struct inode *inode)
+{
+ struct nfsdfs_client *nc = inode->i_private;
+
+ if (nc)
+ kref_get(&nc->cl_ref);
+ return nc;
+}
+
+struct nfsdfs_client *get_nfsdfs_client(struct inode *inode)
+{
+ struct nfsdfs_client *nc;
+
+ rcu_read_lock();
+ nc = __get_nfsdfs_client(inode);
+ rcu_read_unlock();
+ return nc;
+}
+/* from __rpc_unlink */
+static void nfsdfs_remove_file(struct inode *dir, struct dentry *dentry)
+{
+ int ret;
+
+ clear_ncl(d_inode(dentry));
+ dget(dentry);
+ ret = simple_unlink(dir, dentry);
+ d_delete(dentry);
+ dput(dentry);
+ WARN_ON_ONCE(ret);
+}
+
+static void nfsdfs_remove_files(struct dentry *root)
+{
+ struct dentry *dentry, *tmp;
+
+ list_for_each_entry_safe(dentry, tmp, &root->d_subdirs, d_child) {
+ if (!simple_positive(dentry)) {
+ WARN_ON_ONCE(1); /* I think this can't happen? */
+ continue;
+ }
+ nfsdfs_remove_file(d_inode(root), dentry);
+ }
+}
+
+/* XXX: cut'n'paste from simple_fill_super; figure out if we could share
+ * code instead. */
+static int nfsdfs_create_files(struct dentry *root,
+ const struct tree_descr *files)
+{
+ struct inode *dir = d_inode(root);
+ struct inode *inode;
+ struct dentry *dentry;
+ int i;
+
+ inode_lock(dir);
+ for (i = 0; files->name && files->name[0]; i++, files++) {
+ if (!files->name)
+ continue;
+ dentry = d_alloc_name(root, files->name);
+ if (!dentry)
+ goto out;
+ inode = nfsd_get_inode(d_inode(root)->i_sb,
+ S_IFREG | files->mode);
+ if (!inode) {
+ dput(dentry);
+ goto out;
+ }
+ inode->i_fop = files->ops;
+ inode->i_private = __get_nfsdfs_client(dir);
+ d_add(dentry, inode);
+ fsnotify_create(dir, dentry);
+ }
+ inode_unlock(dir);
+ return 0;
+out:
+ nfsdfs_remove_files(root);
+ inode_unlock(dir);
+ return -ENOMEM;
+}
+
/* on success, returns positive number unique to that client. */
-struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id)
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn,
+ struct nfsdfs_client *ncl, u32 id,
+ const struct tree_descr *files)
{
+ struct dentry *dentry;
char name[11];
+ int ret;
sprintf(name, "%u", id);
- return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+ dentry = nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+ if (IS_ERR(dentry)) /* XXX: tossing errors? */
+ return NULL;
+ ret = nfsdfs_create_files(dentry, files);
+ if (ret) {
+ nfsd_client_rmdir(dentry);
+ return NULL;
+ }
+ return dentry;
}
/* Taken from __rpc_rmdir: */
@@ -1222,16 +1324,16 @@ void nfsd_client_rmdir(struct dentry *dentry)
{
struct inode *dir = d_inode(dentry->d_parent);
struct inode *inode = d_inode(dentry);
- struct nfsdfs_client *ncl = inode->i_private;
int ret;
- inode->i_private = NULL;
- synchronize_rcu();
- kref_put(&ncl->cl_ref, ncl->cl_release);
+ inode_lock(dir);
+ nfsdfs_remove_files(dentry);
+ clear_ncl(inode);
dget(dentry);
ret = simple_rmdir(dir, dentry);
WARN_ON_ONCE(ret);
d_delete(dentry);
+ inode_unlock(dir);
}
static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
@@ -1275,7 +1377,6 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
return ret;
dentry = nfsd_mkdir(sb->s_root, NULL, "clients");
if (IS_ERR(dentry)) {
- /* XXX: test: */
d_genocide(sb->s_root);
shrink_dcache_parent(sb->s_root);
dput(sb->s_root);
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index fe7418c2b88f..a60a2ee2e8c1 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -91,7 +91,9 @@ struct nfsdfs_client {
void (*cl_release)(struct kref *kref);
};
-struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id);
+struct nfsdfs_client *get_nfsdfs_client(struct inode *);
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn,
+ struct nfsdfs_client *ncl, u32 id, const struct tree_descr *);
void nfsd_client_rmdir(struct dentry *dentry);
#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
We want clientid's on the wire to be randomized for reasons explained in
ebd7c72c63ac "nfsd: randomize SETCLIENTID reply to help distinguish
servers". But I'd rather have mostly small integers for the clients/
directory.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/netns.h | 1 +
fs/nfsd/nfs4state.c | 2 +-
fs/nfsd/nfsctl.c | 5 +++--
3 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 46d956676480..888b2891960e 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -121,6 +121,7 @@ struct nfsd_net {
*/
unsigned int max_connections;
+ u32 clientid_base;
u32 clientid_counter;
u32 clverifier_counter;
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index dfcb90743861..de99bfe3e6e2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2221,7 +2221,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
clp->cl_cb_session = NULL;
clp->net = net;
clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
- clp->cl_clientid.cl_id);
+ clp->cl_clientid.cl_id - nn->clientid_base);
if (!clp->cl_nfsd_dentry) {
free_client(clp);
return NULL;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 8a2261cdefee..edf4dada42bf 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1212,7 +1212,7 @@ struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl,
{
char name[11];
- sprintf(name, "%d", id++);
+ sprintf(name, "%u", id);
return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
}
@@ -1350,7 +1350,8 @@ static __net_init int nfsd_init_net(struct net *net)
nn->nfsd4_grace = 90;
nn->somebody_reclaimed = false;
nn->clverifier_counter = prandom_u32();
- nn->clientid_counter = prandom_u32();
+ nn->clientid_base = prandom_u32();
+ nn->clientid_counter = nn->clientid_base + 1;
nn->s2s_cp_cl_id = nn->clientid_counter++;
atomic_set(&nn->ntf_refcnt, 0);
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Add ip address, full client-provided identifier, and minor version.
There's much more that could possibly be useful but this is a start.
As with open owners, client identifiers are opaque binary data, but some
client happen to include some useful information in ascii.
---
fs/nfsd/nfs4state.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index f53621a65e60..39e1956b1bd2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2208,6 +2208,11 @@ static int client_info_show(struct seq_file *m, void *v)
clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
memcpy(&clid, &clp->cl_clientid, sizeof(clid));
seq_printf(m, "clientid: %llx\n", clid);
+ seq_printf(m, "address: %pISpc\n", (struct sockaddr *)&clp->cl_addr);
+ seq_printf(m, "name: ");
+ seq_escape_mem(m, clp->cl_name.data, clp->cl_name.len,
+ ESCAPE_NP|ESCAPE_HEX|ESCAPE_NOHIGH, NULL);
+ seq_printf(m, "\nminor version: %d\n", clp->cl_minorversion);
drop_client(clp);
return 0;
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Rename this to a more descriptive name.
Also, I'm going to want a second refcount with a slightly different use.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 26 +++++++++++++-------------
fs/nfsd/state.h | 2 +-
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6a45fb00c5fc..9dab61bbd256 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -137,7 +137,7 @@ static __be32 get_client_locked(struct nfs4_client *clp)
if (is_client_expired(clp))
return nfserr_expired;
- atomic_inc(&clp->cl_refcount);
+ atomic_inc(&clp->cl_rpc_users);
return nfs_ok;
}
@@ -169,7 +169,7 @@ static void put_client_renew_locked(struct nfs4_client *clp)
lockdep_assert_held(&nn->client_lock);
- if (!atomic_dec_and_test(&clp->cl_refcount))
+ if (!atomic_dec_and_test(&clp->cl_rpc_users))
return;
if (!is_client_expired(clp))
renew_client_locked(clp);
@@ -179,7 +179,7 @@ static void put_client_renew(struct nfs4_client *clp)
{
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
- if (!atomic_dec_and_lock(&clp->cl_refcount, &nn->client_lock))
+ if (!atomic_dec_and_lock(&clp->cl_rpc_users, &nn->client_lock))
return;
if (!is_client_expired(clp))
renew_client_locked(clp);
@@ -1847,7 +1847,7 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
clp->cl_name.len = name.len;
INIT_LIST_HEAD(&clp->cl_sessions);
idr_init(&clp->cl_stateids);
- atomic_set(&clp->cl_refcount, 0);
+ atomic_set(&clp->cl_rpc_users, 0);
clp->cl_cb_state = NFSD4_CB_UNKNOWN;
INIT_LIST_HEAD(&clp->cl_idhash);
INIT_LIST_HEAD(&clp->cl_openowners);
@@ -1926,7 +1926,7 @@ unhash_client(struct nfs4_client *clp)
static __be32 mark_client_expired_locked(struct nfs4_client *clp)
{
- if (atomic_read(&clp->cl_refcount))
+ if (atomic_read(&clp->cl_rpc_users))
return nfserr_jukebox;
unhash_client_locked(clp);
return nfs_ok;
@@ -4066,7 +4066,7 @@ static __be32 lookup_clientid(clientid_t *clid,
spin_unlock(&nn->client_lock);
return nfserr_expired;
}
- atomic_inc(&found->cl_refcount);
+ atomic_inc(&found->cl_rpc_users);
spin_unlock(&nn->client_lock);
/* Cache the nfs4_client in cstate! */
@@ -6550,7 +6550,7 @@ nfs4_check_open_reclaim(clientid_t *clid,
static inline void
put_client(struct nfs4_client *clp)
{
- atomic_dec(&clp->cl_refcount);
+ atomic_dec(&clp->cl_rpc_users);
}
static struct nfs4_client *
@@ -6668,7 +6668,7 @@ nfsd_inject_add_lock_to_list(struct nfs4_ol_stateid *lst,
return;
lockdep_assert_held(&nn->client_lock);
- atomic_inc(&clp->cl_refcount);
+ atomic_inc(&clp->cl_rpc_users);
list_add(&lst->st_locks, collect);
}
@@ -6697,7 +6697,7 @@ static u64 nfsd_foreach_client_lock(struct nfs4_client *clp, u64 max,
* Despite the fact that these functions deal
* with 64-bit integers for "count", we must
* ensure that it doesn't blow up the
- * clp->cl_refcount. Throw a warning if we
+ * clp->cl_rpc_users. Throw a warning if we
* start to approach INT_MAX here.
*/
WARN_ON_ONCE(count == (INT_MAX / 2));
@@ -6821,7 +6821,7 @@ nfsd_foreach_client_openowner(struct nfs4_client *clp, u64 max,
if (func) {
func(oop);
if (collect) {
- atomic_inc(&clp->cl_refcount);
+ atomic_inc(&clp->cl_rpc_users);
list_add(&oop->oo_perclient, collect);
}
}
@@ -6829,7 +6829,7 @@ nfsd_foreach_client_openowner(struct nfs4_client *clp, u64 max,
/*
* Despite the fact that these functions deal with
* 64-bit integers for "count", we must ensure that
- * it doesn't blow up the clp->cl_refcount. Throw a
+ * it doesn't blow up the clp->cl_rpc_users. Throw a
* warning if we start to approach INT_MAX here.
*/
WARN_ON_ONCE(count == (INT_MAX / 2));
@@ -6959,7 +6959,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
if (dp->dl_time != 0)
continue;
- atomic_inc(&clp->cl_refcount);
+ atomic_inc(&clp->cl_rpc_users);
WARN_ON(!unhash_delegation_locked(dp));
list_add(&dp->dl_recall_lru, victims);
}
@@ -6967,7 +6967,7 @@ static u64 nfsd_find_all_delegations(struct nfs4_client *clp, u64 max,
/*
* Despite the fact that these functions deal with
* 64-bit integers for "count", we must ensure that
- * it doesn't blow up the clp->cl_refcount. Throw a
+ * it doesn't blow up the clp->cl_rpc_users. Throw a
* warning if we start to approach INT_MAX here.
*/
WARN_ON_ONCE(count == (INT_MAX / 2));
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 396c76755b03..aa9f1676e88a 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -346,7 +346,7 @@ struct nfs4_client {
struct nfsd4_clid_slot cl_cs_slot; /* create_session slot */
u32 cl_exchange_flags;
/* number of rpc's in progress over an associated session: */
- atomic_t cl_refcount;
+ atomic_t cl_rpc_users;
struct nfs4_op_map cl_spo_must_allow;
/* for nfs41 callbacks */
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
I plan to expose some information about nfsv4 clients here.
---
fs/nfsd/netns.h | 2 +
fs/nfsd/nfs4state.c | 23 ++++++----
fs/nfsd/nfsctl.c | 108 +++++++++++++++++++++++++++++++++++++++++++-
fs/nfsd/nfsd.h | 9 ++++
fs/nfsd/state.h | 6 ++-
5 files changed, 138 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index cce335e1ec98..46d956676480 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -58,6 +58,8 @@ struct nfsd_net {
/* internal mount of the "nfsd" pseudofilesystem: */
struct vfsmount *nfsd_mnt;
+ struct dentry *nfsd_client_dir;
+
/*
* reclaim_str_hashtbl[] holds known client info from previous reset/reboot
* used in reboot/reset lease grace period processing
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 83d0ee329e14..dfcb90743861 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1871,20 +1871,19 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
static void __free_client(struct kref *k)
{
- struct nfs4_client *clp = container_of(k, struct nfs4_client, cl_ref);
+ struct nfsdfs_client *c = container_of(k, struct nfsdfs_client, cl_ref);
+ struct nfs4_client *clp = container_of(c, struct nfs4_client, cl_nfsdfs);
free_svc_cred(&clp->cl_cred);
kfree(clp->cl_ownerstr_hashtbl);
kfree(clp->cl_name.data);
idr_destroy(&clp->cl_stateids);
- if (clp->cl_nfsd_dentry)
- nfsd_client_rmdir(clp->cl_nfsd_dentry);
kmem_cache_free(client_slab, clp);
}
void drop_client(struct nfs4_client *clp)
{
- kref_put(&clp->cl_ref, __free_client);
+ kref_put(&clp->cl_nfsdfs.cl_ref, __free_client);
}
static void
@@ -1899,6 +1898,8 @@ free_client(struct nfs4_client *clp)
free_session(ses);
}
rpc_destroy_wait_queue(&clp->cl_cb_waitq);
+ if (clp->cl_nfsd_dentry)
+ nfsd_client_rmdir(clp->cl_nfsd_dentry);
drop_client(clp);
}
@@ -2199,6 +2200,7 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
struct sockaddr *sa = svc_addr(rqstp);
int ret;
struct net *net = SVC_NET(rqstp);
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
clp = alloc_client(name);
if (clp == NULL)
@@ -2209,8 +2211,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
free_client(clp);
return NULL;
}
-
- kref_init(&clp->cl_ref);
+ gen_clid(clp, nn);
+ kref_init(&clp->cl_nfsdfs.cl_ref);
nfsd4_init_cb(&clp->cl_cb_null, clp, NULL, NFSPROC4_CLNT_CB_NULL);
clp->cl_time = get_seconds();
clear_bit(0, &clp->cl_cb_slot_busy);
@@ -2218,6 +2220,12 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
rpc_copy_addr((struct sockaddr *) &clp->cl_addr, sa);
clp->cl_cb_session = NULL;
clp->net = net;
+ clp->cl_nfsd_dentry = nfsd_client_mkdir(nn, &clp->cl_nfsdfs,
+ clp->cl_clientid.cl_id);
+ if (!clp->cl_nfsd_dentry) {
+ free_client(clp);
+ return NULL;
+ }
return clp;
}
@@ -2661,7 +2669,6 @@ nfsd4_exchange_id(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
new->cl_spo_must_allow.u.words[0] = exid->spo_must_allow[0];
new->cl_spo_must_allow.u.words[1] = exid->spo_must_allow[1];
- gen_clid(new, nn);
add_to_unconfirmed(new);
swap(new, conf);
out_copy:
@@ -3404,7 +3411,7 @@ nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
copy_clid(new, conf);
gen_confirm(new, nn);
} else /* case 4 (new client) or cases 2, 3 (client reboot): */
- gen_clid(new, nn);
+ ;
new->cl_minorversion = 0;
gen_callback(new, setclid, rqstp);
add_to_unconfirmed(new);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 8d2062428569..8a2261cdefee 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -15,6 +15,7 @@
#include <linux/sunrpc/gss_krb5_enctypes.h>
#include <linux/sunrpc/rpc_pipe_fs.h>
#include <linux/module.h>
+#include <linux/fsnotify.h>
#include "idmap.h"
#include "nfsd.h"
@@ -52,6 +53,7 @@ enum {
NFSD_RecoveryDir,
NFSD_V4EndGrace,
#endif
+ NFSD_MaxReserved
};
/*
@@ -1146,8 +1148,99 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
* populating the filesystem.
*/
+/* Basically copying rpc_get_inode. */
+static struct inode *nfsd_get_inode(struct super_block *sb, umode_t mode)
+{
+ struct inode *inode = new_inode(sb);
+ if (!inode)
+ return NULL;
+ /* Following advice from simple_fill_super documentation: */
+ inode->i_ino = iunique(sb, NFSD_MaxReserved);
+ inode->i_mode = mode;
+ inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
+ switch (mode & S_IFMT) {
+ case S_IFDIR:
+ inode->i_fop = &simple_dir_operations;
+ inode->i_op = &simple_dir_inode_operations;
+ inc_nlink(inode);
+ default:
+ break;
+ }
+ return inode;
+}
+
+static int __nfsd_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
+{
+ struct inode *inode;
+
+ inode = nfsd_get_inode(dir->i_sb, mode);
+ if (!inode)
+ return -ENOMEM;
+ d_add(dentry, inode);
+ inc_nlink(dir);
+ fsnotify_mkdir(dir, dentry);
+ return 0;
+}
+
+static struct dentry *nfsd_mkdir(struct dentry *parent, struct nfsdfs_client *ncl, char *name)
+{
+ struct inode *dir = parent->d_inode;
+ struct dentry *dentry;
+ int ret = -ENOMEM;
+
+ inode_lock(dir);
+ dentry = d_alloc_name(parent, name);
+ if (!dentry)
+ goto out_err;
+ ret = __nfsd_mkdir(d_inode(parent), dentry, S_IFDIR | 0600);
+ if (ret)
+ goto out_err;
+ if (ncl) {
+ d_inode(dentry)->i_private = ncl;
+ kref_get(&ncl->cl_ref);
+ }
+out:
+ inode_unlock(dir);
+ return dentry;
+out_err:
+ dentry = ERR_PTR(ret);
+ goto out;
+}
+
+/* on success, returns positive number unique to that client. */
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id)
+{
+ char name[11];
+
+ sprintf(name, "%d", id++);
+
+ return nfsd_mkdir(nn->nfsd_client_dir, ncl, name);
+}
+
+/* Taken from __rpc_rmdir: */
+void nfsd_client_rmdir(struct dentry *dentry)
+{
+ struct inode *dir = d_inode(dentry->d_parent);
+ struct inode *inode = d_inode(dentry);
+ struct nfsdfs_client *ncl = inode->i_private;
+ int ret;
+
+ inode->i_private = NULL;
+ synchronize_rcu();
+ kref_put(&ncl->cl_ref, ncl->cl_release);
+ dget(dentry);
+ ret = simple_rmdir(dir, dentry);
+ WARN_ON_ONCE(ret);
+ d_delete(dentry);
+}
+
static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
{
+ struct nfsd_net *nn = net_generic(current->nsproxy->net_ns,
+ nfsd_net_id);
+ struct dentry *dentry;
+ int ret;
+
static const struct tree_descr nfsd_files[] = {
[NFSD_List] = {"exports", &exports_nfsd_operations, S_IRUGO},
[NFSD_Export_features] = {"export_features",
@@ -1177,7 +1270,20 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
/* last one */ {""}
};
get_net(sb->s_fs_info);
- return simple_fill_super(sb, 0x6e667364, nfsd_files);
+ ret = simple_fill_super(sb, 0x6e667364, nfsd_files);
+ if (ret)
+ return ret;
+ dentry = nfsd_mkdir(sb->s_root, NULL, "clients");
+ if (IS_ERR(dentry)) {
+ /* XXX: test: */
+ d_genocide(sb->s_root);
+ shrink_dcache_parent(sb->s_root);
+ dput(sb->s_root);
+ return PTR_ERR(dentry);
+ }
+ nn->nfsd_client_dir = dentry;
+ return 0;
+
}
static struct dentry *nfsd_mount(struct file_system_type *fs_type,
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 066899929863..fe7418c2b88f 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -21,6 +21,7 @@
#include <uapi/linux/nfsd/debug.h>
+#include "netns.h"
#include "stats.h"
#include "export.h"
@@ -85,6 +86,14 @@ int nfsd_pool_stats_release(struct inode *, struct file *);
void nfsd_destroy(struct net *net);
+struct nfsdfs_client {
+ struct kref cl_ref;
+ void (*cl_release)(struct kref *kref);
+};
+
+struct dentry *nfsd_client_mkdir(struct nfsd_net *nn, struct nfsdfs_client *ncl, u32 id);
+void nfsd_client_rmdir(struct dentry *dentry);
+
#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
#ifdef CONFIG_NFSD_V2_ACL
extern const struct svc_version nfsd_acl_version2;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index aa26ae520fb6..5a999e4b766e 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -39,6 +39,7 @@
#include <linux/refcount.h>
#include <linux/sunrpc/svc_xprt.h>
#include "nfsfh.h"
+#include "nfsd.h"
typedef struct {
u32 cl_boot;
@@ -347,9 +348,12 @@ struct nfs4_client {
u32 cl_exchange_flags;
/* number of rpc's in progress over an associated session: */
atomic_t cl_rpc_users;
- struct kref cl_ref;
+ struct nfsdfs_client cl_nfsdfs;
struct nfs4_op_map cl_spo_must_allow;
+ /* debugging info directory under nfsd/clients/ : */
+ struct dentry *cl_nfsd_dentry;
+
/* for nfs41 callbacks */
/* We currently support a single back channel with a single slot */
unsigned long cl_cb_slot_busy;
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Use a common structure instead of defining our own here.
The only difference is using an int instead of umode_t for the mode
field; I haven't tried to figure out why tree_descr uses int.
Signed-off-by: J. Bruce Fields <[email protected]>
---
net/sunrpc/rpc_pipe.c | 37 ++++++++++++++-----------------------
1 file changed, 14 insertions(+), 23 deletions(-)
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 69663681bf9d..9af9f4c8fa1e 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -460,15 +460,6 @@ static const struct file_operations rpc_info_operations = {
};
-/*
- * Description of fs contents.
- */
-struct rpc_filelist {
- const char *name;
- const struct file_operations *i_fop;
- umode_t mode;
-};
-
static struct inode *
rpc_get_inode(struct super_block *sb, umode_t mode)
{
@@ -648,7 +639,7 @@ static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
* FIXME: This probably has races.
*/
static void __rpc_depopulate(struct dentry *parent,
- const struct rpc_filelist *files,
+ const struct tree_descr *files,
int start, int eof)
{
struct inode *dir = d_inode(parent);
@@ -680,7 +671,7 @@ static void __rpc_depopulate(struct dentry *parent,
}
static void rpc_depopulate(struct dentry *parent,
- const struct rpc_filelist *files,
+ const struct tree_descr *files,
int start, int eof)
{
struct inode *dir = d_inode(parent);
@@ -691,7 +682,7 @@ static void rpc_depopulate(struct dentry *parent,
}
static int rpc_populate(struct dentry *parent,
- const struct rpc_filelist *files,
+ const struct tree_descr *files,
int start, int eof,
void *private)
{
@@ -711,7 +702,7 @@ static int rpc_populate(struct dentry *parent,
case S_IFREG:
err = __rpc_create(dir, dentry,
files[i].mode,
- files[i].i_fop,
+ files[i].ops,
private);
break;
case S_IFDIR:
@@ -1015,10 +1006,10 @@ enum {
RPCAUTH_EOF
};
-static const struct rpc_filelist authfiles[] = {
+static const struct tree_descr authfiles[] = {
[RPCAUTH_info] = {
.name = "info",
- .i_fop = &rpc_info_operations,
+ .ops = &rpc_info_operations,
.mode = S_IFREG | 0400,
},
};
@@ -1076,20 +1067,20 @@ int rpc_remove_client_dir(struct rpc_clnt *rpc_client)
return rpc_rmdir_depopulate(dentry, rpc_clntdir_depopulate);
}
-static const struct rpc_filelist cache_pipefs_files[3] = {
+static const struct tree_descr cache_pipefs_files[3] = {
[0] = {
.name = "channel",
- .i_fop = &cache_file_operations_pipefs,
+ .ops = &cache_file_operations_pipefs,
.mode = S_IFREG | 0600,
},
[1] = {
.name = "content",
- .i_fop = &content_file_operations_pipefs,
+ .ops = &content_file_operations_pipefs,
.mode = S_IFREG | 0400,
},
[2] = {
.name = "flush",
- .i_fop = &cache_flush_operations_pipefs,
+ .ops = &cache_flush_operations_pipefs,
.mode = S_IFREG | 0600,
},
};
@@ -1145,7 +1136,7 @@ enum {
RPCAUTH_RootEOF
};
-static const struct rpc_filelist files[] = {
+static const struct tree_descr files[] = {
[RPCAUTH_lockd] = {
.name = "lockd",
.mode = S_IFDIR | 0555,
@@ -1242,7 +1233,7 @@ void rpc_put_sb_net(const struct net *net)
}
EXPORT_SYMBOL_GPL(rpc_put_sb_net);
-static const struct rpc_filelist gssd_dummy_clnt_dir[] = {
+static const struct tree_descr gssd_dummy_clnt_dir[] = {
[0] = {
.name = "clntXX",
.mode = S_IFDIR | 0555,
@@ -1277,10 +1268,10 @@ rpc_dummy_info_show(struct seq_file *m, void *v)
}
DEFINE_SHOW_ATTRIBUTE(rpc_dummy_info);
-static const struct rpc_filelist gssd_dummy_info_file[] = {
+static const struct tree_descr gssd_dummy_info_file[] = {
[0] = {
.name = "info",
- .i_fop = &rpc_dummy_info_fops,
+ .ops = &rpc_dummy_info_fops,
.mode = S_IFREG | 0400,
},
};
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Add a nfsd/clients/#/opens file to list some information about all the
opens held by the given client.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 100 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 98 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 928705fc8ff5..829d1e5440d3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -684,7 +684,8 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
idr_preload(GFP_KERNEL);
spin_lock(&cl->cl_lock);
- new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
+ /* Reserving 0 for start of file in nfsdfs "opens" file: */
+ new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 1, 0, GFP_NOWAIT);
spin_unlock(&cl->cl_lock);
idr_preload_end();
if (new_id < 0)
@@ -2223,9 +2224,104 @@ static const struct file_operations client_info_fops = {
.release = single_release,
};
+static void *opens_start(struct seq_file *s, loff_t *pos)
+ __acquires(&clp->cl_lock)
+{
+ struct nfs4_client *clp = s->private;
+ unsigned long id = *pos;
+ void *ret;
+
+ spin_lock(&clp->cl_lock);
+ ret = idr_get_next_ul(&clp->cl_stateids, &id);
+ *pos = id;
+ return ret;
+}
+
+static void *opens_next(struct seq_file *s, void *v, loff_t *pos)
+{
+ struct nfs4_client *clp = s->private;
+ unsigned long id = *pos;
+ void *ret;
+
+ id = *pos;
+ id++;
+ ret = idr_get_next_ul(&clp->cl_stateids, &id);
+ *pos = id;
+ return ret;
+}
+
+static void opens_stop(struct seq_file *s, void *v)
+ __releases(&clp->cl_lock)
+{
+ struct nfs4_client *clp = s->private;
+
+ spin_unlock(&clp->cl_lock);
+}
+
+static int opens_show(struct seq_file *s, void *v)
+{
+ struct nfs4_stid *st = v;
+ struct nfs4_ol_stateid *os;
+ u64 stateid;
+
+ if (st->sc_type != NFS4_OPEN_STID)
+ return 0; /* XXX: or SEQ_SKIP? */
+ os = openlockstateid(st);
+ /* XXX: get info about file, etc., here */
+
+ memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
+ seq_printf(s, "stateid: %llx\n", stateid);
+ return 0;
+}
+
+static struct seq_operations opens_seq_ops = {
+ .start = opens_start,
+ .next = opens_next,
+ .stop = opens_stop,
+ .show = opens_show
+};
+
+static int client_opens_open(struct inode *inode, struct file *file)
+{
+ struct nfsdfs_client *nc;
+ struct seq_file *s;
+ struct nfs4_client *clp;
+ int ret;
+
+ nc = get_nfsdfs_client(inode);
+ if (!nc)
+ return -ENXIO;
+ clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
+
+ ret = seq_open(file, &opens_seq_ops);
+ if (ret)
+ return ret;
+ s = file->private_data;
+ s->private = clp;
+ return 0;
+}
+
+static int client_opens_release(struct inode *inode, struct file *file)
+{
+ struct seq_file *m = file->private_data;
+ struct nfs4_client *clp = m->private;
+
+ /* XXX: alternatively, we could get/drop in seq start/stop */
+ drop_client(clp);
+ return 0;
+}
+
+static const struct file_operations client_opens_fops = {
+ .open = client_opens_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = client_opens_release,
+};
+
static const struct tree_descr client_files[] = {
[0] = {"info", &client_info_fops, S_IRUSR},
- [1] = {""},
+ [1] = {"open", &client_opens_fops, S_IRUSR},
+ [2] = {""},
};
static struct nfs4_client *create_client(struct xdr_netobj name,
--
2.20.1
From: "J. Bruce Fields" <[email protected]>
Keep a second reference count which is what is really used to decide
when to free the client's memory. File objects under nfsd/client/ will
hold these references.
Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 26 +++++++++++++++++++++-----
fs/nfsd/state.h | 1 +
2 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9dab61bbd256..83d0ee329e14 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1869,6 +1869,24 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
return NULL;
}
+static void __free_client(struct kref *k)
+{
+ struct nfs4_client *clp = container_of(k, struct nfs4_client, cl_ref);
+
+ free_svc_cred(&clp->cl_cred);
+ kfree(clp->cl_ownerstr_hashtbl);
+ kfree(clp->cl_name.data);
+ idr_destroy(&clp->cl_stateids);
+ if (clp->cl_nfsd_dentry)
+ nfsd_client_rmdir(clp->cl_nfsd_dentry);
+ kmem_cache_free(client_slab, clp);
+}
+
+void drop_client(struct nfs4_client *clp)
+{
+ kref_put(&clp->cl_ref, __free_client);
+}
+
static void
free_client(struct nfs4_client *clp)
{
@@ -1881,11 +1899,7 @@ free_client(struct nfs4_client *clp)
free_session(ses);
}
rpc_destroy_wait_queue(&clp->cl_cb_waitq);
- free_svc_cred(&clp->cl_cred);
- kfree(clp->cl_ownerstr_hashtbl);
- kfree(clp->cl_name.data);
- idr_destroy(&clp->cl_stateids);
- kmem_cache_free(client_slab, clp);
+ drop_client(clp);
}
/* must be called under the client_lock */
@@ -2195,6 +2209,8 @@ static struct nfs4_client *create_client(struct xdr_netobj name,
free_client(clp);
return NULL;
}
+
+ kref_init(&clp->cl_ref);
nfsd4_init_cb(&clp->cl_cb_null, clp, NULL, NFSPROC4_CLNT_CB_NULL);
clp->cl_time = get_seconds();
clear_bit(0, &clp->cl_cb_slot_busy);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index aa9f1676e88a..aa26ae520fb6 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -347,6 +347,7 @@ struct nfs4_client {
u32 cl_exchange_flags;
/* number of rpc's in progress over an associated session: */
atomic_t cl_rpc_users;
+ struct kref cl_ref;
struct nfs4_op_map cl_spo_must_allow;
/* for nfs41 callbacks */
--
2.20.1
On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients. Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client. Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
>
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.
>
> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
>
> Currently these pseudofiles look like:
>
> # find /proc/fs/nfsd/clients -type f|xargs tail
> ==> /proc/fs/nfsd/clients/3741/opens <==
> 5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>
> ==> /proc/fs/nfsd/clients/3741/info <==
> clientid: 6debfb505cc0cd36
> address: 192.168.122.36:0
> name: Linux NFSv4.2 test2.fieldses.org
> minor version: 2
>
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
>
Nice work! We've needed this for a long time.
One thing we need to consider here from the get-go though is what sort
of ABI guarantee you want for this format. People _will_ write scripts
that scrape this info, so we should take that into account up front.
> So, some random questions:
>
> - I just copied the major:minor:ino thing from /proc/locks, I
> suspect we would have picked something different to identify
> inodes if /proc/locks were done now. (Mount id and inode?
> Something else?)
>
That does make it easy to correlate with the info in /proc/locks.
We'd have a dentry here by virtue of the nfs4_file. Should we print a
path in addition to this?
> - The open owner is just an opaque blob of binary data, but
> clients may choose to include some useful asci-encoded
> information, so I'm formatting them as strings with non-ascii
> stuff escaped. For example, pynfs usually uses the name of
> the test as the open owner. But as you see above, the ascii
> content of the Linux client's open owners is less useful.
> Also, there's no way I know of to map them back to a file
> description or process or anything else useful on the client,
> so perhaps they're of limited interest.
>
> - I'm not sure about the stateid either. I did think it might
> be useful just as a unique identifier for each line.
> (Actually for that it'd be enough to take just the third of
> those four numbers making up the stateid--maybe that would be
> better.)
>
It'd be ideal to be able to easily correlate this info with what
wireshark displays. Does wireshark display hashes for openowners? I know
it does for stateids. If so, generating the same hash would be really
nice.
That said, waybe it's best to just dump the raw info out here though and
rely on some postprocessing scripts for viewing it?
> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
>
> There's probably a lot more we could add to that info file eventually.
>
> Other stuff to add next:
>
> - nfsd/clients/#/kill that you can write to to revoke all a
> client's state if it's wedged somehow.
That would also be neat. We have a bit of code to support today that in
the fault injection code, but it'll need some cleanup and wiring it into
a knob here would be better.
> - lists of locks and delegations; lower priority since most of
> that information is already in /proc/locks.
Agreed.
> J. Bruce Fields (10):
> nfsd: persist nfsd filesystem across mounts
> nfsd: rename cl_refcount
> nfsd4: use reference count to free client
> nfsd: add nfsd/clients directory
> nfsd: make client/ directory names small ints
> rpc: replace rpc_filelist by tree_descr
> nfsd4: add a client info file
> nfsd4: add file to display list of client's opens
> nfsd: expose some more information about NFSv4 opens
> nfsd: add more information to client info file
>
> fs/nfsd/netns.h | 6 +
> fs/nfsd/nfs4state.c | 228 ++++++++++++++++++++++++++++++---
> fs/nfsd/nfsctl.c | 225 +++++++++++++++++++++++++++++++-
> fs/nfsd/nfsd.h | 11 ++
> fs/nfsd/state.h | 9 +-
> fs/seq_file.c | 17 +++
> include/linux/seq_file.h | 2 +
> include/linux/string_helpers.h | 1 +
> lib/string_helpers.c | 5 +-
> net/sunrpc/rpc_pipe.c | 37 ++----
> 10 files changed, 491 insertions(+), 50 deletions(-)
>
--
Jeff Layton <[email protected]>
On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> Add a nfsd/clients/#/opens file to list some information about all the
> opens held by the given client.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 100 +++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 98 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 928705fc8ff5..829d1e5440d3 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -684,7 +684,8 @@ struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *sla
>
> idr_preload(GFP_KERNEL);
> spin_lock(&cl->cl_lock);
> - new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 0, 0, GFP_NOWAIT);
> + /* Reserving 0 for start of file in nfsdfs "opens" file: */
> + new_id = idr_alloc_cyclic(&cl->cl_stateids, stid, 1, 0, GFP_NOWAIT);
> spin_unlock(&cl->cl_lock);
> idr_preload_end();
> if (new_id < 0)
> @@ -2223,9 +2224,104 @@ static const struct file_operations client_info_fops = {
> .release = single_release,
> };
>
> +static void *opens_start(struct seq_file *s, loff_t *pos)
> + __acquires(&clp->cl_lock)
> +{
> + struct nfs4_client *clp = s->private;
> + unsigned long id = *pos;
> + void *ret;
> +
> + spin_lock(&clp->cl_lock);
> + ret = idr_get_next_ul(&clp->cl_stateids, &id);
> + *pos = id;
> + return ret;
> +}
> +
> +static void *opens_next(struct seq_file *s, void *v, loff_t *pos)
> +{
> + struct nfs4_client *clp = s->private;
> + unsigned long id = *pos;
> + void *ret;
> +
> + id = *pos;
> + id++;
> + ret = idr_get_next_ul(&clp->cl_stateids, &id);
> + *pos = id;
> + return ret;
> +}
> +
> +static void opens_stop(struct seq_file *s, void *v)
> + __releases(&clp->cl_lock)
> +{
> + struct nfs4_client *clp = s->private;
> +
> + spin_unlock(&clp->cl_lock);
> +}
> +
> +static int opens_show(struct seq_file *s, void *v)
> +{
> + struct nfs4_stid *st = v;
> + struct nfs4_ol_stateid *os;
> + u64 stateid;
> +
> + if (st->sc_type != NFS4_OPEN_STID)
> + return 0; /* XXX: or SEQ_SKIP? */
> + os = openlockstateid(st);
> + /* XXX: get info about file, etc., here */
> +
> + memcpy(&stateid, &st->sc_stateid, sizeof(stateid));
> + seq_printf(s, "stateid: %llx\n", stateid);
> + return 0;
> +}
More bikeshedding: should we have a "states" file instead of an "opens"
file and print a different set of output for each stateid type?
> +
> +static struct seq_operations opens_seq_ops = {
> + .start = opens_start,
> + .next = opens_next,
> + .stop = opens_stop,
> + .show = opens_show
> +};
> +
> +static int client_opens_open(struct inode *inode, struct file *file)
> +{
> + struct nfsdfs_client *nc;
> + struct seq_file *s;
> + struct nfs4_client *clp;
> + int ret;
> +
> + nc = get_nfsdfs_client(inode);
> + if (!nc)
> + return -ENXIO;
> + clp = container_of(nc, struct nfs4_client, cl_nfsdfs);
> +
> + ret = seq_open(file, &opens_seq_ops);
> + if (ret)
> + return ret;
> + s = file->private_data;
> + s->private = clp;
> + return 0;
> +}
> +
> +static int client_opens_release(struct inode *inode, struct file *file)
> +{
> + struct seq_file *m = file->private_data;
> + struct nfs4_client *clp = m->private;
> +
> + /* XXX: alternatively, we could get/drop in seq start/stop */
> + drop_client(clp);
> + return 0;
> +}
> +
> +static const struct file_operations client_opens_fops = {
> + .open = client_opens_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = client_opens_release,
> +};
> +
> static const struct tree_descr client_files[] = {
> [0] = {"info", &client_info_fops, S_IRUSR},
> - [1] = {""},
> + [1] = {"open", &client_opens_fops, S_IRUSR},
> + [2] = {""},
> };
>
> static struct nfs4_client *create_client(struct xdr_netobj name,
--
Jeff Layton <[email protected]>
On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients. Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client. Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
>
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.
>
> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
>
> Currently these pseudofiles look like:
>
> # find /proc/fs/nfsd/clients -type f|xargs tail
> ==> /proc/fs/nfsd/clients/3741/opens <==
> 5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>
> ==> /proc/fs/nfsd/clients/3741/info <==
> clientid: 6debfb505cc0cd36
> address: 192.168.122.36:0
> name: Linux NFSv4.2 test2.fieldses.org
> minor version: 2
>
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
>
> So, some random questions:
>
> - I just copied the major:minor:ino thing from /proc/locks, I
> suspect we would have picked something different to identify
> inodes if /proc/locks were done now. (Mount id and inode?
> Something else?)
>
> - The open owner is just an opaque blob of binary data, but
> clients may choose to include some useful asci-encoded
> information, so I'm formatting them as strings with non-ascii
> stuff escaped. For example, pynfs usually uses the name of
> the test as the open owner. But as you see above, the ascii
> content of the Linux client's open owners is less useful.
> Also, there's no way I know of to map them back to a file
> description or process or anything else useful on the client,
> so perhaps they're of limited interest.
>
> - I'm not sure about the stateid either. I did think it might
> be useful just as a unique identifier for each line.
> (Actually for that it'd be enough to take just the third of
> those four numbers making up the stateid--maybe that would be
> better.)
>
> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
>
> There's probably a lot more we could add to that info file eventually.
>
> Other stuff to add next:
>
> - nfsd/clients/#/kill that you can write to to revoke all a
> client's state if it's wedged somehow.
> - lists of locks and delegations; lower priority since most of
> that information is already in /proc/locks.
>
> --b.
>
> J. Bruce Fields (10):
> nfsd: persist nfsd filesystem across mounts
> nfsd: rename cl_refcount
> nfsd4: use reference count to free client
> nfsd: add nfsd/clients directory
> nfsd: make client/ directory names small ints
> rpc: replace rpc_filelist by tree_descr
> nfsd4: add a client info file
> nfsd4: add file to display list of client's opens
> nfsd: expose some more information about NFSv4 opens
> nfsd: add more information to client info file
>
> fs/nfsd/netns.h | 6 +
> fs/nfsd/nfs4state.c | 228 ++++++++++++++++++++++++++++++---
> fs/nfsd/nfsctl.c | 225 +++++++++++++++++++++++++++++++-
> fs/nfsd/nfsd.h | 11 ++
> fs/nfsd/state.h | 9 +-
> fs/seq_file.c | 17 +++
> include/linux/seq_file.h | 2 +
> include/linux/string_helpers.h | 1 +
> lib/string_helpers.c | 5 +-
> net/sunrpc/rpc_pipe.c | 37 ++----
> 10 files changed, 491 insertions(+), 50 deletions(-)
>
I went through the patches and the code all looks fine to me. The only
real questions I think are what sort of info we want to present here,
and how we'll deal with changes to the format in the future.
Maybe on that latter point, we can reserve the right to add fields to
this info later, but do our best to never remove existing ones? Then
tools could be written to just ignore the fields that they don't
understand.
Reviewed-by: Jeff Layton <[email protected]>
On Thu, Apr 25, 2019 at 01:02:30PM -0400, Jeff Layton wrote:
> On Thu, 2019-04-25 at 10:04 -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <[email protected]>
> >
> > The following patches expose information about NFSv4 opens held by knfsd
> > on behalf of NFSv4 clients. Those are currently invisible to userspace,
> > unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
> >
> > The approach is to add a new directory /proc/fs/nfsd/clients/ with
> > subdirectories for each active NFSv4 client. Each subdirectory has an
> > "info" file with some basic information to help identify the client and
> > an "opens" directory that lists the opens held by that client.
> >
> > I got it working by cobbling together some poorly-understood code I
> > found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
> > and tell me what I've got wrong, they're more than welcome, but at this
> > stage I'm more curious for feedback on the interface.
> >
> > I'm also cc'ing people responsible for lsof and util-linux in case they
> > have any opinions.
> >
> > Currently these pseudofiles look like:
> >
> > # find /proc/fs/nfsd/clients -type f|xargs tail
> > ==> /proc/fs/nfsd/clients/3741/opens <==
> > 5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> > 5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> >
> > ==> /proc/fs/nfsd/clients/3741/info <==
> > clientid: 6debfb505cc0cd36
> > address: 192.168.122.36:0
> > name: Linux NFSv4.2 test2.fieldses.org
> > minor version: 2
> >
> > Each line of the "opens" file is tab-delimited and describes one open,
> > and the fields are stateid, open access bits, deny bits,
> > major:minor:ino, and open owner.
> >
>
> Nice work! We've needed this for a long time.
>
> One thing we need to consider here from the get-go though is what sort
> of ABI guarantee you want for this format. People _will_ write scripts
> that scrape this info, so we should take that into account up front.
There is a man page for the nfsd filesystem, nfsd(7). I should write up
something to add to that. If people write code without reading that
then we may still end up boxed in, of course, but it's a start.
What I'm hoping we can count on from readers:
- they will ignore any unkown files in clients/#/.
- readers will ignore any lines in clients/#/info starting with
an unrecognized keyword.
- they will ignore any unknown data at the end of
clients/#/opens.
That's in approximate decreasing order of my confidence in those rules
being observed, though I don't think any of those are too much to ask.
> > So, some random questions:
> >
> > - I just copied the major:minor:ino thing from /proc/locks, I
> > suspect we would have picked something different to identify
> > inodes if /proc/locks were done now. (Mount id and inode?
> > Something else?)
> >
>
> That does make it easy to correlate with the info in /proc/locks.
>
> We'd have a dentry here by virtue of the nfs4_file. Should we print a
> path in addition to this?
We could. It won't be 100% reliable, of course (unlinks, renames), but
it could still be convenient for human readers, and an optimization for
non-human readers trying to find an inode.
The filehandle might be a good idea too.
I wonder if there's any issue with line length, or with quantity of data
emitted by a single seq_file show method. The open owner can be up to
4K (after escaping), paths and filehandles can be long too.
> > - The open owner is just an opaque blob of binary data, but
> > clients may choose to include some useful asci-encoded
> > information, so I'm formatting them as strings with non-ascii
> > stuff escaped. For example, pynfs usually uses the name of
> > the test as the open owner. But as you see above, the ascii
> > content of the Linux client's open owners is less useful.
> > Also, there's no way I know of to map them back to a file
> > description or process or anything else useful on the client,
> > so perhaps they're of limited interest.
> >
> > - I'm not sure about the stateid either. I did think it might
> > be useful just as a unique identifier for each line.
> > (Actually for that it'd be enough to take just the third of
> > those four numbers making up the stateid--maybe that would be
> > better.)
>
> It'd be ideal to be able to easily correlate this info with what
> wireshark displays. Does wireshark display hashes for openowners? I know
> it does for stateids. If so, generating the same hash would be really
> nice.
>
> That said, waybe it's best to just dump the raw info out here though and
> rely on some postprocessing scripts for viewing it?
In that case, I think so, as I don't know how committed wireshark is to
the choice of hash.
> > In the "info" file, the "name" line is the client identifier/client
> > owner provided by the client, which (like the stateowner) is just opaque
> > binary data, though as you can see here the Linux client is providing a
> > readable ascii string.
> >
> > There's probably a lot more we could add to that info file eventually.
> >
> > Other stuff to add next:
> >
> > - nfsd/clients/#/kill that you can write to to revoke all a
> > client's state if it's wedged somehow.
>
> That would also be neat. We have a bit of code to support today that in
> the fault injection code, but it'll need some cleanup and wiring it into
> a knob here would be better.
OK, good, I'm working on that. Looks like fault injection gives up if
there are rpc's in process for the given client, whereas here I'd rather
force the expiry. Looks like that needs some straightforward waitqueue
logic to wait for the in-progress rpc's.
--b.
On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
> More bikeshedding: should we have a "states" file instead of an "opens"
> file and print a different set of output for each stateid type?
Sure. The format of the file could be something like
<stateid> open rw -- <openowner>...
<stateid> lock r 0-EOF <lockowner>...
<stateid> deleg r
I wonder if we could put owners on separate lines and do some
heirarchical thing to show owner-stateid relationships? Hm. That's
kind of appealing but more work.
I was only planning to do opens for the first iteration, and I think
extending later in separate files is slightly safer.
More trivial, but: it'd lengthen lines and make columns line up less
often. But if we include a lot of long variable-length fields then
that's kinda hopeless anyway.
--b.
On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <[email protected]> wrote:
>
> From: "J. Bruce Fields" <[email protected]>
>
> The following patches expose information about NFSv4 opens held by knfsd
> on behalf of NFSv4 clients. Those are currently invisible to userspace,
> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>
> The approach is to add a new directory /proc/fs/nfsd/clients/ with
> subdirectories for each active NFSv4 client. Each subdirectory has an
> "info" file with some basic information to help identify the client and
> an "opens" directory that lists the opens held by that client.
>
> I got it working by cobbling together some poorly-understood code I
> found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
> and tell me what I've got wrong, they're more than welcome, but at this
> stage I'm more curious for feedback on the interface.
Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
My understanding is that "complex" files are verboten in procfs and sysfs?
We've been going through a lengthy process to move files out of procfs
into sysfs and debugfs as a result (while trying to maintain some kind of
compatibility in the user tools), but if it is possible to use a separate
filesystem to hold all of the stats/parameters I'd much rather do that
than use debugfs (which has become root-access-only in newer kernels).
Cheers, Andreas
> I'm also cc'ing people responsible for lsof and util-linux in case they
> have any opinions.
>
> Currently these pseudofiles look like:
>
> # find /proc/fs/nfsd/clients -type f|xargs tail
> ==> /proc/fs/nfsd/clients/3741/opens <==
> 5cc0cd36/6debfb50/00000001/00000001 rw -- fd:10:13649 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
> 5cc0cd36/6debfb50/00000003/00000001 r- -- fd:10:13650 'open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x0b\xb7\x89%\xfc\xef'
>
> ==> /proc/fs/nfsd/clients/3741/info <==
> clientid: 6debfb505cc0cd36
> address: 192.168.122.36:0
> name: Linux NFSv4.2 test2.fieldses.org
> minor version: 2
>
> Each line of the "opens" file is tab-delimited and describes one open,
> and the fields are stateid, open access bits, deny bits,
> major:minor:ino, and open owner.
>
> So, some random questions:
>
> - I just copied the major:minor:ino thing from /proc/locks, I
> suspect we would have picked something different to identify
> inodes if /proc/locks were done now. (Mount id and inode?
> Something else?)
>
> - The open owner is just an opaque blob of binary data, but
> clients may choose to include some useful asci-encoded
> information, so I'm formatting them as strings with non-ascii
> stuff escaped. For example, pynfs usually uses the name of
> the test as the open owner. But as you see above, the ascii
> content of the Linux client's open owners is less useful.
> Also, there's no way I know of to map them back to a file
> description or process or anything else useful on the client,
> so perhaps they're of limited interest.
>
> - I'm not sure about the stateid either. I did think it might
> be useful just as a unique identifier for each line.
> (Actually for that it'd be enough to take just the third of
> those four numbers making up the stateid--maybe that would be
> better.)
>
> In the "info" file, the "name" line is the client identifier/client
> owner provided by the client, which (like the stateowner) is just opaque
> binary data, though as you can see here the Linux client is providing a
> readable ascii string.
>
> There's probably a lot more we could add to that info file eventually.
>
> Other stuff to add next:
>
> - nfsd/clients/#/kill that you can write to to revoke all a
> client's state if it's wedged somehow.
> - lists of locks and delegations; lower priority since most of
> that information is already in /proc/locks.
>
> --b.
>
> J. Bruce Fields (10):
> nfsd: persist nfsd filesystem across mounts
> nfsd: rename cl_refcount
> nfsd4: use reference count to free client
> nfsd: add nfsd/clients directory
> nfsd: make client/ directory names small ints
> rpc: replace rpc_filelist by tree_descr
> nfsd4: add a client info file
> nfsd4: add file to display list of client's opens
> nfsd: expose some more information about NFSv4 opens
> nfsd: add more information to client info file
>
> fs/nfsd/netns.h | 6 +
> fs/nfsd/nfs4state.c | 228 ++++++++++++++++++++++++++++++---
> fs/nfsd/nfsctl.c | 225 +++++++++++++++++++++++++++++++-
> fs/nfsd/nfsd.h | 11 ++
> fs/nfsd/state.h | 9 +-
> fs/seq_file.c | 17 +++
> include/linux/seq_file.h | 2 +
> include/linux/string_helpers.h | 1 +
> lib/string_helpers.c | 5 +-
> net/sunrpc/rpc_pipe.c | 37 ++----
> 10 files changed, 491 insertions(+), 50 deletions(-)
>
> --
> 2.20.1
>
Cheers, Andreas
On Apr 25, 2019, at 10:14 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
>> More bikeshedding: should we have a "states" file instead of an "opens"
>> file and print a different set of output for each stateid type?
>
> Sure. The format of the file could be something like
>
> <stateid> open rw -- <openowner>...
> <stateid> lock r 0-EOF <lockowner>...
> <stateid> deleg r
>
> I wonder if we could put owners on separate lines and do some
> heirarchical thing to show owner-stateid relationships? Hm. That's
> kind of appealing but more work.
My suggestion here would be to use YAML-formatted output rather than
space/tab separated positional fields. That can still be made human
readable, but also machine parseable and extensible if formatted properly.
Something like the following (use https://yaml-online-parser.appspot.com/
to validate the formatting):
- <stateid>: { state: open, mode: rw, owner: <openowner> }
- <stateid>: { state: lock, mode: r, start: 0, end: EOF, owner: <openowner> }
> I was only planning to do opens for the first iteration, and I think
> extending later in separate files is slightly safer.
>
> More trivial, but: it'd lengthen lines and make columns line up less
> often. But if we include a lot of long variable-length fields then
> that's kinda hopeless anyway.
>
> --b.
Cheers, Andreas
On Thu, Apr 25 2019, Andreas Dilger wrote:
> On Apr 25, 2019, at 4:04 PM, J. Bruce Fields <[email protected]> wrote:
>>
>> From: "J. Bruce Fields" <[email protected]>
>>
>> The following patches expose information about NFSv4 opens held by knfsd
>> on behalf of NFSv4 clients. Those are currently invisible to userspace,
>> unlike locks (/proc/locks) and local proccesses' opens (/proc/<pid>/).
>>
>> The approach is to add a new directory /proc/fs/nfsd/clients/ with
>> subdirectories for each active NFSv4 client. Each subdirectory has an
>> "info" file with some basic information to help identify the client and
>> an "opens" directory that lists the opens held by that client.
>>
>> I got it working by cobbling together some poorly-understood code I
>> found in libfs, rpc_pipefs and elsewhere. If anyone wants to wade in
>> and tell me what I've got wrong, they're more than welcome, but at this
>> stage I'm more curious for feedback on the interface.
>
> Is this in procfs, sysfs, or a separate NFSD-specific filesystem?
> My understanding is that "complex" files are verboten in procfs and sysfs?
> We've been going through a lengthy process to move files out of procfs
> into sysfs and debugfs as a result (while trying to maintain some kind of
> compatibility in the user tools), but if it is possible to use a separate
> filesystem to hold all of the stats/parameters I'd much rather do that
> than use debugfs (which has become root-access-only in newer kernels).
/proc/fs/nfsd is the (standard) mount point for a separate NFSD-specific
filesystem, originally created to replace the nfsd-specific
systemcall.
So the nfsd developers have a fair degree of latitude as to what can go
in there.
But I *don't* think it is a good idea to follow this pattern. Creating
a separate control filesystem for every different module that thinks it
has different needs doesn't scale well. We could end up with dozens of
tiny filesystems that all need to be mounted at just the right place. I
don't think that is healthy for Linus.
Nor do I think we should be stuffing stuff into debugfs that isn't
really for debugging. That isn't healthy either.
If sysfs doesn't meet our needs, then we need to raise that in
appropriate fora and present a clear case and try to build consensus -
because if we see a problem, then it is likely that others do to.
This is all presumably in the context of Lustre and while lustre is
out-of-tree we don't have a lot of leverage. So I wouldn't consider
pursuing anything here until we get back upstream.
NeilBrown
On Sat, Apr 27, 2019 at 09:55:23AM +1000, NeilBrown wrote:
> On Fri, Apr 26 2019, J. Bruce Fields wrote:
> > But it's true that from the start nfsd didn't really fit the model
> > of a single (possibly writeable) attribute per file.
>
> Depends on what you mean by that. Original files where write-only and
> where slightly complex attributes.
Yes I thought it was just those too, but then I looked at the original
commit it also included at least the "exports" file.
> Writing performed an action, like
> adding an entry to the export table (first you add a client, then add a
> client+filesystem to export it).
>
> This idea for a file performing an action, rather than presenting an
> attribute, is much the same as the "bind" and "unbind" files you can
> find in sysfs.
>
> (see also https://lwn.net/Articles/378884/ for examples of sysfs files
> that are not one-attribute-per-file)
I'll give that a re-read, thanks.
I did spend maybe a few minutes looking into basing nfsd code on kernfs
and didn't think it was worth it. I could take a more serious look.
--b.
On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> Add open modes, device numbers, inode numbers, and open owners to each
> line of nfsd/clients/#/opens.
>
> Open owners are totally opaque but seem to sometimes have some useful
> ascii strings included, so passing through printable ascii characters
> and escaping the rest seems useful while still being machine-readable.
> ---
> fs/nfsd/nfs4state.c | 40
> ++++++++++++++++++++++++++++------
> fs/nfsd/state.h | 2 +-
> fs/seq_file.c | 17 +++++++++++++++
> include/linux/seq_file.h | 2 ++
> include/linux/string_helpers.h | 1 +
> lib/string_helpers.c | 5 +++--
> 6 files changed, 57 insertions(+), 10 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 829d1e5440d3..f53621a65e60 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -42,6 +42,7 @@
> #include <linux/sunrpc/svcauth_gss.h>
> #include <linux/sunrpc/addr.h>
> #include <linux/jhash.h>
> +#include <linux/string_helpers.h>
> #include "xdr4.h"
> #include "xdr4cb.h"
> #include "vfs.h"
> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s,
> void *v)
> static int opens_show(struct seq_file *s, void *v)
> {
> struct nfs4_stid *st = v;
> - struct nfs4_ol_stateid *os;
> - u64 stateid;
> + struct nfs4_ol_stateid *ols;
> + struct nfs4_file *nf;
> + struct file *file;
> + struct inode *inode;
> + struct nfs4_stateowner *oo;
> + unsigned int access, deny;
>
> if (st->sc_type != NFS4_OPEN_STID)
> return 0; /* XXX: or SEQ_SKIP? */
> - os = openlockstateid(st);
> - /* XXX: get info about file, etc., here */
> + ols = openlockstateid(st);
> + oo = ols->st_stateowner;
> + nf = st->sc_file;
> + file = find_any_file(nf);
> + inode = d_inode(file->f_path.dentry);
> +
> + seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
Should we match the byte order printed with what wireshark/tshark sees?
For example, this will print:
5ccb016e/6d028c97/00000002/00000002 -w -- fd:00:9163439 'open
id:\x00\x00\x00.\x00\x00\x00\x00\x00\x00\x021\x8dp\xbe\xc7'
But, tshark -V prints:
Opcode: OPEN (18)
Status: NFS4_OK (0)
stateid
[StateID Hash: 0x8298]
seqid: 0x00000002
Data: 6e01cb5c978c026d02000000
[Data hash (CRC-32): 0x8391069f]
I think this is the first time we've exposed state ids to users from
knfsd,
I wonder if we should make it easier for everyone that might want to try
to
correlate this information with what they can see in a packet capture.
Ben
Benjamin Coddington <[email protected]> writes:
> On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:
>
>> From: "J. Bruce Fields" <[email protected]>
>>
>> Add open modes, device numbers, inode numbers, and open owners to each
>> line of nfsd/clients/#/opens.
>>
>> Open owners are totally opaque but seem to sometimes have some useful
>> ascii strings included, so passing through printable ascii characters
>> and escaping the rest seems useful while still being machine-readable.
>> ---
>> fs/nfsd/nfs4state.c | 40
>> ++++++++++++++++++++++++++++------
>> fs/nfsd/state.h | 2 +-
>> fs/seq_file.c | 17 +++++++++++++++
>> include/linux/seq_file.h | 2 ++
>> include/linux/string_helpers.h | 1 +
>> lib/string_helpers.c | 5 +++--
>> 6 files changed, 57 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 829d1e5440d3..f53621a65e60 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -42,6 +42,7 @@
>> #include <linux/sunrpc/svcauth_gss.h>
>> #include <linux/sunrpc/addr.h>
>> #include <linux/jhash.h>
>> +#include <linux/string_helpers.h>
>> #include "xdr4.h"
>> #include "xdr4cb.h"
>> #include "vfs.h"
>> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s,
>> void *v)
>> static int opens_show(struct seq_file *s, void *v)
>> {
>> struct nfs4_stid *st = v;
>> - struct nfs4_ol_stateid *os;
>> - u64 stateid;
>> + struct nfs4_ol_stateid *ols;
>> + struct nfs4_file *nf;
>> + struct file *file;
>> + struct inode *inode;
>> + struct nfs4_stateowner *oo;
>> + unsigned int access, deny;
>>
>> if (st->sc_type != NFS4_OPEN_STID)
>> return 0; /* XXX: or SEQ_SKIP? */
>> - os = openlockstateid(st);
>> - /* XXX: get info about file, etc., here */
>> + ols = openlockstateid(st);
>> + oo = ols->st_stateowner;
>> + nf = st->sc_file;
>> + file = find_any_file(nf);
Is there a matching fput() missing somewhere, or did I just not see it...?
>> + inode = d_inode(file->f_path.dentry);
>> +
>> + seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
>
> Should we match the byte order printed with what wireshark/tshark sees?
^^ +1
Thanks,
Andy
--
Andrew W. Elble
Infrastructure Engineer
Information and Technology Services
Rochester Institute of Technology
tel: (585)-475-2411 | [email protected]
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
CONFIDENTIALITY NOTE: The information transmitted, including
attachments, is intended only for the person(s) or entity to which it
is addressed and may contain confidential and/or privileged material.
Any review, retransmission, dissemination or other use of, or taking of
any action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received this
in error, please contact the sender and destroy any copies of this
information.
On Thu, May 02, 2019 at 11:58:06AM -0400, Andrew W Elble wrote:
> Benjamin Coddington <[email protected]> writes:
>
> > On 25 Apr 2019, at 10:04, J. Bruce Fields wrote:
> >
> >> From: "J. Bruce Fields" <[email protected]>
> >>
> >> Add open modes, device numbers, inode numbers, and open owners to each
> >> line of nfsd/clients/#/opens.
> >>
> >> Open owners are totally opaque but seem to sometimes have some useful
> >> ascii strings included, so passing through printable ascii characters
> >> and escaping the rest seems useful while still being machine-readable.
> >> ---
> >> fs/nfsd/nfs4state.c | 40
> >> ++++++++++++++++++++++++++++------
> >> fs/nfsd/state.h | 2 +-
> >> fs/seq_file.c | 17 +++++++++++++++
> >> include/linux/seq_file.h | 2 ++
> >> include/linux/string_helpers.h | 1 +
> >> lib/string_helpers.c | 5 +++--
> >> 6 files changed, 57 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >> index 829d1e5440d3..f53621a65e60 100644
> >> --- a/fs/nfsd/nfs4state.c
> >> +++ b/fs/nfsd/nfs4state.c
> >> @@ -42,6 +42,7 @@
> >> #include <linux/sunrpc/svcauth_gss.h>
> >> #include <linux/sunrpc/addr.h>
> >> #include <linux/jhash.h>
> >> +#include <linux/string_helpers.h>
> >> #include "xdr4.h"
> >> #include "xdr4cb.h"
> >> #include "vfs.h"
> >> @@ -2261,16 +2262,41 @@ static void opens_stop(struct seq_file *s,
> >> void *v)
> >> static int opens_show(struct seq_file *s, void *v)
> >> {
> >> struct nfs4_stid *st = v;
> >> - struct nfs4_ol_stateid *os;
> >> - u64 stateid;
> >> + struct nfs4_ol_stateid *ols;
> >> + struct nfs4_file *nf;
> >> + struct file *file;
> >> + struct inode *inode;
> >> + struct nfs4_stateowner *oo;
> >> + unsigned int access, deny;
> >>
> >> if (st->sc_type != NFS4_OPEN_STID)
> >> return 0; /* XXX: or SEQ_SKIP? */
> >> - os = openlockstateid(st);
> >> - /* XXX: get info about file, etc., here */
> >> + ols = openlockstateid(st);
> >> + oo = ols->st_stateowner;
> >> + nf = st->sc_file;
> >> + file = find_any_file(nf);
>
> Is there a matching fput() missing somewhere, or did I just not see it...?
Oops, fixed.
> >> + inode = d_inode(file->f_path.dentry);
> >> +
> >> + seq_printf(s, STATEID_FMT, STATEID_VAL(&st->sc_stateid));
> >
> > Should we match the byte order printed with what wireshark/tshark sees?
>
> ^^ +1
Yeah, I agree, I'm changing that to just be a "%16phN", which should
give us what wireshark does.
Thanks for the review!
--b.
On Thu, Apr 25, 2019 at 09:18:04PM -0400, J. Bruce Fields wrote:
> On Thu, Apr 25, 2019 at 11:14:23PM +0200, Andreas Dilger wrote:
> > On Apr 25, 2019, at 10:14 PM, J. Bruce Fields <[email protected]> wrote:
> > >
> > > On Thu, Apr 25, 2019 at 02:04:59PM -0400, Jeff Layton wrote:
> > >> More bikeshedding: should we have a "states" file instead of an "opens"
> > >> file and print a different set of output for each stateid type?
> > >
> > > Sure. The format of the file could be something like
> > >
> > > <stateid> open rw -- <openowner>...
> > > <stateid> lock r 0-EOF <lockowner>...
> > > <stateid> deleg r
> > >
> > > I wonder if we could put owners on separate lines and do some
> > > heirarchical thing to show owner-stateid relationships? Hm. That's
> > > kind of appealing but more work.
> >
> > My suggestion here would be to use YAML-formatted output rather than
> > space/tab separated positional fields. That can still be made human
> > readable, but also machine parseable and extensible if formatted properly.
>
> Well, anything we do will be machine-parseable. But I can believe YAML
> would make future extension easier. It doesn't look like it would be
> more complicated to generate. It uses C-style escaping (like \x32) so
> there'd be no change to how we format binary blobs.
>
> The field names make it a tad more verbose but I guess it's not too bad.
OK, I tried changing "opens" to "states" and using YAML. Example output:
- 0x020000006a5fdc5c4ad09d9e01000000: { type: open, access: rw, deny: --, superblock: "fd:10:13649", owner: "open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x0046��QH " }
- 0x010000006a5fdc5c4ad09d9e03000000: { type: open, access: r-, deny: --, superblock: "fd:10:13650", owner: "open id:\x00\x00\x00&\x00\x00\x00\x00\x00\x0046��QH" }
- 0x010000006a5fdc5c4ad09d9e04000000: { type: deleg, access: r, superblock: "fd:10:13650" }
- 0x010000006a5fdc5c4ad09d9e06000000: { type: lock, superblock: "fd:10:13649", owner: "lock id:\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00" }
The parser Andreas suggested (https://yaml-online-parser.appspot.com/)
accepts these. It also thinks strings are always in a unicode encoding
of some kind, which they aren't. The owners are arbitrary series of
bytes but I'd like at least any ascii parts to be human readable, and
I'm a little stuck on how to do that.
--b.