2018-12-10 17:47:24

by J. Bruce Fields

[permalink] [raw]
Subject: listing knfsd-held locks and opens

We've got a long-standing complaint that tools like lsof, when run on an
NFS server, overlook opens and locks held by NFS clients.

The information's all there, it's just a question of how to expose it.

Easiest might be a single flat file like /proc/locks, but I've always
hoped we could do something slightly more structured, using a
subdirectory per NFS client.

Jeff Layton looked into this several years ago. I don't remember if
there was some particular issue or if he just got bogged down in VFS
details.

My concerns are that:

- I'd like the format to be easily expandable. The option to
create new files seems like it would help.
- some of the data we'd like to expose may be kind of cumbersome
include as a column in a text file. (I'm thinking of the
NFSv4 client identifier, which the protocol allows to be up to
1K of binary data, even if most (all?) clients use shorter
ascii identifiers.)

I'm not sure I'd want to go as far as a sysfs-like one-value-per-file
rule, which seems like overkill?

In a little more detail, as starting point, I was considering naming
each client directory with a small integer, and including files like:

info: a text file with
NFS protocol version
ascii representation of client address
krb5 principal if available

clientid: NFSv4 client ID; file absent for NFSv2/3 clients.

locks: list of locks, following something like the /proc/locks
format.

opens: list of file opens, with access bits, inode numbers,
device number.

Does that sound reasonable? Any other ideas?

--b.


2018-12-10 17:49:10

by Chuck Lever III

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens



> On Dec 10, 2018, at 12:47 PM, [email protected] wrote:
>
> We've got a long-standing complaint that tools like lsof, when run on an
> NFS server, overlook opens and locks held by NFS clients.
>
> The information's all there, it's just a question of how to expose it.
>
> Easiest might be a single flat file like /proc/locks, but I've always
> hoped we could do something slightly more structured, using a
> subdirectory per NFS client.
>
> Jeff Layton looked into this several years ago. I don't remember if
> there was some particular issue or if he just got bogged down in VFS
> details.
>
> My concerns are that:
>
> - I'd like the format to be easily expandable. The option to
> create new files seems like it would help.
> - some of the data we'd like to expose may be kind of cumbersome
> include as a column in a text file. (I'm thinking of the
> NFSv4 client identifier, which the protocol allows to be up to
> 1K of binary data, even if most (all?) clients use shorter
> ascii identifiers.)
>
> I'm not sure I'd want to go as far as a sysfs-like one-value-per-file
> rule, which seems like overkill?
>
> In a little more detail, as starting point, I was considering naming
> each client directory with a small integer, and including files like:
>
> info: a text file with
> NFS protocol version
> ascii representation of client address
> krb5 principal if available
>
> clientid: NFSv4 client ID; file absent for NFSv2/3 clients.
>
> locks: list of locks, following something like the /proc/locks
> format.
>
> opens: list of file opens, with access bits, inode numbers,
> device number.
>
> Does that sound reasonable? Any other ideas?

How do you plan to make this kernel API namespace-aware?

--
Chuck Lever




2018-12-10 18:12:41

by Jeff Layton

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens

On Mon, 2018-12-10 at 12:47 -0500, J. Bruce Fields wrote:
> We've got a long-standing complaint that tools like lsof, when run on an
> NFS server, overlook opens and locks held by NFS clients.
>
> The information's all there, it's just a question of how to expose it.
>
> Easiest might be a single flat file like /proc/locks, but I've always
> hoped we could do something slightly more structured, using a
> subdirectory per NFS client.
>
> Jeff Layton looked into this several years ago. I don't remember if
> there was some particular issue or if he just got bogged down in VFS
> details.
>

I think I had a patch that generated a single flat file for locks, but
you wanted to present a directory or file per-client, and I just never
got around to reworking the earlier patch.

> My concerns are that:
>
> - I'd like the format to be easily expandable. The option to
> create new files seems like it would help.

Yes. We do need to bear in mind that someone will eventually write
scripts to scrape this info. We may want to consider this a formal part
of kernel ABI from the get-go and tread carefully when making changes
after the initial merge.

> - some of the data we'd like to expose may be kind of cumbersome
> include as a column in a text file. (I'm thinking of the
> NFSv4 client identifier, which the protocol allows to be up to
> 1K of binary data, even if most (all?) clients use shorter
> ascii identifiers.)
>
> I'm not sure I'd want to go as far as a sysfs-like one-value-per-file
> rule, which seems like overkill?
>
> In a little more detail, as starting point, I was considering naming
> each client directory with a small integer, and including files like:
>
> info: a text file with
> NFS protocol version
> ascii representation of client address
> krb5 principal if available
>
> clientid: NFSv4 client ID; file absent for NFSv2/3 clients.
>
> locks: list of locks, following something like the /proc/locks
> format.
>
> opens: list of file opens, with access bits, inode numbers,
> device number.
>
> Does that sound reasonable? Any other ideas?
>

That sounds like a great start. Some ideas:

The locks file could also list delegations and layouts, but it might be
good to do them in separate files. That would make it cleaner to display
info that is only relevant to those types (recall info, in particular).

You might also consider adding a v4-specific info file. Show things like
"when was last lease renewal"?
--
Jeff Layton <[email protected]>


2018-12-10 19:00:51

by J. Bruce Fields

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens

On Mon, Dec 10, 2018 at 12:49:02PM -0500, Chuck Lever wrote:
> > On Dec 10, 2018, at 12:47 PM, [email protected] wrote:
> > In a little more detail, as starting point, I was considering naming
> > each client directory with a small integer, and including files like:
> >
> > info: a text file with
> > NFS protocol version
> > ascii representation of client address
> > krb5 principal if available
> >
> > clientid: NFSv4 client ID; file absent for NFSv2/3 clients.
> >
> > locks: list of locks, following something like the /proc/locks
> > format.
> >
> > opens: list of file opens, with access bits, inode numbers,
> > device number.
> >
> > Does that sound reasonable? Any other ideas?
>
> How do you plan to make this kernel API namespace-aware?

I may have some details wrong, but:

We associate most of nfsd's state with the network namespace. The
"nfsd" pseudofilesystem is where I'm thinking I might put this, and it
inherits the network namespace from the process that calls mount and
stores it in the superblock. So each mount done in a different net
namespace should get its own superblock, its own inodes, etc.

(I think that's how proc works, too?)

So when you set up containerized nfsd, you mount a new nfsd filesystem
in each container. And the list of clients visible there should only be
the ones visible to that namespace.

I guess that means that if you share an export across multiple
containers, then if you want to find all the clients locking a given
file, you have to iterate over all the container's "nfsd" mounts.

I suspect that's what we have to do, though. I mean, the client
addresses, for example, may not even make sense unless you know which
network namespace they come from.

(On the other hand... how does /proc/locks actually work? Looks to me
like it always lists every lock on the system. Can it really translate
any process on the system into a pid that makes sense in any container?
I'm not following that, from a brief look at the code.)

--b.

2018-12-10 19:23:12

by J. Bruce Fields

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens

On Mon, Dec 10, 2018 at 01:12:31PM -0500, Jeff Layton wrote:
> On Mon, 2018-12-10 at 12:47 -0500, J. Bruce Fields wrote:
> > We've got a long-standing complaint that tools like lsof, when run on an
> > NFS server, overlook opens and locks held by NFS clients.
> >
> > The information's all there, it's just a question of how to expose it.
> >
> > Easiest might be a single flat file like /proc/locks, but I've always
> > hoped we could do something slightly more structured, using a
> > subdirectory per NFS client.
> >
> > Jeff Layton looked into this several years ago. I don't remember if
> > there was some particular issue or if he just got bogged down in VFS
> > details.
> >
>
> I think I had a patch that generated a single flat file for locks, but
> you wanted to present a directory or file per-client, and I just never
> got around to reworking the earlier patch.

Oh, OK, makes sense.

> That sounds like a great start. Some ideas:
>
> The locks file could also list delegations and layouts, but it might be
> good to do them in separate files. That would make it cleaner to display
> info that is only relevant to those types (recall info, in particular).
>
> You might also consider adding a v4-specific info file. Show things like
> "when was last lease renewal"?

Yes, good ideas. I hope this could be expandable in such a way that we
don't need all of that at the start.

I also had some idea that we might eventually also benefit from some
two-way communication. But the only idea I had there was some sort of
"destroy this client now" operation, which is probably less important
for NFSv4 state, since it gets cleaned up automatically on lease expiry.

--b.

2018-12-10 19:53:48

by J. Bruce Fields

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens

On Mon, Dec 10, 2018 at 02:23:10PM -0500, J. Bruce Fields wrote:
> On Mon, Dec 10, 2018 at 01:12:31PM -0500, Jeff Layton wrote:
> > On Mon, 2018-12-10 at 12:47 -0500, J. Bruce Fields wrote:
> > > We've got a long-standing complaint that tools like lsof, when run on an
> > > NFS server, overlook opens and locks held by NFS clients.
> > >
> > > The information's all there, it's just a question of how to expose it.
> > >
> > > Easiest might be a single flat file like /proc/locks, but I've always
> > > hoped we could do something slightly more structured, using a
> > > subdirectory per NFS client.
> > >
> > > Jeff Layton looked into this several years ago. I don't remember if
> > > there was some particular issue or if he just got bogged down in VFS
> > > details.
> > >
> >
> > I think I had a patch that generated a single flat file for locks, but
> > you wanted to present a directory or file per-client, and I just never
> > got around to reworking the earlier patch.
>
> Oh, OK, makes sense.

(But, um, if anyone has a good starting point to recommend to me here,
I'm interested. E.g. another pseudofs that's a good example to follow.)

--b.

2018-12-10 23:35:06

by Jeff Layton

[permalink] [raw]
Subject: Re: listing knfsd-held locks and opens

On Mon, 2018-12-10 at 14:53 -0500, J. Bruce Fields wrote:
> On Mon, Dec 10, 2018 at 02:23:10PM -0500, J. Bruce Fields wrote:
> > On Mon, Dec 10, 2018 at 01:12:31PM -0500, Jeff Layton wrote:
> > > On Mon, 2018-12-10 at 12:47 -0500, J. Bruce Fields wrote:
> > > > We've got a long-standing complaint that tools like lsof, when run on an
> > > > NFS server, overlook opens and locks held by NFS clients.
> > > >
> > > > The information's all there, it's just a question of how to expose it.
> > > >
> > > > Easiest might be a single flat file like /proc/locks, but I've always
> > > > hoped we could do something slightly more structured, using a
> > > > subdirectory per NFS client.
> > > >
> > > > Jeff Layton looked into this several years ago. I don't remember if
> > > > there was some particular issue or if he just got bogged down in VFS
> > > > details.
> > > >
> > >
> > > I think I had a patch that generated a single flat file for locks, but
> > > you wanted to present a directory or file per-client, and I just never
> > > got around to reworking the earlier patch.
> >
> > Oh, OK, makes sense.
>
> (But, um, if anyone has a good starting point to recommend to me here,
> I'm interested. E.g. another pseudofs that's a good example to follow.)
>

I looked for the branch, but I can't find it now. It may be possible to
find my original posting of it on the mailing list, but it has been
years. I'm pretty sure it'd be badly bitrotted by now anyway.

Where do you intend for this to live? Do you plan to build a new
hierarchy under /proc/fs/nfsd, or use something like sysfs or debugfs?

> I also had some idea that we might eventually also benefit from some
> two-way communication. But the only idea I had there was some sort of
> "destroy this client now" operation, which is probably less important
> for NFSv4 state, since it gets cleaned up automatically on lease expiry.
>

Per client cancellation sounds like a nice feature. The fault injection
code had some (less granular) stuff for killing off live clients. It may
be worth going over that.

--
Jeff Layton <[email protected]>