2023-08-08 22:54:46

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] NFSD: add version field to nfsd_rpc_status_show handler

On Wed, 09 Aug 2023, Chuck Lever wrote:
> On Tue, Aug 08, 2023 at 10:20:44AM -0400, Jeff Layton wrote:
> >
> > It would probably be fairly simple to output well-formed yaml instead.
> > JSON and XML are a bit more of a pain.
>
> If folks don't mind, I would like more structured output like one of
> these self-documenting formats. (I know I said I didn't care before,
> but I'm beginning to care now ;-)

Lustre, which I am somewhat involved with, uses YAML for various things.
If someone else introduced yaml-producing sysfs files to the kernel
first, that might make the path for lustre smoother :-)

Another option is netlink which lustre is stating to use for
configuration and stats. It is a self-describing format. The code
looks verbose, but it is widely used in the kernel and so well supported.

>
> I'm also wondering if we really ought not add another file under
> /proc, which is essentially obsolete. Would /sys/fs/nfsd/yada be
> better for this facility?

It is only under /proc because that is where it is mounted by default :-)
I think it might be sensible to create a node under /sys where all the
content of the nfsd filesystem also appears.
I'm not keen on /sys/fs/nfsd because nfsd isn't a filesystem, it is a
service.
But the /sys/fs seems to be largely unstructured (/proc v2??) so maybe
putting nfsd there is OK. We could claim that /proc/fs/nfsd is a
filesystem :-)

>
> I hesitate to even mention network namespaces...

Please do mention them - I find them too easy to forget about.
/proc/fs/nfsd/ inherits the network namespace from whoever mounts it.
So this can work perfectly.
If we created a mirror in /sys/ we would presumably use the namespace of
the process that opens the file.

Thanks,
NeilBrown


2023-08-09 01:54:48

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH] NFSD: add version field to nfsd_rpc_status_show handler

On Wed, Aug 09, 2023 at 07:45:06AM +1000, NeilBrown wrote:
> On Wed, 09 Aug 2023, Chuck Lever wrote:
> > On Tue, Aug 08, 2023 at 10:20:44AM -0400, Jeff Layton wrote:
> > >
> > > It would probably be fairly simple to output well-formed yaml instead.
> > > JSON and XML are a bit more of a pain.
> >
> > If folks don't mind, I would like more structured output like one of
> > these self-documenting formats. (I know I said I didn't care before,
> > but I'm beginning to care now ;-)
>
> Lustre, which I am somewhat involved with, uses YAML for various things.
> If someone else introduced yaml-producing sysfs files to the kernel
> first, that might make the path for lustre smoother :-)

It worries me that there isn't yet kernel infrastructure for
formating yaml in sysfs files. That broadens the scope of this
work significantly.


> Another option is netlink which lustre is stating to use for
> configuration and stats. It is a self-describing format. The code
> looks verbose, but it is widely used in the kernel and so well supported.

I just spent the last 6 months building a netlink upcall to handle
TLS handshake requests for in-kernel TLS consumers. It is built on
the recently-added yaml netlink specs and code generator. The yaml
netlink specs are kept under:

Documentation/netlink/specs/

Using netlink would give us a lot of infrastructure for this
facility, but I'm not sure it's worth the extra complexity. And it
would /require/ the use of user space tooling (ie, not 'cat') to get
to the information exported from the kernel. <shrug>


> > I'm also wondering if we really ought not add another file under
> > /proc, which is essentially obsolete. Would /sys/fs/nfsd/yada be
> > better for this facility?
>
> It is only under /proc because that is where it is mounted by default :-)
> I think it might be sensible to create a node under /sys where all the
> content of the nfsd filesystem also appears.

There are things in the nfsd filesystem that really belong under
/proc/net/rpc or elsewhere, so IMO such migration needs to be
handled on a case-by-case basis -- different project for another
time.


> I'm not keen on /sys/fs/nfsd because nfsd isn't a filesystem, it is a
> service.

How about /sys/module/nfsd ?


> > I hesitate to even mention network namespaces...
>
> Please do mention them - I find them too easy to forget about.
> /proc/fs/nfsd/ inherits the network namespace from whoever mounts it.
> So this can work perfectly.
> If we created a mirror in /sys/ we would presumably use the namespace of
> the process that opens the file.

I agree: the network namespace of the process that opens the
rpc_status file is just what we want to limit access to in-flight
requests. The current network namespace of each thread is available
via SVC_NET(rqst), so it should be quite simple to display only
in-flight requests that match the opener's namespace.


--
Chuck Lever

2023-08-09 02:11:13

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] NFSD: add version field to nfsd_rpc_status_show handler

On Wed, 09 Aug 2023, Chuck Lever wrote:
> On Wed, Aug 09, 2023 at 07:45:06AM +1000, NeilBrown wrote:
> > On Wed, 09 Aug 2023, Chuck Lever wrote:
> > > On Tue, Aug 08, 2023 at 10:20:44AM -0400, Jeff Layton wrote:
> > > >
> > > > It would probably be fairly simple to output well-formed yaml instead.
> > > > JSON and XML are a bit more of a pain.
> > >
> > > If folks don't mind, I would like more structured output like one of
> > > these self-documenting formats. (I know I said I didn't care before,
> > > but I'm beginning to care now ;-)
> >
> > Lustre, which I am somewhat involved with, uses YAML for various things.
> > If someone else introduced yaml-producing sysfs files to the kernel
> > first, that might make the path for lustre smoother :-)
>
> It worries me that there isn't yet kernel infrastructure for
> formating yaml in sysfs files. That broadens the scope of this
> work significantly.
>
>
> > Another option is netlink which lustre is stating to use for
> > configuration and stats. It is a self-describing format. The code
> > looks verbose, but it is widely used in the kernel and so well supported.
>
> I just spent the last 6 months building a netlink upcall to handle
> TLS handshake requests for in-kernel TLS consumers. It is built on
> the recently-added yaml netlink specs and code generator. The yaml
> netlink specs are kept under:
>
> Documentation/netlink/specs/
>
> Using netlink would give us a lot of infrastructure for this
> facility, but I'm not sure it's worth the extra complexity. And it
> would /require/ the use of user space tooling (ie, not 'cat') to get
> to the information exported from the kernel. <shrug>
>

I do like the "cat" approach. Unfortunately it doesn't scale and you
never really know when it needs to scale.
The nfsd/rpc cache.c auth cache is still a sore point for me. It works
nicely expect that it breaks for gss because the keys get too big. So
we've had a couple of attempts to "fix" that. The fixes work, but they
are *different*.

The other well known pain point is /proc/mounts. It is really cool that
you can "cat" that, but with thousands of mounts it can take tools like
systemd a long time to find changes.

does any of that matter for collecting stats? Will we hit a wall? I
really don't know. I'd like to think that we won't....

>
> > > I'm also wondering if we really ought not add another file under
> > > /proc, which is essentially obsolete. Would /sys/fs/nfsd/yada be
> > > better for this facility?
> >
> > It is only under /proc because that is where it is mounted by default :-)
> > I think it might be sensible to create a node under /sys where all the
> > content of the nfsd filesystem also appears.
>
> There are things in the nfsd filesystem that really belong under
> /proc/net/rpc or elsewhere, so IMO such migration needs to be
> handled on a case-by-case basis -- different project for another
> time.

abolutely.

>
>
> > I'm not keen on /sys/fs/nfsd because nfsd isn't a filesystem, it is a
> > service.
>
> How about /sys/module/nfsd ?

Not worse than /sys/fs/nfsd. Not really better though.
Everything in /sys/module/foo is about the module as a chunk of code -
except "parameters". I guess we could add "state".

Maybe configfs is the thing ... but I never liked configfs. It seems
like a solution in search of a problem.

I complained that /sys/fs is like the provfs-v2. It is more that
everything other than devices (and block,bus,class) is procfs-v2.
There are little bits of regularity, but no big-picture.

I would probably argue for /sys/services/sunrpc/{nfsd,lockd,nfsv4.1-cb}

Alternately, we could go for /sys/devices/virtual/sunrpc. The virtual
directory contains "workqueue" which is a service in some sense, and
contains subdirectories for specific work-queues.
I'm not sure that all of the nfsd stuff would fit under there...

but maybe I'm over-thinking things.

Thanks,
NeilBrown


>
>
> > > I hesitate to even mention network namespaces...
> >
> > Please do mention them - I find them too easy to forget about.
> > /proc/fs/nfsd/ inherits the network namespace from whoever mounts it.
> > So this can work perfectly.
> > If we created a mirror in /sys/ we would presumably use the namespace of
> > the process that opens the file.
>
> I agree: the network namespace of the process that opens the
> rpc_status file is just what we want to limit access to in-flight
> requests. The current network namespace of each thread is available
> via SVC_NET(rqst), so it should be quite simple to display only
> in-flight requests that match the opener's namespace.
>
>
> --
> Chuck Lever
>