2024-01-25 19:41:16

by Jeffrey Layton

[permalink] [raw]
Subject: Should we establish a new nfsdctl userland program?

The existing rpc.nfsd program was designed during a different time, when
we just didn't require that much control over how it behaved. It's
klunky to work with.

In a response to Chuck's recent RFC patch to add knob to disable
READ_PLUS calls, I mentioned that it might be a good time to make a
clean break from the past and start a new program for controlling nfsd.

Here's what I'm thinking:

Let's build a swiss-army-knife kind of interface like git or virsh:

# nfsdctl?stats <--- fetch the new stats that got merged
# nfsdctl add_listener <--- add a new listen socket, by address or hostname
# nfsdctl set v3 on <--- enable NFSv3
# nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
# nfsdctl set threads 128 <--- spin up the threads

We could start with just the bare minimum for now (the stats interface),
and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
start and stop the server. systemd will also need to fall back to using
rpc.nfsd if nfsdctl or the netlink program isn't present.

Note that I think this program will have to be a compiled binary vs. a
python script or the like, given that it'll be involved in system
startup.

It turns out that Lorenzo already has a C program that has a lot of the
plumbing we'd need:

https://github.com/LorenzoBianconi/nfsd-netlink

I think it might be good to clean up the interface a bit, build a
manpage and merge that into nfs-utils.

Questions:

1/ one big binary, or smaller nfsdctl-* programs (like git uses)?

2/ should it automagically read in nfs.conf? (I tend to think it should,
but we might want an option to disable that)

3/ should "set threads" activate the server, or just set a count, and
then we do a separate activation step to start it? If we want that, then
we may want to twiddle the proposed netlink interface a bit.

I'm sure other questions will arise as we embark on this too.

Thoughts? Anyone have objections to this idea?
--
Jeff Layton <[email protected]>


2024-01-25 19:56:30

by Cedric Blancher

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Thu, 25 Jan 2024 at 20:41, Jeff Layton <[email protected]> wrote:
>
> The existing rpc.nfsd program was designed during a different time, when
> we just didn't require that much control over how it behaved. It's
> klunky to work with.
>
> In a response to Chuck's recent RFC patch to add knob to disable
> READ_PLUS calls, I mentioned that it might be a good time to make a
> clean break from the past and start a new program for controlling nfsd.
>
> Here's what I'm thinking:
>
> Let's build a swiss-army-knife kind of interface like git or virsh:
>
> # nfsdctl stats <--- fetch the new stats that got merged
> # nfsdctl add_listener <--- add a new listen socket, by address or hostname

Absolutely NOT "hostname". Please do not repeat the mistake AGAIN of
separating "host name" and "TCP port", as they are both required to
find the server. Every 10 or 15 years the same mistake is made by the
next generation of software engineers.

https://datatracker.ietf.org/doc/html/rfc1738 clearly defined
"hostport", and that is what should be used here.

Ced
--
Cedric Blancher <[email protected]>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur

2024-01-25 20:25:55

by Chuck Lever

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?



> On Jan 25, 2024, at 2:54 PM, Cedric Blancher <[email protected]> wrote:
>
> On Thu, 25 Jan 2024 at 20:41, Jeff Layton <[email protected]> wrote:
>>
>> The existing rpc.nfsd program was designed during a different time, when
>> we just didn't require that much control over how it behaved. It's
>> klunky to work with.
>>
>> In a response to Chuck's recent RFC patch to add knob to disable
>> READ_PLUS calls, I mentioned that it might be a good time to make a
>> clean break from the past and start a new program for controlling nfsd.
>>
>> Here's what I'm thinking:
>>
>> Let's build a swiss-army-knife kind of interface like git or virsh:
>>
>> # nfsdctl stats <--- fetch the new stats that got merged
>> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
>
> Absolutely NOT "hostname". Please do not repeat the mistake AGAIN of
> separating "host name" and "TCP port", as they are both required to
> find the server. Every 10 or 15 years the same mistake is made by the
> next generation of software engineers.

I don't see how this is a mistake.

> port
> The port number to connect to. Most schemes designate
> protocols that have a default port number. Another port number
> may optionally be supplied, in decimal, separated from the
> host by a colon. If the port is omitted, the colon is as well.

NFS has a default port number. Thus, even RFC 1738 states that
"hostname" is just fine. It means "DNS label and default port".

Most usage scenarios will prefer the shorthand of leaving off the
port. So engineers seem to be designing interfaces for the most
common usage, don't you think?


> https://datatracker.ietf.org/doc/html/rfc1738 clearly defined
> "hostport", and that is what should be used here.

RFC 1738 was published very clearly before the widespread use of
IPv6 addresses, which use a : to separate the components of an
IP address.

Certainly the add_listener subcommand can take a port, but there's
plenty of room to add that in other ways.

For example, we might want:

# nfsdctl add_listener

xprt <udp|tcp|rdma>

host <DNS label> | addr <IPv4 or IPv6 address>

and optionally

port <listener port>

If port is not specified, the default port for the xprt type
is assumed. (eg, 2049 or 20049)

Eventually, also:

# nfsdctl del_listener ... with similar arguments

--
Chuck Lever


2024-01-25 21:12:10

by NeilBrown

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Fri, 26 Jan 2024, Jeff Layton wrote:
> The existing rpc.nfsd program was designed during a different time, when
> we just didn't require that much control over how it behaved. It's
> klunky to work with.

How is it clunky?

rpc.nfsd

that starts the service.

rpc.nfsd 0

that stops the service.

Ok, not completely elegant. Maybe

nfsdctl start
nfsdctl stop

would be better.

>
> In a response to Chuck's recent RFC patch to add knob to disable
> READ_PLUS calls, I mentioned that it might be a good time to make a
> clean break from the past and start a new program for controlling nfsd.
>
> Here's what I'm thinking:
>
> Let's build a swiss-army-knife kind of interface like git or virsh:
>
> # nfsdctl stats <--- fetch the new stats that got merged
> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> # nfsdctl set v3 on <--- enable NFSv3
> # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> # nfsdctl set threads 128 <--- spin up the threads

Sure the "git" style would use

nfsdctl version 3 on
nfsdctl threads 128

Apart from "stats", "start", "stop", I suspect that we developers would
be the only people to actually use this functionality. Until now,
echo > /proc/sys/nfsd/foo
has been enough for most tweeking. Having a proper tool would likely
lower the barrier to entry, which can only be a good thing.

>
> We could start with just the bare minimum for now (the stats interface),
> and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> start and stop the server. systemd will also need to fall back to using
> rpc.nfsd if nfsdctl or the netlink program isn't present.

systemd doesn't need a fallback. Systemd always activates
nfs-server.service. We just need to make sure the installed
nfs-server.service matches the installed tools, and as they are
distributed as parts of the same package, that should be trivial.

>
> Note that I think this program will have to be a compiled binary vs. a
> python script or the like, given that it'll be involved in system
> startup.

Agreed.

>
> It turns out that Lorenzo already has a C program that has a lot of the
> plumbing we'd need:
>
> https://github.com/LorenzoBianconi/nfsd-netlink
>
> I think it might be good to clean up the interface a bit, build a
> manpage and merge that into nfs-utils.
>
> Questions:
>
> 1/ one big binary, or smaller nfsdctl-* programs (like git uses)?

/usr/lib/git-core (on my laptop) has 168 entries. Only 29 of them are
NOT symlinks to 'git'.

While I do like the "tool command args" interface, and I like the option
of adding commands by simply creating drop-in tools, I think that core
functionality should go in the core tool.
So: "one big binary" please - with call-out functionality if anyone can
be bothered implementing it.

>
> 2/ should it automagically read in nfs.conf? (I tend to think it should,
> but we might want an option to disable that)

Absolutely definitely. I'm not convinced we need an option to disable
config, but allowing options to over-ride specific configs is sensible.

Most uses of this tool would come from nfs-server.service which would
presumably call
nfsdctl start
which would set everything based on the nfs.conf and thus start the
server. And
nfsdctl stop
which would set the number of threads to zero.

>
> 3/ should "set threads" activate the server, or just set a count, and
> then we do a separate activation step to start it? If we want that, then
> we may want to twiddle the proposed netlink interface a bit.

It might be sensible to have "set max-threads" which doesn't actually
start the service.
I would really REALLY like a dynamic thread pool. It would start at 1
(or maybe 2) and grow on demand up to the max, and idle threads
(inactive for 30 seconds?) would exit. We could then default the max to
some function of memory size and people could mostly ignore the
num-threads setting.

I don't have patches today, but if we are re-doing the interfaces I
would like us to plan the interfaces to support a pool rather than a
fixed number.

>
> I'm sure other questions will arise as we embark on this too.
>
> Thoughts? Anyone have objections to this idea?

I think this is an excellent question to ask. As you say it is a long
time since rpc.nfsd was created, and it has grown incrementally rather
then being clearly designed.

> --
> Jeff Layton <[email protected]>
>

Thanks,
NeilBrown

2024-01-25 23:04:29

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Fri, 2024-01-26 at 08:11 +1100, NeilBrown wrote:
> On Fri, 26 Jan 2024, Jeff Layton wrote:
> > The existing rpc.nfsd program was designed during a different time, when
> > we just didn't require that much control over how it behaved. It's
> > klunky to work with.
>
> How is it clunky?
>
> rpc.nfsd
>
> that starts the service.
>
> rpc.nfsd 0
>
> that stops the service.
>
> Ok, not completely elegant. Maybe
>
> nfsdctl start
> nfsdctl stop
>
> would be better.
>

It's clunky if you have to script around it. Mostly it's an issue for
people doing clustering and that sort of thing.

> >
> > In a response to Chuck's recent RFC patch to add knob to disable
> > READ_PLUS calls, I mentioned that it might be a good time to make a

Sorry, not READ_PLUS calls here, I meant splice_reads...

> > clean break from the past and start a new program for controlling nfsd.
> >
> > Here's what I'm thinking:
> >
> > Let's build a swiss-army-knife kind of interface like git or virsh:
> >
> > # nfsdctl?stats <--- fetch the new stats that got merged
> > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > # nfsdctl set v3 on <--- enable NFSv3
> > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > # nfsdctl set threads 128 <--- spin up the threads
>
> Sure the "git" style would use
>
> nfsdctl version 3 on
> nfsdctl threads 128
>

I like following git's example syntactically.

> Apart from "stats", "start", "stop", I suspect that we developers would
> be the only people to actually use this functionality.?
>

Agreed, maybe alongside higher orchestration like containerization or
clustering tech.

> Until now,
> echo > /proc/sys/nfsd/foo
> has been enough for most tweeking. Having a proper tool would likely
> lower the barrier to entry, which can only be a good thing.
>

I think so too. Also, we don't really have that option with netlink. We
need some sort of tool to drive the new interfaces.

> >
> > We could start with just the bare minimum for now (the stats interface),
> > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > start and stop the server. systemd will also need to fall back to using
> > rpc.nfsd if nfsdctl or the netlink program isn't present.
>
> systemd doesn't need a fallback. Systemd always activates
> nfs-server.service. We just need to make sure the installed
> nfs-server.service matches the installed tools, and as they are
> distributed as parts of the same package, that should be trivial.
>

The problem is the transition period. There will come a time where
people will have kernels that don't support the new netlink interface,
but newer userland.

We could teach nfsdctl to work with nfsdfs, but I'd rather it just bail
out and say "Sorry, you have to use rpc.nfsd on this old kernel".

Maybe we don't need to worry about plumbing that logic into the systemd
service though, and just having distros do a hard cutover at some point
makes more sense.

> >
> > Note that I think this program will have to be a compiled binary vs. a
> > python script or the like, given that it'll be involved in system
> > startup.
>
> Agreed.
>
> >
> > It turns out that Lorenzo already has a C program that has a lot of the
> > plumbing we'd need:
> >
> > https://github.com/LorenzoBianconi/nfsd-netlink
> >
> > I think it might be good to clean up the interface a bit, build a
> > manpage and merge that into nfs-utils.
> >
> > Questions:
> >
> > 1/ one big binary, or smaller nfsdctl-* programs (like git uses)?
>
> /usr/lib/git-core (on my laptop) has 168 entries. Only 29 of them are
> NOT symlinks to 'git'.
>
> While I do like the "tool command args" interface, and I like the option
> of adding commands by simply creating drop-in tools, I think that core
> functionality should go in the core tool.
> So: "one big binary" please - with call-out functionality if anyone can
> be bothered implementing it.
>

Ok, sounds good to me.

> >
> > 2/ should it automagically read in nfs.conf? (I tend to think it should,
> > but we might want an option to disable that)
>
> Absolutely definitely. I'm not convinced we need an option to disable
> config, but allowing options to over-ride specific configs is sensible.
>
> Most uses of this tool would come from nfs-server.service which would
> presumably call
> nfsdctl start
> which would set everything based on the nfs.conf and thus start the
> server. And
> nfsdctl stop
> which would set the number of threads to zero.
>

Sensible.

> >
> > 3/ should "set threads" activate the server, or just set a count, and
> > then we do a separate activation step to start it? If we want that, then
> > we may want to twiddle the proposed netlink interface a bit.
>
> It might be sensible to have "set max-threads" which doesn't actually
> start the service.
> I would really REALLY like a dynamic thread pool. It would start at 1
> (or maybe 2) and grow on demand up to the max, and idle threads
> (inactive for 30 seconds?) would exit. We could then default the max to
> some function of memory size and people could mostly ignore the
> num-threads setting.
>
> I don't have patches today, but if we are re-doing the interfaces I
> would like us to plan the interfaces to support a pool rather than a
> fixed number.
>

I like that idea too. A dynamic threadpool would be very nice to have.
Since we're dreaming:

Maybe we can set "threads" to a specific value (-1?) that makes it start
the pool at "min_threads" and dynamically size the pool up to
"max_threads" with the load.

> >
> > I'm sure other questions will arise as we embark on this too.
> >
> > Thoughts? Anyone have objections to this idea?
>
> I think this is an excellent question to ask. As you say it is a long
> time since rpc.nfsd was created, and it has grown incrementally rather
> then being clearly designed.


Thanks. I think this is something that has the potential to really make
server administration simpler.

I also wouldn't mind a readline-style shell interface if you just run
nfsdctl without arguments. Like:

# nfsdctl
nfsdctl> threads 128
nfsdctl> version 3 off
nfsdctl> start
nfsdctl> ^d

...but that could be added later too.

--
Jeff Layton <[email protected]>

2024-01-26 00:51:08

by Dan Shelton

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Thu, 25 Jan 2024 at 21:25, Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Jan 25, 2024, at 2:54 PM, Cedric Blancher <[email protected]> wrote:
> >
> > On Thu, 25 Jan 2024 at 20:41, Jeff Layton <[email protected]> wrote:
> >>
> >> The existing rpc.nfsd program was designed during a different time, when
> >> we just didn't require that much control over how it behaved. It's
> >> klunky to work with.
> >>
> >> In a response to Chuck's recent RFC patch to add knob to disable
> >> READ_PLUS calls, I mentioned that it might be a good time to make a
> >> clean break from the past and start a new program for controlling nfsd.
> >>
> >> Here's what I'm thinking:
> >>
> >> Let's build a swiss-army-knife kind of interface like git or virsh:
> >>
> >> # nfsdctl stats <--- fetch the new stats that got merged
> >> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> >
> > Absolutely NOT "hostname". Please do not repeat the mistake AGAIN of
> > separating "host name" and "TCP port", as they are both required to
> > find the server. Every 10 or 15 years the same mistake is made by the
> > next generation of software engineers.
>
> I don't see how this is a mistake.
>
> > port
> > The port number to connect to. Most schemes designate
> > protocols that have a default port number. Another port number
> > may optionally be supplied, in decimal, separated from the
> > host by a colon. If the port is omitted, the colon is as well.
>
> NFS has a default port number. Thus, even RFC 1738 states that
> "hostname" is just fine. It means "DNS label and default port".
>
> Most usage scenarios will prefer the shorthand of leaving off the
> port. So engineers seem to be designing interfaces for the most
> common usage, don't you think?

The most common usage? For small shops that might apply, but not for
big hosters where IPv4 addresses are a scarce resource.
>
>
> > https://datatracker.ietf.org/doc/html/rfc1738 clearly defined
> > "hostport", and that is what should be used here.
>
> RFC 1738 was published very clearly before the widespread use of
> IPv6 addresses,

Read below, you need [ and ] for IPv6 addresses, or be in parser hell.

> which use a : to separate the components of an
> IP address.

I think you take the syntax part too seriously, and not Cedric's
message: Address and port should be specified together, as they are
required to be used together to create the socket. It's part of the
address information.

Instead of fighting over syntax (hostname@port, hostname:port, ...) it
should seriously be considered to include the port.

Also, it might be good for hosters to allow more than one nfsd
instance with seperate exports file, each running on its own
address/port combination. And not force them to use a VM to bloat
resources.

> Certainly the add_listener subcommand can take a port, but there's
> plenty of room to add that in other ways.
>
> For example, we might want:
>
> # nfsdctl add_listener
>
> xprt <udp|tcp|rdma>
>
> host <DNS label> | addr <IPv4 or IPv6 address>
>
> and optionally
>
> port <listener port>

Nope. Just hostname:port or hostname@port. For raw IPv6 addresses you
have to use square brackets anyway, or end up in parser hell.

Dan
--
Dan Shelton - Cluster Specialist Win/Lin/Bsd

2024-01-26 02:37:56

by Chuck Lever

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?


> On Jan 25, 2024, at 7:50 PM, Dan Shelton <[email protected]> wrote:
>
> On Thu, 25 Jan 2024 at 21:25, Chuck Lever III <[email protected]> wrote:
>>
>>> https://datatracker.ietf.org/doc/html/rfc1738 clearly defined
>>> "hostport", and that is what should be used here.
>>
>> RFC 1738 was published very clearly before the widespread use of
>> IPv6 addresses,
>
> Read below, you need [ and ] for IPv6 addresses, or be in parser hell.

I'm well aware of how raw IPv6 presentation addresses are
wrapped.


>> which use a : to separate the components of an
>> IP address.
>
> I think you take the syntax part too seriously, and not Cedric's
> message: Address and port should be specified together, as they are
> required to be used together to create the socket. It's part of the
> address information.

I have no argument against the idea that a listener's bind
address includes both an address and port. Everyone assumes
that fact. Who do you think is going to forget that such that
having it baked into our admin interface would be helpful?

Note in our case: The usual address is ANY. The usual port is
2049. There's literally no benefit to force everyone to specify
both an address or hostname and a port via the user interface,
when the command itself can fill in the correct defaults for
most people.

Look at the mount.nfs command. The address and port are separate
because the address carries the high-order information, but the
port number has a frequently-used default value, or is obtained
via an rpcbind lookup in some cases.

Even in URIs: the port number is optional, and for nfs: URIs,
the default value is 2049.

Truly, this is already the way these admin commands work
nearly everywhere else.

Here are some examples that would be hard to carry off if we
require "hostname@port" :

# nfsdctl add_listener xprt udp port 65535

Gives us an ANY listener on UDP port 65535.

We could even specify the bind address by device name if we
wanted:

# nfsdctl add_listener xprt tcp device eth0 port 4045

Gives us a listener on eth0's assigned address at TCP port
4045.

That seems quite flexible to me, and does not depend on the
administrator providing either an address or hostname.


--
Chuck Lever


2024-01-26 07:47:54

by NeilBrown

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Fri, 26 Jan 2024, Dan Shelton wrote:
>
> Also, it might be good for hosters to allow more than one nfsd
> instance with seperate exports file, each running on its own
> address/port combination. And not force them to use a VM to bloat
> resources.

This is possible with containers. If you try to start nfsd in a new
network namespace, it will be a new instance independent of nfsd running
in any other namespace.
You would need a filesystem namespace of the various tools to see the
/etc/exports that you want it to.

I would like it to be easier to configure this, but I'm certain that
namespaces are the right way to create a new instance.
(It might be nice to allow separate NFSD namespaces in the same network
namespace but we'd need clear evidence that using network namespaces
causes genuine problems.)

NeilBrown

2024-02-02 17:16:52

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

> The existing rpc.nfsd program was designed during a different time, when
> we just didn't require that much control over how it behaved. It's
> klunky to work with.
>
> In a response to Chuck's recent RFC patch to add knob to disable
> READ_PLUS calls, I mentioned that it might be a good time to make a
> clean break from the past and start a new program for controlling nfsd.
>
> Here's what I'm thinking:
>
> Let's build a swiss-army-knife kind of interface like git or virsh:
>
> # nfsdctl?stats <--- fetch the new stats that got merged
> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> # nfsdctl set v3 on <--- enable NFSv3
> # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> # nfsdctl set threads 128 <--- spin up the threads
>
> We could start with just the bare minimum for now (the stats interface),
> and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> start and stop the server. systemd will also need to fall back to using
> rpc.nfsd if nfsdctl or the netlink program isn't present.
>
> Note that I think this program will have to be a compiled binary vs. a
> python script or the like, given that it'll be involved in system
> startup.
>
> It turns out that Lorenzo already has a C program that has a lot of the
> plumbing we'd need:
>
> https://github.com/LorenzoBianconi/nfsd-netlink

This is something I developed just for testing the new interface but I agree we
could start from it.

Regarding the kernel part I addressed the comments I received upstream on v6 and
pushed the code here [0].
How do you guys prefer to proceed? Is the better to post v7 upstream and continue
the discussion in order to have something usable to develop the user-space part or
do you prefer to have something for the user-space first?
I do not have a strong opinion on it.

Regards,
Lorenzo

[0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7

>
> I think it might be good to clean up the interface a bit, build a
> manpage and merge that into nfs-utils.
>
> Questions:
>
> 1/ one big binary, or smaller nfsdctl-* programs (like git uses)?
>
> 2/ should it automagically read in nfs.conf? (I tend to think it should,
> but we might want an option to disable that)
>
> 3/ should "set threads" activate the server, or just set a count, and
> then we do a separate activation step to start it? If we want that, then
> we may want to twiddle the proposed netlink interface a bit.
>
> I'm sure other questions will arise as we embark on this too.
>
> Thoughts? Anyone have objections to this idea?
> --
> Jeff Layton <[email protected]>
>


Attachments:
(No filename) (2.76 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-02 17:57:25

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > The existing rpc.nfsd program was designed during a different time, when
> > we just didn't require that much control over how it behaved. It's
> > klunky to work with.
> >
> > In a response to Chuck's recent RFC patch to add knob to disable
> > READ_PLUS calls, I mentioned that it might be a good time to make a
> > clean break from the past and start a new program for controlling nfsd.
> >
> > Here's what I'm thinking:
> >
> > Let's build a swiss-army-knife kind of interface like git or virsh:
> >
> > # nfsdctl?stats <--- fetch the new stats that got merged
> > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > # nfsdctl set v3 on <--- enable NFSv3
> > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > # nfsdctl set threads 128 <--- spin up the threads
> >
> > We could start with just the bare minimum for now (the stats interface),
> > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > start and stop the server. systemd will also need to fall back to using
> > rpc.nfsd if nfsdctl or the netlink program isn't present.
> >
> > Note that I think this program will have to be a compiled binary vs. a
> > python script or the like, given that it'll be involved in system
> > startup.
> >
> > It turns out that Lorenzo already has a C program that has a lot of the
> > plumbing we'd need:
> >
> > ????https://github.com/LorenzoBianconi/nfsd-netlink
>
> This is something I developed just for testing the new interface but I agree we
> could start from it.
>
> Regarding the kernel part I addressed the comments I received upstream on v6 and
> pushed the code here [0].
> How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> the discussion in order to have something usable to develop the user-space part or
> do you prefer to have something for the user-space first?
> I do not have a strong opinion on it.
>
> Regards,
> Lorenzo
>
> [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
>
>

My advice?

Step back and spend some time working on the userland bits before
posting another revision. Experience has shown that you never realize
what sort of warts an interface like this has until you have to work
with it.

You may find that you want to tweak it some once you do, and it's much
easier to do that before we merge anything. This will be part of the
kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
it.

Having a userland program ready to go will allow us to do things like
set up the systemd service for this too, which is primarily how this new
program will be called.
--
Jeff Layton <[email protected]>

2024-02-05 16:46:16

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

> On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > > The existing rpc.nfsd program was designed during a different time, when
> > > we just didn't require that much control over how it behaved. It's
> > > klunky to work with.
> > >
> > > In a response to Chuck's recent RFC patch to add knob to disable
> > > READ_PLUS calls, I mentioned that it might be a good time to make a
> > > clean break from the past and start a new program for controlling nfsd.
> > >
> > > Here's what I'm thinking:
> > >
> > > Let's build a swiss-army-knife kind of interface like git or virsh:
> > >
> > > # nfsdctl?stats <--- fetch the new stats that got merged
> > > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > > # nfsdctl set v3 on <--- enable NFSv3
> > > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > > # nfsdctl set threads 128 <--- spin up the threads
> > >
> > > We could start with just the bare minimum for now (the stats interface),
> > > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > > start and stop the server. systemd will also need to fall back to using
> > > rpc.nfsd if nfsdctl or the netlink program isn't present.
> > >
> > > Note that I think this program will have to be a compiled binary vs. a
> > > python script or the like, given that it'll be involved in system
> > > startup.
> > >
> > > It turns out that Lorenzo already has a C program that has a lot of the
> > > plumbing we'd need:
> > >
> > > ????https://github.com/LorenzoBianconi/nfsd-netlink
> >
> > This is something I developed just for testing the new interface but I agree we
> > could start from it.
> >
> > Regarding the kernel part I addressed the comments I received upstream on v6 and
> > pushed the code here [0].
> > How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> > the discussion in order to have something usable to develop the user-space part or
> > do you prefer to have something for the user-space first?
> > I do not have a strong opinion on it.
> >
> > Regards,
> > Lorenzo
> >
> > [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> >
> >
>
> My advice?
>
> Step back and spend some time working on the userland bits before
> posting another revision. Experience has shown that you never realize
> what sort of warts an interface like this has until you have to work
> with it.
>
> You may find that you want to tweak it some once you do, and it's much
> easier to do that before we merge anything. This will be part of the
> kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> it.
>
> Having a userland program ready to go will allow us to do things like
> set up the systemd service for this too, which is primarily how this new
> program will be called.

I agree on it. In order to proceed I guess we should define a list of
requirements/expected behaviour on this new user-space tool used to
configure nfsd. I am not so familiar with the user-space requirements
for nfsd so I am just copying what you suggested, something like:

$ nfsdctl?stats <--- fetch the new stats that got merged
$ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
$ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
$ nfsdctl set threads 128 <--- spin up the threads

Can we start from [0] to develop them?

Regards,
Lorenzo

[0] https://github.com/LorenzoBianconi/nfsd-netlink

> --
> Jeff Layton <[email protected]>
>


Attachments:
(No filename) (3.79 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-05 17:30:55

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Mon, 2024-02-05 at 17:46 +0100, Lorenzo Bianconi wrote:
> > On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > > > The existing rpc.nfsd program was designed during a different time, when
> > > > we just didn't require that much control over how it behaved. It's
> > > > klunky to work with.
> > > >
> > > > In a response to Chuck's recent RFC patch to add knob to disable
> > > > READ_PLUS calls, I mentioned that it might be a good time to make a
> > > > clean break from the past and start a new program for controlling nfsd.
> > > >
> > > > Here's what I'm thinking:
> > > >
> > > > Let's build a swiss-army-knife kind of interface like git or virsh:
> > > >
> > > > # nfsdctl?stats <--- fetch the new stats that got merged
> > > > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > > > # nfsdctl set v3 on <--- enable NFSv3
> > > > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > > > # nfsdctl set threads 128 <--- spin up the threads
> > > >
> > > > We could start with just the bare minimum for now (the stats interface),
> > > > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > > > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > > > start and stop the server. systemd will also need to fall back to using
> > > > rpc.nfsd if nfsdctl or the netlink program isn't present.
> > > >
> > > > Note that I think this program will have to be a compiled binary vs. a
> > > > python script or the like, given that it'll be involved in system
> > > > startup.
> > > >
> > > > It turns out that Lorenzo already has a C program that has a lot of the
> > > > plumbing we'd need:
> > > >
> > > > ????https://github.com/LorenzoBianconi/nfsd-netlink
> > >
> > > This is something I developed just for testing the new interface but I agree we
> > > could start from it.
> > >
> > > Regarding the kernel part I addressed the comments I received upstream on v6 and
> > > pushed the code here [0].
> > > How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> > > the discussion in order to have something usable to develop the user-space part or
> > > do you prefer to have something for the user-space first?
> > > I do not have a strong opinion on it.
> > >
> > > Regards,
> > > Lorenzo
> > >
> > > [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> > >
> > >
> >
> > My advice?
> >
> > Step back and spend some time working on the userland bits before
> > posting another revision. Experience has shown that you never realize
> > what sort of warts an interface like this has until you have to work
> > with it.
> >
> > You may find that you want to tweak it some once you do, and it's much
> > easier to do that before we merge anything. This will be part of the
> > kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> > it.
> >
> > Having a userland program ready to go will allow us to do things like
> > set up the systemd service for this too, which is primarily how this new
> > program will be called.
>
> I agree on it. In order to proceed I guess we should define a list of
> requirements/expected behaviour on this new user-space tool used to
> configure nfsd. I am not so familiar with the user-space requirements
> for nfsd so I am just copying what you suggested, something like:
>
> $ nfsdctl?stats <--- fetch the new stats that got merged
> $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname

Those look fine.

All of the commands should display the current state too when run with
no arguments. So running "nfsctl xprt" should dump out all of the
listening sockets. Ditto for the ones below too.

> $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1

The above would also enable v4.0 too?

For this we might want to use the +/- syntax, actually. Also, there were
no minorversions before v4, so it probably better not to accept that for
v3:

$ nfsdctl proto +v3 -v4.0 +4.1

So to me, that would mean enable v3 and v4.1, and disable v4.0.
v2 (if supported) and v4.2 would be left unchanged.

> $ nfsdctl set threads 128 <--- spin up the threads
>

Maybe:

$ nfsdctl threads 128

?

> Can we start from [0] to develop them?
>

Yeah, that seems like a good place to start.

> Regards,
> Lorenzo
>
> [0] https://github.com/LorenzoBianconi/nfsd-netlink
--
Jeff Layton <[email protected]>

2024-02-05 17:44:51

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

> On Mon, 2024-02-05 at 17:46 +0100, Lorenzo Bianconi wrote:
> > > On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > > > > The existing rpc.nfsd program was designed during a different time, when
> > > > > we just didn't require that much control over how it behaved. It's
> > > > > klunky to work with.
> > > > >
> > > > > In a response to Chuck's recent RFC patch to add knob to disable
> > > > > READ_PLUS calls, I mentioned that it might be a good time to make a
> > > > > clean break from the past and start a new program for controlling nfsd.
> > > > >
> > > > > Here's what I'm thinking:
> > > > >
> > > > > Let's build a swiss-army-knife kind of interface like git or virsh:
> > > > >
> > > > > # nfsdctl?stats <--- fetch the new stats that got merged
> > > > > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > > > > # nfsdctl set v3 on <--- enable NFSv3
> > > > > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > > > > # nfsdctl set threads 128 <--- spin up the threads
> > > > >
> > > > > We could start with just the bare minimum for now (the stats interface),
> > > > > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > > > > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > > > > start and stop the server. systemd will also need to fall back to using
> > > > > rpc.nfsd if nfsdctl or the netlink program isn't present.
> > > > >
> > > > > Note that I think this program will have to be a compiled binary vs. a
> > > > > python script or the like, given that it'll be involved in system
> > > > > startup.
> > > > >
> > > > > It turns out that Lorenzo already has a C program that has a lot of the
> > > > > plumbing we'd need:
> > > > >
> > > > > ????https://github.com/LorenzoBianconi/nfsd-netlink
> > > >
> > > > This is something I developed just for testing the new interface but I agree we
> > > > could start from it.
> > > >
> > > > Regarding the kernel part I addressed the comments I received upstream on v6 and
> > > > pushed the code here [0].
> > > > How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> > > > the discussion in order to have something usable to develop the user-space part or
> > > > do you prefer to have something for the user-space first?
> > > > I do not have a strong opinion on it.
> > > >
> > > > Regards,
> > > > Lorenzo
> > > >
> > > > [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> > > >
> > > >
> > >
> > > My advice?
> > >
> > > Step back and spend some time working on the userland bits before
> > > posting another revision. Experience has shown that you never realize
> > > what sort of warts an interface like this has until you have to work
> > > with it.
> > >
> > > You may find that you want to tweak it some once you do, and it's much
> > > easier to do that before we merge anything. This will be part of the
> > > kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> > > it.
> > >
> > > Having a userland program ready to go will allow us to do things like
> > > set up the systemd service for this too, which is primarily how this new
> > > program will be called.
> >
> > I agree on it. In order to proceed I guess we should define a list of
> > requirements/expected behaviour on this new user-space tool used to
> > configure nfsd. I am not so familiar with the user-space requirements
> > for nfsd so I am just copying what you suggested, something like:
> >
> > $ nfsdctl?stats <--- fetch the new stats that got merged
> > $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
>
> Those look fine.
>
> All of the commands should display the current state too when run with
> no arguments. So running "nfsctl xprt" should dump out all of the
> listening sockets. Ditto for the ones below too.

ack

>
> > $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
>
> The above would also enable v4.0 too?
>
> For this we might want to use the +/- syntax, actually. Also, there were
> no minorversions before v4, so it probably better not to accept that for
> v3:
>
> $ nfsdctl proto +v3 -v4.0 +4.1

according to the previous discussion we agreed to pass to the kernel just the
protocol versions we want to enable (v3.0 v4.0 v4.1 in the previous example)
and disable all the other ones..but I am fine to define a different semantics.
What do you think it is the best one?

>
> So to me, that would mean enable v3 and v4.1, and disable v4.0.
> v2 (if supported) and v4.2 would be left unchanged.
>
> > $ nfsdctl set threads 128 <--- spin up the threads
> >
>
> Maybe:
>
> $ nfsdctl threads 128

ack

>
> ?
>
> > Can we start from [0] to develop them?
> >
>
> Yeah, that seems like a good place to start.

ack, I will work in it.

Regards,
Lorenzo

>
> > Regards,
> > Lorenzo
> >
> > [0] https://github.com/LorenzoBianconi/nfsd-netlink
> --
> Jeff Layton <[email protected]>
>


Attachments:
(No filename) (5.22 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-05 18:03:45

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Mon, 2024-02-05 at 18:44 +0100, Lorenzo Bianconi wrote:
> > On Mon, 2024-02-05 at 17:46 +0100, Lorenzo Bianconi wrote:
> > > > On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > > > > > The existing rpc.nfsd program was designed during a different time, when
> > > > > > we just didn't require that much control over how it behaved. It's
> > > > > > klunky to work with.
> > > > > >
> > > > > > In a response to Chuck's recent RFC patch to add knob to disable
> > > > > > READ_PLUS calls, I mentioned that it might be a good time to make a
> > > > > > clean break from the past and start a new program for controlling nfsd.
> > > > > >
> > > > > > Here's what I'm thinking:
> > > > > >
> > > > > > Let's build a swiss-army-knife kind of interface like git or virsh:
> > > > > >
> > > > > > # nfsdctl?stats <--- fetch the new stats that got merged
> > > > > > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > > > > > # nfsdctl set v3 on <--- enable NFSv3
> > > > > > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > > > > > # nfsdctl set threads 128 <--- spin up the threads
> > > > > >
> > > > > > We could start with just the bare minimum for now (the stats interface),
> > > > > > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > > > > > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > > > > > start and stop the server. systemd will also need to fall back to using
> > > > > > rpc.nfsd if nfsdctl or the netlink program isn't present.
> > > > > >
> > > > > > Note that I think this program will have to be a compiled binary vs. a
> > > > > > python script or the like, given that it'll be involved in system
> > > > > > startup.
> > > > > >
> > > > > > It turns out that Lorenzo already has a C program that has a lot of the
> > > > > > plumbing we'd need:
> > > > > >
> > > > > > ????https://github.com/LorenzoBianconi/nfsd-netlink
> > > > >
> > > > > This is something I developed just for testing the new interface but I agree we
> > > > > could start from it.
> > > > >
> > > > > Regarding the kernel part I addressed the comments I received upstream on v6 and
> > > > > pushed the code here [0].
> > > > > How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> > > > > the discussion in order to have something usable to develop the user-space part or
> > > > > do you prefer to have something for the user-space first?
> > > > > I do not have a strong opinion on it.
> > > > >
> > > > > Regards,
> > > > > Lorenzo
> > > > >
> > > > > [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> > > > >
> > > > >
> > > >
> > > > My advice?
> > > >
> > > > Step back and spend some time working on the userland bits before
> > > > posting another revision. Experience has shown that you never realize
> > > > what sort of warts an interface like this has until you have to work
> > > > with it.
> > > >
> > > > You may find that you want to tweak it some once you do, and it's much
> > > > easier to do that before we merge anything. This will be part of the
> > > > kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> > > > it.
> > > >
> > > > Having a userland program ready to go will allow us to do things like
> > > > set up the systemd service for this too, which is primarily how this new
> > > > program will be called.
> > >
> > > I agree on it. In order to proceed I guess we should define a list of
> > > requirements/expected behaviour on this new user-space tool used to
> > > configure nfsd. I am not so familiar with the user-space requirements
> > > for nfsd so I am just copying what you suggested, something like:
> > >
> > > $ nfsdctl?stats <--- fetch the new stats that got merged
> > > $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
> >
> > Those look fine.
> >
> > All of the commands should display the current state too when run with
> > no arguments. So running "nfsctl xprt" should dump out all of the
> > listening sockets. Ditto for the ones below too.
>
> ack
>

I think we might need a "nfsdctl xprt del" too. I know we don't have
that functionality now, but I think it's missing. Otherwise if you
mistakenly set the wrong socket with the interface above, how do you fix
it?

Alternately, we could provide some way to reset the server state
altogether:

# nfsdctl reset

?

> >
> > > $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
> >
> > The above would also enable v4.0 too?
> >
> > For this we might want to use the +/- syntax, actually. Also, there were
> > no minorversions before v4, so it probably better not to accept that for
> > v3:
> >
> > ????$ nfsdctl proto +v3 -v4.0 +4.1
>
> according to the previous discussion we agreed to pass to the kernel just the
> protocol versions we want to enable (v3.0 v4.0 v4.1 in the previous example)
> and disable all the other ones..but I am fine to define a different semantics.
> What do you think it is the best one?
>


The kernel interface should be declarative like you have, but the
command-line interface could be more imperative like this. It could do a
fetch of the currently enabled versions, twiddle them based on the +/-
arguments and then send the new set down to the kernel for it to act on.

One thing we haven't considered here too: Kconfig allows you to compile
out support for some versions. For instance, v2 support is disabled by
default these days.

Maybe we should rethink the kernel interface too: When querying the
versions, we could have it send the full set of all of the versions that
it supports up to userland, and report whether each version is enabled
or disabled.

So if you get a list of versions from the kernel, and v2 isn't in the
list at all, you can assume that the kernel doesn't support it.

> >
> > So to me, that would mean enable v3 and v4.1, and disable v4.0.
> > v2 (if supported) and v4.2 would be left unchanged.
> >
> > > $ nfsdctl set threads 128 <--- spin up the threads
> > >
> >
> > Maybe:
> >
> > ????$ nfsdctl threads 128
>
> ack
>
> >
> > ?
> > ?
> >
> >
> >
> >
> >
> >
> >
> > > Can we start from [0] to develop them?
> > >
> >
> > Yeah, that seems like a good place to start.
>
> ack, I will work in it.
>

Very exciting!

--
Jeff Layton <[email protected]>

2024-02-05 21:14:37

by Chuck Lever

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?


> On Feb 5, 2024, at 1:03 PM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2024-02-05 at 18:44 +0100, Lorenzo Bianconi wrote:
>>> On Mon, 2024-02-05 at 17:46 +0100, Lorenzo Bianconi wrote:
>>>>
>>>>
>>>> I agree on it. In order to proceed I guess we should define a list of
>>>> requirements/expected behaviour on this new user-space tool used to
>>>> configure nfsd. I am not so familiar with the user-space requirements
>>>> for nfsd so I am just copying what you suggested, something like:
>>>>
>>>> $ nfsdctl stats <--- fetch the new stats that got merged
>>>> $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
>>>
>>> Those look fine.
>>>
>>> All of the commands should display the current state too when run with
>>> no arguments. So running "nfsctl xprt" should dump out all of the
>>> listening sockets. Ditto for the ones below too.
>>
>> ack
>>
>
> I think we might need a "nfsdctl xprt del" too. I know we don't have
> that functionality now, but I think it's missing. Otherwise if you
> mistakenly set the wrong socket with the interface above, how do you fix
> it?

Is "nfsdctl xprt" for managing listeners? I think the following
might be more in line with how NFS and RPC deal with transports:

nfsdctl listen netid xxx addr yyyyy [ port zzz ]

or

nfsdctl listen netid xxx dns yyyyy [ port zzz ]

"xxx" would be one of

udp, udp6, tcp, tcp6, rdma, rdma6, local

"addr yyyy" would be an IP presentation address, in an address
family that matches the netid's family. Recall that TI-RPC
considers IPv4 and IPv6 listeners each as separate endpoints.

"dns yyyy" would be a DNS label

Do we need to add a keyword to control whether the kernel
registers the new listening endpoint with the local rpcbind?


> Alternately, we could provide some way to reset the server state
> altogether:
>
> # nfsdctl reset
>
> ?
>
>>>
>>>> $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
>>>
>>> The above would also enable v4.0 too?
>>>
>>> For this we might want to use the +/- syntax, actually. Also, there were
>>> no minorversions before v4, so it probably better not to accept that for
>>> v3:
>>>
>>> $ nfsdctl proto +v3 -v4.0 +4.1

I would prefer not using "proto" since that term is already
overloaded. "version" might be better, and maybe we don't
need the "v" suffix...?


--
Chuck Lever


2024-02-05 21:23:53

by NeilBrown

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Tue, 06 Feb 2024, Lorenzo Bianconi wrote:
> > On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> > > > The existing rpc.nfsd program was designed during a different time, when
> > > > we just didn't require that much control over how it behaved. It's
> > > > klunky to work with.
> > > >
> > > > In a response to Chuck's recent RFC patch to add knob to disable
> > > > READ_PLUS calls, I mentioned that it might be a good time to make a
> > > > clean break from the past and start a new program for controlling nfsd.
> > > >
> > > > Here's what I'm thinking:
> > > >
> > > > Let's build a swiss-army-knife kind of interface like git or virsh:
> > > >
> > > > # nfsdctl stats <--- fetch the new stats that got merged
> > > > # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> > > > # nfsdctl set v3 on <--- enable NFSv3
> > > > # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> > > > # nfsdctl set threads 128 <--- spin up the threads
> > > >
> > > > We could start with just the bare minimum for now (the stats interface),
> > > > and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> > > > want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> > > > start and stop the server. systemd will also need to fall back to using
> > > > rpc.nfsd if nfsdctl or the netlink program isn't present.
> > > >
> > > > Note that I think this program will have to be a compiled binary vs. a
> > > > python script or the like, given that it'll be involved in system
> > > > startup.
> > > >
> > > > It turns out that Lorenzo already has a C program that has a lot of the
> > > > plumbing we'd need:
> > > >
> > > >     https://github.com/LorenzoBianconi/nfsd-netlink
> > >
> > > This is something I developed just for testing the new interface but I agree we
> > > could start from it.
> > >
> > > Regarding the kernel part I addressed the comments I received upstream on v6 and
> > > pushed the code here [0].
> > > How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> > > the discussion in order to have something usable to develop the user-space part or
> > > do you prefer to have something for the user-space first?
> > > I do not have a strong opinion on it.
> > >
> > > Regards,
> > > Lorenzo
> > >
> > > [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> > >
> > >
> >
> > My advice?
> >
> > Step back and spend some time working on the userland bits before
> > posting another revision. Experience has shown that you never realize
> > what sort of warts an interface like this has until you have to work
> > with it.
> >
> > You may find that you want to tweak it some once you do, and it's much
> > easier to do that before we merge anything. This will be part of the
> > kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> > it.
> >
> > Having a userland program ready to go will allow us to do things like
> > set up the systemd service for this too, which is primarily how this new
> > program will be called.
>
> I agree on it. In order to proceed I guess we should define a list of
> requirements/expected behaviour on this new user-space tool used to
> configure nfsd. I am not so familiar with the user-space requirements
> for nfsd so I am just copying what you suggested, something like:
>
> $ nfsdctl stats <--- fetch the new stats that got merged
> $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
> $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
> $ nfsdctl set threads 128 <--- spin up the threads

My preference would be:

nfsdctl start
and
nfsdctl stop

where nfsdctl reads a config file to discover what setting are required.
I cannot see any credible use case for 'xprt' or 'proto' or 'threads'
commands.

Possibly nfsctl would accept config on the command line:
nfsdctl start proto=3,4.1 threads=42 proto=tcp:localhost:2049
or similar.

It would also be helpful to have "nfsdinfo" or similar which has "stats"
and "status" commands. Maybe that could be combined with "nfsdctl", but
extracting info is not a form of "control". Or we could find a more
generic verb. For soft-raid I wrote "mdadm" "adm" for "administer"
which seemed less specific than "control" (mdctl). Though probably the
main reason was that I like palindromes and "mdadm" was a bit easier to
say. nfsdadm ?? (see also udevadm, drdbadm, fsadm ....) But maybe I'm
just too fuss.

In my experience working with our customers and support team, they are
not at all interested in fine control. "systemctl restart nfs-server" is
their second preference when "reboot" isn't appropriate for some reason.

You might be able to convince me that "nfdctl reload" was useful - it
could be called from "systemctl reload nfs-server". That would then
justify kernel interfaces to remove xprts. But having all the
fine-control just increases the required testing needed with little
practical gain.

NeilBrown

2024-02-05 21:34:17

by Chuck Lever

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?



> On Feb 5, 2024, at 2:44 PM, NeilBrown <[email protected]> wrote:
>
> On Tue, 06 Feb 2024, Lorenzo Bianconi wrote:
>>> On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
>>>>> The existing rpc.nfsd program was designed during a different time, when
>>>>> we just didn't require that much control over how it behaved. It's
>>>>> klunky to work with.
>>>>>
>>>>> In a response to Chuck's recent RFC patch to add knob to disable
>>>>> READ_PLUS calls, I mentioned that it might be a good time to make a
>>>>> clean break from the past and start a new program for controlling nfsd.
>>>>>
>>>>> Here's what I'm thinking:
>>>>>
>>>>> Let's build a swiss-army-knife kind of interface like git or virsh:
>>>>>
>>>>> # nfsdctl stats <--- fetch the new stats that got merged
>>>>> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
>>>>> # nfsdctl set v3 on <--- enable NFSv3
>>>>> # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
>>>>> # nfsdctl set threads 128 <--- spin up the threads
>>>>>
>>>>> We could start with just the bare minimum for now (the stats interface),
>>>>> and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
>>>>> want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
>>>>> start and stop the server. systemd will also need to fall back to using
>>>>> rpc.nfsd if nfsdctl or the netlink program isn't present.
>>>>>
>>>>> Note that I think this program will have to be a compiled binary vs. a
>>>>> python script or the like, given that it'll be involved in system
>>>>> startup.
>>>>>
>>>>> It turns out that Lorenzo already has a C program that has a lot of the
>>>>> plumbing we'd need:
>>>>>
>>>>> https://github.com/LorenzoBianconi/nfsd-netlink
>>>>
>>>> This is something I developed just for testing the new interface but I agree we
>>>> could start from it.
>>>>
>>>> Regarding the kernel part I addressed the comments I received upstream on v6 and
>>>> pushed the code here [0].
>>>> How do you guys prefer to proceed? Is the better to post v7 upstream and continue
>>>> the discussion in order to have something usable to develop the user-space part or
>>>> do you prefer to have something for the user-space first?
>>>> I do not have a strong opinion on it.
>>>>
>>>> Regards,
>>>> Lorenzo
>>>>
>>>> [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
>>>>
>>>>
>>>
>>> My advice?
>>>
>>> Step back and spend some time working on the userland bits before
>>> posting another revision. Experience has shown that you never realize
>>> what sort of warts an interface like this has until you have to work
>>> with it.
>>>
>>> You may find that you want to tweak it some once you do, and it's much
>>> easier to do that before we merge anything. This will be part of the
>>> kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
>>> it.
>>>
>>> Having a userland program ready to go will allow us to do things like
>>> set up the systemd service for this too, which is primarily how this new
>>> program will be called.
>>
>> I agree on it. In order to proceed I guess we should define a list of
>> requirements/expected behaviour on this new user-space tool used to
>> configure nfsd. I am not so familiar with the user-space requirements
>> for nfsd so I am just copying what you suggested, something like:
>>
>> $ nfsdctl stats <--- fetch the new stats that got merged
>> $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
>> $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
>> $ nfsdctl set threads 128 <--- spin up the threads
>
> My preference would be:
>
> nfsdctl start
> and
> nfsdctl stop
>
> where nfsdctl reads a config file to discover what setting are required.
> I cannot see any credible use case for 'xprt' or 'proto' or 'threads'
> commands.
>
> Possibly nfsctl would accept config on the command line:
> nfsdctl start proto=3,4.1 threads=42 proto=tcp:localhost:2049
> or similar.

You've got proto= listed twice here.

I'm more in favor of having more subcommands, each of which do
something simple. Easier to understand, easier to test.


> It would also be helpful to have "nfsdinfo" or similar which has "stats"
> and "status" commands. Maybe that could be combined with "nfsdctl", but
> extracting info is not a form of "control". Or we could find a more
> generic verb. For soft-raid I wrote "mdadm" "adm" for "administer"
> which seemed less specific than "control" (mdctl). Though probably the
> main reason was that I like palindromes and "mdadm" was a bit easier to
> say. nfsdadm ?? (see also udevadm, drdbadm, fsadm ....) But maybe I'm
> just too fuss.
>
> In my experience working with our customers and support team, they are
> not at all interested in fine control.

This is an interface to be used by systemctl. I don't think customers
or support teams would ever need to make use of it directly.


> "systemctl restart nfs-server" is
> their second preference when "reboot" isn't appropriate for some reason.
>
> You might be able to convince me that "nfdctl reload" was useful - it
> could be called from "systemctl reload nfs-server". That would then
> justify kernel interfaces to remove xprts. But having all the
> fine-control just increases the required testing needed with little
> practical gain.
>
> NeilBrown


--
Chuck Lever


2024-02-05 22:06:00

by NeilBrown

[permalink] [raw]
Subject: Re: Should we establish a new nfsdctl userland program?

On Tue, 06 Feb 2024, Chuck Lever III wrote:
>
>
> > On Feb 5, 2024, at 2:44 PM, NeilBrown <[email protected]> wrote:
> >
> > On Tue, 06 Feb 2024, Lorenzo Bianconi wrote:
> >>> On Fri, 2024-02-02 at 18:08 +0100, Lorenzo Bianconi wrote:
> >>>>> The existing rpc.nfsd program was designed during a different time, when
> >>>>> we just didn't require that much control over how it behaved. It's
> >>>>> klunky to work with.
> >>>>>
> >>>>> In a response to Chuck's recent RFC patch to add knob to disable
> >>>>> READ_PLUS calls, I mentioned that it might be a good time to make a
> >>>>> clean break from the past and start a new program for controlling nfsd.
> >>>>>
> >>>>> Here's what I'm thinking:
> >>>>>
> >>>>> Let's build a swiss-army-knife kind of interface like git or virsh:
> >>>>>
> >>>>> # nfsdctl stats <--- fetch the new stats that got merged
> >>>>> # nfsdctl add_listener <--- add a new listen socket, by address or hostname
> >>>>> # nfsdctl set v3 on <--- enable NFSv3
> >>>>> # nfsdctl set splice_read off <--- disable splice reads (per Chuck's recent patch)
> >>>>> # nfsdctl set threads 128 <--- spin up the threads
> >>>>>
> >>>>> We could start with just the bare minimum for now (the stats interface),
> >>>>> and then expand on it. Once we're at feature parity with rpc.nfsd, we'd
> >>>>> want to have systemd preferentially use nfsdctl instead of rpc.nfsd to
> >>>>> start and stop the server. systemd will also need to fall back to using
> >>>>> rpc.nfsd if nfsdctl or the netlink program isn't present.
> >>>>>
> >>>>> Note that I think this program will have to be a compiled binary vs. a
> >>>>> python script or the like, given that it'll be involved in system
> >>>>> startup.
> >>>>>
> >>>>> It turns out that Lorenzo already has a C program that has a lot of the
> >>>>> plumbing we'd need:
> >>>>>
> >>>>> https://github.com/LorenzoBianconi/nfsd-netlink
> >>>>
> >>>> This is something I developed just for testing the new interface but I agree we
> >>>> could start from it.
> >>>>
> >>>> Regarding the kernel part I addressed the comments I received upstream on v6 and
> >>>> pushed the code here [0].
> >>>> How do you guys prefer to proceed? Is the better to post v7 upstream and continue
> >>>> the discussion in order to have something usable to develop the user-space part or
> >>>> do you prefer to have something for the user-space first?
> >>>> I do not have a strong opinion on it.
> >>>>
> >>>> Regards,
> >>>> Lorenzo
> >>>>
> >>>> [0] https://github.com/LorenzoBianconi/nfsd-next/tree/nfsd-next-netlink-new-cmds-public-v7
> >>>>
> >>>>
> >>>
> >>> My advice?
> >>>
> >>> Step back and spend some time working on the userland bits before
> >>> posting another revision. Experience has shown that you never realize
> >>> what sort of warts an interface like this has until you have to work
> >>> with it.
> >>>
> >>> You may find that you want to tweak it some once you do, and it's much
> >>> easier to do that before we merge anything. This will be part of the
> >>> kernel ABI, so once it's in a shipping kernel, we're sort of stuck with
> >>> it.
> >>>
> >>> Having a userland program ready to go will allow us to do things like
> >>> set up the systemd service for this too, which is primarily how this new
> >>> program will be called.
> >>
> >> I agree on it. In order to proceed I guess we should define a list of
> >> requirements/expected behaviour on this new user-space tool used to
> >> configure nfsd. I am not so familiar with the user-space requirements
> >> for nfsd so I am just copying what you suggested, something like:
> >>
> >> $ nfsdctl stats <--- fetch the new stats that got merged
> >> $ nfsdctl xprt add proto <udp|tcp> host <host> [port <port>] <--- add a new listen socket, by address or hostname
> >> $ nfsdctl proto v3.0 v4.0 v4.1 <--- enable NFSv3 and v4.1
> >> $ nfsdctl set threads 128 <--- spin up the threads
> >
> > My preference would be:
> >
> > nfsdctl start
> > and
> > nfsdctl stop
> >
> > where nfsdctl reads a config file to discover what setting are required.
> > I cannot see any credible use case for 'xprt' or 'proto' or 'threads'
> > commands.
> >
> > Possibly nfsctl would accept config on the command line:
> > nfsdctl start proto=3,4.1 threads=42 proto=tcp:localhost:2049
> > or similar.
>
> You've got proto= listed twice here.

Yep - the second was meant to be xprt=

>
> I'm more in favor of having more subcommands, each of which do
> something simple. Easier to understand, easier to test.

Fewer subcommands are easier to test than more.
OK - forget the "config on command line" idea. Just read config from
/etc/nfs.conf and action that.

>
>
> > It would also be helpful to have "nfsdinfo" or similar which has "stats"
> > and "status" commands. Maybe that could be combined with "nfsdctl", but
> > extracting info is not a form of "control". Or we could find a more
> > generic verb. For soft-raid I wrote "mdadm" "adm" for "administer"
> > which seemed less specific than "control" (mdctl). Though probably the
> > main reason was that I like palindromes and "mdadm" was a bit easier to
> > say. nfsdadm ?? (see also udevadm, drdbadm, fsadm ....) But maybe I'm
> > just too fuss.
> >
> > In my experience working with our customers and support team, they are
> > not at all interested in fine control.
>
> This is an interface to be used by systemctl. I don't think customers
> or support teams would ever need to make use of it directly.

If it's to be used by systemctl, then there is zero justification for
anything other than start/reload/stop. We already have
/etc/nfs.conf.d/* for configuring nfsd. Requiring tools to edit
systemctl unit files as well is a retrograde step.

NeilBrown


>
>
> > "systemctl restart nfs-server" is
> > their second preference when "reboot" isn't appropriate for some reason.
> >
> > You might be able to convince me that "nfdctl reload" was useful - it
> > could be called from "systemctl reload nfs-server". That would then
> > justify kernel interfaces to remove xprts. But having all the
> > fine-control just increases the required testing needed with little
> > practical gain.
> >
> > NeilBrown
>
>
> --
> Chuck Lever
>
>
>