LinuxLists.cc - [PATCH v2] nfs.man: document requirements for NFSv4 identity

2022-03-14 01:59:29

Subject: [PATCH v2] nfs.man: document requirements for NFSv4 identity

When mounting NFS filesystem in a network namespace using v4, some care
must be taken to ensure a unique and stable client identity. Similar
case is needed for NFS-root and other situations.

Add documentation explaining the requirements for the NFS identity in
these situations.

Signed-off-by: NeilBrown <[email protected]>
---

I think I've address most of the feedback, but please forgive and remind
if I missed something.
NeilBrown

utils/mount/nfs.man | 109 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 108 insertions(+), 1 deletion(-)

diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
index d9f34df36b42..5f15abe8cf72 100644
--- a/utils/mount/nfs.man
+++ b/utils/mount/nfs.man
@@ -1,7 +1,7 @@
.\"@(#)nfs.5"
.TH NFS 5 "9 October 2012"
.SH NAME
-nfs \- fstab format and options for the
+nfs \- fstab format and configuration for the
.B nfs
file systems
.SH SYNOPSIS
@@ -1844,6 +1844,113 @@ export pathname, but not both, during a remount. For example,
merges the mount option
.B ro
with the mount options already saved on disk for the NFS server mounted at /mnt.
+.SH "NFS CLIENT IDENTIFIER"
+NFSv4 requires that the client present a unique identifier to the server
+to be used to track state such as file locks. By default Linux NFS uses
+the host name, as configured at the time of the first NFS mount,
+together with some fixed content such as the name "Linux NFS" and the
+particular protocol version. When the hostname is guaranteed to be
+unique among all client which access the same server this is sufficient.
+If hostname uniqueness cannot be assumed, extra identity information
+must be provided.
+.PP
+Some situations which are known to be problematic with respect to unique
+host names include:
+.IP \- 2
+NFS-root (diskless) clients, where the DCHP server (or equivalent) does
+not provide a unique host name.
+.IP \- 2
+"containers" within a single Linux host. If each container has a separate
+network namespace, but does not use the UTS namespace to provide a unique
+host name, then there can be multiple effective NFS clients with the
+same host name.
+.IP \= 2
+Clients across multiple administrative domains that access a common NFS
+server. If assignment of host name is devolved to separate domains,
+uniqueness cannot be guaranteed, unless a domain name is included in the
+host name.
+.SS "Increasing Client Uniqueness"
+Apart from the host name, which is the preferred way to differentiate
+NFS clients, there are two mechanisms to add uniqueness to the
+client identifier.
+.TP
+.B nfs.nfs4_unique_id
+This module parameter can be set to an arbitrary string at boot time, or
+when the
+.B nfs
+module is loaded. This might be suitable for configuring diskless clients.
+.TP
+.B /sys/fs/nfs/client/net/identifier
+This virtual file (available since Linux 5.3) is local to the network
+name-space in which it is accessed and so can provided uniqueness between
+network namespaces (containers) when the hostname remains uniform.
+.RS
+.PP
+This value is empty on name-space creation.
+If the value is to be set, that should be done before the first
+mount. If the container system has access to some sort of per-container
+identity then that identity, possibly obfuscated as a UUID is privacy is
+needed, can be used. Combining the identity with the name of the
+container systems would also help. For example:
+.RS 4
+echo "ip-netns:`ip netns identify`" \\
+.br
+ > /sys/fs/nfs/client/net/identifier
+.br
+uuidgen --sha1 --namespace @url \\
+.br
+ -N "nfs:`cat /etc/machine-id`" \\
+.br
+ > /sys/fs/nfs/client/net/identifier
+.RE
+If the container system provides no stable name,
+but does have stable storage, then something like
+.RS 4
+[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
+.br
+cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
+.RE
+would suffice.
+.PP
+If a container has neither a stable name nor stable (local) storage,
+then it is not possible to provide a stable identifier, so providing
+a random identifier to ensure uniqueness would be best
+.RS 4
+uuidgen > /sys/fs/nfs/client/net/identifier
+.RE
+.RE
+.SS Consequences of poor identity setting
+Any two concurrent clients that might access the same server must have
+different identifiers for correct operation, and any two consecutive
+instances of the same client should have the same identifier for optimal
+crash recovery.
+.PP
+If two different clients present the same identity to a server there are
+two possible scenarios. If the clients use the same credential then the
+server will treat them as the same client which appears to be restarting
+frequently. One client may manage to open some files etc, but as soon
+as the other client does anything the first client will lose access and
+need to re-open everything.
+.PP
+If the clients use different credentials, then the second client to
+establish a connection to the server will be refused access. For
+.B auth=sys
+the credential is based on hostname, so will be the same if the
+identities are the same. With
+.B auth=krb
+the credential is stored in
+.I /etc/krb5.keytab
+and will be the same only if this is copied among hosts.
+.PP
+If the identity is unique but not stable, for example if it is generated
+randomly on each start up of the NFS client, then crash recovery is
+affected. When a client shuts down uncleanly and restarts, the server
+will normally detect this because the same identity is presented with
+different boot time (or "incarnation verifier"), and will discard old
+state. If the client presents a different identifier, then the server
+cannot discard old state until the lease time has expired, and the new
+client may be delayed in opening or locking files that it was
+previously accessing.
.SH FILES
.TP 1.5i
.I /etc/fstab
--
2.35.1

2022-03-14 15:06:32

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

On 13 Mar 2022, at 21:04, NeilBrown wrote:

> When mounting NFS filesystem in a network namespace using v4, some
> care
> must be taken to ensure a unique and stable client identity. Similar
> case is needed for NFS-root and other situations.
>
> Add documentation explaining the requirements for the NFS identity in
> these situations.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
>
> I think I've address most of the feedback, but please forgive and
> remind
> if I missed something.
> NeilBrown
>
> utils/mount/nfs.man | 109
> +++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 108 insertions(+), 1 deletion(-)
>
> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
> index d9f34df36b42..5f15abe8cf72 100644
> --- a/utils/mount/nfs.man
> +++ b/utils/mount/nfs.man
> @@ -1,7 +1,7 @@
> .\"@(#)nfs.5"
> .TH NFS 5 "9 October 2012"
> .SH NAME
> -nfs \- fstab format and options for the
> +nfs \- fstab format and configuration for the
> .B nfs
> file systems
> .SH SYNOPSIS
> @@ -1844,6 +1844,113 @@ export pathname, but not both, during a
> remount. For example,
> merges the mount option
> .B ro
> with the mount options already saved on disk for the NFS server
> mounted at /mnt.
> +.SH "NFS CLIENT IDENTIFIER"
> +NFSv4 requires that the client present a unique identifier to the
> server
> +to be used to track state such as file locks. By default Linux NFS
> uses
> +the host name, as configured at the time of the first NFS mount,
> +together with some fixed content such as the name "Linux NFS" and the
> +particular protocol version. When the hostname is guaranteed to be
> +unique among all client which access the same server this is
> sufficient.
> +If hostname uniqueness cannot be assumed, extra identity information
> +must be provided.
> +.PP
> +Some situations which are known to be problematic with respect to
> unique
> +host names include:
> +.IP \- 2
> +NFS-root (diskless) clients, where the DCHP server (or equivalent)
> does
> +not provide a unique host name.
> +.IP \- 2
> +"containers" within a single Linux host. If each container has a
> separate
> +network namespace, but does not use the UTS namespace to provide a
> unique
> +host name, then there can be multiple effective NFS clients with the
> +same host name.
> +.IP \= 2
> +Clients across multiple administrative domains that access a common
> NFS
> +server. If assignment of host name is devolved to separate domains,
> +uniqueness cannot be guaranteed, unless a domain name is included in
> the
> +host name.
> +.SS "Increasing Client Uniqueness"
> +Apart from the host name, which is the preferred way to differentiate
> +NFS clients, there are two mechanisms to add uniqueness to the
> +client identifier.
> +.TP
> +.B nfs.nfs4_unique_id
> +This module parameter can be set to an arbitrary string at boot time,
> or
> +when the
> +.B nfs
> +module is loaded. This might be suitable for configuring diskless
> clients.
> +.TP
> +.B /sys/fs/nfs/client/net/identifier
> +This virtual file (available since Linux 5.3) is local to the network
> +name-space in which it is accessed and so can provided uniqueness
> between

+name-space in which it is accessed and so can provided uniqueness
between
+name-space in which it is accessed and so can provide uniqueness
between
^

> +network namespaces (containers) when the hostname remains uniform.
> +.RS
> +.PP
> +This value is empty on name-space creation.
> +If the value is to be set, that should be done before the first
> +mount. If the container system has access to some sort of
> per-container
> +identity then that identity, possibly obfuscated as a UUID is privacy
> is

+identity then that identity, possibly obfuscated as a UUID is privacy
is
+identity then that identity, possibly obfuscated as a UUID if privacy
is
^^

> +needed, can be used. Combining the identity with the name of the
> +container systems would also help. For example:
> +.RS 4
> +echo "ip-netns:`ip netns identify`" \\
> +.br
> + > /sys/fs/nfs/client/net/identifier
> +.br
> +uuidgen --sha1 --namespace @url \\
> +.br
> + -N "nfs:`cat /etc/machine-id`" \\
> +.br
> + > /sys/fs/nfs/client/net/identifier
> +.RE
> +If the container system provides no stable name,
> +but does have stable storage, then something like
> +.RS 4
> +[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
> +.br
> +cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
> +.RE
> +would suffice.
> +.PP
> +If a container has neither a stable name nor stable (local) storage,
> +then it is not possible to provide a stable identifier, so providing
> +a random identifier to ensure uniqueness would be best
> +.RS 4
> +uuidgen > /sys/fs/nfs/client/net/identifier
> +.RE
> +.RE
> +.SS Consequences of poor identity setting
> +Any two concurrent clients that might access the same server must
> have
> +different identifiers for correct operation, and any two consecutive
> +instances of the same client should have the same identifier for
> optimal
> +crash recovery.
> +.PP
> +If two different clients present the same identity to a server there
> are
> +two possible scenarios. If the clients use the same credential then
> the
> +server will treat them as the same client which appears to be
> restarting
> +frequently. One client may manage to open some files etc, but as
> soon
> +as the other client does anything the first client will lose access
> and
> +need to re-open everything.
> +.PP
> +If the clients use different credentials, then the second client to
> +establish a connection to the server will be refused access. For
> +.B auth=sys
> +the credential is based on hostname, so will be the same if the
> +identities are the same. With
> +.B auth=krb
> +the credential is stored in
> +.I /etc/krb5.keytab
> +and will be the same only if this is copied among hosts.
> +.PP
> +If the identity is unique but not stable, for example if it is
> generated
> +randomly on each start up of the NFS client, then crash recovery is
> +affected. When a client shuts down uncleanly and restarts, the
> server
> +will normally detect this because the same identity is presented with

There's ambiguity on "this", it could be the situation described in the
previous sentence, how about:

+will normally detect this because the same identity is presented with
+will normally detect the unclean restart because the same identity is
presented with

Ben

2022-03-14 19:04:59

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

Hi Neil-

> On Mar 13, 2022, at 9:04 PM, NeilBrown <[email protected]> wrote:
>
>
> When mounting NFS filesystem in a network namespace using v4, some care
> must be taken to ensure a unique and stable client identity. Similar
> case is needed for NFS-root and other situations.
>
> Add documentation explaining the requirements for the NFS identity in
> these situations.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
>
> I think I've address most of the feedback, but please forgive and remind
> if I missed something.
> NeilBrown
>
> utils/mount/nfs.man | 109 +++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 108 insertions(+), 1 deletion(-)
>
> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
> index d9f34df36b42..5f15abe8cf72 100644
> --- a/utils/mount/nfs.man
> +++ b/utils/mount/nfs.man
> @@ -1,7 +1,7 @@
> .\"@(#)nfs.5"
> .TH NFS 5 "9 October 2012"
> .SH NAME
> -nfs \- fstab format and options for the
> +nfs \- fstab format and configuration for the
> .B nfs
> file systems

Suggest "configuration for nfs file systems" (remove "the")

> .SH SYNOPSIS
> @@ -1844,6 +1844,113 @@ export pathname, but not both, during a remount. For example,
> merges the mount option
> .B ro
> with the mount options already saved on disk for the NFS server mounted at /mnt.
> +.SH "NFS CLIENT IDENTIFIER"
> +NFSv4 requires that the client present a unique identifier to the server
> +to be used to track state such as file locks. By default Linux NFS uses
> +the host name, as configured at the time of the first NFS mount,
> +together with some fixed content such as the name "Linux NFS" and the
> +particular protocol version. When the hostname is guaranteed to be
> +unique among all client which access the same server this is sufficient.
> +If hostname uniqueness cannot be assumed, extra identity information
> +must be provided.

The last sentence is made ambiguous by the use of passive voice.

Suggest: "When hostname uniqueness cannot be guaranteed, the client
administrator must provide extra identity information."

I have a problem with basing our default uniqueness guarantee on
hostnames "most of the time" hoping it will all work out. There
are simply too many common cases where hostname stability can't be
relied upon. Our sustaining teams will happily tell us this hope
hasn't so far been born out.

I also don't feel that nfs(5) is an appropriate place for this level
of detail. Documentation/filesystems/nfs/ is more appropriate IMO.
In general, man pages are good for quick summaries, not for
explainers. Here, it reads like "you, a user, are going to have to
do this thing that is like filling out a tax form" -- in reality it
should be information that should be:

- Ignorable by most folks
- Used by distributors to add value by automating set up
- Used for debugging large client installations

Maybe I'm just stating this to understand the purpose of this
patch, but it could also be used as an "Intended audience"
disclaimer in this new section.

> +.PP
> +Some situations which are known to be problematic with respect to unique
> +host names include:

A little wordy.

Suggest: "Situations known to be problematic with respect to unique
hostnames include:"

If this will eventually become part of nfs(5), I would first run
this patch by documentation experts, because they might have a
preference for "hostnames" over "host names" and "namespaces" over
"name-spaces". Usage of these terms throughout this patch is not
consistent.

> +.IP \- 2
> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
> +not provide a unique host name.

Suggest this addition:

.IP \- 2

Dynamically-assigned hostnames, where the hostname can be changed after
a client reboot, while the client is booted, or if a client often
repeatedly connects to multiple networks (for example if it is moved
from home to an office every day).

> +.IP \- 2
> +"containers" within a single Linux host. If each container has a separate
> +network namespace, but does not use the UTS namespace to provide a unique
> +host name, then there can be multiple effective NFS clients with the
> +same host name.
> +.IP \= 2

.IP \- 2

> +Clients across multiple administrative domains that access a common NFS
> +server. If assignment of host name is devolved to separate domains,

I don't recognize the phrase "assignment is devolved to separate domains".
Can you choose a friendlier way of saying this?

> +uniqueness cannot be guaranteed, unless a domain name is included in the
> +host name.
> +.SS "Increasing Client Uniqueness"
> +Apart from the host name, which is the preferred way to differentiate
> +NFS clients, there are two mechanisms to add uniqueness to the
> +client identifier.
> +.TP
> +.B nfs.nfs4_unique_id
> +This module parameter can be set to an arbitrary string at boot time, or
> +when the
> +.B nfs
> +module is loaded. This might be suitable for configuring diskless clients.

Suggest: "This is suitable for"

> +.TP
> +.B /sys/fs/nfs/client/net/identifier
> +This virtual file (available since Linux 5.3) is local to the network
> +name-space in which it is accessed and so can provided uniqueness between
> +network namespaces (containers) when the hostname remains uniform.

^provided^provide

^between^amongst

and the clause at the end confused me.

Suggest: "in which it is accessed and thus can provide uniqueness
amongst network namespaces (containers)."

> +.RS
> +.PP
> +This value is empty on name-space creation.
> +If the value is to be set, that should be done before the first
> +mount. If the container system has access to some sort of per-container
> +identity then that identity, possibly obfuscated as a UUID is privacy is
> +needed, can be used. Combining the identity with the name of the
> +container systems would also help.

I object to recommending obfuscation via a UUID.

1. This is confusing because there has been no mention of any
persistence requirement so far. At this point, a reader
might think that the client can simply convert the hostname
and netns identifier every time it boots. However this is
only OK to do if these things are guaranteed not to change
during the lifetime of a client. In a world where a majority
of systems get their hostnames dynamically, I think this is
a shaky foundation.

2. There's no requirement that this uniquifier be in the form
of a UUID anywhere in specifications, and the Linux client
itself does not add such a requirement. (You suggested
before that we should start by writing down requirements.
Using a UUID ain't a requirement).

Linux chooses to implement its uniquifer with a UUID because
it is assumed we are using a random UUID (rather than a
name-based or time-based UUID). A random UUID has strong
global uniqueness guarantees, which guarantees the client
identifier will always be unique amongst clients in nearly
all situations for nearly no cost.

If we want to create a good uniquifier here, then combine the
hostname, netns identity, and/or the host's machine-id and then
hash that blob with a known strong digest algorithm like
SHA-256. A man page must not recommend the use of deprecated or
insecure obfuscation mechanisms.

The man page can suggest a random-based UUID as long as it
states plainly that such UUIDs have global uniqueness guarantees
that make them suitable for this purpose. We're using a UUID
for its global uniqueness properties, not because of its
appearance.

> For example:
> +.RS 4
> +echo "ip-netns:`ip netns identify`" \\
> +.br
> + > /sys/fs/nfs/client/net/identifier
> +.br
> +uuidgen --sha1 --namespace @url \\
> +.br
> + -N "nfs:`cat /etc/machine-id`" \\
> +.br
> + > /sys/fs/nfs/client/net/identifier
> +.RE
> +If the container system provides no stable name,
> +but does have stable storage,

Here's the first mention of "stable". It needs some
introduction far above.

> then something like
> +.RS 4
> +[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
> +.br
> +cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
> +.RE
> +would suffice.
> +.PP
> +If a container has neither a stable name nor stable (local) storage,
> +then it is not possible to provide a stable identifier, so providing
> +a random identifier to ensure uniqueness would be best
> +.RS 4
> +uuidgen > /sys/fs/nfs/client/net/identifier
> +.RE
> +.RE
> +.SS Consequences of poor identity setting

This section provides context to understand the above technical
recommendations. I suggest this whole section should be moved
to near the opening paragraph.

> +Any two concurrent clients that might access the same server must have
> +different identifiers for correct operation, and any two consecutive
> +instances of the same client should have the same identifier for optimal
> +crash recovery.

Also recovery from network partitions.

> +.PP
> +If two different clients present the same identity to a server there are
> +two possible scenarios. If the clients use the same credential then the
> +server will treat them as the same client which appears to be restarting
> +frequently. One client may manage to open some files etc, but as soon
> +as the other client does anything the first client will lose access and
> +need to re-open everything.

This seems fuzzy.

1. If locks are lost, then there is a substantial risk of data
corruption.

2. Is the client itself supposed to re-open files, or are
applications somehow notified that they need to re-open?
Either of these scenarios is fraught -- I don't believe any
application is coded to expect to have to re-open a file
due to exigent circumstances.

> +.PP
> +If the clients use different credentials, then the second client to
> +establish a connection to the server will be refused access. For
> +.B auth=sys
> +the credential is based on hostname, so will be the same if the
> +identities are the same. With
> +.B auth=krb
> +the credential is stored in
> +.I /etc/krb5.keytab
> +and will be the same only if this is copied among hosts.

This language implies that copying the keytab is a recommended thing
to do. It's not. I mentioned it before because some customers think
it's OK to use the same keytab across their client fleet. But obviously
that will result in lost open and lock state.

I suggest rephrasing this last sentence to describe the negative lease
recovery consequence of two clients happening to share the same host
principal -- as in "This is why you shouldn't share keytabs..."

> +.PP
> +If the identity is unique but not stable, for example if it is generated
> +randomly on each start up of the NFS client, then crash recovery is
> +affected. When a client shuts down uncleanly and restarts, the server
> +will normally detect this because the same identity is presented with
> +different boot time (or "incarnation verifier"), and will discard old
> +state. If the client presents a different identifier, then the server
> +cannot discard old state until the lease time has expired, and the new
> +client may be delayed in opening or locking files that it was
> +previously accessing.
> .SH FILES
> .TP 1.5i
> .I /etc/fstab
> --
> 2.35.1
>

--
Chuck Lever

2022-03-15 12:22:50

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

Thanks for the typo fixes Ben - I've applied them o my local copy.

NeilBrown

2022-03-17 01:28:39

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

On Tue, 15 Mar 2022, Chuck Lever III wrote:
> Hi Neil-
>
> > On Mar 13, 2022, at 9:04 PM, NeilBrown <[email protected]> wrote:
> >
> >
> > When mounting NFS filesystem in a network namespace using v4, some care
> > must be taken to ensure a unique and stable client identity. Similar
> > case is needed for NFS-root and other situations.
> >
> > Add documentation explaining the requirements for the NFS identity in
> > these situations.
> >
> > Signed-off-by: NeilBrown <[email protected]>
> > ---
> >
> > I think I've address most of the feedback, but please forgive and remind
> > if I missed something.
> > NeilBrown
> >
> > utils/mount/nfs.man | 109 +++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 108 insertions(+), 1 deletion(-)
> >
> > diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
> > index d9f34df36b42..5f15abe8cf72 100644
> > --- a/utils/mount/nfs.man
> > +++ b/utils/mount/nfs.man
> > @@ -1,7 +1,7 @@
> > .\"@(#)nfs.5"
> > .TH NFS 5 "9 October 2012"
> > .SH NAME
> > -nfs \- fstab format and options for the
> > +nfs \- fstab format and configuration for the
> > .B nfs
> > file systems
>
> Suggest "configuration for nfs file systems" (remove "the")

Agreed.

>
>
> > .SH SYNOPSIS
> > @@ -1844,6 +1844,113 @@ export pathname, but not both, during a remount. For example,
> > merges the mount option
> > .B ro
> > with the mount options already saved on disk for the NFS server mounted at /mnt.
> > +.SH "NFS CLIENT IDENTIFIER"
> > +NFSv4 requires that the client present a unique identifier to the server
> > +to be used to track state such as file locks. By default Linux NFS uses
> > +the host name, as configured at the time of the first NFS mount,
> > +together with some fixed content such as the name "Linux NFS" and the
> > +particular protocol version. When the hostname is guaranteed to be
> > +unique among all client which access the same server this is sufficient.
> > +If hostname uniqueness cannot be assumed, extra identity information
> > +must be provided.
>
> The last sentence is made ambiguous by the use of passive voice.
>
> Suggest: "When hostname uniqueness cannot be guaranteed, the client
> administrator must provide extra identity information."

Why must the client administrator do this? Why can't some automated
tool do this? Or some container-building environment.
That's an advantage of the passive voice, you don't need to assign
responsibility for the verb.

>
> I have a problem with basing our default uniqueness guarantee on
> hostnames "most of the time" hoping it will all work out. There
> are simply too many common cases where hostname stability can't be
> relied upon. Our sustaining teams will happily tell us this hope
> hasn't so far been born out.

Maybe it has not been born out because there is no documented
requirement for it that we can point people to.
Clearly containers that use NFS are not currently all configured well to do
this. Some change is needed. Maybe adding a unique host name is the
easiest change ... or maybe not.

Surely NFS is not the *only* service that uses the host name.
Encouraging the use of unique host names might benefit others.

The practical reality is that a great many NFS client installations do
currently depend on unique host names - after all, it actually works.
Is it really so unreasonable to try to encourage the exceptions to fit
the common pattern better?

>
> I also don't feel that nfs(5) is an appropriate place for this level
> of detail. Documentation/filesystems/nfs/ is more appropriate IMO.
> In general, man pages are good for quick summaries, not for
> explainers. Here, it reads like "you, a user, are going to have to
> do this thing that is like filling out a tax form" -- in reality it
> should be information that should be:
>
> - Ignorable by most folks
> - Used by distributors to add value by automating set up
> - Used for debugging large client installations

nfs(5) contains sections on TRANSPORT METHODS, DATA AND METADATA
COHERENCE, SECURITY CONSIDERATIONS. Is this section really out of
place?

I could agree that all of these sections belong in "section 7" (Overview,
conventions, and miscellaneous) rather than "section 5" (File formats and
configuration files) but we don't have nfs.7 (yet). I think section 7
is a reasonable fit for your 3 points above.

I don't agree that Documentation/filesystems/nfs/ is sufficient. That
is (from my perspective) primarily of interest to kernel developers.
The whole point of this exercise that at we need to reach people outside
of that group.

>
> Maybe I'm just stating this to understand the purpose of this
> patch, but it could also be used as an "Intended audience"
> disclaimer in this new section.

OK, so the "purpose of this patch" relates in part to a comment you made
earlier, which I include here:

> Since it is just a line or two of code, it might be of little
> harm just to go with separate implementations for now and stop
> talking about it. If it sucks, we can fix the suckage.
>
> Who volunteers to implement this mechanism in mount.nfs ?

I don't think this is the best next step. I think we need to get some
container system developer to contribute here. So far we only have
second hand anecdotes about problems. I think the most concrete is from
Ben suggesting that in at least one container system, using
/etc/machine-id is a good idea.

I don't think we can change nfs-utils (whether mount.nfs or mount.conf
or some other way) to set identity from /etc/machine-id for everyone.
So we need at least for that container system to request that change.

How would they like to do that?

I suggest that we explain the problem to representatives of the various
container communities that we have contact with (Well... "you", more
than "we" as I don't have contacts).

We could use the documentation I provided to clearly present the
problem.
Then ask:
- would you like to just run some shell code (see examples)
- or would you like to provide an /etc/nfs.conf.d/my-container.conf
- or would you like to run a tool that we provide
- or is there already a push to provide unique container hostnames,
and is this the incentive you need to help that push across the
line?

If we have someone from $CONTAINER_COMMUNITY say "if you do this thing,
then we will use it", then that would be hard to argue with.
If we could get two or three different communities to comment, I expect
the best answer would become a lot more obvious.

But first we, ourselves, need to agree on the document :-)

>
>
> > +.PP
> > +Some situations which are known to be problematic with respect to unique
> > +host names include:
>
> A little wordy.
>
> Suggest: "Situations known to be problematic with respect to unique
> hostnames include:"

Yep.

>
> If this will eventually become part of nfs(5), I would first run
> this patch by documentation experts, because they might have a
> preference for "hostnames" over "host names" and "namespaces" over
> "name-spaces". Usage of these terms throughout this patch is not
> consistent.

I've made it consistently "hostname" and "namespace" which is consistent
with the rest of the document

>
>
> > +.IP \- 2
> > +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
> > +not provide a unique host name.
>
> Suggest this addition:
>
> .IP \- 2
>
> Dynamically-assigned hostnames, where the hostname can be changed after
> a client reboot, while the client is booted, or if a client often
> repeatedly connects to multiple networks (for example if it is moved
> from home to an office every day).

This is a different kettle of fish. The hostname is *always* included
in the identifier. If it isn't stable, then the identifier isn't
stable.

I saw in the history that when you introduced the module parameter it
replaced the hostname. This caused problems in containers (which had
different host names) so Trond changed it so the module parameter
supplemented the hostname.

If hostnames are really so poorly behaved I can see there might be a
case to suppress the hostname, but we don't have that option is current
kernels. Should we add it?

>
>
> > +.IP \- 2
> > +"containers" within a single Linux host. If each container has a separate
> > +network namespace, but does not use the UTS namespace to provide a unique
> > +host name, then there can be multiple effective NFS clients with the
> > +same host name.
> > +.IP \= 2
>
> .IP \- 2

Thanks.

>
>
> > +Clients across multiple administrative domains that access a common NFS
> > +server. If assignment of host name is devolved to separate domains,
>
> I don't recognize the phrase "assignment is devolved to separate domains".
> Can you choose a friendlier way of saying this?
>

If hostnames are not assigned centrally then uniqueness cannot be
guaranteed unless a domain name is included in the hostname.

>
> > +uniqueness cannot be guaranteed, unless a domain name is included in the
> > +host name.
> > +.SS "Increasing Client Uniqueness"
> > +Apart from the host name, which is the preferred way to differentiate
> > +NFS clients, there are two mechanisms to add uniqueness to the
> > +client identifier.
> > +.TP
> > +.B nfs.nfs4_unique_id
> > +This module parameter can be set to an arbitrary string at boot time, or
> > +when the
> > +.B nfs
> > +module is loaded. This might be suitable for configuring diskless clients.
>
> Suggest: "This is suitable for"

OK

>
>
> > +.TP
> > +.B /sys/fs/nfs/client/net/identifier
> > +This virtual file (available since Linux 5.3) is local to the network
> > +name-space in which it is accessed and so can provided uniqueness between
> > +network namespaces (containers) when the hostname remains uniform.
>
> ^provided^provide
>
> ^between^amongst
>
> and the clause at the end confused me.
>
> Suggest: "in which it is accessed and thus can provide uniqueness
> amongst network namespaces (containers)."

The clause at the end was simply emphasising that the identifer is only
needed if the hostname does not vary across containers. I have removed it.

>
>
> > +.RS
> > +.PP
> > +This value is empty on name-space creation.
> > +If the value is to be set, that should be done before the first
> > +mount. If the container system has access to some sort of per-container
> > +identity then that identity, possibly obfuscated as a UUID is privacy is
> > +needed, can be used. Combining the identity with the name of the
> > +container systems would also help.
>
> I object to recommending obfuscation via a UUID.
>
> 1. This is confusing because there has been no mention of any
> persistence requirement so far. At this point, a reader
> might think that the client can simply convert the hostname
> and netns identifier every time it boots. However this is
> only OK to do if these things are guaranteed not to change
> during the lifetime of a client. In a world where a majority
> of systems get their hostnames dynamically, I think this is
> a shaky foundation.

If the hostname changes after boot (weird concept .. does that really
happen?) that is irrelevant. The hostname is copied at boot by NFS, and
if it is included in the /sys/fs/nfs/client/identifier (which would be
pointless, but not harmful) it has again been copied.

If it is different on subsequent boots, then that is a big problem and
not one that we can currently fix.

....except that non-persistent client identifiers isn't an enormous
problem, just a possible cause of delays.

>
> 2. There's no requirement that this uniquifier be in the form
> of a UUID anywhere in specifications, and the Linux client
> itself does not add such a requirement. (You suggested
> before that we should start by writing down requirements.
> Using a UUID ain't a requirement).

The requirement here is that /etc/machine-id is documented as requiring
obfuscation. uuidgen is a convenient way to provide obfuscation. That
is all I was trying to say.

>
> Linux chooses to implement its uniquifer with a UUID because
> it is assumed we are using a random UUID (rather than a
> name-based or time-based UUID). A random UUID has strong
> global uniqueness guarantees, which guarantees the client
> identifier will always be unique amongst clients in nearly
> all situations for nearly no cost.
>

"Linux chooses" what does that mean? I've lost the thread here, sorry.

> If we want to create a good uniquifier here, then combine the
> hostname, netns identity, and/or the host's machine-id and then
> hash that blob with a known strong digest algorithm like
> SHA-256. A man page must not recommend the use of deprecated or
> insecure obfuscation mechanisms.

I didn't realize the hash that uuidgen uses was deprecated. Is there
some better way to provide an app-specific obfuscation of a string from
the command line?

Maybe
echo nfs-id:`cat /etc/machine-id`| sha256sum

??

>
> The man page can suggest a random-based UUID as long as it
> states plainly that such UUIDs have global uniqueness guarantees
> that make them suitable for this purpose. We're using a UUID
> for its global uniqueness properties, not because of its
> appearance.

So I could use "/etc/nfsv4-identity" instead of "/etc/nfs4-uuid".
What else should I change/add.

>
>
> > For example:
> > +.RS 4
> > +echo "ip-netns:`ip netns identify`" \\
> > +.br
> > + > /sys/fs/nfs/client/net/identifier
> > +.br
> > +uuidgen --sha1 --namespace @url \\
> > +.br
> > + -N "nfs:`cat /etc/machine-id`" \\
> > +.br
> > + > /sys/fs/nfs/client/net/identifier
> > +.RE
> > +If the container system provides no stable name,
> > +but does have stable storage,
>
> Here's the first mention of "stable". It needs some
> introduction far above.

True. So the first para becomes:

NFSv4 requires that the client present a stable unique identifier to
the server to be used to track state such as file locks. By default
Linux NFS uses the hostname, as configured at the time of the first
NFS mount, together with some fixed content such as the name "Linux
NFS" and the particular protocol version. When the hostname is
guaranteed to be unique among all client which access the same server,
and stable across reboots, this is sufficient. If hostname uniqueness
cannot be assumed, extra identity information must be provided. If
the hostname is not stable, unclean restarts may suffer unavoidable
delays.

>
>
> > then something like
> > +.RS 4
> > +[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
> > +.br
> > +cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
> > +.RE
> > +would suffice.
> > +.PP
> > +If a container has neither a stable name nor stable (local) storage,
> > +then it is not possible to provide a stable identifier, so providing
> > +a random identifier to ensure uniqueness would be best
> > +.RS 4
> > +uuidgen > /sys/fs/nfs/client/net/identifier
> > +.RE
> > +.RE
> > +.SS Consequences of poor identity setting
>
> This section provides context to understand the above technical
> recommendations. I suggest this whole section should be moved
> to near the opening paragraph.

I seem to keep moving things upwards.... something has to come last.
Maybe a "(See below)" at the end of the revised first para?

>
>
> > +Any two concurrent clients that might access the same server must have
> > +different identifiers for correct operation, and any two consecutive
> > +instances of the same client should have the same identifier for optimal
> > +crash recovery.
>
> Also recovery from network partitions.

A network partition doesn't coincide with two consecutive instances of the
same client. There is just one client instance and one server instance.

>
>
> > +.PP
> > +If two different clients present the same identity to a server there are
> > +two possible scenarios. If the clients use the same credential then the
> > +server will treat them as the same client which appears to be restarting
> > +frequently. One client may manage to open some files etc, but as soon
> > +as the other client does anything the first client will lose access and
> > +need to re-open everything.
>
> This seems fuzzy.
>
> 1. If locks are lost, then there is a substantial risk of data
> corruption.
>
> 2. Is the client itself supposed to re-open files, or are
> applications somehow notified that they need to re-open?
> Either of these scenarios is fraught -- I don't believe any
> application is coded to expect to have to re-open a file
> due to exigent circumstances.

I wasn't very happy with the description either. I think we want some
detail, but not too much.

The "re-opening" that I mentioned is the NFS client resubmitting NFS
OPEN requests, not the application having to re-open.
However if the application manages to get a lock, then when the "other"
client connects to the server the application will lose the lock, and
all read/write accesses on the relevant fd will result in EIO (I
think). Clearly bad.

I wanted to say the clients could end up "fighting" with each other -
the EXCHANGE_ID from one destroys the state set up by the other - I that
seems to be too much anthropomorphism.

If two different clients present the same identity to a server there
are two possible scenarios. If the clients use the same credential
then the server will treat them as the same client which appears to
be restarting frequently. The clients will each enter a loop where
they establish state with the server and then find that the state
has been destroy by the other client and so will need to establish
it again.

???

>
>
> > +.PP
> > +If the clients use different credentials, then the second client to
> > +establish a connection to the server will be refused access. For
> > +.B auth=sys
> > +the credential is based on hostname, so will be the same if the
> > +identities are the same. With
> > +.B auth=krb
> > +the credential is stored in
> > +.I /etc/krb5.keytab
> > +and will be the same only if this is copied among hosts.
>
> This language implies that copying the keytab is a recommended thing
> to do. It's not. I mentioned it before because some customers think
> it's OK to use the same keytab across their client fleet. But obviously
> that will result in lost open and lock state.
>
> I suggest rephrasing this last sentence to describe the negative lease
> recovery consequence of two clients happening to share the same host
> principal -- as in "This is why you shouldn't share keytabs..."
>

How about

.PP
If the clients use different credentials, then the second client to
establish a connection to the server will be refused access which is a
safer failure mode. For
.B auth=sys
the credential is based on hostname, so will be the same if the
identities are the same. With
.B auth=krb
the credential is stored in
.I /etc/krb5.keytab
so providing this isn't copied among client the safer failure mode will result.

??

Thanks for your details review!

NeilBrown

>
> > +.PP
> > +If the identity is unique but not stable, for example if it is generated
> > +randomly on each start up of the NFS client, then crash recovery is
> > +affected. When a client shuts down uncleanly and restarts, the server
> > +will normally detect this because the same identity is presented with
> > +different boot time (or "incarnation verifier"), and will discard old
> > +state. If the client presents a different identifier, then the server
> > +cannot discard old state until the lease time has expired, and the new
> > +client may be delayed in opening or locking files that it was
> > +previously accessing.
> > .SH FILES
> > .TP 1.5i
> > .I /etc/fstab
> > --
> > 2.35.1
> >
>
> --
> Chuck Lever
>
>
>
>

2022-03-17 04:05:43

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

Howdy Neil-

> On Mar 14, 2022, at 8:41 PM, NeilBrown <[email protected]> wrote:
>
> On Tue, 15 Mar 2022, Chuck Lever III wrote:
>> Hi Neil-
>>
>>> On Mar 13, 2022, at 9:04 PM, NeilBrown <[email protected]> wrote:
>>>
>>>
>>> When mounting NFS filesystem in a network namespace using v4, some care
>>> must be taken to ensure a unique and stable client identity. Similar
>>> case is needed for NFS-root and other situations.
>>>
>>> Add documentation explaining the requirements for the NFS identity in
>>> these situations.
>>>
>>> Signed-off-by: NeilBrown <[email protected]>
>>> ---
>>>
>>> I think I've address most of the feedback, but please forgive and remind
>>> if I missed something.
>>> NeilBrown
>>>
>>> utils/mount/nfs.man | 109 +++++++++++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 108 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
>>> index d9f34df36b42..5f15abe8cf72 100644
>>> --- a/utils/mount/nfs.man
>>> +++ b/utils/mount/nfs.man
>>> @@ -1,7 +1,7 @@
>>> .\"@(#)nfs.5"
>>> .TH NFS 5 "9 October 2012"
>>> .SH NAME
>>> -nfs \- fstab format and options for the
>>> +nfs \- fstab format and configuration for the
>>> .B nfs
>>> file systems
>>
>> Suggest "configuration for nfs file systems" (remove "the")
>
> Agreed.
>
>>
>>
>>> .SH SYNOPSIS
>>> @@ -1844,6 +1844,113 @@ export pathname, but not both, during a remount. For example,
>>> merges the mount option
>>> .B ro
>>> with the mount options already saved on disk for the NFS server mounted at /mnt.
>>> +.SH "NFS CLIENT IDENTIFIER"
>>> +NFSv4 requires that the client present a unique identifier to the server
>>> +to be used to track state such as file locks. By default Linux NFS uses
>>> +the host name, as configured at the time of the first NFS mount,
>>> +together with some fixed content such as the name "Linux NFS" and the
>>> +particular protocol version. When the hostname is guaranteed to be
>>> +unique among all client which access the same server this is sufficient.
>>> +If hostname uniqueness cannot be assumed, extra identity information
>>> +must be provided.
>>
>> The last sentence is made ambiguous by the use of passive voice.
>>
>> Suggest: "When hostname uniqueness cannot be guaranteed, the client
>> administrator must provide extra identity information."
>
> Why must the client administrator do this? Why can't some automated
> tool do this? Or some container-building environment.
> That's an advantage of the passive voice, you don't need to assign
> responsibility for the verb.

My point is that in order to provide the needed information,
elevated privilege is required. The current sentence reads as
if J. Random User could be interrupted at some point and asked
for help.

In other words, the documentation should state that this is
an administrative task. Here I'm not advocating for a specific
mechanism to actually perform that task.

>> I have a problem with basing our default uniqueness guarantee on
>> hostnames "most of the time" hoping it will all work out. There
>> are simply too many common cases where hostname stability can't be
>> relied upon. Our sustaining teams will happily tell us this hope
>> hasn't so far been born out.
>
> Maybe it has not been born out because there is no documented
> requirement for it that we can point people to.
> Clearly containers that use NFS are not currently all configured well to do
> this. Some change is needed. Maybe adding a unique host name is the
> easiest change ... or maybe not.

You seem to be documenting the client's current behavior.
The tone of the documentation is that this behavior is fine
and works for most people.

It's the second part that I disagree with. Oracle Linux has
bugs documenting this behavior is a problem, and I'm sure
Red Hat does too. The current behavior is broken. It is this
brokeness that we are trying to resolve.

So let me make a stronger statement: we should not
document that broken behavior in nfs(5). Instead, we should
fix that behavior, and then document the golden brown and
delicious behavior. Updating nfs(5) first is putting
DeCarte in front of de horse.

> Surely NFS is not the *only* service that uses the host name.
> Encouraging the use of unique host names might benefit others.

Unless you have specific use cases that might benefit from
ensuring hostname uniqueness, I would beg that you stay
focused on the immediate issue of how the Linux client
constructs its nfs_client_id4 strings.

> The practical reality is that a great many NFS client installations do
> currently depend on unique host names - after all, it actually works.
> Is it really so unreasonable to try to encourage the exceptions to fit
> the common pattern better?

Yes it is unreasonable.

NFS servers typically have a fixed DNS presence. They have
to because clients mount by hostname.

NFS clients, on the other hand, are not under that constraint.
The only time I can think of where a client has to have a
fixed hostname is if a krb5 host principal is involved.

In so many other cases, eg. mobile computing or elastic
services, the client hostname is mutable. I don't think
it's fair to put another constraint on host naming here,
especially one with implications of service denial or
data corruption (see below).

>> Maybe I'm just stating this to understand the purpose of this
>> patch, but it could also be used as an "Intended audience"
>> disclaimer in this new section.
>
> OK, so the "purpose of this patch" relates in part to a comment you made
> earlier, which I include here:
>
>> Since it is just a line or two of code, it might be of little
>> harm just to go with separate implementations for now and stop
>> talking about it. If it sucks, we can fix the suckage.
>>
>> Who volunteers to implement this mechanism in mount.nfs ?
>
> I don't think this is the best next step. I think we need to get some
> container system developer to contribute here. So far we only have
> second hand anecdotes about problems. I think the most concrete is from
> Ben suggesting that in at least one container system, using
> /etc/machine-id is a good idea.
>
> I don't think we can change nfs-utils (whether mount.nfs or mount.conf
> or some other way) to set identity from /etc/machine-id for everyone.
> So we need at least for that container system to request that change.
>
> How would they like to do that?
>
> I suggest that we explain the problem to representatives of the various
> container communities that we have contact with (Well... "you", more
> than "we" as I don't have contacts).

I'm all for involving one or more container experts. But IMO
it's not appropriate to update our man page to do that. Let's
update nfs(5) when we are done with this effort.

> We could use the documentation I provided to clearly present the
> problem.

No doubt, we need a crisp problem statement!

> Then ask:
> - would you like to just run some shell code (see examples)
> - or would you like to provide an /etc/nfs.conf.d/my-container.conf
> - or would you like to run a tool that we provide
> - or is there already a push to provide unique container hostnames,
> and is this the incentive you need to help that push across the
> line?
>
> If we have someone from $CONTAINER_COMMUNITY say "if you do this thing,
> then we will use it", then that would be hard to argue with.
> If we could get two or three different communities to comment, I expect
> the best answer would become a lot more obvious.
>
> But first we, ourselves, need to agree on the document :-)

If the community is seeking help, then a wiki might be a better
place to formulate a problem statement.

>>> +.PP
>>> +Some situations which are known to be problematic with respect to unique
>>> +host names include:
>>
>> A little wordy.
>>
>> Suggest: "Situations known to be problematic with respect to unique
>> hostnames include:"
>
> Yep.
>
>>
>> If this will eventually become part of nfs(5), I would first run
>> this patch by documentation experts, because they might have a
>> preference for "hostnames" over "host names" and "namespaces" over
>> "name-spaces". Usage of these terms throughout this patch is not
>> consistent.
>
> I've made it consistently "hostname" and "namespace" which is consistent
> with the rest of the document
>
>>
>>
>>> +.IP \- 2
>>> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
>>> +not provide a unique host name.
>>
>> Suggest this addition:
>>
>> .IP \- 2
>>
>> Dynamically-assigned hostnames, where the hostname can be changed after
>> a client reboot, while the client is booted, or if a client often
>> repeatedly connects to multiple networks (for example if it is moved
>> from home to an office every day).
>
> This is a different kettle of fish. The hostname is *always* included
> in the identifier. If it isn't stable, then the identifier isn't
> stable.
>
> I saw in the history that when you introduced the module parameter it
> replaced the hostname. This caused problems in containers (which had
> different host names) so Trond changed it so the module parameter
> supplemented the hostname.
>
> If hostnames are really so poorly behaved I can see there might be a
> case to suppress the hostname, but we don't have that option is current
> kernels. Should we add it?

I claim that it has become problematic to use the hostname in the
nfs_client_id4 string.

25 years ago when NFSv4.0 was being crafted, it was assumed that
client hostnames were unchanging. The original RFC 3010 recommended
adding the hostname, the client IP address, and the server IP
address to the nfs_client_id4 string.

Since then, we've learned that the IP addresses are quite mutable,
and thus not appropriate for a fixed identifier. I argue that the
client's hostname is now the same.

The Linux NFSv4 prototype and subsequent production code used the
local hostname because it's easy to access in the kernel via the
UTS name. That was adequate 20 years ago, but has become less so
over time. You can view this evolution in the commit log.

It doesn't seem that complicated (to me) to divorce the client_id4
string from the local hostname, and the benefits are significant.

>>> +.IP \- 2
>>> +"containers" within a single Linux host. If each container has a separate
>>> +network namespace, but does not use the UTS namespace to provide a unique
>>> +host name, then there can be multiple effective NFS clients with the
>>> +same host name.
>>> +.IP \= 2
>>
>> .IP \- 2
>
> Thanks.
>
>>
>>
>>> +Clients across multiple administrative domains that access a common NFS
>>> +server. If assignment of host name is devolved to separate domains,
>>
>> I don't recognize the phrase "assignment is devolved to separate domains".
>> Can you choose a friendlier way of saying this?
>>
>
> If hostnames are not assigned centrally then uniqueness cannot be
> guaranteed unless a domain name is included in the hostname.

Better, thanks.

>>> +.RS
>>> +.PP
>>> +This value is empty on name-space creation.
>>> +If the value is to be set, that should be done before the first
>>> +mount. If the container system has access to some sort of per-container
>>> +identity then that identity, possibly obfuscated as a UUID is privacy is
>>> +needed, can be used. Combining the identity with the name of the
>>> +container systems would also help.
>>
>> I object to recommending obfuscation via a UUID.
>>
>> 1. This is confusing because there has been no mention of any
>> persistence requirement so far. At this point, a reader
>> might think that the client can simply convert the hostname
>> and netns identifier every time it boots. However this is
>> only OK to do if these things are guaranteed not to change
>> during the lifetime of a client. In a world where a majority
>> of systems get their hostnames dynamically, I think this is
>> a shaky foundation.
>
> If the hostname changes after boot (weird concept .. does that really
> happen?) that is irrelevant.

It really happens. A DHCP lease renewal can do it. Moving to a
new subnet on the same campus might do it. I can open "Device
Settings" on my laptop and change my laptop's hostname on a
whim. Joining a VPN might do it.

A client might have multiple network interfaces, each with a
unique hostname. Which one should be used for the nfs_client_id4
string? RFCs 7931 and 8587 discuss how trunking needs to work:
the upshot is that the client needs to have one consistent
nfs_client_id4 string it presents to all servers (in case of
migration) no matter which network path it uses to access the
server.

> The hostname is copied at boot by NFS, and
> if it is included in the /sys/fs/nfs/client/identifier (which would be
> pointless, but not harmful) it has again been copied.
>
> If it is different on subsequent boots, then that is a big problem and
> not one that we can currently fix.

Yes, we can fix it: don't use the client's hostname but
instead use a separate persistent uniquifier, as has been
proposed.

> ....except that non-persistent client identifiers isn't an enormous
> problem, just a possible cause of delays.

I disagree, it's a significant issue.

- If locks are lost, that is a potential source of data corruption.

- If a lease is stolen, that is a denial of service.

Our customers take this very seriously. The NFS clients's
out-of-the-shrink-wrap default behavior/configuration should be
conservative enough to prevent these issues. Customers store
mission critical data via NFS. Most customers expect NFS to work
reliably without a lot of configuration fuss.

>> 2. There's no requirement that this uniquifier be in the form
>> of a UUID anywhere in specifications, and the Linux client
>> itself does not add such a requirement. (You suggested
>> before that we should start by writing down requirements.
>> Using a UUID ain't a requirement).
>
> The requirement here is that /etc/machine-id is documented as requiring
> obfuscation. uuidgen is a convenient way to provide obfuscation. That
> is all I was trying to say.

Understood, but the words you used have some additional
implications that you might not want.

>> Linux chooses to implement its uniquifer with a UUID because
>> it is assumed we are using a random UUID (rather than a
>> name-based or time-based UUID). A random UUID has strong
>> global uniqueness guarantees, which guarantees the client
>> identifier will always be unique amongst clients in nearly
>> all situations for nearly no cost.
>>
>
> "Linux chooses" what does that mean? I've lost the thread here, sorry.

Try instead: "The documentation regarding the nfs_unique_id
module parameter suggests the use of a UUID because..."

>> If we want to create a good uniquifier here, then combine the
>> hostname, netns identity, and/or the host's machine-id and then
>> hash that blob with a known strong digest algorithm like
>> SHA-256. A man page must not recommend the use of deprecated or
>> insecure obfuscation mechanisms.
>
> I didn't realize the hash that uuidgen uses was deprecated. Is there
> some better way to provide an app-specific obfuscation of a string from
> the command line?
>
> Maybe
> echo nfs-id:`cat /etc/machine-id`| sha256sum
>
> ??

Something like that, yes. But the scriptlet needs to also
involve the netns identity somehow.

>> The man page can suggest a random-based UUID as long as it
>> states plainly that such UUIDs have global uniqueness guarantees
>> that make them suitable for this purpose. We're using a UUID
>> for its global uniqueness properties, not because of its
>> appearance.
>
> So I could use "/etc/nfsv4-identity" instead of "/etc/nfs4-uuid".

I like. I would prefer not using "uuid" in the name. Ben and
Steve were resistant to that idea, though.

> What else should I change/add.
>
>>
>>
>>> For example:
>>> +.RS 4
>>> +echo "ip-netns:`ip netns identify`" \\
>>> +.br
>>> + > /sys/fs/nfs/client/net/identifier
>>> +.br
>>> +uuidgen --sha1 --namespace @url \\
>>> +.br
>>> + -N "nfs:`cat /etc/machine-id`" \\
>>> +.br
>>> + > /sys/fs/nfs/client/net/identifier
>>> +.RE
>>> +If the container system provides no stable name,
>>> +but does have stable storage,
>>
>> Here's the first mention of "stable". It needs some
>> introduction far above.
>
> True. So the first para becomes:
>
> NFSv4 requires that the client present a stable unique identifier to
> the server to be used to track state such as file locks. By default
> Linux NFS uses the hostname, as configured at the time of the first
> NFS mount, together with some fixed content such as the name "Linux
> NFS" and the particular protocol version. When the hostname is
> guaranteed to be unique among all client which access the same server,
> and stable across reboots, this is sufficient. If hostname uniqueness
> cannot be assumed, extra identity information must be provided. If
> the hostname is not stable, unclean restarts may suffer unavoidable
> delays.

See above. The impact is more extensive than "unavoidable delays."

>>> then something like
>>> +.RS 4
>>> +[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
>>> +.br
>>> +cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
>>> +.RE
>>> +would suffice.
>>> +.PP
>>> +If a container has neither a stable name nor stable (local) storage,
>>> +then it is not possible to provide a stable identifier, so providing
>>> +a random identifier to ensure uniqueness would be best
>>> +.RS 4
>>> +uuidgen > /sys/fs/nfs/client/net/identifier
>>> +.RE
>>> +.RE
>>> +.SS Consequences of poor identity setting
>>
>> This section provides context to understand the above technical
>> recommendations. I suggest this whole section should be moved
>> to near the opening paragraph.
>
> I seem to keep moving things upwards.... something has to come last.
> Maybe a "(See below)" at the end of the revised first para?
>
>>
>>
>>> +Any two concurrent clients that might access the same server must have
>>> +different identifiers for correct operation, and any two consecutive
>>> +instances of the same client should have the same identifier for optimal
>>> +crash recovery.
>>
>> Also recovery from network partitions.
>
> A network partition doesn't coincide with two consecutive instances of the
> same client. There is just one client instance and one server instance.

It's possible for one of the peers to reboot during the network
partition.

>>> +.PP
>>> +If two different clients present the same identity to a server there are
>>> +two possible scenarios. If the clients use the same credential then the
>>> +server will treat them as the same client which appears to be restarting
>>> +frequently. One client may manage to open some files etc, but as soon
>>> +as the other client does anything the first client will lose access and
>>> +need to re-open everything.
>>
>> This seems fuzzy.
>>
>> 1. If locks are lost, then there is a substantial risk of data
>> corruption.
>>
>> 2. Is the client itself supposed to re-open files, or are
>> applications somehow notified that they need to re-open?
>> Either of these scenarios is fraught -- I don't believe any
>> application is coded to expect to have to re-open a file
>> due to exigent circumstances.
>
> I wasn't very happy with the description either. I think we want some
> detail, but not too much.
>
> The "re-opening" that I mentioned is the NFS client resubmitting NFS
> OPEN requests, not the application having to re-open.
> However if the application manages to get a lock, then when the "other"
> client connects to the server the application will lose the lock, and
> all read/write accesses on the relevant fd will result in EIO (I
> think). Clearly bad.
>
> I wanted to say the clients could end up "fighting" with each other -
> the EXCHANGE_ID from one destroys the state set up by the other - I that
> seems to be too much anthropomorphism.
>
> If two different clients present the same identity to a server there
> are two possible scenarios. If the clients use the same credential
> then the server will treat them as the same client which appears to
> be restarting frequently. The clients will each enter a loop where
> they establish state with the server and then find that the state
> has been destroy by the other client and so will need to establish
> it again.
>
> ???

My colleague Calum coined the term "lease stealing". That might be
a good thing to define somewhere and simply use that term as needed.

>>> +.PP
>>> +If the clients use different credentials, then the second client to
>>> +establish a connection to the server will be refused access. For
>>> +.B auth=sys
>>> +the credential is based on hostname, so will be the same if the
>>> +identities are the same. With
>>> +.B auth=krb
>>> +the credential is stored in
>>> +.I /etc/krb5.keytab
>>> +and will be the same only if this is copied among hosts.
>>
>> This language implies that copying the keytab is a recommended thing
>> to do. It's not. I mentioned it before because some customers think
>> it's OK to use the same keytab across their client fleet. But obviously
>> that will result in lost open and lock state.
>>
>> I suggest rephrasing this last sentence to describe the negative lease
>> recovery consequence of two clients happening to share the same host
>> principal -- as in "This is why you shouldn't share keytabs..."
>>
>
> How about
>
> .PP
> If the clients use different credentials, then the second client to
> establish a connection to the server will be refused access which is a
> safer failure mode. For
> .B auth=sys
> the credential is based on hostname, so will be the same if the
> identities are the same. With
> .B auth=krb
> the credential is stored in
> .I /etc/krb5.keytab
> so providing this isn't copied among client the safer failure mode will result.

With
.BR auth=krb5 ,
the client uses the host principal in
.I /etc/krb5.keytab
or in some cases, the lone user principal,
to authenticate lease management operations.
This securely prevents lease stealing.

> ??
>
> Thanks for your details review!
>
> NeilBrown
>
>>
>>> +.PP
>>> +If the identity is unique but not stable, for example if it is generated
>>> +randomly on each start up of the NFS client, then crash recovery is
>>> +affected. When a client shuts down uncleanly and restarts, the server
>>> +will normally detect this because the same identity is presented with
>>> +different boot time (or "incarnation verifier"), and will discard old
>>> +state. If the client presents a different identifier, then the server
>>> +cannot discard old state until the lease time has expired, and the new
>>> +client may be delayed in opening or locking files that it was
>>> +previously accessing.
>>> .SH FILES
>>> .TP 1.5i
>>> .I /etc/fstab
>>> --
>>> 2.35.1
>>>
>>
>> --
>> Chuck Lever

--
Chuck Lever

2022-03-18 07:12:32

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

On Thu, 17 Mar 2022, Chuck Lever III wrote:
> Howdy Neil-

G'day

> >> The last sentence is made ambiguous by the use of passive voice.
> >>
> >> Suggest: "When hostname uniqueness cannot be guaranteed, the client
> >> administrator must provide extra identity information."
> >
> > Why must the client administrator do this? Why can't some automated
> > tool do this? Or some container-building environment.
> > That's an advantage of the passive voice, you don't need to assign
> > responsibility for the verb.
>
> My point is that in order to provide the needed information,
> elevated privilege is required. The current sentence reads as
> if J. Random User could be interrupted at some point and asked
> for help.
>
> In other words, the documentation should state that this is
> an administrative task. Here I'm not advocating for a specific
> mechanism to actually perform that task.

??? This whole man page is primarily about mount options, particularly
as they appear in /etc/fstab. These are not available to the non-admin.
Why would anyone think this section is any different?

>
>
> >> I have a problem with basing our default uniqueness guarantee on
> >> hostnames "most of the time" hoping it will all work out. There
> >> are simply too many common cases where hostname stability can't be
> >> relied upon. Our sustaining teams will happily tell us this hope
> >> hasn't so far been born out.
> >
> > Maybe it has not been born out because there is no documented
> > requirement for it that we can point people to.
> > Clearly containers that use NFS are not currently all configured well to do
> > this. Some change is needed. Maybe adding a unique host name is the
> > easiest change ... or maybe not.
>
> You seem to be documenting the client's current behavior.
> The tone of the documentation is that this behavior is fine
> and works for most people.

It certainly works for a lot of people. Many people are using NFSv4
quite effectively. I'm sure there are people who are having problems
too, but let's not fall for the squeaky wheel fallacy.

>
> It's the second part that I disagree with. Oracle Linux has
> bugs documenting this behavior is a problem, and I'm sure
> Red Hat does too. The current behavior is broken. It is this
> brokeness that we are trying to resolve.

The current behaviour of NFS is NOT broken. Maybe is it not adequately
robust against certain configuration choices. Certainly we should make
it as robust as we reasonably can. But let's not overstate the problem.

>
> So let me make a stronger statement: we should not
> document that broken behavior in nfs(5). Instead, we should
> fix that behavior, and then document the golden brown and
> delicious behavior. Updating nfs(5) first is putting
> DeCarte in front of de horse.
>
>
> > Surely NFS is not the *only* service that uses the host name.
> > Encouraging the use of unique host names might benefit others.
>
> Unless you have specific use cases that might benefit from
> ensuring hostname uniqueness, I would beg that you stay
> focused on the immediate issue of how the Linux client
> constructs its nfs_client_id4 strings.
>
>
> > The practical reality is that a great many NFS client installations do
> > currently depend on unique host names - after all, it actually works.
> > Is it really so unreasonable to try to encourage the exceptions to fit
> > the common pattern better?
>
> Yes it is unreasonable.
>
> NFS servers typically have a fixed DNS presence. They have
> to because clients mount by hostname.
>
> NFS clients, on the other hand, are not under that constraint.
> The only time I can think of where a client has to have a
> fixed hostname is if a krb5 host principal is involved.
>
> In so many other cases, eg. mobile computing or elastic
> services, the client hostname is mutable. I don't think
> it's fair to put another constraint on host naming here,
> especially one with implications of service denial or
> data corruption (see below).
>
>
> >> Maybe I'm just stating this to understand the purpose of this
> >> patch, but it could also be used as an "Intended audience"
> >> disclaimer in this new section.
> >
> > OK, so the "purpose of this patch" relates in part to a comment you made
> > earlier, which I include here:
> >
> >> Since it is just a line or two of code, it might be of little
> >> harm just to go with separate implementations for now and stop
> >> talking about it. If it sucks, we can fix the suckage.
> >>
> >> Who volunteers to implement this mechanism in mount.nfs ?
> >
> > I don't think this is the best next step. I think we need to get some
> > container system developer to contribute here. So far we only have
> > second hand anecdotes about problems. I think the most concrete is from
> > Ben suggesting that in at least one container system, using
> > /etc/machine-id is a good idea.
> >
> > I don't think we can change nfs-utils (whether mount.nfs or mount.conf
> > or some other way) to set identity from /etc/machine-id for everyone.
> > So we need at least for that container system to request that change.
> >
> > How would they like to do that?
> >
> > I suggest that we explain the problem to representatives of the various
> > container communities that we have contact with (Well... "you", more
> > than "we" as I don't have contacts).
>
> I'm all for involving one or more container experts. But IMO
> it's not appropriate to update our man page to do that. Let's
> update nfs(5) when we are done with this effort.

Don't let perfect be the enemy of good.
We were making no progress with "fixing" nfs. Documenting "how it works
today" should never be a bad thing. Obviously we can (and must) update
the documentation when we update the behaviour.

But if some concrete behavioural changes can be agreed and implemented
through this discussion, I'm happy for the documentation to land only
after those changes.

>
>
> > We could use the documentation I provided to clearly present the
> > problem.
>
> No doubt, we need a crisp problem statement!
>
>
> > Then ask:
> > - would you like to just run some shell code (see examples)
> > - or would you like to provide an /etc/nfs.conf.d/my-container.conf
> > - or would you like to run a tool that we provide
> > - or is there already a push to provide unique container hostnames,
> > and is this the incentive you need to help that push across the
> > line?
> >
> > If we have someone from $CONTAINER_COMMUNITY say "if you do this thing,
> > then we will use it", then that would be hard to argue with.
> > If we could get two or three different communities to comment, I expect
> > the best answer would become a lot more obvious.
> >
> > But first we, ourselves, need to agree on the document :-)
>
> If the community is seeking help, then a wiki might be a better
> place to formulate a problem statement.
>
>
> >>> +.PP
> >>> +Some situations which are known to be problematic with respect to unique
> >>> +host names include:
> >>
> >> A little wordy.
> >>
> >> Suggest: "Situations known to be problematic with respect to unique
> >> hostnames include:"
> >
> > Yep.
> >
> >>
> >> If this will eventually become part of nfs(5), I would first run
> >> this patch by documentation experts, because they might have a
> >> preference for "hostnames" over "host names" and "namespaces" over
> >> "name-spaces". Usage of these terms throughout this patch is not
> >> consistent.
> >
> > I've made it consistently "hostname" and "namespace" which is consistent
> > with the rest of the document
> >
> >>
> >>
> >>> +.IP \- 2
> >>> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
> >>> +not provide a unique host name.
> >>
> >> Suggest this addition:
> >>
> >> .IP \- 2
> >>
> >> Dynamically-assigned hostnames, where the hostname can be changed after
> >> a client reboot, while the client is booted, or if a client often
> >> repeatedly connects to multiple networks (for example if it is moved
> >> from home to an office every day).
> >
> > This is a different kettle of fish. The hostname is *always* included
> > in the identifier. If it isn't stable, then the identifier isn't
> > stable.
> >
> > I saw in the history that when you introduced the module parameter it
> > replaced the hostname. This caused problems in containers (which had
> > different host names) so Trond changed it so the module parameter
> > supplemented the hostname.
> >
> > If hostnames are really so poorly behaved I can see there might be a
> > case to suppress the hostname, but we don't have that option is current
> > kernels. Should we add it?
>
> I claim that it has become problematic to use the hostname in the
> nfs_client_id4 string.

In that case, we should fix it - make it possible to exclude the
hostname from the nfs_client_id4 string. You make a convincing case.
Have you thoughts on how we should implement that?

Add a new bool sysfs attribute: identity_includes_hostname which
defaults to true and the current behaviour, but which can be set to
false? Or should it transparently be set to false when the "identity"
is set?

>
> 25 years ago when NFSv4.0 was being crafted, it was assumed that
> client hostnames were unchanging. The original RFC 3010 recommended
> adding the hostname, the client IP address, and the server IP
> address to the nfs_client_id4 string.
>
> Since then, we've learned that the IP addresses are quite mutable,
> and thus not appropriate for a fixed identifier. I argue that the
> client's hostname is now the same.
>
> The Linux NFSv4 prototype and subsequent production code used the
> local hostname because it's easy to access in the kernel via the
> UTS name. That was adequate 20 years ago, but has become less so
> over time. You can view this evolution in the commit log.
>
> It doesn't seem that complicated (to me) to divorce the client_id4
> string from the local hostname, and the benefits are significant.
>
>
> >>> +.IP \- 2
> >>> +"containers" within a single Linux host. If each container has a separate
> >>> +network namespace, but does not use the UTS namespace to provide a unique
> >>> +host name, then there can be multiple effective NFS clients with the
> >>> +same host name.
> >>> +.IP \= 2
> >>
> >> .IP \- 2
> >
> > Thanks.
> >
> >>
> >>
> >>> +Clients across multiple administrative domains that access a common NFS
> >>> +server. If assignment of host name is devolved to separate domains,
> >>
> >> I don't recognize the phrase "assignment is devolved to separate domains".
> >> Can you choose a friendlier way of saying this?
> >>
> >
> > If hostnames are not assigned centrally then uniqueness cannot be
> > guaranteed unless a domain name is included in the hostname.
>
> Better, thanks.
>
>
> >>> +.RS
> >>> +.PP
> >>> +This value is empty on name-space creation.
> >>> +If the value is to be set, that should be done before the first
> >>> +mount. If the container system has access to some sort of per-container
> >>> +identity then that identity, possibly obfuscated as a UUID is privacy is
> >>> +needed, can be used. Combining the identity with the name of the
> >>> +container systems would also help.
> >>
> >> I object to recommending obfuscation via a UUID.
> >>
> >> 1. This is confusing because there has been no mention of any
> >> persistence requirement so far. At this point, a reader
> >> might think that the client can simply convert the hostname
> >> and netns identifier every time it boots. However this is
> >> only OK to do if these things are guaranteed not to change
> >> during the lifetime of a client. In a world where a majority
> >> of systems get their hostnames dynamically, I think this is
> >> a shaky foundation.
> >
> > If the hostname changes after boot (weird concept .. does that really
> > happen?) that is irrelevant.
>
> It really happens. A DHCP lease renewal can do it. Moving to a
> new subnet on the same campus might do it. I can open "Device
> Settings" on my laptop and change my laptop's hostname on a
> whim. Joining a VPN might do it.
>
> A client might have multiple network interfaces, each with a
> unique hostname. Which one should be used for the nfs_client_id4
> string? RFCs 7931 and 8587 discuss how trunking needs to work:
> the upshot is that the client needs to have one consistent
> nfs_client_id4 string it presents to all servers (in case of
> migration) no matter which network path it uses to access the
> server.
>
>
> > The hostname is copied at boot by NFS, and
> > if it is included in the /sys/fs/nfs/client/identifier (which would be
> > pointless, but not harmful) it has again been copied.
> >
> > If it is different on subsequent boots, then that is a big problem and
> > not one that we can currently fix.
>
> Yes, we can fix it: don't use the client's hostname but
> instead use a separate persistent uniquifier, as has been
> proposed.
>
>
> > ....except that non-persistent client identifiers isn't an enormous
> > problem, just a possible cause of delays.
>
> I disagree, it's a significant issue.
>
> - If locks are lost, that is a potential source of data corruption.
>
> - If a lease is stolen, that is a denial of service.
>
> Our customers take this very seriously.

Of course, as they should. data integrity is paramount.
non-persistent client identifier doesn't put that as risk - not in and
of itself.

If a client's identifier changed during the lifetime of one instance of
the client, then that would allow locks to be lost. That does NOT
happen just because you happen to change the host name. The hostname is
copied at first use.
It *could* happen if you changed the module parameter or sysfs identity
after the first mount, but I hope we can agree that not a justifiable
action.

A lease can only be "stolen" by a non-unique identifier, not simply by
non-persistent identifiers. But maybe this needs a caveat.

If a set of clients are each given host names from time to time which
are, at any moment in time, unique, but are able to "migrate" from one
client to another, then it would be possible for two clients to both
have performed their first NFS mount when they have some common
hosttname X. The "first" was given host X at boot time, it mounted
something. The hostname was subsequently change to Y and some other
host booted and got X and then mounted from the same server. This
would be seriously problematic. I class this as "non-unique" hostnames,
not as non-persistent-identifier.

> The NFS clients's
> out-of-the-shrink-wrap default behavior/configuration should be
> conservative enough to prevent these issues. Customers store
> mission critical data via NFS. Most customers expect NFS to work
> reliably without a lot of configuration fuss.

I've been working on the assumption that it is not possible to provide
ideal zero-config behaviour "out-of-the-shrink-wrap". You have hinted
(or more) a few times that this is your goal. Certainly a worthy goal if
possible. Is it possible?

I contend that if there is no common standard for how containers (and
network namespaces in particular) are used, then it is simply not
possible to provide perfect out-of-the-box behaviour. There *must* be
some local configuration that we cannot enforce through the kernel or
through nfs-utils. We can offer, but we cannot enforce. So we must
document.

The very best that we could do would be to provide a random component to
the identifier unless we had a high level of confidence that a unique
identifier had been provided some other way. I don't know how to get
that high level of confidence in a way that doesn't break working
configurations.
Ben suggested defaulting 'identity' to a random string for any network
namespace other than init. I don't think that is cautious enough.
Maybe if we did it when the network namespace is not init, but the UTS
namepsace is init. But that feels like a hack and is probably brittle.

Can you suggest *any* way to improve the "out-of-shrink-wrap" behaviour
significantly?

>
>
> >> 2. There's no requirement that this uniquifier be in the form
> >> of a UUID anywhere in specifications, and the Linux client
> >> itself does not add such a requirement. (You suggested
> >> before that we should start by writing down requirements.
> >> Using a UUID ain't a requirement).
> >
> > The requirement here is that /etc/machine-id is documented as requiring
> > obfuscation. uuidgen is a convenient way to provide obfuscation. That
> > is all I was trying to say.
>
> Understood, but the words you used have some additional
> implications that you might not want.
>
>
> >> Linux chooses to implement its uniquifer with a UUID because
> >> it is assumed we are using a random UUID (rather than a
> >> name-based or time-based UUID). A random UUID has strong
> >> global uniqueness guarantees, which guarantees the client
> >> identifier will always be unique amongst clients in nearly
> >> all situations for nearly no cost.
> >>
> >
> > "Linux chooses" what does that mean? I've lost the thread here, sorry.
>
> Try instead: "The documentation regarding the nfs_unique_id
> module parameter suggests the use of a UUID because..."

Ahhhh... that makes sense now - thanks.
That documentation needs to be updated. It still says "used instead of
a system's node name" while the code currently implements "used together
with ..."

>
>
> >> If we want to create a good uniquifier here, then combine the
> >> hostname, netns identity, and/or the host's machine-id and then
> >> hash that blob with a known strong digest algorithm like
> >> SHA-256. A man page must not recommend the use of deprecated or
> >> insecure obfuscation mechanisms.
> >
> > I didn't realize the hash that uuidgen uses was deprecated. Is there
> > some better way to provide an app-specific obfuscation of a string from
> > the command line?
> >
> > Maybe
> > echo nfs-id:`cat /etc/machine-id`| sha256sum
> >
> > ??
>
> Something like that, yes. But the scriptlet needs to also
> involve the netns identity somehow.

Hmmm.. the impression I got from Ben was that the container system
ensured that /etc/machine-id was different in different containers. So
there would be no need to add anything. Of course I should make that
explicit in the documentation.

I would be nice if we could always use "ip netns identify", but that
doesn't seem to be generally supported.

>
>
> >> The man page can suggest a random-based UUID as long as it
> >> states plainly that such UUIDs have global uniqueness guarantees
> >> that make them suitable for this purpose. We're using a UUID
> >> for its global uniqueness properties, not because of its
> >> appearance.
> >
> > So I could use "/etc/nfsv4-identity" instead of "/etc/nfs4-uuid".
>
> I like. I would prefer not using "uuid" in the name. Ben and
> Steve were resistant to that idea, though.
>
>
> > What else should I change/add.
> >
> >>
> >>
> >>> For example:
> >>> +.RS 4
> >>> +echo "ip-netns:`ip netns identify`" \\
> >>> +.br
> >>> + > /sys/fs/nfs/client/net/identifier
> >>> +.br
> >>> +uuidgen --sha1 --namespace @url \\
> >>> +.br
> >>> + -N "nfs:`cat /etc/machine-id`" \\
> >>> +.br
> >>> + > /sys/fs/nfs/client/net/identifier
> >>> +.RE
> >>> +If the container system provides no stable name,
> >>> +but does have stable storage,
> >>
> >> Here's the first mention of "stable". It needs some
> >> introduction far above.
> >
> > True. So the first para becomes:
> >
> > NFSv4 requires that the client present a stable unique identifier to
> > the server to be used to track state such as file locks. By default
> > Linux NFS uses the hostname, as configured at the time of the first
> > NFS mount, together with some fixed content such as the name "Linux
> > NFS" and the particular protocol version. When the hostname is
> > guaranteed to be unique among all client which access the same server,
> > and stable across reboots, this is sufficient. If hostname uniqueness
> > cannot be assumed, extra identity information must be provided. If
> > the hostname is not stable, unclean restarts may suffer unavoidable
> > delays.
>
> See above. The impact is more extensive than "unavoidable delays."
>
>
> >>> then something like
> >>> +.RS 4
> >>> +[ -s /etc/nfsv4-uuid ] || uuidgen > /etc/nfsv4-uuid &&
> >>> +.br
> >>> +cat /etc/nfsv4-uuid > /sys/fs/nfs/client/net/identifier
> >>> +.RE
> >>> +would suffice.
> >>> +.PP
> >>> +If a container has neither a stable name nor stable (local) storage,
> >>> +then it is not possible to provide a stable identifier, so providing
> >>> +a random identifier to ensure uniqueness would be best
> >>> +.RS 4
> >>> +uuidgen > /sys/fs/nfs/client/net/identifier
> >>> +.RE
> >>> +.RE
> >>> +.SS Consequences of poor identity setting
> >>
> >> This section provides context to understand the above technical
> >> recommendations. I suggest this whole section should be moved
> >> to near the opening paragraph.
> >
> > I seem to keep moving things upwards.... something has to come last.
> > Maybe a "(See below)" at the end of the revised first para?
> >
> >>
> >>
> >>> +Any two concurrent clients that might access the same server must have
> >>> +different identifiers for correct operation, and any two consecutive
> >>> +instances of the same client should have the same identifier for optimal
> >>> +crash recovery.
> >>
> >> Also recovery from network partitions.
> >
> > A network partition doesn't coincide with two consecutive instances of the
> > same client. There is just one client instance and one server instance.
>
> It's possible for one of the peers to reboot during the network
> partition.

True, but is that interesting?
There are situations where the client will lose locks no matter what it
does with its identity. These don't have any impact on choices of what
you use for identity. There are also situations where the client won't
lose locks. These are equally irrelevant.

The only relevant situation (with respect to identifier stability) is
when the server reboots, and the client is able to contact the server
during the grace period. If it doesn't use the same identity as it used
before, it can then lose locks.

>
>
> >>> +.PP
> >>> +If two different clients present the same identity to a server there are
> >>> +two possible scenarios. If the clients use the same credential then the
> >>> +server will treat them as the same client which appears to be restarting
> >>> +frequently. One client may manage to open some files etc, but as soon
> >>> +as the other client does anything the first client will lose access and
> >>> +need to re-open everything.
> >>
> >> This seems fuzzy.
> >>
> >> 1. If locks are lost, then there is a substantial risk of data
> >> corruption.
> >>
> >> 2. Is the client itself supposed to re-open files, or are
> >> applications somehow notified that they need to re-open?
> >> Either of these scenarios is fraught -- I don't believe any
> >> application is coded to expect to have to re-open a file
> >> due to exigent circumstances.
> >
> > I wasn't very happy with the description either. I think we want some
> > detail, but not too much.
> >
> > The "re-opening" that I mentioned is the NFS client resubmitting NFS
> > OPEN requests, not the application having to re-open.
> > However if the application manages to get a lock, then when the "other"
> > client connects to the server the application will lose the lock, and
> > all read/write accesses on the relevant fd will result in EIO (I
> > think). Clearly bad.
> >
> > I wanted to say the clients could end up "fighting" with each other -
> > the EXCHANGE_ID from one destroys the state set up by the other - I that
> > seems to be too much anthropomorphism.
> >
> > If two different clients present the same identity to a server there
> > are two possible scenarios. If the clients use the same credential
> > then the server will treat them as the same client which appears to
> > be restarting frequently. The clients will each enter a loop where
> > they establish state with the server and then find that the state
> > has been destroy by the other client and so will need to establish
> > it again.
> >
> > ???
>
> My colleague Calum coined the term "lease stealing". That might be
> a good thing to define somewhere and simply use that term as needed.
>

.PP
If two different clients present the same identity to a server there are
two possible scenarios. If the clients do not use cryptographic
credentials, or use the same credential, then the server will treat them
as the same client which appears to be restarting frequently. Each
client will effectively "steal" the lease established by the other and
neither will make useful progress.
.PP
If the clients use different cryptographic credentials, then the second
client to establish a connection to the server will be refused access
which is a safer failure mode.
.PP
Cryptographic credentials used to authenticate lease operations will be
the host principal from
.I /etc/krb5.keytab
or in some cases, the lone user principal. These securely prevent lease
stealing.

>
> >>> +.PP
> >>> +If the clients use different credentials, then the second client to
> >>> +establish a connection to the server will be refused access. For
> >>> +.B auth=sys
> >>> +the credential is based on hostname, so will be the same if the
> >>> +identities are the same. With
> >>> +.B auth=krb
> >>> +the credential is stored in
> >>> +.I /etc/krb5.keytab
> >>> +and will be the same only if this is copied among hosts.
> >>
> >> This language implies that copying the keytab is a recommended thing
> >> to do. It's not. I mentioned it before because some customers think
> >> it's OK to use the same keytab across their client fleet. But obviously
> >> that will result in lost open and lock state.
> >>
> >> I suggest rephrasing this last sentence to describe the negative lease
> >> recovery consequence of two clients happening to share the same host
> >> principal -- as in "This is why you shouldn't share keytabs..."
> >>
> >
> > How about
> >
> > .PP
> > If the clients use different credentials, then the second client to
> > establish a connection to the server will be refused access which is a
> > safer failure mode. For
> > .B auth=sys
> > the credential is based on hostname, so will be the same if the
> > identities are the same. With
> > .B auth=krb
> > the credential is stored in
> > .I /etc/krb5.keytab
> > so providing this isn't copied among client the safer failure mode will result.
>
> With
> .BR auth=krb5 ,
> the client uses the host principal in
> .I /etc/krb5.keytab
> or in some cases, the lone user principal,
> to authenticate lease management operations.
> This securely prevents lease stealing.
>
>
>
> > ??
> >
> > Thanks for your details review!
> >
> > NeilBrown
> >
> >>
> >>> +.PP
> >>> +If the identity is unique but not stable, for example if it is generated
> >>> +randomly on each start up of the NFS client, then crash recovery is
> >>> +affected. When a client shuts down uncleanly and restarts, the server
> >>> +will normally detect this because the same identity is presented with
> >>> +different boot time (or "incarnation verifier"), and will discard old
> >>> +state. If the client presents a different identifier, then the server
> >>> +cannot discard old state until the lease time has expired, and the new
> >>> +client may be delayed in opening or locking files that it was
> >>> +previously accessing.
> >>> .SH FILES
> >>> .TP 1.5i
> >>> .I /etc/fstab
> >>> --
> >>> 2.35.1
> >>>
> >>
> >> --
> >> Chuck Lever
>
> --
> Chuck Lever
>

NeilBrown

2022-03-18 19:25:42

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

> On Mar 17, 2022, at 10:00 PM, NeilBrown <[email protected]> wrote:
>
> On Thu, 17 Mar 2022, Chuck Lever III wrote:
>> Howdy Neil-
>
> G'day
>
>>>> The last sentence is made ambiguous by the use of passive voice.
>>>>
>>>> Suggest: "When hostname uniqueness cannot be guaranteed, the client
>>>> administrator must provide extra identity information."
>>>
>>> Why must the client administrator do this? Why can't some automated
>>> tool do this? Or some container-building environment.
>>> That's an advantage of the passive voice, you don't need to assign
>>> responsibility for the verb.
>>
>> My point is that in order to provide the needed information,
>> elevated privilege is required. The current sentence reads as
>> if J. Random User could be interrupted at some point and asked
>> for help.
>>
>> In other words, the documentation should state that this is
>> an administrative task. Here I'm not advocating for a specific
>> mechanism to actually perform that task.
>
> ??? This whole man page is primarily about mount options, particularly
> as they appear in /etc/fstab. These are not available to the non-admin.
> Why would anyone think this section is any different?

Because the nfs_client_id4 uniquifier is not a mount option and
isn't mentioned anywhere else. It's not going to be familiar to
some. As you and I know, most people are not careful readers.

Do note that nfs(5) is really just an extension of mount(8).
The sections you pointed to earlier (eg, DATA AND METADATA
COHERENCE) are there to provide context explaining how to use
NFS mount options. The patch you have proposed is for an API
and protocol element that have nothing to do with NFS mount
options. That by itself disqualifies a proposed addition to
nfs(5).

I suggest instead constructing an independent man page that
is attached to the /etc file that contains the client ID
uniquifier. Something akin to machine-id(5) ?

>>>> I have a problem with basing our default uniqueness guarantee on
>>>> hostnames "most of the time" hoping it will all work out. There
>>>> are simply too many common cases where hostname stability can't be
>>>> relied upon. Our sustaining teams will happily tell us this hope
>>>> hasn't so far been born out.
>>>
>>> Maybe it has not been born out because there is no documented
>>> requirement for it that we can point people to.
>>> Clearly containers that use NFS are not currently all configured well to do
>>> this. Some change is needed. Maybe adding a unique host name is the
>>> easiest change ... or maybe not.
>>
>> You seem to be documenting the client's current behavior.
>> The tone of the documentation is that this behavior is fine
>> and works for most people.
>
> It certainly works for a lot of people. Many people are using NFSv4
> quite effectively. I'm sure there are people who are having problems
> too, but let's not fall for the squeaky wheel fallacy.

For some folks it fails silently and/or requires round trips
with their distributor's call center. I would like not to
discount their experience.

>> It's the second part that I disagree with. Oracle Linux has
>> bugs documenting this behavior is a problem, and I'm sure
>> Red Hat does too. The current behavior is broken. It is this
>> brokeness that we are trying to resolve.
>
> The current behaviour of NFS is NOT broken. Maybe is it not adequately
> robust against certain configuration choices. Certainly we should make
> it as robust as we reasonably can. But let's not overstate the problem.

Years of bug reports suggests I'm not overstating anything.

The plan, for a while now, has been to supplement the use of
the hostname to address this very situation. You are now
suggesting there is nothing to address, which I find difficult
to swallow.

>> So let me make a stronger statement: we should not
>> document that broken behavior in nfs(5). Instead, we should
>> fix that behavior, and then document the golden brown and
>> delicious behavior. Updating nfs(5) first is putting
>> DeCarte in front of de horse.
>>
>>
>>> Surely NFS is not the *only* service that uses the host name.
>>> Encouraging the use of unique host names might benefit others.
>>
>> Unless you have specific use cases that might benefit from
>> ensuring hostname uniqueness, I would beg that you stay
>> focused on the immediate issue of how the Linux client
>> constructs its nfs_client_id4 strings.
>>
>>
>>> The practical reality is that a great many NFS client installations do
>>> currently depend on unique host names - after all, it actually works.
>>> Is it really so unreasonable to try to encourage the exceptions to fit
>>> the common pattern better?
>>
>> Yes it is unreasonable.
>>
>> NFS servers typically have a fixed DNS presence. They have
>> to because clients mount by hostname.
>>
>> NFS clients, on the other hand, are not under that constraint.
>> The only time I can think of where a client has to have a
>> fixed hostname is if a krb5 host principal is involved.
>>
>> In so many other cases, eg. mobile computing or elastic
>> services, the client hostname is mutable. I don't think
>> it's fair to put another constraint on host naming here,
>> especially one with implications of service denial or
>> data corruption (see below).
>>
>>
>>>> Maybe I'm just stating this to understand the purpose of this
>>>> patch, but it could also be used as an "Intended audience"
>>>> disclaimer in this new section.
>>>
>>> OK, so the "purpose of this patch" relates in part to a comment you made
>>> earlier, which I include here:
>>>
>>>> Since it is just a line or two of code, it might be of little
>>>> harm just to go with separate implementations for now and stop
>>>> talking about it. If it sucks, we can fix the suckage.
>>>>
>>>> Who volunteers to implement this mechanism in mount.nfs ?
>>>
>>> I don't think this is the best next step. I think we need to get some
>>> container system developer to contribute here. So far we only have
>>> second hand anecdotes about problems. I think the most concrete is from
>>> Ben suggesting that in at least one container system, using
>>> /etc/machine-id is a good idea.
>>>
>>> I don't think we can change nfs-utils (whether mount.nfs or mount.conf
>>> or some other way) to set identity from /etc/machine-id for everyone.
>>> So we need at least for that container system to request that change.
>>>
>>> How would they like to do that?
>>>
>>> I suggest that we explain the problem to representatives of the various
>>> container communities that we have contact with (Well... "you", more
>>> than "we" as I don't have contacts).
>>
>> I'm all for involving one or more container experts. But IMO
>> it's not appropriate to update our man page to do that. Let's
>> update nfs(5) when we are done with this effort.
>
> Don't let perfect be the enemy of good.
> We were making no progress with "fixing" nfs. Documenting "how it works
> today" should never be a bad thing.

To be clear, I don't have a problem with documenting the current
behavior /somewhere else/. I do have a problem documenting it in
nfs(5) as a situation that is fine, given its known shortcomings
and the fact that it will be updated in short order.

> Obviously we can (and must) update
> the documentation when we update the behaviour.
>
> But if some concrete behavioural changes can be agreed and implemented
> through this discussion, I'm happy for the documentation to land only
> after those changes.
>
>>>>> +.IP \- 2
>>>>> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
>>>>> +not provide a unique host name.
>>>>
>>>> Suggest this addition:
>>>>
>>>> .IP \- 2
>>>>
>>>> Dynamically-assigned hostnames, where the hostname can be changed after
>>>> a client reboot, while the client is booted, or if a client often
>>>> repeatedly connects to multiple networks (for example if it is moved
>>>> from home to an office every day).
>>>
>>> This is a different kettle of fish. The hostname is *always* included
>>> in the identifier. If it isn't stable, then the identifier isn't
>>> stable.
>>>
>>> I saw in the history that when you introduced the module parameter it
>>> replaced the hostname. This caused problems in containers (which had
>>> different host names) so Trond changed it so the module parameter
>>> supplemented the hostname.
>>>
>>> If hostnames are really so poorly behaved I can see there might be a
>>> case to suppress the hostname, but we don't have that option is current
>>> kernels. Should we add it?
>>
>> I claim that it has become problematic to use the hostname in the
>> nfs_client_id4 string.
>
> In that case, we should fix it - make it possible to exclude the
> hostname from the nfs_client_id4 string. You make a convincing case.
> Have you thoughts on how we should implement that?

This functionality has been implemented for some time using either
sysfs or a module parameter. Those APIs supplement the hostname
with whatever string is provided. I don't think we need to
exclude the hostname from the nfs_client_id4 -- in fact some folks
might prefer keeping the hostname in there as an eye-catcher. But
it's simply that the hostname by itself does not provide enough
uniqueness.

The plan for some time now has been to construct user space mechanisms
to use the sysfs/module parameter APIs to always plug in a uniquifier.
That relieves the hostname uniqueness dependencies as long as those
mechanisms are used as often as possible.

So in other words, today the default is to use the hostname; using
the random uniqifier is an exception. The plan is to make the random
uniqifier the default, and fall back on the hostname if for some
reason the uniquifier initialization mechanism did not work.

>>> The hostname is copied at boot by NFS, and
>>> if it is included in the /sys/fs/nfs/client/identifier (which would be
>>> pointless, but not harmful) it has again been copied.
>>>
>>> If it is different on subsequent boots, then that is a big problem and
>>> not one that we can currently fix.
>>
>> Yes, we can fix it: don't use the client's hostname but
>> instead use a separate persistent uniquifier, as has been
>> proposed.
>>
>>
>>> ....except that non-persistent client identifiers isn't an enormous
>>> problem, just a possible cause of delays.
>>
>> I disagree, it's a significant issue.
>>
>> - If locks are lost, that is a potential source of data corruption.
>>
>> - If a lease is stolen, that is a denial of service.
>>
>> Our customers take this very seriously.
>
> Of course, as they should. data integrity is paramount.
> non-persistent client identifier doesn't put that as risk - not in and
> of itself.
>
> If a client's identifier changed during the lifetime of one instance of
> the client, then that would allow locks to be lost. That does NOT
> happen just because you happen to change the host name. The hostname is
> copied at first use.
> It *could* happen if you changed the module parameter or sysfs identity
> after the first mount, but I hope we can agree that not a justifiable
> action.
>
> A lease can only be "stolen" by a non-unique identifier, not simply by
> non-persistent identifiers. But maybe this needs a caveat.

In this thread, I refer mostly to issues caused by
nfs_client_id4 non-uniqueness.

This is indeed the class of misbehavior that is significant
to our customer base. Multiple clients might use
"localhost.localdomain" simply because that's the way the
imaging template is built. Or when an image is copied to
create a new guest, the hostname is not changed. Those are
but two examples. In many cases, client administrators
are simply not in control of their hostnames.

In cloud deployments, AUTH_SYS is the norm because managing a
large Kerberos realm is generally onerous. Thus AUTH_SYS plus
a hostname-uniquified nfs_client_id4 is by far the common
case, though it is the most risky one.

> If a set of clients are each given host names from time to time which
> are, at any moment in time, unique, but are able to "migrate" from one
> client to another, then it would be possible for two clients to both
> have performed their first NFS mount when they have some common
> hosttname X. The "first" was given host X at boot time, it mounted
> something. The hostname was subsequently change to Y and some other
> host booted and got X and then mounted from the same server. This
> would be seriously problematic. I class this as "non-unique" hostnames,
> not as non-persistent-identifier.
>
>> The NFS clients's
>> out-of-the-shrink-wrap default behavior/configuration should be
>> conservative enough to prevent these issues. Customers store
>> mission critical data via NFS. Most customers expect NFS to work
>> reliably without a lot of configuration fuss.
>
> I've been working on the assumption that it is not possible to provide
> ideal zero-config behaviour "out-of-the-shrink-wrap". You have hinted
> (or more) a few times that this is your goal. Certainly a worthy goal if
> possible. Is it possible?
>
> I contend that if there is no common standard for how containers (and
> network namespaces in particular) are used, then it is simply not
> possible to provide perfect out-of-the-box behaviour. There *must* be
> some local configuration that we cannot enforce through the kernel or
> through nfs-utils. We can offer, but we cannot enforce. So we must
> document.
>
> The very best that we could do would be to provide a random component to
> the identifier unless we had a high level of confidence that a unique
> identifier had been provided some other way. I don't know how to get
> that high level of confidence in a way that doesn't break working
> configurations.
> Ben suggested defaulting 'identity' to a random string for any network
> namespace other than init. I don't think that is cautious enough.
> Maybe if we did it when the network namespace is not init, but the UTS
> namepsace is init. But that feels like a hack and is probably brittle.
>
> Can you suggest *any* way to improve the "out-of-shrink-wrap" behaviour
> significantly?

Well it sounds like we agree that making the random uniquifier
the default is a good step forward. Just because this has been
contentious so far, I think we should strive for something that
is a best effort but clearly a step up. The fall back can use
the hostname. Over time the remaining gaps can be stopped.

Here are some suggestions that might make it simpler to implement.

1. Ben's tool manufactures the uniqifier if the file doesn't
already exist. That seems somewhat racy. Instead, why not
make installation utilities responsible for creating the
uniquifier? We need some guarantee that when a VM is cloned,
the uniquifier is replaced, for instance; that's well
outside nfs-utils' sphere of influence.

Document the requirements (a la machine-id(5)) then point
the distributors and Docker folks at that. I think that is
your plan, right? I've done the same with at least one of
Oracle's virtualization products, while waiting for a more
general upstream solution.

Then, either each mount.nfs invocation or some part of
system start-up checks for the uniquifier file and pushes
the uniquifier into the local net namespace. (Doing this
only once at boot has its appeal). If the uniquifier file
does not exist, then the NFS client continues to use a
hostname uniquifier. Over time we find and address the
fallback cases.

2. The udev rule mechanism that Trond proposed attempted to
address both init_ns and subsequent namespaces the same way.
Maybe it's time to examine the assumptions there to help
us make more progress.

Use independent mechanisms for the init_ns and for subsequent
net namespaces. Perhaps Ben already suggested this. Looking
back over weeks of this conversation, these two use cases
seem fundamentally different from each other. The init_ns
has to handle NFSROOT, can use the boot command line or the
module parameter to deal with PXE booting and so on. The
Docker case can use whatever works better for them.

3. We don't yet have a way to guarantee that the uniquifier is
in place before the first NFS mount is initiated. Talking
with someone who has deep systemd expertise might help. It
might also help at least in the non-container case if the
uniquifier is provided on the kernel command line, the same
way that root= is specified.

4. An alternative for the init_ns case might be to add a
mechanism to initramfs to set the client's uniquifier.
On my clients where containers are not in use, I set the
uniquifier using the module parameter; the module load
config file needs to be added to initramfs before it
takes effect.

>>>> If we want to create a good uniquifier here, then combine the
>>>> hostname, netns identity, and/or the host's machine-id and then
>>>> hash that blob with a known strong digest algorithm like
>>>> SHA-256. A man page must not recommend the use of deprecated or
>>>> insecure obfuscation mechanisms.
>>>
>>> I didn't realize the hash that uuidgen uses was deprecated. Is there
>>> some better way to provide an app-specific obfuscation of a string from
>>> the command line?
>>>
>>> Maybe
>>> echo nfs-id:`cat /etc/machine-id`| sha256sum
>>>
>>> ??
>>
>> Something like that, yes. But the scriptlet needs to also
>> involve the netns identity somehow.
>
> Hmmm.. the impression I got from Ben was that the container system
> ensured that /etc/machine-id was different in different containers. So
> there would be no need to add anything. Of course I should make that
> explicit in the documentation.
>
> I would be nice if we could always use "ip netns identify", but that
> doesn't seem to be generally supported.

If containers provide unique machine-ids, a digest of the
machine-id is fine with me.

Note that many implementations don't tolerate a large
nfs_client_id4 string, so keeping the digest size small
might be needed. Using blake2 might be a better choice.

--
Chuck Lever

2022-03-21 07:35:01

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

On Sat, 19 Mar 2022, Chuck Lever III wrote:
>
> Here are some suggestions that might make it simpler to implement.
>
> 1. Ben's tool manufactures the uniqifier if the file doesn't
> already exist. That seems somewhat racy. Instead, why not
> make installation utilities responsible for creating the
> uniquifier? We need some guarantee that when a VM is cloned,
> the uniquifier is replaced, for instance; that's well
> outside nfs-utils' sphere of influence.

You say "the file" like that is a well defined concept. It isn't.
In the context of a container we don't even know if there is *any*
stable local storage.
The existence of "the file" is as much out side of nfs-util's sphere of
influence as the cloning of a VM is.
At least the cloning of a VM is, or soon will be
(https://lwn.net/Articles/887207/), within the sphere of influence of
the NFS kernel module. It will be able to detect the fork and .... do
something. Maybe disable access to all existing mounts and refuse new
mounts until 'identity' has been set.

If NFS had always required identity to be set in a container before
allowing mounts, then the udev approach could work and we would be in a
much better place. But none of us knew that then, and it is too late
for that now (is it?).

This conversation seems to be going around in circles and not getting
anywhere. As as I have no direct interest (the SUSE bugzilla has
precisely 1 bug relating to NFS and non-unique hostnames, and the
customer seemed to accept the requirement of unique hostnames) I am
going to bow out. I might post one more attempt at a documentation
update ... or I might not.

Thanks,
NeilBrown

2022-03-22 21:28:59

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2] nfs.man: document requirements for NFSv4 identity

> On Mar 14, 2022, at 8:41 PM, NeilBrown <[email protected]> wrote:
>
> On Tue, 15 Mar 2022, Chuck Lever III wrote:
>> Hi Neil-
>>
>>> +.IP \- 2
>>> +NFS-root (diskless) clients, where the DCHP server (or equivalent) does
>>> +not provide a unique host name.
>>
>> Suggest this addition:
>>
>> .IP \- 2
>>
>> Dynamically-assigned hostnames, where the hostname can be changed after
>> a client reboot, while the client is booted, or if a client often
>> repeatedly connects to multiple networks (for example if it is moved
>> from home to an office every day).
>
> This is a different kettle of fish. The hostname is *always* included
> in the identifier. If it isn't stable, then the identifier isn't
> stable.
>
> I saw in the history that when you introduced the module parameter it
> replaced the hostname. This caused problems in containers (which had
> different host names) so Trond changed it so the module parameter
> supplemented the hostname.
>
> If hostnames are really so poorly behaved I can see there might be a
> case to suppress the hostname, but we don't have that option is current
> kernels. Should we add it?

I didn't fully understand this comment before. I assume you are
referring to:

55b592933b7d ("NFSv4: Fix nfs4_init_uniform_client_string for net namespaces")

That will likely break reboot recovery if the container's nodename
changes over a reboot.

My (probably limited) understanding is that using the udev rule to
always add a uniquifier could have helped make it possible to remove
the hostname from the co_ownerid.

For the record, I take back this statement:

> I don't think we need to
> exclude the hostname from the nfs_client_id4 -- in fact some folks
> might prefer keeping the hostname in there as an eye-catcher. But
> it's simply that the hostname by itself does not provide enough
> uniqueness.

Since the nodename can change at inopportune times (like over a
reboot), including it in the co_ownerid string can sometimes be
problematic. But I don't have a better suggestion at this time.

> This conversation seems to be going around in circles and not getting
> anywhere. As as I have no direct interest (the SUSE bugzilla has
> precisely 1 bug relating to NFS and non-unique hostnames, and the
> customer seemed to accept the requirement of unique hostnames) I am
> going to bow out. I might post one more attempt at a documentation
> update ... or I might not.

I now agree that the Linux NFS community will need to work with
packagers and distributors, and that we will not arrive at a one-
size-fits-all tool by ourselves.

I'll try to pick up the documentation torch. Thanks for your
efforts so far.

--
Chuck Lever