2022-09-21 18:36:31

by Alan Maxwell

[permalink] [raw]
Subject: nfsv4 client idmapper issue

I am reporting an issue, not a fault or bugreport.
NFS client : Redhat 7 kernel: 3.10.0-1160.71.1.el7.x86_64.
The issue lies with the feature that nfs client that: if an nfs server rejects an unmapped uid or gid, then the client will automatically switch back to using the idmapper.

Our particular configuration of nfsv4 server and client are based on using numeric uidNumber and gidNumber communication. The nfs server we are using , OneFS (Dell/EMC/Isilon) has a setting explicitly for this use: "Do not send names". We have this set and our testing showed working 100% with our nfs client. The main driver for us using this feature is that our uid's are numeric. That causes issues with commands like chown and apparently NFS setattr. Once we realized that, we set the numeric setting and everything worked as planned.
Our problem with the feature comes due to a simple mistake made by an Admin:
chgrp groupnotvalid file
When the admin issued a chgrp, but that group does not exist in the directory service for the NFS server, the NFS server rejected the change. Then the feature kicked in that "client will automatically switch back to using the idmapper. " Which did make changes, the /proc/self/mountstats showing the caps=0x7fff instead of 0xffff.
The only solution to get the mount to work as originally configured is to umount/mount the share.
Bottom line: Our environment can not support idmapping. Having the feature to disable it and that disable be forceful and not something the kernel can decide to re-enable.
We would envision that if an invalid chown/chgrp were issued, to simply return an error, report that the chown/chgrp were not applied and simply leave the nfsmount as is.

Alan Maxwell | Sr. System Programmer | Platinum Infrastructure|20 FedEx Pkwy 1st Fl Vert,Collierville, TN 38017


2022-09-22 11:43:28

by Benjamin Coddington

[permalink] [raw]
Subject: Re: nfsv4 client idmapper issue

On 21 Sep 2022, at 14:13, Alan Maxwell wrote:

> I am reporting an issue, not a fault or bugreport.
> NFS client : Redhat 7 kernel: 3.10.0-1160.71.1.el7.x86_64.
> The issue lies with the feature that nfs client that: if an nfs server
> rejects an unmapped uid or gid, then the client will automatically
> switch back to using the idmapper.
>
> Our particular configuration of nfsv4 server and client are based on
> using numeric uidNumber and gidNumber communication. The nfs server
> we are using , OneFS (Dell/EMC/Isilon) has a setting explicitly for
> this use: "Do not send names". We have this set and our testing
> showed working 100% with our nfs client. The main driver for us using
> this feature is that our uid's are numeric. That causes issues with
> commands like chown and apparently NFS setattr. Once we realized
> that, we set the numeric setting and everything worked as planned.
> Our problem with the feature comes due to a simple mistake made by an
> Admin:
> chgrp groupnotvalid file
> When the admin issued a chgrp, but that group does not exist in the
> directory service for the NFS server, the NFS server rejected the
> change. Then the feature kicked in that "client will automatically
> switch back to using the idmapper. " Which did make changes, the
> /proc/self/mountstats showing the caps=0x7fff instead of 0xffff.
> The only solution to get the mount to work as originally configured is
> to umount/mount the share.
> Bottom line: Our environment can not support idmapping. Having the
> feature to disable it and that disable be forceful and not something
> the kernel can decide to re-enable.
> We would envision that if an invalid chown/chgrp were issued, to
> simply return an error, report that the chown/chgrp were not applied
> and simply leave the nfsmount as is.
>
> Alan Maxwell | Sr. System Programmer | Platinum Infrastructure|20
> FedEx Pkwy 1st Fl Vert,Collierville, TN 38017

Seems like a server bug to me -- if you want to set a numeric group on a
file, the server doesn't need to "look up" the group to see if it
exists, it
should just set the value on the underlying filesystem.

What the server is signaling by sending back NFS4ERR_BADOWNER is that it
actually /is/ doing id mapping.

Why doesn't OneFS just set the value when told "Do not send names"?

Ben

2022-09-22 13:49:15

by Alan Maxwell

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: nfsv4 client idmapper issue



-----Original Message-----
From: Benjamin Coddington <[email protected]>
Sent: Thursday, September 22, 2022 6:42 AM
To: Alan Maxwell <[email protected]>
Cc: [email protected]
Subject: [EXTERNAL] Re: nfsv4 client idmapper issue

Caution! This email originated outside of FedEx. Please do not open attachments or click links from an unknown or suspicious origin.

On 21 Sep 2022, at 14:13, Alan Maxwell wrote:

> I am reporting an issue, not a fault or bugreport.
> NFS client : Redhat 7 kernel: 3.10.0-1160.71.1.el7.x86_64.
> The issue lies with the feature that nfs client that: if an nfs server
> rejects an unmapped uid or gid, then the client will automatically
> switch back to using the idmapper.
>
> Our particular configuration of nfsv4 server and client are based on
> using numeric uidNumber and gidNumber communication. The nfs server
> we are using , OneFS (Dell/EMC/Isilon) has a setting explicitly for
> this use: "Do not send names". We have this set and our testing
> showed working 100% with our nfs client. The main driver for us using
> this feature is that our uid's are numeric. That causes issues with
> commands like chown and apparently NFS setattr. Once we realized
> that, we set the numeric setting and everything worked as planned.
> Our problem with the feature comes due to a simple mistake made by an
> Admin:
> chgrp groupnotvalid file
> When the admin issued a chgrp, but that group does not exist in the
> directory service for the NFS server, the NFS server rejected the
> change. Then the feature kicked in that "client will automatically
> switch back to using the idmapper. " Which did make changes, the
> /proc/self/mountstats showing the caps=0x7fff instead of 0xffff.
> The only solution to get the mount to work as originally configured is
> to umount/mount the share.
> Bottom line: Our environment can not support idmapping. Having the
> feature to disable it and that disable be forceful and not something
> the kernel can decide to re-enable.
> We would envision that if an invalid chown/chgrp were issued, to
> simply return an error, report that the chown/chgrp were not applied
> and simply leave the nfsmount as is.
>
> Alan Maxwell | Sr. System Programmer | Platinum Infrastructure|20
> FedEx Pkwy 1st Fl Vert,Collierville, TN 38017

Seems like a server bug to me -- if you want to set a numeric group on a file, the server doesn't need to "look up" the group to see if it exists, it should just set the value on the underlying filesystem.
How would the server know what gidNumber to assign if the nfs client sent a name?
Is there a method in Redhat to have the nfsclient only send uidNumbers/gidNumbers?


What the server is signaling by sending back NFS4ERR_BADOWNER is that it actually /is/ doing id mapping.
Doing id mapping or better name , id verification, is expected. We hope the server would tell us, "client sent name I can't verify or lookup"

Why doesn't OneFS just set the value when told "Do not send names"?
The nfsclient sends both a bad name and bad gidNumber, we actually think that should be the case, even and security=sys , there should be validation of users and groups.

Ben

2022-09-22 14:11:47

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 9:45, Alan Maxwell wrote:

> How would the server know what gidNumber to assign if the nfs client sent
> a name?

I'm not familiar with this server, but I'm guessing if you have it set to
"Do not send names", then it also will try not to translate uid/gids it
receives. Are you asking a theoretical question?

> Is there a method in Redhat to have the nfsclient only send
> uidNumbers/gidNumbers?

Better to use Red Hat's support for these type of questions because this
list is mostly upstream development work, but I believe that's the point of
nfs4_disable_idmapping which exists on that kernel.

> Doing id mapping or better name , id verification, is expected. We hope
> the server would tell us, "client sent name I can't verify or lookup"

Right, and that is a signal to the client that the server is not doing the
"Do not send names" thing, rather trying to map values, so the client
changes its behavior.

If you're only sending integer gid values, what does it mean to verify a
group id? If you want your server to treat the values as integer gids, then
it shouldn't return an error that means "I couldn't translate this into a
gid".

> The nfsclient sends both a bad name and bad gidNumber, we actually think
> that should be the case, even and security=sys , there should be
> validation of users and groups.

I'm sorry, I don't understand what you trying to say here.

Ben

2022-09-22 14:32:58

by Alan Maxwell

[permalink] [raw]
Subject: RE: [EXTERNAL] nfsv4 client idmapper issue



-----Original Message-----
From: Benjamin Coddington <[email protected]>
Sent: Thursday, September 22, 2022 9:06 AM
To: Alan Maxwell <[email protected]>
Cc: [email protected]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 9:45, Alan Maxwell wrote:

> How would the server know what gidNumber to assign if the nfs client
> sent a name?

I'm not familiar with this server, but I'm guessing if you have it set to "Do not send names", then it also will try not to translate uid/gids it receives. Are you asking a theoretical question?
We have server set to "do not send names" because our uid's are fully numeric and that causes similar problem with nfs-client.

> Is there a method in Redhat to have the nfsclient only send
> uidNumbers/gidNumbers?

Better to use Red Hat's support for these type of questions because this list is mostly upstream development work, but I believe that's the point of nfs4_disable_idmapping which exists on that kernel.
I have an open case with Redhat, they have instructed that, "the upstream kernel has this feature, we can't make any corrections"


> Doing id mapping or better name , id verification, is expected. We
> hope the server would tell us, "client sent name I can't verify or lookup"

Right, and that is a signal to the client that the server is not doing the "Do not send names" thing, rather trying to map values, so the client changes its behavior.

If you're only sending integer gid values, what does it mean to verify a group id? If you want your server to treat the values as integer gids, then it shouldn't return an error that means "I couldn't translate this into a gid".
Again, we would expect ultimately if the server returns error, nfs client should do same, show and error, not change the configuration and ignore our disable_id_mapping.

> The nfsclient sends both a bad name and bad gidNumber, we actually
> think that should be the case, even and security=sys , there should be
> validation of users and groups.

I'm sorry, I don't understand what you trying to say here.

Ben

2022-09-22 14:42:18

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 10:18, Alan Maxwell wrote:

> Again, we would expect ultimately if the server returns error, nfs
> client should do same, show and error, not change the configuration
> and ignore our disable_id_mapping.

Not all NFSv4 errors are sent back to userspace, and the meaning of this
error ( I am assuming the server returns NFS4ERR_BADOWNER, a wire
capture
would verify it ) is to tell the client that it was unable to translate
the
owner value. As I understand RFC 8881, that's a clear indication to the
client that the server is not treating the values as numeric uid/gid, it
is
attempting to map them. That is why the client changes its behavior.

Besides, what error can the chown syscall possibly return in this case?
Let's say the client /doesn't/ re-enable string-based names. What are
you
expecting the client to return to userspace?

Ben

2022-09-22 15:08:14

by Alan Maxwell

[permalink] [raw]
Subject: RE: [EXTERNAL] nfsv4 client idmapper issue



-----Original Message-----
From: Benjamin Coddington <[email protected]>
Sent: Thursday, September 22, 2022 9:40 AM
To: Alan Maxwell <[email protected]>
Cc: [email protected]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 10:18, Alan Maxwell wrote:

> Again, we would expect ultimately if the server returns error, nfs
> client should do same, show and error, not change the configuration
> and ignore our disable_id_mapping.

Not all NFSv4 errors are sent back to userspace, and the meaning of this error ( I am assuming the server returns NFS4ERR_BADOWNER, a wire capture would verify it ) is to tell the client that it was unable to translate the owner value. As I understand RFC 8881, that's a clear indication to the client that the server is not treating the values as numeric uid/gid, it is attempting to map them. That is why the client changes its behavior.

Besides, what error can the chown syscall possibly return in this case?
Let's say the client /doesn't/ re-enable string-based names. What are you expecting the client to return to userspace?

I would expect chgrp to behave similar to a local file system:
chgrp badgroupname junk
chgrp: invalid group: 'badgroupname'




Ben

2022-09-22 15:17:04

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 10:50, Alan Maxwell wrote:

> I would expect chgrp to behave similar to a local file system:
> chgrp badgroupname junk
> chgrp: invalid group: 'badgroupname'

That's not the filesystem giving you that error, that's the chgrp binary
trying to translate the argument into a gid.

2022-09-22 19:05:44

by Alan Maxwell

[permalink] [raw]
Subject: RE: [EXTERNAL] nfsv4 client idmapper issue

I thought I said similar, but here's another example,
chgrp root nosuchfile
chgrp: cannot access 'nosuchfile': No such file or directory
So the file system didn't find the file and chgrp reported it.
So these are examples.

-----Original Message-----
From: Benjamin Coddington <[email protected]>
Sent: Thursday, September 22, 2022 10:02 AM
To: Alan Maxwell <[email protected]>
Cc: [email protected]
Subject: Re: [EXTERNAL] nfsv4 client idmapper issue

On 22 Sep 2022, at 10:50, Alan Maxwell wrote:

> I would expect chgrp to behave similar to a local file system:
> chgrp badgroupname junk
> chgrp: invalid group: 'badgroupname'

That's not the filesystem giving you that error, that's the chgrp binary trying to translate the argument into a gid.