2012-09-18 01:23:43

by NeilBrown

[permalink] [raw]
Subject: Inconsistency when mounting a directory that 'world' cannot access.


Suppose that on an NFS server I have a directory
/foo/bar/baz

which I export, and that /foo/bar does not have world access. e.g.
permissions are '750' and everyone who owns files in there is a member of the
group which owns /foo/bar.

Then with NFSv3 I can
mount server:/foo/bar/baz /somewhere
because the lookup of /foo/bar/baz happens as root on the server in mountd.

With NFSv4 using 'sec=sys' I can only do this if I export with
"no_root_squash", as the lookup happens on the client as root, and if root
were squashed, it wouldn't have access beyond /foo/bar.

But if I use NFSv4 using 'sec=krb5', the lookup happens on the client using a
machine credential which gets mapped to 'nobody/nogroup' (or whatever anonuid
and anongid are set to for the export). So I cannot perform the mount at all.

This is - at best - inconsistent and can cause confusion (hey - I was
confused for a while there).

Should something be done? Can anything be done?

I lean towards thinking that the most restrictive behaviour is most correct
(though I have a customer who feels that it is too restrictive).

Should the NFSv4 client always use an anon credential when performing the
'mount'? Is that even possible for auth_sys?
Should rpc.mountd use set_fsuid before doing the path lookup to ensure that
everyone has access to the exported directory?

Or is there some way 'mount' lookups for krb5 could be treated as being
performed by root?

Any ideas?

NeilBrown


PS the reason to want to mount a sub directory instead of just mounting the
top level directory is that "/foo/bar" is rather large - over 1000 home
directories. Whenever anything does an "ls -l" or similar in there (and it
seems that some things do) it causes fairly horrible performance somewhere
(lots of uid to name to uid mappings probably being part of it). Using an
automounter to just mount the bit that is required seems to help.


Attachments:
signature.asc (828.00 B)

2012-10-03 16:27:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> > > wrote:
> > >
> > > > I guess you're right. So it starts to sound more like: "you have a
> > > > confusing setup. Your export configuration says one thing, and your
> > > > filesystem permissions say another. Under NFSv3 the confusion didn't
> > > > matter, but now it does--time to fix it."
> > > >
> > >
> > > That's the best I could come to - I'm glad to have it confirmed. Thanks!
> > >
> > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > "root_squash" is active, but might not).
> > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > for consistency..
> > >
> > > Is the following appropriate? Is there somewhere better to put this caveat?
> >
> > Unfortunately, it's more complicated than this, as it depends on client
> > implementation and configuration details.
> >
> > Something like this would be more accurate but possibly too long:
> >
> > Note that under NFSv2 and NFSv3, the mount path is traversed by
> > mountd acting as root, but under NFSv4 the mount path is looked
> > up using the client's credentials. This means that, for
> > example, if a client mounts using a krb5 credential that the
> > server maps to an "anonmyous" user, then the mount will only
> > succeed if that directory and all its parents allow eXecute
> > permissions.
>
> So you're listing this as a "feature" rather than a bug? There should be
> no reason to constrain the pseudofs to use the permission checks from
> the underlying filesystem.

I'd be fine with that.

(That still leaves some subtle v3/v4 difference in the case of mount
paths underneath an export?

What *is* the existing mountd behavior there, exactly? I'm inclined to
think allowing mounts of arbitrary subdirectories is a bug, but maybe
there's some historical reason for it or maybe someone already depends
on it.)

--b.

2012-10-03 03:46:28

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
wrote:

> I guess you're right. So it starts to sound more like: "you have a
> confusing setup. Your export configuration says one thing, and your
> filesystem permissions say another. Under NFSv3 the confusion didn't
> matter, but now it does--time to fix it."
>

That's the best I could come to - I'm glad to have it confirmed. Thanks!

It is unfortunate that Linux NFS uses an anon credential to mount when krb5
is in use, and uses 'root' when auth_sys is used (which might be anon if
"root_squash" is active, but might not).
I wonder if it would work to use auth_none for the mount-time lookup, just
for consistency..

Is the following appropriate? Is there somewhere better to put this caveat?

Thanks,
NeilBrown


diff --git a/utils/exportfs/exports.man b/utils/exportfs/exports.man
index bc1de73..91e4b9c 100644
--- a/utils/exportfs/exports.man
+++ b/utils/exportfs/exports.man
@@ -126,6 +126,10 @@ will be enforced only for access using flavors listed in the immediately
preceding sec= option. The only options that are permitted to vary in
this way are ro, rw, no_root_squash, root_squash, and all_squash.
.PP
+When RPCSEC_GSS is used with NFSv4, a client will only be able to mount a
+directory if that directory and all its ancestors give eXecute access
+to "world".
+.PP
.SS General Options
.BR exportfs
understands the following export options:


Attachments:
signature.asc (828.00 B)

2012-10-02 02:38:13

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, 1 Oct 2012 11:43:10 -0400 "J. Bruce Fields" <[email protected]>
wrote:

> On Tue, Sep 18, 2012 at 11:23:29AM +1000, NeilBrown wrote:
> >
> > Suppose that on an NFS server I have a directory
> > /foo/bar/baz
> >
> > which I export, and that /foo/bar does not have world access. e.g.
> > permissions are '750' and everyone who owns files in there is a member of the
> > group which owns /foo/bar.
> >
> > Then with NFSv3 I can
> > mount server:/foo/bar/baz /somewhere
> > because the lookup of /foo/bar/baz happens as root on the server in mountd.
> >
> > With NFSv4 using 'sec=sys' I can only do this if I export with
> > "no_root_squash", as the lookup happens on the client as root, and if root
> > were squashed, it wouldn't have access beyond /foo/bar.
> >
> > But if I use NFSv4 using 'sec=krb5', the lookup happens on the client using a
> > machine credential which gets mapped to 'nobody/nogroup' (or whatever anonuid
> > and anongid are set to for the export). So I cannot perform the mount at all.
> >
> > This is - at best - inconsistent and can cause confusion (hey - I was
> > confused for a while there).
> >
> > Should something be done? Can anything be done?
>
> I think nfsd_lookup_dentry() would need a special exception for the
> NFSEXP_V4ROOT case.

I don't think that would help in general.
If I export /foo and want to mount /foo/bar/baz, then for the last lookup at
least, NFSEXP_V4ROOT isn't set anywhere near.

An exception would need to be made for every 'nfs4_lookup_dentry', provided
it found a directory, or nothing (or a symlink...). Possibly this could only
be done for anonymous credentials (as are used by the nfs4 mount operation).

It would be nice if we could clearly differentiate a mount-time lookup from a
regular lookup, but I don't think the protocol allows for that.

Thanks,
NeilBrown


>
> Looks like the directory permission check is actually done in
> lookup_one_len(), so we'd need to either call something else or
> temporarily swap credentials?
>
> --b.
>
> >
> > I lean towards thinking that the most restrictive behaviour is most correct
> > (though I have a customer who feels that it is too restrictive).
> >
> > Should the NFSv4 client always use an anon credential when performing the
> > 'mount'? Is that even possible for auth_sys?
> > Should rpc.mountd use set_fsuid before doing the path lookup to ensure that
> > everyone has access to the exported directory?
> >
> > Or is there some way 'mount' lookups for krb5 could be treated as being
> > performed by root?
> >
> > Any ideas?


Attachments:
signature.asc (828.00 B)

2012-10-08 14:18:49

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> NeilBrown [[email protected]] wrote:
> > Mount with NFSv4 and it takes about the same. However:
> >
> > .....
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> > drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> > drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> > ....
> >
> >
> > tcpdump shows the server is returning the write stuff, but something if going
> > wrong on the client. I've tried unmounting/remounting and killing/restarting
> > rpc.idmapd.
>
> As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> server sending numeric ids by default in AUTH_SYS (commit
> e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> code).
>
> I hand patched server commit, but my client was an older one. That is
> how I got into my issue. Not sure, if you are running into a similar
> issue.

Oh, could be--but then why would some of the id's still be mapped
correctly?

--b.

2012-10-08 13:59:57

by Malahal Naineni

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

NeilBrown [[email protected]] wrote:
> Mount with NFSv4 and it takes about the same. However:
>
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> ....
>
>
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client. I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.

As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
server sending numeric ids by default in AUTH_SYS (commit
e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
code).

I hand patched server commit, but my client was an older one. That is
how I got into my issue. Not sure, if you are running into a similar
issue.

Regards, Malahal.


2012-10-08 12:19:23

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, Oct 08, 2012 at 05:03:04PM +1100, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <[email protected]>
> wrote:
> > It's not the nfsd behavior that bothers me--there's nothing we can do
> > about the fact that access by filehandle can bypass directory
> > permissions.
> >
> > What bothers is that mountd will apparently allow anyone to do a lookup
> > anywhere in an exported filesystem.
>
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".

As you know, that's not necessarily a good asumption. And if somebody's
using sec=krb5, they're explicitly saying that they don't trust that
assumption.

> > Getting all the id->name mappings for a 100-entry directory is going to
> > require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> > by default it looks like the idmapd cache will go cold after 10
> > minutes.... Not hard to imagine that could be a problem.
> >
> > Running multiple idmapd process would be easy and might help? Though
> > not if the client's just giving us the getattrs one at a time.
> >
> > Or maybe the problem's somewhere else entirely, but that's a real bug if
> > we aren't giving good performance on /home.
>
> I did some experimenting..
> On both 'client' and 'server':
> for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
> >> /etc/passwd
>
> On server in suitable directory
>
> for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
>
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same. However:

OK, that's interesting. I wonder what the problem is, then? I can't
think of what else would make /home different.

> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> ....

Oops.

> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client. I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache? Any easy way to view or flush the cache?

Not that I know of.

What client version is this, and is it using the new (nfsidmap) or old
(idmapd) idmapper?

--b.

2012-10-02 14:33:35

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Tue, Oct 02, 2012 at 12:38:10PM +1000, NeilBrown wrote:
> On Mon, 1 Oct 2012 11:43:10 -0400 "J. Bruce Fields" <[email protected]>
> wrote:
>
> > On Tue, Sep 18, 2012 at 11:23:29AM +1000, NeilBrown wrote:
> > >
> > > Suppose that on an NFS server I have a directory
> > > /foo/bar/baz
> > >
> > > which I export, and that /foo/bar does not have world access. e.g.
> > > permissions are '750' and everyone who owns files in there is a member of the
> > > group which owns /foo/bar.
> > >
> > > Then with NFSv3 I can
> > > mount server:/foo/bar/baz /somewhere
> > > because the lookup of /foo/bar/baz happens as root on the server in mountd.
> > >
> > > With NFSv4 using 'sec=sys' I can only do this if I export with
> > > "no_root_squash", as the lookup happens on the client as root, and if root
> > > were squashed, it wouldn't have access beyond /foo/bar.
> > >
> > > But if I use NFSv4 using 'sec=krb5', the lookup happens on the client using a
> > > machine credential which gets mapped to 'nobody/nogroup' (or whatever anonuid
> > > and anongid are set to for the export). So I cannot perform the mount at all.
> > >
> > > This is - at best - inconsistent and can cause confusion (hey - I was
> > > confused for a while there).
> > >
> > > Should something be done? Can anything be done?
> >
> > I think nfsd_lookup_dentry() would need a special exception for the
> > NFSEXP_V4ROOT case.
>
> I don't think that would help in general.
> If I export /foo and want to mount /foo/bar/baz, then for the last lookup at
> least, NFSEXP_V4ROOT isn't set anywhere near.

I'm not entirely clear why mountd even allows that, but no point trying
to change it now anyway.

> An exception would need to be made for every 'nfs4_lookup_dentry', provided
> it found a directory, or nothing (or a symlink...). Possibly this could only
> be done for anonymous credentials (as are used by the nfs4 mount operation).

Yuch.

> It would be nice if we could clearly differentiate a mount-time lookup from a
> regular lookup, but I don't think the protocol allows for that.

I guess you're right. So it starts to sound more like: "you have a
confusing setup. Your export configuration says one thing, and your
filesystem permissions say another. Under NFSv3 the confusion didn't
matter, but now it does--time to fix it."

--b.

2012-10-08 11:42:45

by Steve Dickson

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.



On 08/10/12 02:03, NeilBrown wrote:
> On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <[email protected]>
> wrote:
>
>> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
>>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <[email protected]>
>>> wrote:
>>>
>>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
>>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
>>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
>>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I guess you're right. So it starts to sound more like: "you have a
>>>>>>>> confusing setup. Your export configuration says one thing, and your
>>>>>>>> filesystem permissions say another. Under NFSv3 the confusion didn't
>>>>>>>> matter, but now it does--time to fix it."
>>>>>>>>
>>>>>>>
>>>>>>> That's the best I could come to - I'm glad to have it confirmed. Thanks!
>>>>>>>
>>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
>>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
>>>>>>> "root_squash" is active, but might not).
>>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
>>>>>>> for consistency..
>>>>>>>
>>>>>>> Is the following appropriate? Is there somewhere better to put this caveat?
>>>>>>
>>>>>> Unfortunately, it's more complicated than this, as it depends on client
>>>>>> implementation and configuration details.
>>>>>>
>>>>>> Something like this would be more accurate but possibly too long:
>>>>>>
>>>>>> Note that under NFSv2 and NFSv3, the mount path is traversed by
>>>>>> mountd acting as root, but under NFSv4 the mount path is looked
>>>>>> up using the client's credentials. This means that, for
>>>>>> example, if a client mounts using a krb5 credential that the
>>>>>> server maps to an "anonmyous" user, then the mount will only
>>>>>> succeed if that directory and all its parents allow eXecute
>>>>>> permissions.
>>>>>
>>>>> So you're listing this as a "feature" rather than a bug? There should be
>>>>> no reason to constrain the pseudofs to use the permission checks from
>>>>> the underlying filesystem.
>>>>
>>>> I'd be fine with that.
>>>>
>>>> (That still leaves some subtle v3/v4 difference in the case of mount
>>>> paths underneath an export?
>>>>
>>>> What *is* the existing mountd behavior there, exactly? I'm inclined to
>>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
>>>> there's some historical reason for it or maybe someone already depends
>>>> on it.)
>>>>
>>>> --b.
>>>
>>> The behaviour is simple that you mount a filehandle (typically belonging to a
>>> directory) and that filehandle can be anything inside any exported filesystem.
>>
>> It's not the nfsd behavior that bothers me--there's nothing we can do
>> about the fact that access by filehandle can bypass directory
>> permissions.
>>
>> What bothers is that mountd will apparently allow anyone to do a lookup
>> anywhere in an exported filesystem.
>
> Not anyone - it requires a privileged source port from a known host.
> So it is only "anyone who can get 'root'".
>
>>
>> I don't know--maybe I shouldn't be so concerned about the possibility a
>> rogue user could figure out that my "Music" directory includes an
>> unreasonable number of Miles Davis titles.
>>
>>> Yes, please do depend on being able to mount filehandles that aren't to root
>>> of a filesystem.
>>>
>>> The case the brought this issue to my attention involved the server having
>>> a directory containing hundreds of home directories. This directory is
>>> exported.
>>>
>>> If they mount that top level directory they get horrible performance. If
>>> they use an automounter to just mount the homes that are accessed it works
>>> better. They weren't able to explain why but my guess is that some tools
>>> (GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
>>> the top level directory which would hammer nfs-idmapd and probably ldap....
>>> though you would think that would get cached and not be a problem for long.
>>> So maybe it is more subtle than that.
>>
>> Getting all the id->name mappings for a 100-entry directory is going to
>> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
>> by default it looks like the idmapd cache will go cold after 10
>> minutes.... Not hard to imagine that could be a problem.
>>
>> Running multiple idmapd process would be easy and might help? Though
>> not if the client's just giving us the getattrs one at a time.
>>
>> Or maybe the problem's somewhere else entirely, but that's a real bug if
>> we aren't giving good performance on /home.
>
> I did some experimenting..
> On both 'client' and 'server':
> for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>>> /etc/passwd
>
> On server in suitable directory
>
> for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
>
> Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> little under 4 seconds.
> Mount with NFSv4 and it takes about the same. However:
>
> .....
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> ....
>
>
> tcpdump shows the server is returning the write stuff, but something if going
> wrong on the client. I've tried unmounting/remounting and killing/restarting
> rpc.idmapd.
> I had some config problems previously .. is there any chance that these
> unknown entries are in a cache? Any easy way to view or flush the cache?
Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
clear the keyring of user and group ids. See nfsidmap(5).

If you using rpc.idmapd, I believe
echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
will do the trick.... The CITI faq
http://www.citi.umich.edu/projects/nfsv4/linux/faq/
has a section on work with this cache...

steved.

>
> Of course this is with text-file password lookup. LDAP might be slower but
> I'd be surprised if it was much slower.
>
> NeilBrown
>
>
>
>>
>> --b.
>>
>>> I've built similar setups before. There is something attractive about
>>> everyone's home directory being /home/$USERNAME even though they are on
>>> different servers and different filesystems.
>>>
>>> In the particular problem scenario, local policy requires that the 'staff'
>>> directory on the server to not be world-accessible, but they still want to
>>> mount the individual home directories from there onto client machines as
>>> required.
>>> I cannot easily justify that policy, but the point is that it works with
>>> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5. I don't
>>> think we can fix this inconsistency but maybe we can explain it.
>>>
>>> I think your text is more accurate than mine, but also a little more vague so
>>> the important may not be immediately obvious. That might be a price we have
>>> to pay for accuracy.
>

2012-10-01 15:43:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Tue, Sep 18, 2012 at 11:23:29AM +1000, NeilBrown wrote:
>
> Suppose that on an NFS server I have a directory
> /foo/bar/baz
>
> which I export, and that /foo/bar does not have world access. e.g.
> permissions are '750' and everyone who owns files in there is a member of the
> group which owns /foo/bar.
>
> Then with NFSv3 I can
> mount server:/foo/bar/baz /somewhere
> because the lookup of /foo/bar/baz happens as root on the server in mountd.
>
> With NFSv4 using 'sec=sys' I can only do this if I export with
> "no_root_squash", as the lookup happens on the client as root, and if root
> were squashed, it wouldn't have access beyond /foo/bar.
>
> But if I use NFSv4 using 'sec=krb5', the lookup happens on the client using a
> machine credential which gets mapped to 'nobody/nogroup' (or whatever anonuid
> and anongid are set to for the export). So I cannot perform the mount at all.
>
> This is - at best - inconsistent and can cause confusion (hey - I was
> confused for a while there).
>
> Should something be done? Can anything be done?

I think nfsd_lookup_dentry() would need a special exception for the
NFSEXP_V4ROOT case.

Looks like the directory permission check is actually done in
lookup_one_len(), so we'd need to either call something else or
temporarily swap credentials?

--b.

>
> I lean towards thinking that the most restrictive behaviour is most correct
> (though I have a customer who feels that it is too restrictive).
>
> Should the NFSv4 client always use an anon credential when performing the
> 'mount'? Is that even possible for auth_sys?
> Should rpc.mountd use set_fsuid before doing the path lookup to ensure that
> everyone has access to the exported directory?
>
> Or is there some way 'mount' lookups for krb5 could be treated as being
> performed by root?
>
> Any ideas?

2012-10-08 15:30:16

by Malahal Naineni

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

J. Bruce Fields [[email protected]] wrote:
> On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> > NeilBrown [[email protected]] wrote:
> > > Mount with NFSv4 and it takes about the same. However:
> > >
> > > .....
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> > > drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> > > drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> > > ....
> > >
> > >
> > > tcpdump shows the server is returning the write stuff, but something if going
> > > wrong on the client. I've tried unmounting/remounting and killing/restarting
> > > rpc.idmapd.
> >
> > As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> > server sending numeric ids by default in AUTH_SYS (commit
> > e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> > them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> > code).
> >
> > I hand patched server commit, but my client was an older one. That is
> > how I got into my issue. Not sure, if you are running into a similar
> > issue.
>
> Oh, could be--but then why would some of the id's still be mapped
> correctly?

Wild guess, those objects are created by client and didn't get their
attributes updated yet from server???

FYI, a co-worker here had RHEL6.3 server and RHEL6.2 client that
exhibited this nobody issue with NFSv4.

Regards, Malahal.


2012-10-03 15:48:45

by Myklebust, Trond

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

T24gV2VkLCAyMDEyLTEwLTAzIGF0IDExOjEzIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6
DQo+IE9uIFdlZCwgT2N0IDAzLCAyMDEyIGF0IDAxOjQ2OjI5UE0gKzEwMDAsIE5laWxCcm93biB3
cm90ZToNCj4gPiBPbiBUdWUsIDIgT2N0IDIwMTIgMTA6MzM6MzQgLTA0MDAgIkouIEJydWNlIEZp
ZWxkcyIgPGJmaWVsZHNAZmllbGRzZXMub3JnPg0KPiA+IHdyb3RlOg0KPiA+IA0KPiA+ID4gSSBn
dWVzcyB5b3UncmUgcmlnaHQuICBTbyBpdCBzdGFydHMgdG8gc291bmQgbW9yZSBsaWtlOiAieW91
IGhhdmUgYQ0KPiA+ID4gY29uZnVzaW5nIHNldHVwLiAgWW91ciBleHBvcnQgY29uZmlndXJhdGlv
biBzYXlzIG9uZSB0aGluZywgYW5kIHlvdXINCj4gPiA+IGZpbGVzeXN0ZW0gcGVybWlzc2lvbnMg
c2F5IGFub3RoZXIuICBVbmRlciBORlN2MyB0aGUgY29uZnVzaW9uIGRpZG4ndA0KPiA+ID4gbWF0
dGVyLCBidXQgbm93IGl0IGRvZXMtLXRpbWUgdG8gZml4IGl0LiINCj4gPiA+IA0KPiA+IA0KPiA+
IFRoYXQncyB0aGUgYmVzdCBJIGNvdWxkIGNvbWUgdG8gLSBJJ20gZ2xhZCB0byBoYXZlIGl0IGNv
bmZpcm1lZC4gIFRoYW5rcyENCj4gPiANCj4gPiBJdCBpcyB1bmZvcnR1bmF0ZSB0aGF0IExpbnV4
IE5GUyB1c2VzIGFuIGFub24gY3JlZGVudGlhbCB0byBtb3VudCB3aGVuIGtyYjUNCj4gPiBpcyBp
biB1c2UsIGFuZCB1c2VzICdyb290JyB3aGVuIGF1dGhfc3lzIGlzIHVzZWQgKHdoaWNoIG1pZ2h0
IGJlIGFub24gaWYNCj4gPiAicm9vdF9zcXVhc2giIGlzIGFjdGl2ZSwgYnV0IG1pZ2h0IG5vdCku
DQo+ID4gSSB3b25kZXIgaWYgaXQgd291bGQgd29yayB0byB1c2UgYXV0aF9ub25lIGZvciB0aGUg
bW91bnQtdGltZSBsb29rdXAsIGp1c3QNCj4gPiBmb3IgY29uc2lzdGVuY3kuLg0KPiA+IA0KPiA+
IElzIHRoZSBmb2xsb3dpbmcgYXBwcm9wcmlhdGU/ICBJcyB0aGVyZSBzb21ld2hlcmUgYmV0dGVy
IHRvIHB1dCB0aGlzIGNhdmVhdD8NCj4gDQo+IFVuZm9ydHVuYXRlbHksIGl0J3MgbW9yZSBjb21w
bGljYXRlZCB0aGFuIHRoaXMsIGFzIGl0IGRlcGVuZHMgb24gY2xpZW50DQo+IGltcGxlbWVudGF0
aW9uIGFuZCBjb25maWd1cmF0aW9uIGRldGFpbHMuDQo+IA0KPiBTb21ldGhpbmcgbGlrZSB0aGlz
IHdvdWxkIGJlIG1vcmUgYWNjdXJhdGUgYnV0IHBvc3NpYmx5IHRvbyBsb25nOg0KPiANCj4gCU5v
dGUgdGhhdCB1bmRlciBORlN2MiBhbmQgTkZTdjMsIHRoZSBtb3VudCBwYXRoIGlzIHRyYXZlcnNl
ZCBieQ0KPiAJbW91bnRkIGFjdGluZyBhcyByb290LCBidXQgdW5kZXIgTkZTdjQgdGhlIG1vdW50
IHBhdGggaXMgbG9va2VkDQo+IAl1cCB1c2luZyB0aGUgY2xpZW50J3MgY3JlZGVudGlhbHMuICBU
aGlzIG1lYW5zIHRoYXQsIGZvcg0KPiAJZXhhbXBsZSwgaWYgYSBjbGllbnQgbW91bnRzIHVzaW5n
IGEga3JiNSBjcmVkZW50aWFsIHRoYXQgdGhlDQo+IAlzZXJ2ZXIgbWFwcyB0byBhbiAiYW5vbm15
b3VzIiB1c2VyLCB0aGVuIHRoZSBtb3VudCB3aWxsIG9ubHkNCj4gCXN1Y2NlZWQgaWYgdGhhdCBk
aXJlY3RvcnkgYW5kIGFsbCBpdHMgcGFyZW50cyBhbGxvdyBlWGVjdXRlDQo+IAlwZXJtaXNzaW9u
cy4NCg0KU28geW91J3JlIGxpc3RpbmcgdGhpcyBhcyBhICJmZWF0dXJlIiByYXRoZXIgdGhhbiBh
IGJ1Zz8gVGhlcmUgc2hvdWxkIGJlDQpubyByZWFzb24gdG8gY29uc3RyYWluIHRoZSBwc2V1ZG9m
cyB0byB1c2UgdGhlIHBlcm1pc3Npb24gY2hlY2tzIGZyb20NCnRoZSB1bmRlcmx5aW5nIGZpbGVz
eXN0ZW0uDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWlu
ZXINCg0KTmV0QXBwDQpUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20N
Cg==

2012-10-09 00:33:38

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, 8 Oct 2012 10:26:47 -0500 Malahal Naineni <[email protected]> wrote:

> J. Bruce Fields [[email protected]] wrote:
> > On Mon, Oct 08, 2012 at 08:54:52AM -0500, Malahal Naineni wrote:
> > > NeilBrown [[email protected]] wrote:
> > > > Mount with NFSv4 and it takes about the same. However:
> > > >
> > > > .....
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> > > > drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> > > > drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> > > > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> > > > ....
> > > >
> > > >
> > > > tcpdump shows the server is returning the write stuff, but something if going
> > > > wrong on the client. I've tried unmounting/remounting and killing/restarting
> > > > rpc.idmapd.
> > >
> > > As you know 4294967294 is (-2, nfs nobody), I have seen this issue with NFS
> > > server sending numeric ids by default in AUTH_SYS (commit
> > > e9541ce8efc22c233a045f091c2b969923709038), but the client can't handle
> > > them (lack of commit 5cf36cfdc8caa2724738ad0842c5c3dd02f309dc in client
> > > code).
> > >
> > > I hand patched server commit, but my client was an older one. That is
> > > how I got into my issue. Not sure, if you are running into a similar
> > > issue.
> >
> > Oh, could be--but then why would some of the id's still be mapped
> > correctly?
>
> Wild guess, those objects are created by client and didn't get their
> attributes updated yet from server???
>
> FYI, a co-worker here had RHEL6.3 server and RHEL6.2 client that
> exhibited this nobody issue with NFSv4.
>
> Regards, Malahal.

I think the original cause of my problem was that I had inconsistent settings
for 'Domain' in 'idmapd.conf'. That seems to have resulted in 'nobody'
entries being cached which I now cannot flush.

I'm running a 3.5 kernel on the client, so the issues you mentioned won't be
affecting me.

Thanks,
NeilBrown


Attachments:
signature.asc (828.00 B)

2012-10-03 22:47:00

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <[email protected]>
wrote:

> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> > > > wrote:
> > > >
> > > > > I guess you're right. So it starts to sound more like: "you have a
> > > > > confusing setup. Your export configuration says one thing, and your
> > > > > filesystem permissions say another. Under NFSv3 the confusion didn't
> > > > > matter, but now it does--time to fix it."
> > > > >
> > > >
> > > > That's the best I could come to - I'm glad to have it confirmed. Thanks!
> > > >
> > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > "root_squash" is active, but might not).
> > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > for consistency..
> > > >
> > > > Is the following appropriate? Is there somewhere better to put this caveat?
> > >
> > > Unfortunately, it's more complicated than this, as it depends on client
> > > implementation and configuration details.
> > >
> > > Something like this would be more accurate but possibly too long:
> > >
> > > Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > mountd acting as root, but under NFSv4 the mount path is looked
> > > up using the client's credentials. This means that, for
> > > example, if a client mounts using a krb5 credential that the
> > > server maps to an "anonmyous" user, then the mount will only
> > > succeed if that directory and all its parents allow eXecute
> > > permissions.
> >
> > So you're listing this as a "feature" rather than a bug? There should be
> > no reason to constrain the pseudofs to use the permission checks from
> > the underlying filesystem.
>
> I'd be fine with that.
>
> (That still leaves some subtle v3/v4 difference in the case of mount
> paths underneath an export?
>
> What *is* the existing mountd behavior there, exactly? I'm inclined to
> think allowing mounts of arbitrary subdirectories is a bug, but maybe
> there's some historical reason for it or maybe someone already depends
> on it.)
>
> --b.

The behaviour is simple that you mount a filehandle (typically belonging to a
directory) and that filehandle can be anything inside any exported filesystem.
Yes, please do depend on being able to mount filehandles that aren't to root
of a filesystem.

The case the brought this issue to my attention involved the server having
a directory containing hundreds of home directories. This directory is
exported.

If they mount that top level directory they get horrible performance. If
they use an automounter to just mount the homes that are accessed it works
better. They weren't able to explain why but my guess is that some tools
(GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
the top level directory which would hammer nfs-idmapd and probably ldap....
though you would think that would get cached and not be a problem for long.
So maybe it is more subtle than that.

I've built similar setups before. There is something attractive about
everyone's home directory being /home/$USERNAME even though they are on
different servers and different filesystems.

In the particular problem scenario, local policy requires that the 'staff'
directory on the server to not be world-accessible, but they still want to
mount the individual home directories from there onto client machines as
required.
I cannot easily justify that policy, but the point is that it works with
NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5. I don't
think we can fix this inconsistency but maybe we can explain it.

I think your text is more accurate than mine, but also a little more vague so
the important may not be immediately obvious. That might be a price we have
to pay for accuracy.

Thanks,
NeilBrown


Attachments:
signature.asc (828.00 B)

2012-10-08 06:03:00

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <[email protected]>
wrote:

> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> > On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <[email protected]>
> > wrote:
> >
> > > On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > > > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > I guess you're right. So it starts to sound more like: "you have a
> > > > > > > confusing setup. Your export configuration says one thing, and your
> > > > > > > filesystem permissions say another. Under NFSv3 the confusion didn't
> > > > > > > matter, but now it does--time to fix it."
> > > > > > >
> > > > > >
> > > > > > That's the best I could come to - I'm glad to have it confirmed. Thanks!
> > > > > >
> > > > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > > > "root_squash" is active, but might not).
> > > > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > > > for consistency..
> > > > > >
> > > > > > Is the following appropriate? Is there somewhere better to put this caveat?
> > > > >
> > > > > Unfortunately, it's more complicated than this, as it depends on client
> > > > > implementation and configuration details.
> > > > >
> > > > > Something like this would be more accurate but possibly too long:
> > > > >
> > > > > Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > > > mountd acting as root, but under NFSv4 the mount path is looked
> > > > > up using the client's credentials. This means that, for
> > > > > example, if a client mounts using a krb5 credential that the
> > > > > server maps to an "anonmyous" user, then the mount will only
> > > > > succeed if that directory and all its parents allow eXecute
> > > > > permissions.
> > > >
> > > > So you're listing this as a "feature" rather than a bug? There should be
> > > > no reason to constrain the pseudofs to use the permission checks from
> > > > the underlying filesystem.
> > >
> > > I'd be fine with that.
> > >
> > > (That still leaves some subtle v3/v4 difference in the case of mount
> > > paths underneath an export?
> > >
> > > What *is* the existing mountd behavior there, exactly? I'm inclined to
> > > think allowing mounts of arbitrary subdirectories is a bug, but maybe
> > > there's some historical reason for it or maybe someone already depends
> > > on it.)
> > >
> > > --b.
> >
> > The behaviour is simple that you mount a filehandle (typically belonging to a
> > directory) and that filehandle can be anything inside any exported filesystem.
>
> It's not the nfsd behavior that bothers me--there's nothing we can do
> about the fact that access by filehandle can bypass directory
> permissions.
>
> What bothers is that mountd will apparently allow anyone to do a lookup
> anywhere in an exported filesystem.

Not anyone - it requires a privileged source port from a known host.
So it is only "anyone who can get 'root'".

>
> I don't know--maybe I shouldn't be so concerned about the possibility a
> rogue user could figure out that my "Music" directory includes an
> unreasonable number of Miles Davis titles.
>
> > Yes, please do depend on being able to mount filehandles that aren't to root
> > of a filesystem.
> >
> > The case the brought this issue to my attention involved the server having
> > a directory containing hundreds of home directories. This directory is
> > exported.
> >
> > If they mount that top level directory they get horrible performance. If
> > they use an automounter to just mount the homes that are accessed it works
> > better. They weren't able to explain why but my guess is that some tools
> > (GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
> > the top level directory which would hammer nfs-idmapd and probably ldap....
> > though you would think that would get cached and not be a problem for long.
> > So maybe it is more subtle than that.
>
> Getting all the id->name mappings for a 100-entry directory is going to
> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> by default it looks like the idmapd cache will go cold after 10
> minutes.... Not hard to imagine that could be a problem.
>
> Running multiple idmapd process would be easy and might help? Though
> not if the client's just giving us the getattrs one at a time.
>
> Or maybe the problem's somewhere else entirely, but that's a real bug if
> we aren't giving good performance on /home.

I did some experimenting..
On both 'client' and 'server':
for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
>> /etc/passwd

On server in suitable directory

for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done

Mount that directory onto the client with NFSv3 and "time ls -l" takes a
little under 4 seconds.
Mount with NFSv4 and it takes about the same. However:

.....
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
....


tcpdump shows the server is returning the write stuff, but something if going
wrong on the client. I've tried unmounting/remounting and killing/restarting
rpc.idmapd.
I had some config problems previously .. is there any chance that these
unknown entries are in a cache? Any easy way to view or flush the cache?

Of course this is with text-file password lookup. LDAP might be slower but
I'd be surprised if it was much slower.

NeilBrown



>
> --b.
>
> > I've built similar setups before. There is something attractive about
> > everyone's home directory being /home/$USERNAME even though they are on
> > different servers and different filesystems.
> >
> > In the particular problem scenario, local policy requires that the 'staff'
> > directory on the server to not be world-accessible, but they still want to
> > mount the individual home directories from there onto client machines as
> > required.
> > I cannot easily justify that policy, but the point is that it works with
> > NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5. I don't
> > think we can fix this inconsistency but maybe we can explain it.
> >
> > I think your text is more accurate than mine, but also a little more vague so
> > the important may not be immediately obvious. That might be a price we have
> > to pay for accuracy.


Attachments:
signature.asc (828.00 B)

2012-10-09 00:30:48

by NeilBrown

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, 08 Oct 2012 07:42:34 -0400 Steve Dickson <[email protected]> wrote:

>
>
> On 08/10/12 02:03, NeilBrown wrote:
> > On Thu, 4 Oct 2012 12:07:39 -0400 "J. Bruce Fields" <[email protected]>
> > wrote:
> >
> >> On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> >>> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <[email protected]>
> >>> wrote:
> >>>
> >>>> On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> >>>>> On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> >>>>>> On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> >>>>>>> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I guess you're right. So it starts to sound more like: "you have a
> >>>>>>>> confusing setup. Your export configuration says one thing, and your
> >>>>>>>> filesystem permissions say another. Under NFSv3 the confusion didn't
> >>>>>>>> matter, but now it does--time to fix it."
> >>>>>>>>
> >>>>>>>
> >>>>>>> That's the best I could come to - I'm glad to have it confirmed. Thanks!
> >>>>>>>
> >>>>>>> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> >>>>>>> is in use, and uses 'root' when auth_sys is used (which might be anon if
> >>>>>>> "root_squash" is active, but might not).
> >>>>>>> I wonder if it would work to use auth_none for the mount-time lookup, just
> >>>>>>> for consistency..
> >>>>>>>
> >>>>>>> Is the following appropriate? Is there somewhere better to put this caveat?
> >>>>>>
> >>>>>> Unfortunately, it's more complicated than this, as it depends on client
> >>>>>> implementation and configuration details.
> >>>>>>
> >>>>>> Something like this would be more accurate but possibly too long:
> >>>>>>
> >>>>>> Note that under NFSv2 and NFSv3, the mount path is traversed by
> >>>>>> mountd acting as root, but under NFSv4 the mount path is looked
> >>>>>> up using the client's credentials. This means that, for
> >>>>>> example, if a client mounts using a krb5 credential that the
> >>>>>> server maps to an "anonmyous" user, then the mount will only
> >>>>>> succeed if that directory and all its parents allow eXecute
> >>>>>> permissions.
> >>>>>
> >>>>> So you're listing this as a "feature" rather than a bug? There should be
> >>>>> no reason to constrain the pseudofs to use the permission checks from
> >>>>> the underlying filesystem.
> >>>>
> >>>> I'd be fine with that.
> >>>>
> >>>> (That still leaves some subtle v3/v4 difference in the case of mount
> >>>> paths underneath an export?
> >>>>
> >>>> What *is* the existing mountd behavior there, exactly? I'm inclined to
> >>>> think allowing mounts of arbitrary subdirectories is a bug, but maybe
> >>>> there's some historical reason for it or maybe someone already depends
> >>>> on it.)
> >>>>
> >>>> --b.
> >>>
> >>> The behaviour is simple that you mount a filehandle (typically belonging to a
> >>> directory) and that filehandle can be anything inside any exported filesystem.
> >>
> >> It's not the nfsd behavior that bothers me--there's nothing we can do
> >> about the fact that access by filehandle can bypass directory
> >> permissions.
> >>
> >> What bothers is that mountd will apparently allow anyone to do a lookup
> >> anywhere in an exported filesystem.
> >
> > Not anyone - it requires a privileged source port from a known host.
> > So it is only "anyone who can get 'root'".
> >
> >>
> >> I don't know--maybe I shouldn't be so concerned about the possibility a
> >> rogue user could figure out that my "Music" directory includes an
> >> unreasonable number of Miles Davis titles.
> >>
> >>> Yes, please do depend on being able to mount filehandles that aren't to root
> >>> of a filesystem.
> >>>
> >>> The case the brought this issue to my attention involved the server having
> >>> a directory containing hundreds of home directories. This directory is
> >>> exported.
> >>>
> >>> If they mount that top level directory they get horrible performance. If
> >>> they use an automounter to just mount the homes that are accessed it works
> >>> better. They weren't able to explain why but my guess is that some tools
> >>> (GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
> >>> the top level directory which would hammer nfs-idmapd and probably ldap....
> >>> though you would think that would get cached and not be a problem for long.
> >>> So maybe it is more subtle than that.
> >>
> >> Getting all the id->name mappings for a 100-entry directory is going to
> >> require a 100 serialized upcalls to idmapd (and then possibly ldap), and
> >> by default it looks like the idmapd cache will go cold after 10
> >> minutes.... Not hard to imagine that could be a problem.
> >>
> >> Running multiple idmapd process would be easy and might help? Though
> >> not if the client's just giving us the getattrs one at a time.
> >>
> >> Or maybe the problem's somewhere else entirely, but that's a real bug if
> >> we aren't giving good performance on /home.
> >
> > I did some experimenting..
> > On both 'client' and 'server':
> > for i in `seq 2000 3000`; do echo u$i:x:$i:1000::/nohome:/bin/false; done
> >>> /etc/passwd
> >
> > On server in suitable directory
> >
> > for i in `seq 2000 3000`; do mkdir $i ; chown u$i $i ; done
> >
> > Mount that directory onto the client with NFSv3 and "time ls -l" takes a
> > little under 4 seconds.
> > Mount with NFSv4 and it takes about the same. However:
> >
> > .....
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2974
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2975
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2976
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2977
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2978
> > drwxr-xr-x 2 u2979 root 4096 Oct 8 16:19 2979
> > drwxr-xr-x 2 u2980 root 4096 Oct 8 16:19 2980
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2981
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2982
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2983
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2984
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2985
> > drwxr-xr-x 2 4294967294 root 4096 Oct 8 16:19 2986
> > ....
> >
> >
> > tcpdump shows the server is returning the write stuff, but something if going
> > wrong on the client. I've tried unmounting/remounting and killing/restarting
> > rpc.idmapd.
> > I had some config problems previously .. is there any chance that these
> > unknown entries are in a cache? Any easy way to view or flush the cache?
> Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
> clear the keyring of user and group ids. See nfsidmap(5).

Thanks... though I'm running some ancient system which only has nfs-utils
1.2.5 and so "nfsidmap -cv" returns silently, but does nothing.
That's OK, I have source -- build, copy, test...

# /tmp/nfsidmap -cv
nfsidmap: fopen(/proc/keys) failed: No such file or directory


Hmm, not what I was expecting ... grep grep ahhh:

config KEYS_DEBUG_PROC_KEYS
bool "Enable the /proc/keys file by which keys may be viewed"

# zcat /proc/config.gz | grep KEYS_DEBUG_PROC
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set


That explains it then - we need a debug option set or we cannot flush the
idmap cache. I guess flushing a cache is a debugging operation, but its a
bit surprising. And in my case: annoying.

Would you expect distros to enable CONFIG_KEYS_DEBUG_PROC_KEYS? If so I'll
get it enabled for SUSE (it is enabled in the 'debug' kernel, but not
'desktop' or 'default). If not, the man page maybe should safe that -c and
-r require a kernel with debugging enabled.

But I set up another machine as the client and configured it properly before
testing, and everything works fine and reasonably fast. So my guess that
id lookup for thousands of different ids caused slowness was probably wrong.


Thanks,
NeilBrown


Attachments:
signature.asc (828.00 B)

2012-10-04 16:07:47

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Thu, Oct 04, 2012 at 08:46:59AM +1000, NeilBrown wrote:
> On Wed, 3 Oct 2012 12:27:28 -0400 "J. Bruce Fields" <[email protected]>
> wrote:
>
> > On Wed, Oct 03, 2012 at 03:48:43PM +0000, Myklebust, Trond wrote:
> > > On Wed, 2012-10-03 at 11:13 -0400, J. Bruce Fields wrote:
> > > > On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> > > > > On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> > > > > wrote:
> > > > >
> > > > > > I guess you're right. So it starts to sound more like: "you have a
> > > > > > confusing setup. Your export configuration says one thing, and your
> > > > > > filesystem permissions say another. Under NFSv3 the confusion didn't
> > > > > > matter, but now it does--time to fix it."
> > > > > >
> > > > >
> > > > > That's the best I could come to - I'm glad to have it confirmed. Thanks!
> > > > >
> > > > > It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> > > > > is in use, and uses 'root' when auth_sys is used (which might be anon if
> > > > > "root_squash" is active, but might not).
> > > > > I wonder if it would work to use auth_none for the mount-time lookup, just
> > > > > for consistency..
> > > > >
> > > > > Is the following appropriate? Is there somewhere better to put this caveat?
> > > >
> > > > Unfortunately, it's more complicated than this, as it depends on client
> > > > implementation and configuration details.
> > > >
> > > > Something like this would be more accurate but possibly too long:
> > > >
> > > > Note that under NFSv2 and NFSv3, the mount path is traversed by
> > > > mountd acting as root, but under NFSv4 the mount path is looked
> > > > up using the client's credentials. This means that, for
> > > > example, if a client mounts using a krb5 credential that the
> > > > server maps to an "anonmyous" user, then the mount will only
> > > > succeed if that directory and all its parents allow eXecute
> > > > permissions.
> > >
> > > So you're listing this as a "feature" rather than a bug? There should be
> > > no reason to constrain the pseudofs to use the permission checks from
> > > the underlying filesystem.
> >
> > I'd be fine with that.
> >
> > (That still leaves some subtle v3/v4 difference in the case of mount
> > paths underneath an export?
> >
> > What *is* the existing mountd behavior there, exactly? I'm inclined to
> > think allowing mounts of arbitrary subdirectories is a bug, but maybe
> > there's some historical reason for it or maybe someone already depends
> > on it.)
> >
> > --b.
>
> The behaviour is simple that you mount a filehandle (typically belonging to a
> directory) and that filehandle can be anything inside any exported filesystem.

It's not the nfsd behavior that bothers me--there's nothing we can do
about the fact that access by filehandle can bypass directory
permissions.

What bothers is that mountd will apparently allow anyone to do a lookup
anywhere in an exported filesystem.

I don't know--maybe I shouldn't be so concerned about the possibility a
rogue user could figure out that my "Music" directory includes an
unreasonable number of Miles Davis titles.

> Yes, please do depend on being able to mount filehandles that aren't to root
> of a filesystem.
>
> The case the brought this issue to my attention involved the server having
> a directory containing hundreds of home directories. This directory is
> exported.
>
> If they mount that top level directory they get horrible performance. If
> they use an automounter to just mount the homes that are accessed it works
> better. They weren't able to explain why but my guess is that some tools
> (GUI filesystem browser) would occasionally do the equivalent of "ls -l" of
> the top level directory which would hammer nfs-idmapd and probably ldap....
> though you would think that would get cached and not be a problem for long.
> So maybe it is more subtle than that.

Getting all the id->name mappings for a 100-entry directory is going to
require a 100 serialized upcalls to idmapd (and then possibly ldap), and
by default it looks like the idmapd cache will go cold after 10
minutes.... Not hard to imagine that could be a problem.

Running multiple idmapd process would be easy and might help? Though
not if the client's just giving us the getattrs one at a time.

Or maybe the problem's somewhere else entirely, but that's a real bug if
we aren't giving good performance on /home.

--b.

> I've built similar setups before. There is something attractive about
> everyone's home directory being /home/$USERNAME even though they are on
> different servers and different filesystems.
>
> In the particular problem scenario, local policy requires that the 'staff'
> directory on the server to not be world-accessible, but they still want to
> mount the individual home directories from there onto client machines as
> required.
> I cannot easily justify that policy, but the point is that it works with
> NFSv3 and with AUTH_SYS/no_root_squash, but not with NFSv4/kerb5. I don't
> think we can fix this inconsistency but maybe we can explain it.
>
> I think your text is more accurate than mine, but also a little more vague so
> the important may not be immediately obvious. That might be a price we have
> to pay for accuracy.

2012-10-08 12:20:53

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Mon, Oct 08, 2012 at 07:42:34AM -0400, Steve Dickson wrote:
> Assuming you are using the keyring based idmapper, "nfsidmap -cv" will
> clear the keyring of user and group ids. See nfsidmap(5).

Oh, good, I'd missed that....

> If you using rpc.idmapd, I believe
> echo `date +'%s'` > /proc/net/rpc/nfs4.idtoname/flush
> will do the trick.... The CITI faq
> http://www.citi.umich.edu/projects/nfsv4/linux/faq/
> has a section on work with this cache...

No, that's just the server-side cache, but if as Neil says the
on-the-wire replies look correct, then the problem is client-side.

--b.

2012-10-03 15:13:52

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Inconsistency when mounting a directory that 'world' cannot access.

On Wed, Oct 03, 2012 at 01:46:29PM +1000, NeilBrown wrote:
> On Tue, 2 Oct 2012 10:33:34 -0400 "J. Bruce Fields" <[email protected]>
> wrote:
>
> > I guess you're right. So it starts to sound more like: "you have a
> > confusing setup. Your export configuration says one thing, and your
> > filesystem permissions say another. Under NFSv3 the confusion didn't
> > matter, but now it does--time to fix it."
> >
>
> That's the best I could come to - I'm glad to have it confirmed. Thanks!
>
> It is unfortunate that Linux NFS uses an anon credential to mount when krb5
> is in use, and uses 'root' when auth_sys is used (which might be anon if
> "root_squash" is active, but might not).
> I wonder if it would work to use auth_none for the mount-time lookup, just
> for consistency..
>
> Is the following appropriate? Is there somewhere better to put this caveat?

Unfortunately, it's more complicated than this, as it depends on client
implementation and configuration details.

Something like this would be more accurate but possibly too long:

Note that under NFSv2 and NFSv3, the mount path is traversed by
mountd acting as root, but under NFSv4 the mount path is looked
up using the client's credentials. This means that, for
example, if a client mounts using a krb5 credential that the
server maps to an "anonmyous" user, then the mount will only
succeed if that directory and all its parents allow eXecute
permissions.

--b.

>
> Thanks,
> NeilBrown
>
>
> diff --git a/utils/exportfs/exports.man b/utils/exportfs/exports.man
> index bc1de73..91e4b9c 100644
> --- a/utils/exportfs/exports.man
> +++ b/utils/exportfs/exports.man
> @@ -126,6 +126,10 @@ will be enforced only for access using flavors listed in the immediately
> preceding sec= option. The only options that are permitted to vary in
> this way are ro, rw, no_root_squash, root_squash, and all_squash.
> .PP
> +When RPCSEC_GSS is used with NFSv4, a client will only be able to mount a
> +directory if that directory and all its ancestors give eXecute access
> +to "world".
> +.PP
> .SS General Options
> .BR exportfs
> understands the following export options: