LinuxLists.cc - v4.0 CB_COMPOUND authentication failures

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 14:11 -0400, Dr Fields James Bruce wrote:
> On Tue, Apr 08, 2014 at 02:08:14PM -0400, Simo Sorce wrote:
> > On Tue, 2014-04-08 at 14:04 -0400, Jeff Layton wrote:
> > > On Tue, 08 Apr 2014 14:01:15 -0400
> > > Simo Sorce <[email protected]> wrote:
> > >
> > > > On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
> > > > > On Tue, 08 Apr 2014 13:27:01 -0400
> > > > > Simo Sorce <[email protected]> wrote:
> > > > >
> > > > > > On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> > > > > > >
> > > > > > > I think that's what happens. We only fall back to using AUTH_SYS if
> > > > > > > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > > > > > > negotiation fails, we should get back -EACCES and that should bubble
> > > > > > > back up to userland.
> > > > > > >
> > > > > > > The real problem is that gssd (and also the krb5 libs themselves) will
> > > > > > > try to canonicalize the name. The resulting host portion of the SPN
> > > > > > > may bear no resemblance to the hostname in the device string. In fact,
> > > > > > > if you mount using an IP address then you're pretty much SOL.
> > > > > >
> > > > > > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > > > > > that's a clue we should not even try ...
> > > > > >
> > > > >
> > > > > It's certainly possible that someone passes in an IP address but then
> > > > > says "-o sec=krb5". It has worked in the past, so it's hard to know
> > > > > whether and how many people actually depend on it.
> > > > >
> > > > > > > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > > > > > > not to bother with DNS at all and just rely on the hostname. That
> > > > > > > won't stop the krb5 libs from doing their canonicalization though. I'm
> > > > > > > not sure if there's some way to ask the krb5 libs to avoid doing that.
> > > > > >
> > > > > > [libdefaults]
> > > > > > rdns = false
> > > > > >
> > > > > > And I think we change the default to false in Fedora/RHEL lately ...
> > > > > >
> > > > > > Simo.
> > > > > >
> > > > >
> > > > > That's a step in the right direction, but I think that the rdns just
> > > > > makes it skip the reverse lookup. AFAIK, the MIT libs will still do
> > > > > getaddrinfo and scrape out the ai_canonname and use that in preference
> > > > > to the hostname you pass in.
> > > >
> > > > That should happen only if you are using a CNAME, not for an A name.
> > > >
> > > > We can open bugs if this is not the case though.
> > > >
> > >
> > > That's still a problem for us then. The current code tries to compare
> > > the host portion of the device string to the SPN that we get in the
> > > callback request. If they don't match, it fails.
> > >
> > > I think what we need to do is fix this the right way -- make rpc.gssd
> > > pass down the acceptor name with the downcall.
> >
> > Why do you need the comparison at all, pardon my ignorance, I do not
> > know very well what its purpose is.
>
> The NFS client wants to verify that a callback came from the server, so
> it needs to know who it originally authenticated to.

As Jeff said, the only good way at this point would be to have rpc.gssd
pass down the acceptor name after it is done with the gssapi calls.

Note that this may still fail, especially i clustered environments where
servers have multiple credentials they can answer with (due to
responding with multiple names). Unless the server is careful in always
using the principal the client got tickets for when it calls back.

Although the best solution is to quickly deprecate 4.0 callbacks and try
as hard as possible to move on. 4.0 callbacks are just broken.

> (Though honestly it's unlikely you can do much damage by spoofing
> callbacks.)

Better not finding out :)

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 18:04:53

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 08 Apr 2014 14:01:15 -0400
Simo Sorce <[email protected]> wrote:

> On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
> > On Tue, 08 Apr 2014 13:27:01 -0400
> > Simo Sorce <[email protected]> wrote:
> >
> > > On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> > > >
> > > > I think that's what happens. We only fall back to using AUTH_SYS if
> > > > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > > > negotiation fails, we should get back -EACCES and that should bubble
> > > > back up to userland.
> > > >
> > > > The real problem is that gssd (and also the krb5 libs themselves) will
> > > > try to canonicalize the name. The resulting host portion of the SPN
> > > > may bear no resemblance to the hostname in the device string. In fact,
> > > > if you mount using an IP address then you're pretty much SOL.
> > >
> > > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > > that's a clue we should not even try ...
> > >
> >
> > It's certainly possible that someone passes in an IP address but then
> > says "-o sec=krb5". It has worked in the past, so it's hard to know
> > whether and how many people actually depend on it.
> >
> > > > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > > > not to bother with DNS at all and just rely on the hostname. That
> > > > won't stop the krb5 libs from doing their canonicalization though. I'm
> > > > not sure if there's some way to ask the krb5 libs to avoid doing that.
> > >
> > > [libdefaults]
> > > rdns = false
> > >
> > > And I think we change the default to false in Fedora/RHEL lately ...
> > >
> > > Simo.
> > >
> >
> > That's a step in the right direction, but I think that the rdns just
> > makes it skip the reverse lookup. AFAIK, the MIT libs will still do
> > getaddrinfo and scrape out the ai_canonname and use that in preference
> > to the hostname you pass in.
>
> That should happen only if you are using a CNAME, not for an A name.
>
> We can open bugs if this is not the case though.
>

That's still a problem for us then. The current code tries to compare
the host portion of the device string to the SPN that we get in the
callback request. If they don't match, it fails.

I think what we need to do is fix this the right way -- make rpc.gssd
pass down the acceptor name with the downcall.

--
Jeff Layton <[email protected]>

2014-04-08 23:31:51

by Frank Filz

[permalink] [raw]

Subject: RE: v4.0 CB_COMPOUND authentication failures

> On Tue, 2014-04-08 at 15:44 -0700, Frank Filz wrote:
> > > On Tue, 2014-04-08 at 10:39 -0700, Frank Filz wrote:
> > > > > > If you mount by IP do you really care about krb5 ? Probably
> > > > > > not, maybe that's a clue we should not even try ...
> > > > > >
> > > > >
> > > > > It's certainly possible that someone passes in an IP address but
> > > > > then says
> > > > "-o
> > > > > sec=krb5". It has worked in the past, so it's hard to know
> > > > > whether and how many people actually depend on it.
> > > >
> > > > Mount by ip is sometimes used with clustered servers, especially
> > > > when they have all their IP addresses in the DNS record. Even
> > > > using a FQDN that just specifies that one IP address probably
> > > > won't work then (since it probably is NOT the hostname used in the
> server credential).
> > >
> > > I do not understand this, using an IP address or a name that resolve
> > > to said IP address is the same.
> >
> > But a name can resolve to a set of IP addresses, often in round-robin
> > fashion. It would cause havoc with NFS server on a cluster if a v3
> > client had locks on 192.168.0.10 (resolved from server.mycompany.com),
> > and then rebooted, and resolved to 192.168.0.9 and sent SM_NOTIFY
> > there). At least that isn't an issue with v4...
>
> Sound you configured your DNS wrong, DNS round robin is not the right way
> to do load balancing for NFSv3.

I should add that this is not something I ever set up, just something others have set up and then asked me as a server developer to deal with. I agree that it's not the greatest setup. It may be coming from an HTTP server mindset (where that setup maybe works reasonably).

> Also if you are specifying an IP address you can as well specify an explicit
> name instead, you can achieve that by using a RR CNAME and then make
> sure the client sticks to the resolved A name for the life of the locks.
>
> For a reboot situation you are screwed anyway unless you configure an
> explicit address, in which case you do not have redundancy anyway.
> You just use a DNS name that is resolved into a unique IP address.

Yep, but getting redundancy for v3 is not trivial. It is partially solved by moving the IP address to a surviving cluster node, which has it's own pain points.

> > Or worse, tcp connection is dropped due to inactivity, and new
> > connection is made to a different server node. But this could still be
> > an issue with v4...
>
> Same as above.
>
> > The workaround has been to specify specific IP address.
>
> The workaround to what ? Why are you using RR names if client are going to
> stick to a specific server anyway ? You are working around a problem you are
> created yourself, fix the problem! (You can fix it also by setting an entry in
> /etc/hosts, it is semantically identical to specifying an IP address on the
> mount anyway).
>
> > Now I haven't done enough with krb5 recently to know how well that
> > works...
>
> > I guess I'm just offering one reason IP addresses might have been
> > specified on mount...
>
> A bogus reason :-)

But one that folks have done, and can't entirely be discounted, at least to the extent that it currently works...

Frank

2014-04-08 16:40:31

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

>
> > But, I don't know, I'm frankly confused about our security design for
> > the NFSv4 state.
> >
> > When we insist on krb5 (and checked the server name correctly), and
> > failed without it, then I feel like I understand what we're doing. Once
> > we start trying it and then falling back (as I understand happens for
> > the krb5 state in the auth_sys case) I get confused.
>
> Now you have me confused. I’m aware that we call nfs_create_rpc_client() with a krb5i argument and then fall back to auth_sys if the RPC layer says that we don’t have a running gss daemon or that we can’t load the rpcsec_gss_krb5 module. I’m not aware of us falling back if rpc.gssd is running and tells us that security negotiation failed; we should be returning a mount error in that case.

Oh, good, that sounds fine--I'd forgotten it worked that way.

So the problem occurs just because gssd and/or the kerberos libraries
are allowing us to establish state using a different name and then we're
not accepting it on the return.

So:

On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
> How is it not better just to rip out that hostname comparison in the
> back channel?

Rip it out entirely?

At that point anyone who can get a credential in the right realm can
send a recall. RFC made this requirement to prevent that.

But we've already decided we don't care about that in the 4.1 case, so,
hey, maybe. I guess it wouldn't bother me.

--b.

2014-04-08 18:08:23

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 14:04 -0400, Jeff Layton wrote:
> On Tue, 08 Apr 2014 14:01:15 -0400
> Simo Sorce <[email protected]> wrote:
>
> > On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
> > > On Tue, 08 Apr 2014 13:27:01 -0400
> > > Simo Sorce <[email protected]> wrote:
> > >
> > > > On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> > > > >
> > > > > I think that's what happens. We only fall back to using AUTH_SYS if
> > > > > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > > > > negotiation fails, we should get back -EACCES and that should bubble
> > > > > back up to userland.
> > > > >
> > > > > The real problem is that gssd (and also the krb5 libs themselves) will
> > > > > try to canonicalize the name. The resulting host portion of the SPN
> > > > > may bear no resemblance to the hostname in the device string. In fact,
> > > > > if you mount using an IP address then you're pretty much SOL.
> > > >
> > > > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > > > that's a clue we should not even try ...
> > > >
> > >
> > > It's certainly possible that someone passes in an IP address but then
> > > says "-o sec=krb5". It has worked in the past, so it's hard to know
> > > whether and how many people actually depend on it.
> > >
> > > > > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > > > > not to bother with DNS at all and just rely on the hostname. That
> > > > > won't stop the krb5 libs from doing their canonicalization though. I'm
> > > > > not sure if there's some way to ask the krb5 libs to avoid doing that.
> > > >
> > > > [libdefaults]
> > > > rdns = false
> > > >
> > > > And I think we change the default to false in Fedora/RHEL lately ...
> > > >
> > > > Simo.
> > > >
> > >
> > > That's a step in the right direction, but I think that the rdns just
> > > makes it skip the reverse lookup. AFAIK, the MIT libs will still do
> > > getaddrinfo and scrape out the ai_canonname and use that in preference
> > > to the hostname you pass in.
> >
> > That should happen only if you are using a CNAME, not for an A name.
> >
> > We can open bugs if this is not the case though.
> >
>
> That's still a problem for us then. The current code tries to compare
> the host portion of the device string to the SPN that we get in the
> callback request. If they don't match, it fails.
>
> I think what we need to do is fix this the right way -- make rpc.gssd
> pass down the acceptor name with the downcall.

Why do you need the comparison at all, pardon my ignorance, I do not
know very well what its purpose is.

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 15:04:31

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 10:46:52 -0400
Dr Fields James Bruce <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
> >
> > On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
> >
> > > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > >> On Tue, 8 Apr 2014 08:35:01 -0400
> > >> "J. Bruce Fields" <[email protected]> wrote:
> > >>
> > >>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > >>>> I've recently been hunting down some problems with delegation handling
> > >>>> and have run across a problem with the client authenticates CB_COMPOUND
> > >>>> requests. I could use some advice on how best to fix it.
> > >>>>
> > >>>> Specifically, check_gss_callback_principal() tries to look up the
> > >>>> callback client and then tries to compare the ticket in it against the
> > >>>> clp->cl_hostname:
> > >>>>
> > >>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > >>>>
> > >>>> if (memcmp(p, "nfs@", 4) != 0)
> > >>>> return 0;
> > >>>> p += 4;
> > >>>> if (strcmp(p, clp->cl_hostname) != 0)
> > >>>> return 0;
> > >>>> return 1;
> > >>>>
> > >>>> The problem is that there is no guarantee that those hostnames will be
> > >>>> the same. If, for instance, I mount "foo:/" and the SPN is
> > >>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > >>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > >>>> server.
> > >>>
> > >>> It sounds like a bug to me that the mount is succeeding without the name
> > >>> matching.
> > >>>
> > >>> The security provided by krb5 is much weaker if we don't check that the
> > >>> name provided on the commandline matches what the server authenticates
> > >>> as.
> > >>>
> > >>
> > >> The logic in gssd for this is pretty awful.
> > >>
> > >> It will basically trust DNS if there is no '.' in the hostname that was
> > >> used at mount time. That'll make it take the address and
> > >> reverse-resolve it.
> > >
> > > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > > resolution for server names (take 3)".
> > >
> > >> We could add yet another band-aid and make it so that DNS is never
> > >> trusted. I'll note that for cifs, we took that route. You have to mount
> > >> the canonical name of the server in order to use krb5.
> > >
> > > I wish we could do that, but I suppose it's too harsh to break
> > > already-working fstabs. Maybe we could phase it in somehow.
> > >
> > >>>> Now that we try to use krb5 on the callback channel even when sec=sys
> > >>>> is specified, this is very problematic.
> > >>>
> > >>> And similarly I think the attempt to opportunistically use krb5 for
> > >>> state management should fail and fall back on auth_sys if the server's
> > >>> name doesn't match.
> > >>>
> >
> > This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
>
> It'd fix this particular problem.
>
> But, I don't know, I'm frankly confused about our security design for
> the NFSv4 state.
>
> When we insist on krb5 (and checked the server name correctly), and
> failed without it, then I feel like I understand what we're doing. Once
> we start trying it and then falling back (as I understand happens for
> the krb5 state in the auth_sys case) I get confused.
>
> > >> Like Trond pointed out, the problem is that gssd doesn't give us that
> > >> info currently. We could change it to do that of course, but that
> > >> basically means revving the downcall.
> > >
> > > It might be easier to rev the upcall so that the kernel could ask gssd
> > > to do strict checking? Since it's just a bunch of name=value pairs it
> > > shouldn't be a huge pain to revise.
> >
> > So what would trigger the kernel to ask for strict checking? Do we add a mount option that says “fail if the server doesn’t authenticate itself”? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
>
> I was thinking about only doing it in the state-establishment case.
> (Since we won't know how to authenticate the callbacks in that case.)
>
> But that would screw up krb5 mounts, I guess, never mind.
>
> Using a fqdn implicitly requests strict checking so a mount option would
> seem redundant.
>

So I guess we have two options to fix this:

1) Change gssd to require the canonical fqdn and not rely on name
resolution. Unfortunately, I think the MIT krb5 libs will still
canonicalize the hostnames by default, so this might not actually fix
anything. See:

http://web.mit.edu/kerberos/krb5-devel/doc/admin/princ_dns.html

...or...

2) Loosen or somehow fix the check in check_gss_callback_principal().
One possibility might be to do a dns_resolver upcall for the host
portion of the SPN, and then compare the address with the server's
address. Ugly, but since we already trust DNS implicitly I guess it's
no less secure...

--
Jeff Layton <[email protected]>

2014-04-08 17:30:50

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 08 Apr 2014 13:27:01 -0400
Simo Sorce <[email protected]> wrote:

> On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> >
> > I think that's what happens. We only fall back to using AUTH_SYS if
> > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > negotiation fails, we should get back -EACCES and that should bubble
> > back up to userland.
> >
> > The real problem is that gssd (and also the krb5 libs themselves) will
> > try to canonicalize the name. The resulting host portion of the SPN
> > may bear no resemblance to the hostname in the device string. In fact,
> > if you mount using an IP address then you're pretty much SOL.
>
> If you mount by IP do you really care about krb5 ? Probably not, maybe
> that's a clue we should not even try ...
>

It's certainly possible that someone passes in an IP address but then
says "-o sec=krb5". It has worked in the past, so it's hard to know
whether and how many people actually depend on it.

> > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > not to bother with DNS at all and just rely on the hostname. That
> > won't stop the krb5 libs from doing their canonicalization though. I'm
> > not sure if there's some way to ask the krb5 libs to avoid doing that.
>
> [libdefaults]
> rdns = false
>
> And I think we change the default to false in Fedora/RHEL lately ...
>
> Simo.
>

That's a step in the right direction, but I think that the rdns just
makes it skip the reverse lookup. AFAIK, the MIT libs will still do
getaddrinfo and scrape out the ai_canonname and use that in preference
to the hostname you pass in.

--
Jeff Layton <[email protected]>

2014-04-08 18:11:59

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 02:08:14PM -0400, Simo Sorce wrote:
> On Tue, 2014-04-08 at 14:04 -0400, Jeff Layton wrote:
> > On Tue, 08 Apr 2014 14:01:15 -0400
> > Simo Sorce <[email protected]> wrote:
> >
> > > On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
> > > > On Tue, 08 Apr 2014 13:27:01 -0400
> > > > Simo Sorce <[email protected]> wrote:
> > > >
> > > > > On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> > > > > >
> > > > > > I think that's what happens. We only fall back to using AUTH_SYS if
> > > > > > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > > > > > negotiation fails, we should get back -EACCES and that should bubble
> > > > > > back up to userland.
> > > > > >
> > > > > > The real problem is that gssd (and also the krb5 libs themselves) will
> > > > > > try to canonicalize the name. The resulting host portion of the SPN
> > > > > > may bear no resemblance to the hostname in the device string. In fact,
> > > > > > if you mount using an IP address then you're pretty much SOL.
> > > > >
> > > > > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > > > > that's a clue we should not even try ...
> > > > >
> > > >
> > > > It's certainly possible that someone passes in an IP address but then
> > > > says "-o sec=krb5". It has worked in the past, so it's hard to know
> > > > whether and how many people actually depend on it.
> > > >
> > > > > > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > > > > > not to bother with DNS at all and just rely on the hostname. That
> > > > > > won't stop the krb5 libs from doing their canonicalization though. I'm
> > > > > > not sure if there's some way to ask the krb5 libs to avoid doing that.
> > > > >
> > > > > [libdefaults]
> > > > > rdns = false
> > > > >
> > > > > And I think we change the default to false in Fedora/RHEL lately ...
> > > > >
> > > > > Simo.
> > > > >
> > > >
> > > > That's a step in the right direction, but I think that the rdns just
> > > > makes it skip the reverse lookup. AFAIK, the MIT libs will still do
> > > > getaddrinfo and scrape out the ai_canonname and use that in preference
> > > > to the hostname you pass in.
> > >
> > > That should happen only if you are using a CNAME, not for an A name.
> > >
> > > We can open bugs if this is not the case though.
> > >
> >
> > That's still a problem for us then. The current code tries to compare
> > the host portion of the device string to the SPN that we get in the
> > callback request. If they don't match, it fails.
> >
> > I think what we need to do is fix this the right way -- make rpc.gssd
> > pass down the acceptor name with the downcall.
>
> Why do you need the comparison at all, pardon my ignorance, I do not
> know very well what its purpose is.

The NFS client wants to verify that a callback came from the server, so
it needs to know who it originally authenticated to.

(Though honestly it's unlikely you can do much damage by spoofing
callbacks.)

--b.

2014-04-08 14:23:42

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
>> On Tue, 8 Apr 2014 08:35:01 -0400
>> "J. Bruce Fields" <[email protected]> wrote:
>>
>>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
>>>> I've recently been hunting down some problems with delegation handling
>>>> and have run across a problem with the client authenticates CB_COMPOUND
>>>> requests. I could use some advice on how best to fix it.
>>>>
>>>> Specifically, check_gss_callback_principal() tries to look up the
>>>> callback client and then tries to compare the ticket in it against the
>>>> clp->cl_hostname:
>>>>
>>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
>>>>
>>>> if (memcmp(p, "nfs@", 4) != 0)
>>>> return 0;
>>>> p += 4;
>>>> if (strcmp(p, clp->cl_hostname) != 0)
>>>> return 0;
>>>> return 1;
>>>>
>>>> The problem is that there is no guarantee that those hostnames will be
>>>> the same. If, for instance, I mount "foo:/" and the SPN is
>>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
>>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
>>>> server.
>>>
>>> It sounds like a bug to me that the mount is succeeding without the name
>>> matching.
>>>
>>> The security provided by krb5 is much weaker if we don't check that the
>>> name provided on the commandline matches what the server authenticates
>>> as.
>>>
>>
>> The logic in gssd for this is pretty awful.
>>
>> It will basically trust DNS if there is no '.' in the hostname that was
>> used at mount time. That'll make it take the address and
>> reverse-resolve it.
>
> Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> resolution for server names (take 3)".
>
>> We could add yet another band-aid and make it so that DNS is never
>> trusted. I'll note that for cifs, we took that route. You have to mount
>> the canonical name of the server in order to use krb5.
>
> I wish we could do that, but I suppose it's too harsh to break
> already-working fstabs. Maybe we could phase it in somehow.
>
>>>> Now that we try to use krb5 on the callback channel even when sec=sys
>>>> is specified, this is very problematic.
>>>
>>> And similarly I think the attempt to opportunistically use krb5 for
>>> state management should fail and fall back on auth_sys if the server's
>>> name doesn't match.
>>>

This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?

>> Like Trond pointed out, the problem is that gssd doesn't give us that
>> info currently. We could change it to do that of course, but that
>> basically means revving the downcall.
>
> It might be easier to rev the upcall so that the kernel could ask gssd
> to do strict checking? Since it's just a bunch of name=value pairs it
> shouldn't be a huge pain to revise.

So what would trigger the kernel to ask for strict checking? Do we add a mount option that says ?fail if the server doesn?t authenticate itself?? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.

>>>> I think that the ideal thing would be to stash the SPN that we use to
>>>> do the SETCLIENTID call and use that in the comparison above.
>>>> Unfortunately, the rpc_cred doesn't really seem to carry this info and
>>>> I don't see where we get enough information in the rpc.gssd downcall to
>>>> figure out what that SPN should be.
>>>>
>>>> Anyone have thoughts or should we just remove the above check until we
>>>> come up with a better way to do this?
>>>>
>>>> [1]: there's another bug that can cause the client to send a bogus
>>>> reply instead of dropping the request as intended, but that's
>>>> relatively simple to fix.
>>>
>>> So I believe the matching really is a requirement and that it would be
>>> wrong to weaken it.
>>>
>>> It sounds like there's also a server bug here if it's giving out
>>> delegations to a client that isn't responding to callbacks.
>>>
>>
>> The server uses CB_NULL requests to probe the callback port, and those
>> aren't affected by this problem. Worse, since CB_NULL requests don't
>> even contain the callback_ident, we can't even use them to hook up the
>> nfs_client with the SPN used in them.
>
> Ah, got it.
>
> The server should still stop delegations as soon as a CB_RECALL times
> out, though, so at least the problem should clear up after that?
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 14:03:34

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> On Tue, 8 Apr 2014 08:35:01 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > I've recently been hunting down some problems with delegation handling
> > > and have run across a problem with the client authenticates CB_COMPOUND
> > > requests. I could use some advice on how best to fix it.
> > >
> > > Specifically, check_gss_callback_principal() tries to look up the
> > > callback client and then tries to compare the ticket in it against the
> > > clp->cl_hostname:
> > >
> > > /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > >
> > > if (memcmp(p, "nfs@", 4) != 0)
> > > return 0;
> > > p += 4;
> > > if (strcmp(p, clp->cl_hostname) != 0)
> > > return 0;
> > > return 1;
> > >
> > > The problem is that there is no guarantee that those hostnames will be
> > > the same. If, for instance, I mount "foo:/" and the SPN is
> > > "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > server.
> >
> > It sounds like a bug to me that the mount is succeeding without the name
> > matching.
> >
> > The security provided by krb5 is much weaker if we don't check that the
> > name provided on the commandline matches what the server authenticates
> > as.
> >
>
> The logic in gssd for this is pretty awful.
>
> It will basically trust DNS if there is no '.' in the hostname that was
> used at mount time. That'll make it take the address and
> reverse-resolve it.

Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
resolution for server names (take 3)".

> We could add yet another band-aid and make it so that DNS is never
> trusted. I'll note that for cifs, we took that route. You have to mount
> the canonical name of the server in order to use krb5.

I wish we could do that, but I suppose it's too harsh to break
already-working fstabs. Maybe we could phase it in somehow.

> > > Now that we try to use krb5 on the callback channel even when sec=sys
> > > is specified, this is very problematic.
> >
> > And similarly I think the attempt to opportunistically use krb5 for
> > state management should fail and fall back on auth_sys if the server's
> > name doesn't match.
> >
>
> Like Trond pointed out, the problem is that gssd doesn't give us that
> info currently. We could change it to do that of course, but that
> basically means revving the downcall.

It might be easier to rev the upcall so that the kernel could ask gssd
to do strict checking? Since it's just a bunch of name=value pairs it
shouldn't be a huge pain to revise.

> > > I think that the ideal thing would be to stash the SPN that we use to
> > > do the SETCLIENTID call and use that in the comparison above.
> > > Unfortunately, the rpc_cred doesn't really seem to carry this info and
> > > I don't see where we get enough information in the rpc.gssd downcall to
> > > figure out what that SPN should be.
> > >
> > > Anyone have thoughts or should we just remove the above check until we
> > > come up with a better way to do this?
> > >
> > > [1]: there's another bug that can cause the client to send a bogus
> > > reply instead of dropping the request as intended, but that's
> > > relatively simple to fix.
> >
> > So I believe the matching really is a requirement and that it would be
> > wrong to weaken it.
> >
> > It sounds like there's also a server bug here if it's giving out
> > delegations to a client that isn't responding to callbacks.
> >
>
> The server uses CB_NULL requests to probe the callback port, and those
> aren't affected by this problem. Worse, since CB_NULL requests don't
> even contain the callback_ident, we can't even use them to hook up the
> nfs_client with the SPN used in them.

Ah, got it.

The server should still stop delegations as soon as a CB_RECALL times
out, though, so at least the problem should clear up after that?

--b.

2014-04-08 12:58:02

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 08:42:10AM -0400, Trond Myklebust wrote:
>
> On Apr 8, 2014, at 8:35, J. Bruce Fields <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> >> I've recently been hunting down some problems with delegation handling
> >> and have run across a problem with the client authenticates CB_COMPOUND
> >> requests. I could use some advice on how best to fix it.
> >>
> >> Specifically, check_gss_callback_principal() tries to look up the
> >> callback client and then tries to compare the ticket in it against the
> >> clp->cl_hostname:
> >>
> >> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> >>
> >> if (memcmp(p, "nfs@", 4) != 0)
> >> return 0;
> >> p += 4;
> >> if (strcmp(p, clp->cl_hostname) != 0)
> >> return 0;
> >> return 1;
> >>
> >> The problem is that there is no guarantee that those hostnames will be
> >> the same. If, for instance, I mount "foo:/" and the SPN is
> >> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> >> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> >> server.
> >
> > It sounds like a bug to me that the mount is succeeding without the name
> > matching.
> >
> > The security provided by krb5 is much weaker if we don't check that the
> > name provided on the commandline matches what the server authenticates
> > as.
>
> Where would the client find that information? I don’t think that rpc.gssd passes that information down to us.

gssd should get the server name passed on the commandline from the info
file or the upcall, if I remember right, and then it's up to gssd to
match names.

I thought gssd was already doing that, but I guess not.

So maybe I'm confused about how this all works.

--b.

>
> >> Now that we try to use krb5 on the callback channel even when sec=sys
> >> is specified, this is very problematic.
> >
> > And similarly I think the attempt to opportunistically use krb5 for
> > state management should fail and fall back on auth_sys if the server's
> > name doesn't match.
> >
> >> I think that the ideal thing would be to stash the SPN that we use to
> >> do the SETCLIENTID call and use that in the comparison above.
> >> Unfortunately, the rpc_cred doesn't really seem to carry this info and
> >> I don't see where we get enough information in the rpc.gssd downcall to
> >> figure out what that SPN should be.
> >>
> >> Anyone have thoughts or should we just remove the above check until we
> >> come up with a better way to do this?
> >>
> >> [1]: there's another bug that can cause the client to send a bogus
> >> reply instead of dropping the request as intended, but that's
> >> relatively simple to fix.
> >
> > So I believe the matching really is a requirement and that it would be
> > wrong to weaken it.
> >
> > It sounds like there's also a server bug here if it's giving out
> > delegations to a client that isn't responding to callbacks.
> >
> > --b.
>
> _________________________________
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
>

2014-04-08 16:22:55

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 10:46, Dr Fields James Bruce <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
>>
>> On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
>>
>>> On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
>>>> On Tue, 8 Apr 2014 08:35:01 -0400
>>>> "J. Bruce Fields" <[email protected]> wrote:
>>>>
>>>>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
>>>>>> I've recently been hunting down some problems with delegation handling
>>>>>> and have run across a problem with the client authenticates CB_COMPOUND
>>>>>> requests. I could use some advice on how best to fix it.
>>>>>>
>>>>>> Specifically, check_gss_callback_principal() tries to look up the
>>>>>> callback client and then tries to compare the ticket in it against the
>>>>>> clp->cl_hostname:
>>>>>>
>>>>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
>>>>>>
>>>>>> if (memcmp(p, "nfs@", 4) != 0)
>>>>>> return 0;
>>>>>> p += 4;
>>>>>> if (strcmp(p, clp->cl_hostname) != 0)
>>>>>> return 0;
>>>>>> return 1;
>>>>>>
>>>>>> The problem is that there is no guarantee that those hostnames will be
>>>>>> the same. If, for instance, I mount "foo:/" and the SPN is
>>>>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
>>>>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
>>>>>> server.
>>>>>
>>>>> It sounds like a bug to me that the mount is succeeding without the name
>>>>> matching.
>>>>>
>>>>> The security provided by krb5 is much weaker if we don't check that the
>>>>> name provided on the commandline matches what the server authenticates
>>>>> as.
>>>>>
>>>>
>>>> The logic in gssd for this is pretty awful.
>>>>
>>>> It will basically trust DNS if there is no '.' in the hostname that was
>>>> used at mount time. That'll make it take the address and
>>>> reverse-resolve it.
>>>
>>> Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
>>> resolution for server names (take 3)".
>>>
>>>> We could add yet another band-aid and make it so that DNS is never
>>>> trusted. I'll note that for cifs, we took that route. You have to mount
>>>> the canonical name of the server in order to use krb5.
>>>
>>> I wish we could do that, but I suppose it's too harsh to break
>>> already-working fstabs. Maybe we could phase it in somehow.
>>>
>>>>>> Now that we try to use krb5 on the callback channel even when sec=sys
>>>>>> is specified, this is very problematic.
>>>>>
>>>>> And similarly I think the attempt to opportunistically use krb5 for
>>>>> state management should fail and fall back on auth_sys if the server's
>>>>> name doesn't match.
>>>>>
>>
>> This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
>
> It'd fix this particular problem.

It weakens the state management security for no good reason. How is it not better just to rip out that hostname comparison in the back channel?

> But, I don't know, I'm frankly confused about our security design for
> the NFSv4 state.
>
> When we insist on krb5 (and checked the server name correctly), and
> failed without it, then I feel like I understand what we're doing. Once
> we start trying it and then falling back (as I understand happens for
> the krb5 state in the auth_sys case) I get confused.

Now you have me confused. I?m aware that we call nfs_create_rpc_client() with a krb5i argument and then fall back to auth_sys if the RPC layer says that we don?t have a running gss daemon or that we can?t load the rpcsec_gss_krb5 module. I?m not aware of us falling back if rpc.gssd is running and tells us that security negotiation failed; we should be returning a mount error in that case.

>>>> Like Trond pointed out, the problem is that gssd doesn't give us that
>>>> info currently. We could change it to do that of course, but that
>>>> basically means revving the downcall.
>>>
>>> It might be easier to rev the upcall so that the kernel could ask gssd
>>> to do strict checking? Since it's just a bunch of name=value pairs it
>>> shouldn't be a huge pain to revise.
>>
>> So what would trigger the kernel to ask for strict checking? Do we add a mount option that says ?fail if the server doesn?t authenticate itself?? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
>
> I was thinking about only doing it in the state-establishment case.
> (Since we won't know how to authenticate the callbacks in that case.)
>
> But that would screw up krb5 mounts, I guess, never mind.
>
> Using a fqdn implicitly requests strict checking so a mount option would
> seem redundant.

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 17:28:16

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 08 Apr 2014 13:25:04 -0400
Simo Sorce <[email protected]> wrote:

> On Tue, 2014-04-08 at 11:13 -0400, Dr Fields James Bruce wrote:
> > On Tue, Apr 08, 2014 at 11:04:20AM -0400, Jeff Layton wrote:
> > > On Tue, 8 Apr 2014 10:46:52 -0400
> > > Dr Fields James Bruce <[email protected]> wrote:
> > >
> > > > On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
> > > > >
> > > > > On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
> > > > >
> > > > > > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > > > > >> On Tue, 8 Apr 2014 08:35:01 -0400
> > > > > >> "J. Bruce Fields" <[email protected]> wrote:
> > > > > >>
> > > > > >>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > > > >>>> I've recently been hunting down some problems with delegation handling
> > > > > >>>> and have run across a problem with the client authenticates CB_COMPOUND
> > > > > >>>> requests. I could use some advice on how best to fix it.
> > > > > >>>>
> > > > > >>>> Specifically, check_gss_callback_principal() tries to look up the
> > > > > >>>> callback client and then tries to compare the ticket in it against the
> > > > > >>>> clp->cl_hostname:
> > > > > >>>>
> > > > > >>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > > > >>>>
> > > > > >>>> if (memcmp(p, "nfs@", 4) != 0)
> > > > > >>>> return 0;
> > > > > >>>> p += 4;
> > > > > >>>> if (strcmp(p, clp->cl_hostname) != 0)
> > > > > >>>> return 0;
> > > > > >>>> return 1;
> > > > > >>>>
> > > > > >>>> The problem is that there is no guarantee that those hostnames will be
> > > > > >>>> the same. If, for instance, I mount "foo:/" and the SPN is
> > > > > >>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > > > >>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > > > >>>> server.
> > > > > >>>
> > > > > >>> It sounds like a bug to me that the mount is succeeding without the name
> > > > > >>> matching.
> > > > > >>>
> > > > > >>> The security provided by krb5 is much weaker if we don't check that the
> > > > > >>> name provided on the commandline matches what the server authenticates
> > > > > >>> as.
> > > > > >>>
> > > > > >>
> > > > > >> The logic in gssd for this is pretty awful.
> > > > > >>
> > > > > >> It will basically trust DNS if there is no '.' in the hostname that was
> > > > > >> used at mount time. That'll make it take the address and
> > > > > >> reverse-resolve it.
> > > > > >
> > > > > > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > > > > > resolution for server names (take 3)".
> > > > > >
> > > > > >> We could add yet another band-aid and make it so that DNS is never
> > > > > >> trusted. I'll note that for cifs, we took that route. You have to mount
> > > > > >> the canonical name of the server in order to use krb5.
> > > > > >
> > > > > > I wish we could do that, but I suppose it's too harsh to break
> > > > > > already-working fstabs. Maybe we could phase it in somehow.
> > > > > >
> > > > > >>>> Now that we try to use krb5 on the callback channel even when sec=sys
> > > > > >>>> is specified, this is very problematic.
> > > > > >>>
> > > > > >>> And similarly I think the attempt to opportunistically use krb5 for
> > > > > >>> state management should fail and fall back on auth_sys if the server's
> > > > > >>> name doesn't match.
> > > > > >>>
> > > > >
> > > > > This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
> > > >
> > > > It'd fix this particular problem.
> > > >
> > > > But, I don't know, I'm frankly confused about our security design for
> > > > the NFSv4 state.
> > > >
> > > > When we insist on krb5 (and checked the server name correctly), and
> > > > failed without it, then I feel like I understand what we're doing. Once
> > > > we start trying it and then falling back (as I understand happens for
> > > > the krb5 state in the auth_sys case) I get confused.
> > > >
> > > > > >> Like Trond pointed out, the problem is that gssd doesn't give us that
> > > > > >> info currently. We could change it to do that of course, but that
> > > > > >> basically means revving the downcall.
> > > > > >
> > > > > > It might be easier to rev the upcall so that the kernel could ask gssd
> > > > > > to do strict checking? Since it's just a bunch of name=value pairs it
> > > > > > shouldn't be a huge pain to revise.
> > > > >
> > > > > So what would trigger the kernel to ask for strict checking? Do we add a mount option that says “fail if the server doesn’t authenticate itself”? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
> > > >
> > > > I was thinking about only doing it in the state-establishment case.
> > > > (Since we won't know how to authenticate the callbacks in that case.)
> > > >
> > > > But that would screw up krb5 mounts, I guess, never mind.
> > > >
> > > > Using a fqdn implicitly requests strict checking so a mount option would
> > > > seem redundant.
> > > >
> > >
> > > So I guess we have two options to fix this:
> > >
> > > 1) Change gssd to require the canonical fqdn and not rely on name
> > > resolution. Unfortunately, I think the MIT krb5 libs will still
> > > canonicalize the hostnames by default, so this might not actually fix
> > > anything. See:
> > >
> > > http://web.mit.edu/kerberos/krb5-devel/doc/admin/princ_dns.html
> > >
> > > ...or...
> > >
> > > 2) Loosen or somehow fix the check in check_gss_callback_principal().
> > > One possibility might be to do a dns_resolver upcall for the host
> > > portion of the SPN, and then compare the address with the server's
> > > address. Ugly, but since we already trust DNS implicitly I guess it's
> > > no less secure...
> >
> > I thought Kerberos wasn't supposed to require trust in DNS. So I feel
> > confused. Cc'ing Simo in hopes he can set us all straight.
>
> RFC4120 indeed warns against relying on DNS.
> Here[1] is my old writeup about why it shouldn't be done, at least until
> DNSSEC gets deployed, then things may change.
>
> Simo.
>
> [1] https://ssimo.org/blog/id_015.html

There's the additional problem that the MIT krb5 libs will silently
canonicalize the name for you as well. Do you know of any way for a
program to request that that *not* be done?

--
Jeff Layton <[email protected]>

2014-04-08 22:57:51

[permalink] [raw]

Subject: RE: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 15:44 -0700, Frank Filz wrote:
> > On Tue, 2014-04-08 at 10:39 -0700, Frank Filz wrote:
> > > > > If you mount by IP do you really care about krb5 ? Probably not,
> > > > > maybe that's a clue we should not even try ...
> > > > >
> > > >
> > > > It's certainly possible that someone passes in an IP address but
> > > > then says
> > > "-o
> > > > sec=krb5". It has worked in the past, so it's hard to know whether
> > > > and how many people actually depend on it.
> > >
> > > Mount by ip is sometimes used with clustered servers, especially when
> > > they have all their IP addresses in the DNS record. Even using a FQDN
> > > that just specifies that one IP address probably won't work then
> > > (since it probably is NOT the hostname used in the server credential).
> >
> > I do not understand this, using an IP address or a name that resolve to said IP
> > address is the same.
>
> But a name can resolve to a set of IP addresses, often in round-robin
> fashion. It would cause havoc with NFS server on a cluster if a v3
> client had locks on 192.168.0.10 (resolved from server.mycompany.com),
> and then rebooted, and resolved to 192.168.0.9 and sent SM_NOTIFY
> there). At least that isn't an issue with v4...

Sound you configured your DNS wrong, DNS round robin is not the right
way to do load balancing for NFSv3.

Also if you are specifying an IP address you can as well specify an
explicit name instead, you can achieve that by using a RR CNAME and then
make sure the client sticks to the resolved A name for the life of the
locks.

For a reboot situation you are screwed anyway unless you configure an
explicit address, in which case you do not have redundancy anyway.
You just use a DNS name that is resolved into a unique IP address.

> Or worse, tcp connection is dropped due to inactivity, and new
> connection is made to a different server node. But this could still be
> an issue with v4...

Same as above.

> The workaround has been to specify specific IP address.

The workaround to what ? Why are you using RR names if client are going
to stick to a specific server anyway ? You are working around a problem
you are created yourself, fix the problem! (You can fix it also by
setting an entry in /etc/hosts, it is semantically identical to
specifying an IP address on the mount anyway).

> Now I haven't done enough with krb5 recently to know how well that
> works...

> I guess I'm just offering one reason IP addresses might have been
> specified on mount...

A bogus reason :-)

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 18:06:25

[permalink] [raw]

Subject: RE: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 10:39 -0700, Frank Filz wrote:
> > > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > > that's a clue we should not even try ...
> > >
> >
> > It's certainly possible that someone passes in an IP address but then says
> "-o
> > sec=krb5". It has worked in the past, so it's hard to know whether and how
> > many people actually depend on it.
>
> Mount by ip is sometimes used with clustered servers, especially when they
> have all their IP addresses in the DNS record. Even using a FQDN that just
> specifies that one IP address probably won't work then (since it probably is
> NOT the hostname used in the server credential).

I do not understand this, using an IP address or a name that resolve to
said IP address is the same.

As long as the server has a keytab with a key in that name it should
just work fine, even if the hostname on the actual machine is different.

If this does not work it is a bug in rpc.svcgssd/gss-proxy, and should
be fixed, not something to try to work around using IP addresses.

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 16:44:35

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 12:22:51 -0400
Trond Myklebust <[email protected]> wrote:

>
> On Apr 8, 2014, at 10:46, Dr Fields James Bruce <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
> >>
> >> On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
> >>
> >>> On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> >>>> On Tue, 8 Apr 2014 08:35:01 -0400
> >>>> "J. Bruce Fields" <[email protected]> wrote:
> >>>>
> >>>>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> >>>>>> I've recently been hunting down some problems with delegation handling
> >>>>>> and have run across a problem with the client authenticates CB_COMPOUND
> >>>>>> requests. I could use some advice on how best to fix it.
> >>>>>>
> >>>>>> Specifically, check_gss_callback_principal() tries to look up the
> >>>>>> callback client and then tries to compare the ticket in it against the
> >>>>>> clp->cl_hostname:
> >>>>>>
> >>>>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> >>>>>>
> >>>>>> if (memcmp(p, "nfs@", 4) != 0)
> >>>>>> return 0;
> >>>>>> p += 4;
> >>>>>> if (strcmp(p, clp->cl_hostname) != 0)
> >>>>>> return 0;
> >>>>>> return 1;
> >>>>>>
> >>>>>> The problem is that there is no guarantee that those hostnames will be
> >>>>>> the same. If, for instance, I mount "foo:/" and the SPN is
> >>>>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> >>>>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> >>>>>> server.
> >>>>>
> >>>>> It sounds like a bug to me that the mount is succeeding without the name
> >>>>> matching.
> >>>>>
> >>>>> The security provided by krb5 is much weaker if we don't check that the
> >>>>> name provided on the commandline matches what the server authenticates
> >>>>> as.
> >>>>>
> >>>>
> >>>> The logic in gssd for this is pretty awful.
> >>>>
> >>>> It will basically trust DNS if there is no '.' in the hostname that was
> >>>> used at mount time. That'll make it take the address and
> >>>> reverse-resolve it.
> >>>
> >>> Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> >>> resolution for server names (take 3)".
> >>>
> >>>> We could add yet another band-aid and make it so that DNS is never
> >>>> trusted. I'll note that for cifs, we took that route. You have to mount
> >>>> the canonical name of the server in order to use krb5.
> >>>
> >>> I wish we could do that, but I suppose it's too harsh to break
> >>> already-working fstabs. Maybe we could phase it in somehow.
> >>>
> >>>>>> Now that we try to use krb5 on the callback channel even when sec=sys
> >>>>>> is specified, this is very problematic.
> >>>>>
> >>>>> And similarly I think the attempt to opportunistically use krb5 for
> >>>>> state management should fail and fall back on auth_sys if the server's
> >>>>> name doesn't match.
> >>>>>
> >>
> >> This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
> >
> > It'd fix this particular problem.
>
> It weakens the state management security for no good reason. How is it not better just to rip out that hostname comparison in the back channel?
>
> > But, I don't know, I'm frankly confused about our security design for
> > the NFSv4 state.
> >
> > When we insist on krb5 (and checked the server name correctly), and
> > failed without it, then I feel like I understand what we're doing. Once
> > we start trying it and then falling back (as I understand happens for
> > the krb5 state in the auth_sys case) I get confused.
>
> Now you have me confused. I?m aware that we call nfs_create_rpc_client() with a krb5i argument and then fall back to auth_sys if the RPC layer says that we don?t have a running gss daemon or that we can?t load the rpcsec_gss_krb5 module. I?m not aware of us falling back if rpc.gssd is running and tells us that security negotiation failed; we should be returning a mount error in that case.
>

I think that's what happens. We only fall back to using AUTH_SYS if
nfs_create_rpc_client returns -EINVAL. In the event that the security
negotiation fails, we should get back -EACCES and that should bubble
back up to userland.

The real problem is that gssd (and also the krb5 libs themselves) will
try to canonicalize the name. The resulting host portion of the SPN may
bear no resemblance to the hostname in the device string. In fact, if
you mount using an IP address then you're pretty much SOL.

I haven't tried it yet, but it looks reasonably trivial to fix gssd
not to bother with DNS at all and just rely on the hostname. That won't
stop the krb5 libs from doing their canonicalization though. I'm not
sure if there's some way to ask the krb5 libs to avoid doing that.

> >>>> Like Trond pointed out, the problem is that gssd doesn't give us that
> >>>> info currently. We could change it to do that of course, but that
> >>>> basically means revving the downcall.
> >>>
> >>> It might be easier to rev the upcall so that the kernel could ask gssd
> >>> to do strict checking? Since it's just a bunch of name=value pairs it
> >>> shouldn't be a huge pain to revise.
> >>
> >> So what would trigger the kernel to ask for strict checking? Do we add a mount option that says ?fail if the server doesn?t authenticate itself?? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
> >
> > I was thinking about only doing it in the state-establishment case.
> > (Since we won't know how to authenticate the callbacks in that case.)
> >
> > But that would screw up krb5 mounts, I guess, never mind.
> >
> > Using a fqdn implicitly requests strict checking so a mount option would
> > seem redundant.
>
> _________________________________
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
>

--
Jeff Layton <[email protected]>

2014-04-08 18:24:28

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 14:03:04 -0400
Trond Myklebust <[email protected]> wrote:

>
> On Apr 8, 2014, at 13:55, Jeff Layton <[email protected]> wrote:
>
> > On Tue, 8 Apr 2014 13:30:21 -0400
> > Trond Myklebust <[email protected]> wrote:
> >
> >>
> >> On Apr 8, 2014, at 12:40, Dr Fields James Bruce <[email protected]> wrote:
> >>>
> >>> On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
> >>>> How is it not better just to rip out that hostname comparison in the
> >>>> back channel?
> >>>
> >>> Rip it out entirely?
> >>>
> >>> At that point anyone who can get a credential in the right realm can
> >>> send a recall. RFC made this requirement to prevent that.
> >>>
> >>
> >> OK. Let?s examine what RFC3530 and RFC3530bis actually says here:
> >>
> >> Regardless of what security mechanism under RPCSEC_GSS is being used,
> >> the NFS server MUST identify itself in GSS-API via a
> >> GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE
> >> names are of the form:
> >>
> >> service@hostname
> >>
> >> For NFS, the "service" element is
> >>
> >> nfs
> >>
> >> Implementations of security mechanisms will convert nfs@hostname to
> >> various different forms. For Kerberos V5, the following form is
> >> RECOMMENDED:
> >>
> >> nfs/hostname
> >>
> >> For Kerberos V5, nfs/hostname would be a server principal in the
> >> Kerberos Key Distribution Center database. This is the same
> >> principal the client acquired a GSS-API context for when it issued
> >> the SETCLIENTID operation, therefore, the realm name for the server
> >> principal must be the same for the callback as it was for the
> >> SETCLIENTID.
> >>
> >>
> >> So as I read the above, technically the client is supposed to read off the principal name that the server uses to authenticate itself to the SETCLIENTID and check that in the callback. Am I wrong?
> >>
> >> If so, then the steps are:
> >>
> >> 1) Modify process_krb5_upcall() and have the call to gss_inquire_context() also request the context acceptor name
> >> 2) Modify the rpc.gssd downcall to pass that name to the kernel in some format that allow us to retrieve it in the SETCLIENTID call.
> >> 3) Modify the comparison in check_gss_callback_principal()
> >>
> >>
> >
> > Sounds about right, with #2 being the difficult part...
> >
> > One possibility for that would be to add a new "acceptor" pipe in the
> > clnt?? dirs. Teach gssd to write to the acceptor name to that pipe
> > before doing the regular downcall.
> >
> > The name written there could then be hung off of the clp. If userland
> > fails to write the name to the pipe before doing the downcall, we'll
> > simply do what we do today (use the cl_hostname).
> >
> > Adding a new pipe is a bit of a pain, but it does sidestep the problem
> > of mismatched kernel and userland?
> >
>
> As far as I can tell, the downcall can be extended. Even the security context is an opaque, so its length is known. If we wanted to append a field after that, we could do so without affecting backward compatibility.
>

Yeah, I think you're right. It looks like gss_pipe_downcall will just
ignore stuff that trails the security context. I'll have a look at
see whether tacking a new field on is feasible.

So in nfs4_proc_setclientid after the call, we can add some code that
copies the new acceptor field out of gss_cred->gss_cl_ctx, and attaches
it to a new field in the nfs_client. Alternately, I wonder if we could
get away with just replacing the clp->cl_hostname with that value?

--
Jeff Layton <[email protected]>

2014-04-08 18:49:13

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 14:45:03 -0400
Trond Myklebust <[email protected]> wrote:

>
> On Apr 8, 2014, at 14:24, Jeff Layton <[email protected]> wrote:
>
> > On Tue, 8 Apr 2014 14:03:04 -0400
> > Trond Myklebust <[email protected]> wrote:
> >
> >>
> >> On Apr 8, 2014, at 13:55, Jeff Layton <[email protected]> wrote:
> >>
> >> As far as I can tell, the downcall can be extended. Even the security context is an opaque, so its length is known. If we wanted to append a field after that, we could do so without affecting backward compatibility.
> >>
> >
> > Yeah, I think you're right. It looks like gss_pipe_downcall will just
> > ignore stuff that trails the security context. I'll have a look at
> > see whether tacking a new field on is feasible.
> >
> > So in nfs4_proc_setclientid after the call, we can add some code that
> > copies the new acceptor field out of gss_cred->gss_cl_ctx, and attaches
> > it to a new field in the nfs_client. Alternately, I wonder if we could
> > get away with just replacing the clp->cl_hostname with that value?
>
> I don?t think we want to replace clp->cl_hostname. If someone wants to play around with the ?-p? option in rpc.svcgssd, then we may end up with some rather strange hostnames on the client...
>

Ok, fair enough. A new field it is...

--
Jeff Layton <[email protected]>

2014-04-08 18:03:11

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 01:55:12PM -0400, Jeff Layton wrote:
> On Tue, 8 Apr 2014 13:30:21 -0400
> Trond Myklebust <[email protected]> wrote:
>
> >
> > On Apr 8, 2014, at 12:40, Dr Fields James Bruce <[email protected]> wrote:
> > >
> > > On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
> > >> How is it not better just to rip out that hostname comparison in the
> > >> back channel?
> > >
> > > Rip it out entirely?
> > >
> > > At that point anyone who can get a credential in the right realm can
> > > send a recall. RFC made this requirement to prevent that.
> > >
> >
> > OK. Let’s examine what RFC3530 and RFC3530bis actually says here:
> >
> > Regardless of what security mechanism under RPCSEC_GSS is being used,
> > the NFS server MUST identify itself in GSS-API via a
> > GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE
> > names are of the form:
> >
> > service@hostname
> >
> > For NFS, the "service" element is
> >
> > nfs
> >
> > Implementations of security mechanisms will convert nfs@hostname to
> > various different forms. For Kerberos V5, the following form is
> > RECOMMENDED:
> >
> > nfs/hostname
> >
> > For Kerberos V5, nfs/hostname would be a server principal in the
> > Kerberos Key Distribution Center database. This is the same
> > principal the client acquired a GSS-API context for when it issued
> > the SETCLIENTID operation, therefore, the realm name for the server
> > principal must be the same for the callback as it was for the
> > SETCLIENTID.
> >
> >
> > So as I read the above, technically the client is supposed to read off the principal name that the server uses to authenticate itself to the SETCLIENTID and check that in the callback. Am I wrong?
> >
> > If so, then the steps are:
> >
> > 1) Modify process_krb5_upcall() and have the call to gss_inquire_context() also request the context acceptor name
> > 2) Modify the rpc.gssd downcall to pass that name to the kernel in some format that allow us to retrieve it in the SETCLIENTID call.
> > 3) Modify the comparison in check_gss_callback_principal()
> >
> >
>
> Sounds about right, with #2 being the difficult part...
>
> One possibility for that would be to add a new "acceptor" pipe in the
> clnt?? dirs. Teach gssd to write to the acceptor name to that pipe
> before doing the regular downcall.
>
> The name written there could then be hung off of the clp. If userland
> fails to write the name to the pipe before doing the downcall, we'll
> simply do what we do today (use the cl_hostname).
>
> Adding a new pipe is a bit of a pain, but it does sidestep the problem
> of mismatched kernel and userland...

I guess you could look at gss_pipe_downcall(), pick a case that
currently reliably errors out, and use that.

E.g., the new downcall format could be designed to be > 1024 bytes, and
gssd could fall back on the old call if that failed.

--b.

2014-04-08 12:42:13

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 8:35, J. Bruce Fields <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
>> I've recently been hunting down some problems with delegation handling
>> and have run across a problem with the client authenticates CB_COMPOUND
>> requests. I could use some advice on how best to fix it.
>>
>> Specifically, check_gss_callback_principal() tries to look up the
>> callback client and then tries to compare the ticket in it against the
>> clp->cl_hostname:
>>
>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
>>
>> if (memcmp(p, "nfs@", 4) != 0)
>> return 0;
>> p += 4;
>> if (strcmp(p, clp->cl_hostname) != 0)
>> return 0;
>> return 1;
>>
>> The problem is that there is no guarantee that those hostnames will be
>> the same. If, for instance, I mount "foo:/" and the SPN is
>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
>> server.
>
> It sounds like a bug to me that the mount is succeeding without the name
> matching.
>
> The security provided by krb5 is much weaker if we don't check that the
> name provided on the commandline matches what the server authenticates
> as.

Where would the client find that information? I don?t think that rpc.gssd passes that information down to us.

>> Now that we try to use krb5 on the callback channel even when sec=sys
>> is specified, this is very problematic.
>
> And similarly I think the attempt to opportunistically use krb5 for
> state management should fail and fall back on auth_sys if the server's
> name doesn't match.
>
>> I think that the ideal thing would be to stash the SPN that we use to
>> do the SETCLIENTID call and use that in the comparison above.
>> Unfortunately, the rpc_cred doesn't really seem to carry this info and
>> I don't see where we get enough information in the rpc.gssd downcall to
>> figure out what that SPN should be.
>>
>> Anyone have thoughts or should we just remove the above check until we
>> come up with a better way to do this?
>>
>> [1]: there's another bug that can cause the client to send a bogus
>> reply instead of dropping the request as intended, but that's
>> relatively simple to fix.
>
> So I believe the matching really is a requirement and that it would be
> wrong to weaken it.
>
> It sounds like there's also a server bug here if it's giving out
> delegations to a client that isn't responding to callbacks.
>
> --b.

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 18:01:28

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
> On Tue, 08 Apr 2014 13:27:01 -0400
> Simo Sorce <[email protected]> wrote:
>
> > On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
> > >
> > > I think that's what happens. We only fall back to using AUTH_SYS if
> > > nfs_create_rpc_client returns -EINVAL. In the event that the security
> > > negotiation fails, we should get back -EACCES and that should bubble
> > > back up to userland.
> > >
> > > The real problem is that gssd (and also the krb5 libs themselves) will
> > > try to canonicalize the name. The resulting host portion of the SPN
> > > may bear no resemblance to the hostname in the device string. In fact,
> > > if you mount using an IP address then you're pretty much SOL.
> >
> > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > that's a clue we should not even try ...
> >
>
> It's certainly possible that someone passes in an IP address but then
> says "-o sec=krb5". It has worked in the past, so it's hard to know
> whether and how many people actually depend on it.
>
> > > I haven't tried it yet, but it looks reasonably trivial to fix gssd
> > > not to bother with DNS at all and just rely on the hostname. That
> > > won't stop the krb5 libs from doing their canonicalization though. I'm
> > > not sure if there's some way to ask the krb5 libs to avoid doing that.
> >
> > [libdefaults]
> > rdns = false
> >
> > And I think we change the default to false in Fedora/RHEL lately ...
> >
> > Simo.
> >
>
> That's a step in the right direction, but I think that the rdns just
> makes it skip the reverse lookup. AFAIK, the MIT libs will still do
> getaddrinfo and scrape out the ai_canonname and use that in preference
> to the hostname you pass in.

That should happen only if you are using a CNAME, not for an A name.

We can open bugs if this is not the case though.

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 15:13:55

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 11:04:20AM -0400, Jeff Layton wrote:
> On Tue, 8 Apr 2014 10:46:52 -0400
> Dr Fields James Bruce <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
> > >
> > > On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
> > >
> > > > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > > >> On Tue, 8 Apr 2014 08:35:01 -0400
> > > >> "J. Bruce Fields" <[email protected]> wrote:
> > > >>
> > > >>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > >>>> I've recently been hunting down some problems with delegation handling
> > > >>>> and have run across a problem with the client authenticates CB_COMPOUND
> > > >>>> requests. I could use some advice on how best to fix it.
> > > >>>>
> > > >>>> Specifically, check_gss_callback_principal() tries to look up the
> > > >>>> callback client and then tries to compare the ticket in it against the
> > > >>>> clp->cl_hostname:
> > > >>>>
> > > >>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > >>>>
> > > >>>> if (memcmp(p, "nfs@", 4) != 0)
> > > >>>> return 0;
> > > >>>> p += 4;
> > > >>>> if (strcmp(p, clp->cl_hostname) != 0)
> > > >>>> return 0;
> > > >>>> return 1;
> > > >>>>
> > > >>>> The problem is that there is no guarantee that those hostnames will be
> > > >>>> the same. If, for instance, I mount "foo:/" and the SPN is
> > > >>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > >>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > >>>> server.
> > > >>>
> > > >>> It sounds like a bug to me that the mount is succeeding without the name
> > > >>> matching.
> > > >>>
> > > >>> The security provided by krb5 is much weaker if we don't check that the
> > > >>> name provided on the commandline matches what the server authenticates
> > > >>> as.
> > > >>>
> > > >>
> > > >> The logic in gssd for this is pretty awful.
> > > >>
> > > >> It will basically trust DNS if there is no '.' in the hostname that was
> > > >> used at mount time. That'll make it take the address and
> > > >> reverse-resolve it.
> > > >
> > > > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > > > resolution for server names (take 3)".
> > > >
> > > >> We could add yet another band-aid and make it so that DNS is never
> > > >> trusted. I'll note that for cifs, we took that route. You have to mount
> > > >> the canonical name of the server in order to use krb5.
> > > >
> > > > I wish we could do that, but I suppose it's too harsh to break
> > > > already-working fstabs. Maybe we could phase it in somehow.
> > > >
> > > >>>> Now that we try to use krb5 on the callback channel even when sec=sys
> > > >>>> is specified, this is very problematic.
> > > >>>
> > > >>> And similarly I think the attempt to opportunistically use krb5 for
> > > >>> state management should fail and fall back on auth_sys if the server's
> > > >>> name doesn't match.
> > > >>>
> > >
> > > This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
> >
> > It'd fix this particular problem.
> >
> > But, I don't know, I'm frankly confused about our security design for
> > the NFSv4 state.
> >
> > When we insist on krb5 (and checked the server name correctly), and
> > failed without it, then I feel like I understand what we're doing. Once
> > we start trying it and then falling back (as I understand happens for
> > the krb5 state in the auth_sys case) I get confused.
> >
> > > >> Like Trond pointed out, the problem is that gssd doesn't give us that
> > > >> info currently. We could change it to do that of course, but that
> > > >> basically means revving the downcall.
> > > >
> > > > It might be easier to rev the upcall so that the kernel could ask gssd
> > > > to do strict checking? Since it's just a bunch of name=value pairs it
> > > > shouldn't be a huge pain to revise.
> > >
> > > So what would trigger the kernel to ask for strict checking? Do we add a mount option that says “fail if the server doesn’t authenticate itself”? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
> >
> > I was thinking about only doing it in the state-establishment case.
> > (Since we won't know how to authenticate the callbacks in that case.)
> >
> > But that would screw up krb5 mounts, I guess, never mind.
> >
> > Using a fqdn implicitly requests strict checking so a mount option would
> > seem redundant.
> >
>
> So I guess we have two options to fix this:
>
> 1) Change gssd to require the canonical fqdn and not rely on name
> resolution. Unfortunately, I think the MIT krb5 libs will still
> canonicalize the hostnames by default, so this might not actually fix
> anything. See:
>
> http://web.mit.edu/kerberos/krb5-devel/doc/admin/princ_dns.html
>
> ...or...
>
> 2) Loosen or somehow fix the check in check_gss_callback_principal().
> One possibility might be to do a dns_resolver upcall for the host
> portion of the SPN, and then compare the address with the server's
> address. Ugly, but since we already trust DNS implicitly I guess it's
> no less secure...

I thought Kerberos wasn't supposed to require trust in DNS. So I feel
confused. Cc'ing Simo in hopes he can set us all straight.

--b.

2014-04-08 17:30:24

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 12:40, Dr Fields James Bruce <[email protected]> wrote:
>
> On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
>> How is it not better just to rip out that hostname comparison in the
>> back channel?
>
> Rip it out entirely?
>
> At that point anyone who can get a credential in the right realm can
> send a recall. RFC made this requirement to prevent that.
>

OK. Let?s examine what RFC3530 and RFC3530bis actually says here:

Regardless of what security mechanism under RPCSEC_GSS is being used,
the NFS server MUST identify itself in GSS-API via a
GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE
names are of the form:

service@hostname

For NFS, the "service" element is

nfs

Implementations of security mechanisms will convert nfs@hostname to
various different forms. For Kerberos V5, the following form is
RECOMMENDED:

nfs/hostname

For Kerberos V5, nfs/hostname would be a server principal in the
Kerberos Key Distribution Center database. This is the same
principal the client acquired a GSS-API context for when it issued
the SETCLIENTID operation, therefore, the realm name for the server
principal must be the same for the callback as it was for the
SETCLIENTID.

So as I read the above, technically the client is supposed to read off the principal name that the server uses to authenticate itself to the SETCLIENTID and check that in the callback. Am I wrong?

If so, then the steps are:

1) Modify process_krb5_upcall() and have the call to gss_inquire_context() also request the context acceptor name
2) Modify the rpc.gssd downcall to pass that name to the kernel in some format that allow us to retrieve it in the SETCLIENTID call.
3) Modify the comparison in check_gss_callback_principal()

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 14:41:53

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 10:22:00 -0400
Jeff Layton <[email protected]> wrote:

> On Tue, 8 Apr 2014 10:03:33 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > > On Tue, 8 Apr 2014 08:35:01 -0400
> > > "J. Bruce Fields" <[email protected]> wrote:
> > >
> > > > On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > > > I've recently been hunting down some problems with delegation handling
> > > > > and have run across a problem with the client authenticates CB_COMPOUND
> > > > > requests. I could use some advice on how best to fix it.
> > > > >
> > > > > Specifically, check_gss_callback_principal() tries to look up the
> > > > > callback client and then tries to compare the ticket in it against the
> > > > > clp->cl_hostname:
> > > > >
> > > > > /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > > >
> > > > > if (memcmp(p, "nfs@", 4) != 0)
> > > > > return 0;
> > > > > p += 4;
> > > > > if (strcmp(p, clp->cl_hostname) != 0)
> > > > > return 0;
> > > > > return 1;
> > > > >
> > > > > The problem is that there is no guarantee that those hostnames will be
> > > > > the same. If, for instance, I mount "foo:/" and the SPN is
> > > > > "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > > > request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > > > server.
> > > >
> > > > It sounds like a bug to me that the mount is succeeding without the name
> > > > matching.
> > > >
> > > > The security provided by krb5 is much weaker if we don't check that the
> > > > name provided on the commandline matches what the server authenticates
> > > > as.
> > > >
> > >
> > > The logic in gssd for this is pretty awful.
> > >
> > > It will basically trust DNS if there is no '.' in the hostname that was
> > > used at mount time. That'll make it take the address and
> > > reverse-resolve it.
> >
> > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > resolution for server names (take 3)".
> >
> > > We could add yet another band-aid and make it so that DNS is never
> > > trusted. I'll note that for cifs, we took that route. You have to mount
> > > the canonical name of the server in order to use krb5.
> >
> > I wish we could do that, but I suppose it's too harsh to break
> > already-working fstabs. Maybe we could phase it in somehow.
> >
> > > > > Now that we try to use krb5 on the callback channel even when sec=sys
> > > > > is specified, this is very problematic.
> > > >
> > > > And similarly I think the attempt to opportunistically use krb5 for
> > > > state management should fail and fall back on auth_sys if the server's
> > > > name doesn't match.
> > > >
> > >
> > > Like Trond pointed out, the problem is that gssd doesn't give us that
> > > info currently. We could change it to do that of course, but that
> > > basically means revving the downcall.
> >
> > It might be easier to rev the upcall so that the kernel could ask gssd
> > to do strict checking? Since it's just a bunch of name=value pairs it
> > shouldn't be a huge pain to revise.
> >
>
> Yeah, that might work, but it will definitely break anyone who's not
> mounting the canonical server name today.
>
> OTOH, if we're going to do that, then we don't really need to rev the
> upcall. Just fix gssd to do this strict checking by default (and maybe
> add a command-line option to allow it to trust DNS like it does today).
>
> > > > > I think that the ideal thing would be to stash the SPN that we use to
> > > > > do the SETCLIENTID call and use that in the comparison above.
> > > > > Unfortunately, the rpc_cred doesn't really seem to carry this info and
> > > > > I don't see where we get enough information in the rpc.gssd downcall to
> > > > > figure out what that SPN should be.
> > > > >
> > > > > Anyone have thoughts or should we just remove the above check until we
> > > > > come up with a better way to do this?
> > > > >
> > > > > [1]: there's another bug that can cause the client to send a bogus
> > > > > reply instead of dropping the request as intended, but that's
> > > > > relatively simple to fix.
> > > >
> > > > So I believe the matching really is a requirement and that it would be
> > > > wrong to weaken it.
> > > >
> > > > It sounds like there's also a server bug here if it's giving out
> > > > delegations to a client that isn't responding to callbacks.
> > > >
> > >
> > > The server uses CB_NULL requests to probe the callback port, and those
> > > aren't affected by this problem. Worse, since CB_NULL requests don't
> > > even contain the callback_ident, we can't even use them to hook up the
> > > nfs_client with the SPN used in them.
> >
> > Ah, got it.
> >
> > The server should still stop delegations as soon as a CB_RECALL times
> > out, though, so at least the problem should clear up after that?
> >
> > --b.
>
> Yes, that seems to be what happens eventually. What I generally see is
> that we get a set of read delegations from the server, eventually the
> server sends a bunch of CB_RECALL requests, which are "dropped" (sort
> of -- I have a patch to really make those be dropped). Eventually ~60s
> later, the client returns the delegations.
>
> I'm a little unclear on what eventually triggers the DELEGRETURNs --
> maybe the server takes down the callback channel? I need to look a
> little closer at that piece...
>

Ahh and FWIW...

What happens is that the RENEW gets a CB_PATH_DOWN error, and the
client then sends back all of the delegations.

> In any case, now that we have all sorts of server operations blocking on
> delegation callbacks this turns into a bit of a mess on the server and
> contributes to some softlockups that I'm seeing there.
>

--
Jeff Layton <[email protected]>

2014-04-08 22:45:02

by Frank Filz

[permalink] [raw]

Subject: RE: v4.0 CB_COMPOUND authentication failures

> On Tue, 2014-04-08 at 10:39 -0700, Frank Filz wrote:
> > > > If you mount by IP do you really care about krb5 ? Probably not,
> > > > maybe that's a clue we should not even try ...
> > > >
> > >
> > > It's certainly possible that someone passes in an IP address but
> > > then says
> > "-o
> > > sec=krb5". It has worked in the past, so it's hard to know whether
> > > and how many people actually depend on it.
> >
> > Mount by ip is sometimes used with clustered servers, especially when
> > they have all their IP addresses in the DNS record. Even using a FQDN
> > that just specifies that one IP address probably won't work then
> > (since it probably is NOT the hostname used in the server credential).
>
> I do not understand this, using an IP address or a name that resolve to said IP
> address is the same.

But a name can resolve to a set of IP addresses, often in round-robin fashion. It would cause havoc with NFS server on a cluster if a v3 client had locks on 192.168.0.10 (resolved from server.mycompany.com), and then rebooted, and resolved to 192.168.0.9 and sent SM_NOTIFY there). At least that isn't an issue with v4...

Or worse, tcp connection is dropped due to inactivity, and new connection is made to a different server node. But this could still be an issue with v4...

The workaround has been to specify specific IP address.

Now I haven't done enough with krb5 recently to know how well that works...

I guess I'm just offering one reason IP addresses might have been specified on mount...

> As long as the server has a keytab with a key in that name it should just work
> fine, even if the hostname on the actual machine is different.
>
> If this does not work it is a bug in rpc.svcgssd/gss-proxy, and should be fixed,
> not something to try to work around using IP addresses.

Frank

2014-04-08 14:22:12

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 10:03:33 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > On Tue, 8 Apr 2014 08:35:01 -0400
> > "J. Bruce Fields" <[email protected]> wrote:
> >
> > > On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > > I've recently been hunting down some problems with delegation handling
> > > > and have run across a problem with the client authenticates CB_COMPOUND
> > > > requests. I could use some advice on how best to fix it.
> > > >
> > > > Specifically, check_gss_callback_principal() tries to look up the
> > > > callback client and then tries to compare the ticket in it against the
> > > > clp->cl_hostname:
> > > >
> > > > /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > >
> > > > if (memcmp(p, "nfs@", 4) != 0)
> > > > return 0;
> > > > p += 4;
> > > > if (strcmp(p, clp->cl_hostname) != 0)
> > > > return 0;
> > > > return 1;
> > > >
> > > > The problem is that there is no guarantee that those hostnames will be
> > > > the same. If, for instance, I mount "foo:/" and the SPN is
> > > > "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > > request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > > server.
> > >
> > > It sounds like a bug to me that the mount is succeeding without the name
> > > matching.
> > >
> > > The security provided by krb5 is much weaker if we don't check that the
> > > name provided on the commandline matches what the server authenticates
> > > as.
> > >
> >
> > The logic in gssd for this is pretty awful.
> >
> > It will basically trust DNS if there is no '.' in the hostname that was
> > used at mount time. That'll make it take the address and
> > reverse-resolve it.
>
> Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> resolution for server names (take 3)".
>
> > We could add yet another band-aid and make it so that DNS is never
> > trusted. I'll note that for cifs, we took that route. You have to mount
> > the canonical name of the server in order to use krb5.
>
> I wish we could do that, but I suppose it's too harsh to break
> already-working fstabs. Maybe we could phase it in somehow.
>
> > > > Now that we try to use krb5 on the callback channel even when sec=sys
> > > > is specified, this is very problematic.
> > >
> > > And similarly I think the attempt to opportunistically use krb5 for
> > > state management should fail and fall back on auth_sys if the server's
> > > name doesn't match.
> > >
> >
> > Like Trond pointed out, the problem is that gssd doesn't give us that
> > info currently. We could change it to do that of course, but that
> > basically means revving the downcall.
>
> It might be easier to rev the upcall so that the kernel could ask gssd
> to do strict checking? Since it's just a bunch of name=value pairs it
> shouldn't be a huge pain to revise.
>

Yeah, that might work, but it will definitely break anyone who's not
mounting the canonical server name today.

OTOH, if we're going to do that, then we don't really need to rev the
upcall. Just fix gssd to do this strict checking by default (and maybe
add a command-line option to allow it to trust DNS like it does today).

> > > > I think that the ideal thing would be to stash the SPN that we use to
> > > > do the SETCLIENTID call and use that in the comparison above.
> > > > Unfortunately, the rpc_cred doesn't really seem to carry this info and
> > > > I don't see where we get enough information in the rpc.gssd downcall to
> > > > figure out what that SPN should be.
> > > >
> > > > Anyone have thoughts or should we just remove the above check until we
> > > > come up with a better way to do this?
> > > >
> > > > [1]: there's another bug that can cause the client to send a bogus
> > > > reply instead of dropping the request as intended, but that's
> > > > relatively simple to fix.
> > >
> > > So I believe the matching really is a requirement and that it would be
> > > wrong to weaken it.
> > >
> > > It sounds like there's also a server bug here if it's giving out
> > > delegations to a client that isn't responding to callbacks.
> > >
> >
> > The server uses CB_NULL requests to probe the callback port, and those
> > aren't affected by this problem. Worse, since CB_NULL requests don't
> > even contain the callback_ident, we can't even use them to hook up the
> > nfs_client with the SPN used in them.
>
> Ah, got it.
>
> The server should still stop delegations as soon as a CB_RECALL times
> out, though, so at least the problem should clear up after that?
>
> --b.

Yes, that seems to be what happens eventually. What I generally see is
that we get a set of read delegations from the server, eventually the
server sends a bunch of CB_RECALL requests, which are "dropped" (sort
of -- I have a patch to really make those be dropped). Eventually ~60s
later, the client returns the delegations.

I'm a little unclear on what eventually triggers the DELEGRETURNs --
maybe the server takes down the callback channel? I need to look a
little closer at that piece...

In any case, now that we have all sorts of server operations blocking on
delegation callbacks this turns into a bit of a mess on the server and
contributes to some softlockups that I'm seeing there.

--
Jeff Layton <[email protected]>

2014-04-08 14:47:21

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 10:41:45AM -0400, Jeff Layton wrote:
> On Tue, 8 Apr 2014 10:22:00 -0400
> Jeff Layton <[email protected]> wrote:
>
> > On Tue, 8 Apr 2014 10:03:33 -0400
> > "J. Bruce Fields" <[email protected]> wrote:
> >
> > > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > > > On Tue, 8 Apr 2014 08:35:01 -0400
> > > > "J. Bruce Fields" <[email protected]> wrote:
> > > >
> > > > > On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > > > > I've recently been hunting down some problems with delegation handling
> > > > > > and have run across a problem with the client authenticates CB_COMPOUND
> > > > > > requests. I could use some advice on how best to fix it.
> > > > > >
> > > > > > Specifically, check_gss_callback_principal() tries to look up the
> > > > > > callback client and then tries to compare the ticket in it against the
> > > > > > clp->cl_hostname:
> > > > > >
> > > > > > /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > > > >
> > > > > > if (memcmp(p, "nfs@", 4) != 0)
> > > > > > return 0;
> > > > > > p += 4;
> > > > > > if (strcmp(p, clp->cl_hostname) != 0)
> > > > > > return 0;
> > > > > > return 1;
> > > > > >
> > > > > > The problem is that there is no guarantee that those hostnames will be
> > > > > > the same. If, for instance, I mount "foo:/" and the SPN is
> > > > > > "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > > > > request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > > > > server.
> > > > >
> > > > > It sounds like a bug to me that the mount is succeeding without the name
> > > > > matching.
> > > > >
> > > > > The security provided by krb5 is much weaker if we don't check that the
> > > > > name provided on the commandline matches what the server authenticates
> > > > > as.
> > > > >
> > > >
> > > > The logic in gssd for this is pretty awful.
> > > >
> > > > It will basically trust DNS if there is no '.' in the hostname that was
> > > > used at mount time. That'll make it take the address and
> > > > reverse-resolve it.
> > >
> > > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > > resolution for server names (take 3)".
> > >
> > > > We could add yet another band-aid and make it so that DNS is never
> > > > trusted. I'll note that for cifs, we took that route. You have to mount
> > > > the canonical name of the server in order to use krb5.
> > >
> > > I wish we could do that, but I suppose it's too harsh to break
> > > already-working fstabs. Maybe we could phase it in somehow.
> > >
> > > > > > Now that we try to use krb5 on the callback channel even when sec=sys
> > > > > > is specified, this is very problematic.
> > > > >
> > > > > And similarly I think the attempt to opportunistically use krb5 for
> > > > > state management should fail and fall back on auth_sys if the server's
> > > > > name doesn't match.
> > > > >
> > > >
> > > > Like Trond pointed out, the problem is that gssd doesn't give us that
> > > > info currently. We could change it to do that of course, but that
> > > > basically means revving the downcall.
> > >
> > > It might be easier to rev the upcall so that the kernel could ask gssd
> > > to do strict checking? Since it's just a bunch of name=value pairs it
> > > shouldn't be a huge pain to revise.
> > >
> >
> > Yeah, that might work, but it will definitely break anyone who's not
> > mounting the canonical server name today.
> >
> > OTOH, if we're going to do that, then we don't really need to rev the
> > upcall. Just fix gssd to do this strict checking by default (and maybe
> > add a command-line option to allow it to trust DNS like it does today).
> >
> > > > > > I think that the ideal thing would be to stash the SPN that we use to
> > > > > > do the SETCLIENTID call and use that in the comparison above.
> > > > > > Unfortunately, the rpc_cred doesn't really seem to carry this info and
> > > > > > I don't see where we get enough information in the rpc.gssd downcall to
> > > > > > figure out what that SPN should be.
> > > > > >
> > > > > > Anyone have thoughts or should we just remove the above check until we
> > > > > > come up with a better way to do this?
> > > > > >
> > > > > > [1]: there's another bug that can cause the client to send a bogus
> > > > > > reply instead of dropping the request as intended, but that's
> > > > > > relatively simple to fix.
> > > > >
> > > > > So I believe the matching really is a requirement and that it would be
> > > > > wrong to weaken it.
> > > > >
> > > > > It sounds like there's also a server bug here if it's giving out
> > > > > delegations to a client that isn't responding to callbacks.
> > > > >
> > > >
> > > > The server uses CB_NULL requests to probe the callback port, and those
> > > > aren't affected by this problem. Worse, since CB_NULL requests don't
> > > > even contain the callback_ident, we can't even use them to hook up the
> > > > nfs_client with the SPN used in them.
> > >
> > > Ah, got it.
> > >
> > > The server should still stop delegations as soon as a CB_RECALL times
> > > out, though, so at least the problem should clear up after that?
> > >
> > > --b.
> >
> > Yes, that seems to be what happens eventually. What I generally see is
> > that we get a set of read delegations from the server, eventually the
> > server sends a bunch of CB_RECALL requests, which are "dropped" (sort
> > of -- I have a patch to really make those be dropped). Eventually ~60s
> > later, the client returns the delegations.
> >
> > I'm a little unclear on what eventually triggers the DELEGRETURNs --
> > maybe the server takes down the callback channel? I need to look a
> > little closer at that piece...
> >
>
> Ahh and FWIW...
>
> What happens is that the RENEW gets a CB_PATH_DOWN error, and the
> client then sends back all of the delegations.

OK, great, so that part is all working as it should.

--b.

2014-04-08 19:01:29

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 14:52, Simo Sorce <[email protected]> wrote:

> On Tue, 2014-04-08 at 14:11 -0400, Dr Fields James Bruce wrote:
>> On Tue, Apr 08, 2014 at 02:08:14PM -0400, Simo Sorce wrote:
>>> On Tue, 2014-04-08 at 14:04 -0400, Jeff Layton wrote:
>>>> On Tue, 08 Apr 2014 14:01:15 -0400
>>>> Simo Sorce <[email protected]> wrote:
>>>>
>>>>> On Tue, 2014-04-08 at 13:30 -0400, Jeff Layton wrote:
>>>>>> On Tue, 08 Apr 2014 13:27:01 -0400
>>>>>> Simo Sorce <[email protected]> wrote:
>>>>>>
>>>>>>> On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
>>>>>>>>
>>>>>>>> I think that's what happens. We only fall back to using AUTH_SYS if
>>>>>>>> nfs_create_rpc_client returns -EINVAL. In the event that the security
>>>>>>>> negotiation fails, we should get back -EACCES and that should bubble
>>>>>>>> back up to userland.
>>>>>>>>
>>>>>>>> The real problem is that gssd (and also the krb5 libs themselves) will
>>>>>>>> try to canonicalize the name. The resulting host portion of the SPN
>>>>>>>> may bear no resemblance to the hostname in the device string. In fact,
>>>>>>>> if you mount using an IP address then you're pretty much SOL.
>>>>>>>
>>>>>>> If you mount by IP do you really care about krb5 ? Probably not, maybe
>>>>>>> that's a clue we should not even try ...
>>>>>>>
>>>>>>
>>>>>> It's certainly possible that someone passes in an IP address but then
>>>>>> says "-o sec=krb5". It has worked in the past, so it's hard to know
>>>>>> whether and how many people actually depend on it.
>>>>>>
>>>>>>>> I haven't tried it yet, but it looks reasonably trivial to fix gssd
>>>>>>>> not to bother with DNS at all and just rely on the hostname. That
>>>>>>>> won't stop the krb5 libs from doing their canonicalization though. I'm
>>>>>>>> not sure if there's some way to ask the krb5 libs to avoid doing that.
>>>>>>>
>>>>>>> [libdefaults]
>>>>>>> rdns = false
>>>>>>>
>>>>>>> And I think we change the default to false in Fedora/RHEL lately ...
>>>>>>>
>>>>>>> Simo.
>>>>>>>
>>>>>>
>>>>>> That's a step in the right direction, but I think that the rdns just
>>>>>> makes it skip the reverse lookup. AFAIK, the MIT libs will still do
>>>>>> getaddrinfo and scrape out the ai_canonname and use that in preference
>>>>>> to the hostname you pass in.
>>>>>
>>>>> That should happen only if you are using a CNAME, not for an A name.
>>>>>
>>>>> We can open bugs if this is not the case though.
>>>>>
>>>>
>>>> That's still a problem for us then. The current code tries to compare
>>>> the host portion of the device string to the SPN that we get in the
>>>> callback request. If they don't match, it fails.
>>>>
>>>> I think what we need to do is fix this the right way -- make rpc.gssd
>>>> pass down the acceptor name with the downcall.
>>>
>>> Why do you need the comparison at all, pardon my ignorance, I do not
>>> know very well what its purpose is.
>>
>> The NFS client wants to verify that a callback came from the server, so
>> it needs to know who it originally authenticated to.
>
> As Jeff said, the only good way at this point would be to have rpc.gssd
> pass down the acceptor name after it is done with the gssapi calls.
>
> Note that this may still fail, especially i clustered environments where
> servers have multiple credentials they can answer with (due to
> responding with multiple names). Unless the server is careful in always
> using the principal the client got tickets for when it calls back.

Anything else would be a protocol violation afaics. Please see the quote from RFC3530 that I sent out earlier in this thread.

> Although the best solution is to quickly deprecate 4.0 callbacks and try
> as hard as possible to move on. 4.0 callbacks are just broken.

We have yet to find a volunteer to add RPCSEC_GSS-authenticated callback support to 4.1. :-(

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 18:45:07

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 14:24, Jeff Layton <[email protected]> wrote:

> On Tue, 8 Apr 2014 14:03:04 -0400
> Trond Myklebust <[email protected]> wrote:
>
>>
>> On Apr 8, 2014, at 13:55, Jeff Layton <[email protected]> wrote:
>>
>> As far as I can tell, the downcall can be extended. Even the security context is an opaque, so its length is known. If we wanted to append a field after that, we could do so without affecting backward compatibility.
>>
>
> Yeah, I think you're right. It looks like gss_pipe_downcall will just
> ignore stuff that trails the security context. I'll have a look at
> see whether tacking a new field on is feasible.
>
> So in nfs4_proc_setclientid after the call, we can add some code that
> copies the new acceptor field out of gss_cred->gss_cl_ctx, and attaches
> it to a new field in the nfs_client. Alternately, I wonder if we could
> get away with just replacing the clp->cl_hostname with that value?

I don?t think we want to replace clp->cl_hostname. If someone wants to play around with the ?-p? option in rpc.svcgssd, then we may end up with some rather strange hostnames on the client...

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 17:29:06

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 12:44 -0400, Jeff Layton wrote:
>
> I think that's what happens. We only fall back to using AUTH_SYS if
> nfs_create_rpc_client returns -EINVAL. In the event that the security
> negotiation fails, we should get back -EACCES and that should bubble
> back up to userland.
>
> The real problem is that gssd (and also the krb5 libs themselves) will
> try to canonicalize the name. The resulting host portion of the SPN
> may bear no resemblance to the hostname in the device string. In fact,
> if you mount using an IP address then you're pretty much SOL.

If you mount by IP do you really care about krb5 ? Probably not, maybe
that's a clue we should not even try ...

> I haven't tried it yet, but it looks reasonably trivial to fix gssd
> not to bother with DNS at all and just rely on the hostname. That
> won't stop the krb5 libs from doing their canonicalization though. I'm
> not sure if there's some way to ask the krb5 libs to avoid doing that.

[libdefaults]
rdns = false

And I think we change the default to false in Fedora/RHEL lately ...

Simo.

--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 17:25:15

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 2014-04-08 at 11:13 -0400, Dr Fields James Bruce wrote:
> On Tue, Apr 08, 2014 at 11:04:20AM -0400, Jeff Layton wrote:
> > On Tue, 8 Apr 2014 10:46:52 -0400
> > Dr Fields James Bruce <[email protected]> wrote:
> >
> > > On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
> > > >
> > > > On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
> > > >
> > > > > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> > > > >> On Tue, 8 Apr 2014 08:35:01 -0400
> > > > >> "J. Bruce Fields" <[email protected]> wrote:
> > > > >>
> > > > >>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > > > >>>> I've recently been hunting down some problems with delegation handling
> > > > >>>> and have run across a problem with the client authenticates CB_COMPOUND
> > > > >>>> requests. I could use some advice on how best to fix it.
> > > > >>>>
> > > > >>>> Specifically, check_gss_callback_principal() tries to look up the
> > > > >>>> callback client and then tries to compare the ticket in it against the
> > > > >>>> clp->cl_hostname:
> > > > >>>>
> > > > >>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> > > > >>>>
> > > > >>>> if (memcmp(p, "nfs@", 4) != 0)
> > > > >>>> return 0;
> > > > >>>> p += 4;
> > > > >>>> if (strcmp(p, clp->cl_hostname) != 0)
> > > > >>>> return 0;
> > > > >>>> return 1;
> > > > >>>>
> > > > >>>> The problem is that there is no guarantee that those hostnames will be
> > > > >>>> the same. If, for instance, I mount "foo:/" and the SPN is
> > > > >>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > > > >>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > > > >>>> server.
> > > > >>>
> > > > >>> It sounds like a bug to me that the mount is succeeding without the name
> > > > >>> matching.
> > > > >>>
> > > > >>> The security provided by krb5 is much weaker if we don't check that the
> > > > >>> name provided on the commandline matches what the server authenticates
> > > > >>> as.
> > > > >>>
> > > > >>
> > > > >> The logic in gssd for this is pretty awful.
> > > > >>
> > > > >> It will basically trust DNS if there is no '.' in the hostname that was
> > > > >> used at mount time. That'll make it take the address and
> > > > >> reverse-resolve it.
> > > > >
> > > > > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > > > > resolution for server names (take 3)".
> > > > >
> > > > >> We could add yet another band-aid and make it so that DNS is never
> > > > >> trusted. I'll note that for cifs, we took that route. You have to mount
> > > > >> the canonical name of the server in order to use krb5.
> > > > >
> > > > > I wish we could do that, but I suppose it's too harsh to break
> > > > > already-working fstabs. Maybe we could phase it in somehow.
> > > > >
> > > > >>>> Now that we try to use krb5 on the callback channel even when sec=sys
> > > > >>>> is specified, this is very problematic.
> > > > >>>
> > > > >>> And similarly I think the attempt to opportunistically use krb5 for
> > > > >>> state management should fail and fall back on auth_sys if the server's
> > > > >>> name doesn't match.
> > > > >>>
> > > >
> > > > This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?
> > >
> > > It'd fix this particular problem.
> > >
> > > But, I don't know, I'm frankly confused about our security design for
> > > the NFSv4 state.
> > >
> > > When we insist on krb5 (and checked the server name correctly), and
> > > failed without it, then I feel like I understand what we're doing. Once
> > > we start trying it and then falling back (as I understand happens for
> > > the krb5 state in the auth_sys case) I get confused.
> > >
> > > > >> Like Trond pointed out, the problem is that gssd doesn't give us that
> > > > >> info currently. We could change it to do that of course, but that
> > > > >> basically means revving the downcall.
> > > > >
> > > > > It might be easier to rev the upcall so that the kernel could ask gssd
> > > > > to do strict checking? Since it's just a bunch of name=value pairs it
> > > > > shouldn't be a huge pain to revise.
> > > >
> > > > So what would trigger the kernel to ask for strict checking? Do we add a mount option that says “fail if the server doesn’t authenticate itself”? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.
> > >
> > > I was thinking about only doing it in the state-establishment case.
> > > (Since we won't know how to authenticate the callbacks in that case.)
> > >
> > > But that would screw up krb5 mounts, I guess, never mind.
> > >
> > > Using a fqdn implicitly requests strict checking so a mount option would
> > > seem redundant.
> > >
> >
> > So I guess we have two options to fix this:
> >
> > 1) Change gssd to require the canonical fqdn and not rely on name
> > resolution. Unfortunately, I think the MIT krb5 libs will still
> > canonicalize the hostnames by default, so this might not actually fix
> > anything. See:
> >
> > http://web.mit.edu/kerberos/krb5-devel/doc/admin/princ_dns.html
> >
> > ...or...
> >
> > 2) Loosen or somehow fix the check in check_gss_callback_principal().
> > One possibility might be to do a dns_resolver upcall for the host
> > portion of the SPN, and then compare the address with the server's
> > address. Ugly, but since we already trust DNS implicitly I guess it's
> > no less secure...
>
> I thought Kerberos wasn't supposed to require trust in DNS. So I feel
> confused. Cc'ing Simo in hopes he can set us all straight.

RFC4120 indeed warns against relying on DNS.
Here[1] is my old writeup about why it shouldn't be done, at least until
DNSSEC gets deployed, then things may change.

Simo.

[1] https://ssimo.org/blog/id_015.html
--
Simo Sorce * Red Hat, Inc * New York

2014-04-08 12:35:03

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> I've recently been hunting down some problems with delegation handling
> and have run across a problem with the client authenticates CB_COMPOUND
> requests. I could use some advice on how best to fix it.
>
> Specifically, check_gss_callback_principal() tries to look up the
> callback client and then tries to compare the ticket in it against the
> clp->cl_hostname:
>
> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
>
> if (memcmp(p, "nfs@", 4) != 0)
> return 0;
> p += 4;
> if (strcmp(p, clp->cl_hostname) != 0)
> return 0;
> return 1;
>
> The problem is that there is no guarantee that those hostnames will be
> the same. If, for instance, I mount "foo:/" and the SPN is
> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> server.

It sounds like a bug to me that the mount is succeeding without the name
matching.

The security provided by krb5 is much weaker if we don't check that the
name provided on the commandline matches what the server authenticates
as.

> Now that we try to use krb5 on the callback channel even when sec=sys
> is specified, this is very problematic.

And similarly I think the attempt to opportunistically use krb5 for
state management should fail and fall back on auth_sys if the server's
name doesn't match.

> I think that the ideal thing would be to stash the SPN that we use to
> do the SETCLIENTID call and use that in the comparison above.
> Unfortunately, the rpc_cred doesn't really seem to carry this info and
> I don't see where we get enough information in the rpc.gssd downcall to
> figure out what that SPN should be.
>
> Anyone have thoughts or should we just remove the above check until we
> come up with a better way to do this?
>
> [1]: there's another bug that can cause the client to send a bogus
> reply instead of dropping the request as intended, but that's
> relatively simple to fix.

So I believe the matching really is a requirement and that it would be
wrong to weaken it.

It sounds like there's also a server bug here if it's giving out
delegations to a client that isn't responding to callbacks.

--b.

2014-04-08 17:58:27

by Frank Filz

[permalink] [raw]

Subject: RE: v4.0 CB_COMPOUND authentication failures

> > If you mount by IP do you really care about krb5 ? Probably not, maybe
> > that's a clue we should not even try ...
> >
>
> It's certainly possible that someone passes in an IP address but then says
"-o
> sec=krb5". It has worked in the past, so it's hard to know whether and how
> many people actually depend on it.

Mount by ip is sometimes used with clustered servers, especially when they
have all their IP addresses in the DNS record. Even using a FQDN that just
specifies that one IP address probably won't work then (since it probably is
NOT the hostname used in the server credential).

Frank

2014-04-08 17:55:19

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 13:30:21 -0400
Trond Myklebust <[email protected]> wrote:

>
> On Apr 8, 2014, at 12:40, Dr Fields James Bruce <[email protected]> wrote:
> >
> > On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
> >> How is it not better just to rip out that hostname comparison in the
> >> back channel?
> >
> > Rip it out entirely?
> >
> > At that point anyone who can get a credential in the right realm can
> > send a recall. RFC made this requirement to prevent that.
> >
>
> OK. Let?s examine what RFC3530 and RFC3530bis actually says here:
>
> Regardless of what security mechanism under RPCSEC_GSS is being used,
> the NFS server MUST identify itself in GSS-API via a
> GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE
> names are of the form:
>
> service@hostname
>
> For NFS, the "service" element is
>
> nfs
>
> Implementations of security mechanisms will convert nfs@hostname to
> various different forms. For Kerberos V5, the following form is
> RECOMMENDED:
>
> nfs/hostname
>
> For Kerberos V5, nfs/hostname would be a server principal in the
> Kerberos Key Distribution Center database. This is the same
> principal the client acquired a GSS-API context for when it issued
> the SETCLIENTID operation, therefore, the realm name for the server
> principal must be the same for the callback as it was for the
> SETCLIENTID.
>
>
> So as I read the above, technically the client is supposed to read off the principal name that the server uses to authenticate itself to the SETCLIENTID and check that in the callback. Am I wrong?
>
> If so, then the steps are:
>
> 1) Modify process_krb5_upcall() and have the call to gss_inquire_context() also request the context acceptor name
> 2) Modify the rpc.gssd downcall to pass that name to the kernel in some format that allow us to retrieve it in the SETCLIENTID call.
> 3) Modify the comparison in check_gss_callback_principal()
>
>

Sounds about right, with #2 being the difficult part...

One possibility for that would be to add a new "acceptor" pipe in the
clnt?? dirs. Teach gssd to write to the acceptor name to that pipe
before doing the regular downcall.

The name written there could then be hung off of the clp. If userland
fails to write the name to the pipe before doing the downcall, we'll
simply do what we do today (use the cl_hostname).

Adding a new pipe is a bit of a pain, but it does sidestep the problem
of mismatched kernel and userland...

--
Jeff Layton <[email protected]>

2014-04-08 13:49:10

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, 8 Apr 2014 08:35:01 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> > I've recently been hunting down some problems with delegation handling
> > and have run across a problem with the client authenticates CB_COMPOUND
> > requests. I could use some advice on how best to fix it.
> >
> > Specifically, check_gss_callback_principal() tries to look up the
> > callback client and then tries to compare the ticket in it against the
> > clp->cl_hostname:
> >
> > /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> >
> > if (memcmp(p, "nfs@", 4) != 0)
> > return 0;
> > p += 4;
> > if (strcmp(p, clp->cl_hostname) != 0)
> > return 0;
> > return 1;
> >
> > The problem is that there is no guarantee that those hostnames will be
> > the same. If, for instance, I mount "foo:/" and the SPN is
> > "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> > request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> > server.
>
> It sounds like a bug to me that the mount is succeeding without the name
> matching.
>
> The security provided by krb5 is much weaker if we don't check that the
> name provided on the commandline matches what the server authenticates
> as.
>

The logic in gssd for this is pretty awful.

It will basically trust DNS if there is no '.' in the hostname that was
used at mount time. That'll make it take the address and
reverse-resolve it.

We could add yet another band-aid and make it so that DNS is never
trusted. I'll note that for cifs, we took that route. You have to mount
the canonical name of the server in order to use krb5.

> > Now that we try to use krb5 on the callback channel even when sec=sys
> > is specified, this is very problematic.
>
> And similarly I think the attempt to opportunistically use krb5 for
> state management should fail and fall back on auth_sys if the server's
> name doesn't match.
>

Like Trond pointed out, the problem is that gssd doesn't give us that
info currently. We could change it to do that of course, but that
basically means revving the downcall.

> > I think that the ideal thing would be to stash the SPN that we use to
> > do the SETCLIENTID call and use that in the comparison above.
> > Unfortunately, the rpc_cred doesn't really seem to carry this info and
> > I don't see where we get enough information in the rpc.gssd downcall to
> > figure out what that SPN should be.
> >
> > Anyone have thoughts or should we just remove the above check until we
> > come up with a better way to do this?
> >
> > [1]: there's another bug that can cause the client to send a bogus
> > reply instead of dropping the request as intended, but that's
> > relatively simple to fix.
>
> So I believe the matching really is a requirement and that it would be
> wrong to weaken it.
>
> It sounds like there's also a server bug here if it's giving out
> delegations to a client that isn't responding to callbacks.
>

The server uses CB_NULL requests to probe the callback port, and those
aren't affected by this problem. Worse, since CB_NULL requests don't
even contain the callback_ident, we can't even use them to hook up the
nfs_client with the SPN used in them.

--
Jeff Layton <[email protected]>

2014-04-08 18:03:08

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Apr 8, 2014, at 13:55, Jeff Layton <[email protected]> wrote:

> On Tue, 8 Apr 2014 13:30:21 -0400
> Trond Myklebust <[email protected]> wrote:
>
>>
>> On Apr 8, 2014, at 12:40, Dr Fields James Bruce <[email protected]> wrote:
>>>
>>> On Tue, Apr 08, 2014 at 12:22:51PM -0400, Trond Myklebust wrote:
>>>> How is it not better just to rip out that hostname comparison in the
>>>> back channel?
>>>
>>> Rip it out entirely?
>>>
>>> At that point anyone who can get a credential in the right realm can
>>> send a recall. RFC made this requirement to prevent that.
>>>
>>
>> OK. Let?s examine what RFC3530 and RFC3530bis actually says here:
>>
>> Regardless of what security mechanism under RPCSEC_GSS is being used,
>> the NFS server MUST identify itself in GSS-API via a
>> GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE
>> names are of the form:
>>
>> service@hostname
>>
>> For NFS, the "service" element is
>>
>> nfs
>>
>> Implementations of security mechanisms will convert nfs@hostname to
>> various different forms. For Kerberos V5, the following form is
>> RECOMMENDED:
>>
>> nfs/hostname
>>
>> For Kerberos V5, nfs/hostname would be a server principal in the
>> Kerberos Key Distribution Center database. This is the same
>> principal the client acquired a GSS-API context for when it issued
>> the SETCLIENTID operation, therefore, the realm name for the server
>> principal must be the same for the callback as it was for the
>> SETCLIENTID.
>>
>>
>> So as I read the above, technically the client is supposed to read off the principal name that the server uses to authenticate itself to the SETCLIENTID and check that in the callback. Am I wrong?
>>
>> If so, then the steps are:
>>
>> 1) Modify process_krb5_upcall() and have the call to gss_inquire_context() also request the context acceptor name
>> 2) Modify the rpc.gssd downcall to pass that name to the kernel in some format that allow us to retrieve it in the SETCLIENTID call.
>> 3) Modify the comparison in check_gss_callback_principal()
>>
>>
>
> Sounds about right, with #2 being the difficult part...
>
> One possibility for that would be to add a new "acceptor" pipe in the
> clnt?? dirs. Teach gssd to write to the acceptor name to that pipe
> before doing the regular downcall.
>
> The name written there could then be hung off of the clp. If userland
> fails to write the name to the pipe before doing the downcall, we'll
> simply do what we do today (use the cl_hostname).
>
> Adding a new pipe is a bit of a pain, but it does sidestep the problem
> of mismatched kernel and userland?
>

As far as I can tell, the downcall can be extended. Even the security context is an opaque, so its length is known. If we wanted to append a field after that, we could do so without affecting backward compatibility.

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2014-04-08 14:46:53

[permalink] [raw]

Subject: Re: v4.0 CB_COMPOUND authentication failures

On Tue, Apr 08, 2014 at 10:23:37AM -0400, Trond Myklebust wrote:
>
> On Apr 8, 2014, at 10:03, J. Bruce Fields <[email protected]> wrote:
>
> > On Tue, Apr 08, 2014 at 09:49:03AM -0400, Jeff Layton wrote:
> >> On Tue, 8 Apr 2014 08:35:01 -0400
> >> "J. Bruce Fields" <[email protected]> wrote:
> >>
> >>> On Tue, Apr 08, 2014 at 08:21:40AM -0400, Jeff Layton wrote:
> >>>> I've recently been hunting down some problems with delegation handling
> >>>> and have run across a problem with the client authenticates CB_COMPOUND
> >>>> requests. I could use some advice on how best to fix it.
> >>>>
> >>>> Specifically, check_gss_callback_principal() tries to look up the
> >>>> callback client and then tries to compare the ticket in it against the
> >>>> clp->cl_hostname:
> >>>>
> >>>> /* Expect a GSS_C_NT_HOSTBASED_NAME like "nfs@serverhostname" */
> >>>>
> >>>> if (memcmp(p, "nfs@", 4) != 0)
> >>>> return 0;
> >>>> p += 4;
> >>>> if (strcmp(p, clp->cl_hostname) != 0)
> >>>> return 0;
> >>>> return 1;
> >>>>
> >>>> The problem is that there is no guarantee that those hostnames will be
> >>>> the same. If, for instance, I mount "foo:/" and the SPN is
> >>>> "nfs/foo.bar.baz" that strcmp will return true, and the CB_COMPOUND
> >>>> request will get tossed out [1]. Ditto if I happen to mount a CNAME of the
> >>>> server.
> >>>
> >>> It sounds like a bug to me that the mount is succeeding without the name
> >>> matching.
> >>>
> >>> The security provided by krb5 is much weaker if we don't check that the
> >>> name provided on the commandline matches what the server authenticates
> >>> as.
> >>>
> >>
> >> The logic in gssd for this is pretty awful.
> >>
> >> It will basically trust DNS if there is no '.' in the hostname that was
> >> used at mount time. That'll make it take the address and
> >> reverse-resolve it.
> >
> > Argh, OK, I guess this is the compromise Simo made in "Avoid DNS reverse
> > resolution for server names (take 3)".
> >
> >> We could add yet another band-aid and make it so that DNS is never
> >> trusted. I'll note that for cifs, we took that route. You have to mount
> >> the canonical name of the server in order to use krb5.
> >
> > I wish we could do that, but I suppose it's too harsh to break
> > already-working fstabs. Maybe we could phase it in somehow.
> >
> >>>> Now that we try to use krb5 on the callback channel even when sec=sys
> >>>> is specified, this is very problematic.
> >>>
> >>> And similarly I think the attempt to opportunistically use krb5 for
> >>> state management should fail and fall back on auth_sys if the server's
> >>> name doesn't match.
> >>>
>
> This suggestion makes no sense to me at all. How does it help to fall back to using weak security when the strong security checks fail?

It'd fix this particular problem.

But, I don't know, I'm frankly confused about our security design for
the NFSv4 state.

When we insist on krb5 (and checked the server name correctly), and
failed without it, then I feel like I understand what we're doing. Once
we start trying it and then falling back (as I understand happens for
the krb5 state in the auth_sys case) I get confused.

> >> Like Trond pointed out, the problem is that gssd doesn't give us that
> >> info currently. We could change it to do that of course, but that
> >> basically means revving the downcall.
> >
> > It might be easier to rev the upcall so that the kernel could ask gssd
> > to do strict checking? Since it's just a bunch of name=value pairs it
> > shouldn't be a huge pain to revise.
>
> So what would trigger the kernel to ask for strict checking? Do we add a mount option that says “fail if the server doesn’t authenticate itself”? That would be hard to combine with security negotiation, since it only makes sense for RPCSEC_GSS authentication.

I was thinking about only doing it in the state-establishment case.
(Since we won't know how to authenticate the callbacks in that case.)

But that would screw up krb5 mounts, I guess, never mind.

Using a fqdn implicitly requests strict checking so a mount option would
seem redundant.

--b.

2014-04-08 18:20:57