2016-08-31 16:36:27

by Matt Garman

[permalink] [raw]
Subject: gss context cache and nfsv4

Hi all,

I'm trying to understand the nuances of GSS security contexts with
regards to NFSv4 with sec=krb5 under Linux. How does the in-kernel
caching of these contexts work, i.e. what is the mechanism? And
specifically,

- How long do the in-kernel context caches live?
- Is there any way to query the in-kernel context caches?
- Is it possible for an individual user to have *multiple* in-kernel
context caches?


A simple example, for discussion:

Say host "testhost" is an NFSv4 (sec=krb5p) client. In particular,
the NFS mount is for user home directories. If I login to "testhost"
and run "klist", I see I have two tickets, the TGT and the NFS service
ticket for the home directory mount. Furthermore, my KRB5CC
environment variable is set, and points to
/tmp/krb5cc_uid_randomstring. So far, so good, no surprises.

Now, if I delete the /tmp/krb5cc file, or run kdestroy, then run
klist, it says "klist: No credentials cache found (ticket cache
FILE:/tmp/krb5cc_uid_whatever)". Also not surprising. However, I
still have access to my home directory. Now if I just do nothing on
that terminal, and wait long enough, eventually I'll get "Permission
Denied" on my home directory.

That's somewhat surprising but readily explained by the in-kernel
credentials cache. Also explicitly explained by the NFSv4 FAQ[1]:

6. I am accessing an NFSv4 mount via Kerberos and then I do a
kdestroy, but I am still able to access the NFS data. Why?

The kernel code caches the gssapi context that was negotiated using
the Kerberos credentials. Destroying the credentials does not destroy
the context in the kernel. We plan to change this behavior when moving
to use the new key ring kernel support to store credentials and
contexts.


How long will this in-kernel context persist? I have tried to
determine this experimentally, but it appears to be non-deterministic
(or I haven't designed the right experiment). It almost seems like if
you keep "using" the in-kernel context (e.g. continuously create and
destroy random files on the Kerberized NFS share), that it lasts
longer. (As opposed to just sitting idle at the terminal.)


Likewise, let's say I have two separate ssh sessions into "testhost".
In this case, I'll have two /tmp/krb5cc_uid_random files. Do I also
have two separate in-kernel context caches, or just one? What happens
if I run kdestroy on one terminal, but not the other? How does that
affect the in-kernel cache(s)?


[1] http://www.citi.umich.edu/projects/nfsv4/linux/faq/

Thanks,
Matt


2016-08-31 18:40:07

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: gss context cache and nfsv4

Hi Matt,

On Wed, Aug 31, 2016 at 12:36 PM, Matt Garman <[email protected]> wrote:
> Hi all,
>
> I'm trying to understand the nuances of GSS security contexts with
> regards to NFSv4 with sec=krb5 under Linux. How does the in-kernel
> caching of these contexts work, i.e. what is the mechanism? And
> specifically,
>
> - How long do the in-kernel context caches live?

The lifetime (expressed in seconds) of the gss context is determined
to be the end lifetime of the service ticket - time now.

> - Is there any way to query the in-kernel context caches?

No there is no way to query.

> - Is it possible for an individual user to have *multiple* in-kernel
> context caches?
>
>
> A simple example, for discussion:
>
> Say host "testhost" is an NFSv4 (sec=krb5p) client. In particular,
> the NFS mount is for user home directories. If I login to "testhost"
> and run "klist", I see I have two tickets, the TGT and the NFS service
> ticket for the home directory mount. Furthermore, my KRB5CC
> environment variable is set, and points to
> /tmp/krb5cc_uid_randomstring. So far, so good, no surprises.
>
> Now, if I delete the /tmp/krb5cc file, or run kdestroy, then run
> klist, it says "klist: No credentials cache found (ticket cache
> FILE:/tmp/krb5cc_uid_whatever)". Also not surprising. However, I
> still have access to my home directory. Now if I just do nothing on
> that terminal, and wait long enough, eventually I'll get "Permission
> Denied" on my home directory.
>
> That's somewhat surprising but readily explained by the in-kernel
> credentials cache. Also explicitly explained by the NFSv4 FAQ[1]:
>
> 6. I am accessing an NFSv4 mount via Kerberos and then I do a
> kdestroy, but I am still able to access the NFS data. Why?
>
> The kernel code caches the gssapi context that was negotiated using
> the Kerberos credentials. Destroying the credentials does not destroy
> the context in the kernel. We plan to change this behavior when moving
> to use the new key ring kernel support to store credentials and
> contexts.
>
>
> How long will this in-kernel context persist? I have tried to
> determine this experimentally, but it appears to be non-deterministic
> (or I haven't designed the right experiment). It almost seems like if
> you keep "using" the in-kernel context (e.g. continuously create and
> destroy random files on the Kerberized NFS share), that it lasts
> longer. (As opposed to just sitting idle at the terminal.)
>
>
> Likewise, let's say I have two separate ssh sessions into "testhost".
> In this case, I'll have two /tmp/krb5cc_uid_random files. Do I also
> have two separate in-kernel context caches, or just one? What happens
> if I run kdestroy on one terminal, but not the other? How does that
> affect the in-kernel cache(s)?

In the kernel, the context caching is done by the uid of the user (and
the security flavor of the mount). Two ssh sessions would share the
same in-kernel gss context cache. When you kdestroy, it only destroys
kerberos ticket cache (stored either in a file or in recent linux in
the keyring). We are currently looking into providing an application
to cleanup in-kernel context cache (very early stages work lead by
Andy Adamson).

>
>
> [1] http://www.citi.umich.edu/projects/nfsv4/linux/faq/
>
> Thanks,
> Matt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-08-31 19:10:29

by Matt Garman

[permalink] [raw]
Subject: Re: gss context cache and nfsv4

On Wed, Aug 31, 2016 at 1:40 PM, Olga Kornievskaia <[email protected]> wrote:
> The lifetime (expressed in seconds) of the gss context is determined
> to be the end lifetime of the service ticket - time now.

Based on a simple experiment, I don't think this is true (or I'm
mis-understanding your explanation). What I did is log into a host
that uses NFSv4 sec=krb5p home directories. klist shows the service
ticket for nfs as not expiring until October 27, 2016 (I have all
ticket lifetimes in Kerberos configured for 70 days).

Now, I do a "kdestroy" and make a note of the time. I then run a
simple loop like this:

# while [ 1 ] ; do date ; ls ; sleep 1m ; done

Twice now I've done this experiment on two different hosts. After
almost exactly an hour, I start getting "Permission denied".

But from your description above, I would expect that I shouldn't see
"Permission denied" until the end of October, right?

2016-08-31 19:41:27

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: gss context cache and nfsv4

On Wed, Aug 31, 2016 at 3:10 PM, Matt Garman <[email protected]> wrote:
> On Wed, Aug 31, 2016 at 1:40 PM, Olga Kornievskaia <[email protected]> wrote:
>> The lifetime (expressed in seconds) of the gss context is determined
>> to be the end lifetime of the service ticket - time now.
>
> Based on a simple experiment, I don't think this is true (or I'm
> mis-understanding your explanation). What I did is log into a host
> that uses NFSv4 sec=krb5p home directories. klist shows the service
> ticket for nfs as not expiring until October 27, 2016 (I have all
> ticket lifetimes in Kerberos configured for 70 days).
>
> Now, I do a "kdestroy" and make a note of the time. I then run a
> simple loop like this:
>
> # while [ 1 ] ; do date ; ls ; sleep 1m ; done
>
> Twice now I've done this experiment on two different hosts. After
> almost exactly an hour, I start getting "Permission denied".
>
> But from your description above, I would expect that I shouldn't see
> "Permission denied" until the end of October, right?

I should have asked for what distro/nfs-utils are you using?

In the RHEL/Fedora nfs-utils distros, lifetime of the context is
gotten from gss_inquire_context() call from the gss krb5 api. In
krb5_gss_inquire_context() in krb5 source code in
src/lib/gssapi/krb5/inq_context.c it's set to what I have set before.

A server can choose to expire the context at any time by returning gss
context error and force the client to create the new security context.
What server are you going against? A network trace would be helpful to
check to see if the server is returning such error.

2016-11-02 14:03:49

by Matt Garman

[permalink] [raw]
Subject: Re: gss context cache and nfsv4

On Wed, Aug 31, 2016 at 2:41 PM, Olga Kornievskaia <[email protected]> wrote:
> I should have asked for what distro/nfs-utils are you using?
>
> In the RHEL/Fedora nfs-utils distros, lifetime of the context is
> gotten from gss_inquire_context() call from the gss krb5 api. In
> krb5_gss_inquire_context() in krb5 source code in
> src/lib/gssapi/krb5/inq_context.c it's set to what I have set before.
>
> A server can choose to expire the context at any time by returning gss
> context error and force the client to create the new security context.
> What server are you going against? A network trace would be helpful to
> check to see if the server is returning such error.

Bringing this thread back to life...

I am using CentOS (effectively same as RHEL) 6.5. nfs-utils version
1.2.3-39. All clients and servers are same OS version, same kernel,
same nfs-utils.

Just to be clear: in your suggestion of a network trace, do you mean
using something like tcpdump or wireshark to see exactly what is going
on between client and server? Is it sufficient to do this while I am
seeing "permission denied" on the krb5p share?

Since I am using krb5p (not the 'p'), I believe all NFS traffic is
encrypted... so will I actually be able to see anything useful in the
packet capture? Can you elaborate specifically on what I should look
for?

Lastly... as a workaround, can I use the "-t" parameter of rpc.gssd?
What if I set that value to be equivalent to be the same as our
Kerberos ticket lifetime?

Thank you,
Matt

2016-11-02 16:54:15

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: gss context cache and nfsv4

On Wed, Nov 2, 2016 at 10:03 AM, Matt Garman <[email protected]> wrote:
> On Wed, Aug 31, 2016 at 2:41 PM, Olga Kornievskaia <[email protected]> wrote:
>> I should have asked for what distro/nfs-utils are you using?
>>
>> In the RHEL/Fedora nfs-utils distros, lifetime of the context is
>> gotten from gss_inquire_context() call from the gss krb5 api. In
>> krb5_gss_inquire_context() in krb5 source code in
>> src/lib/gssapi/krb5/inq_context.c it's set to what I have set before.
>>
>> A server can choose to expire the context at any time by returning gss
>> context error and force the client to create the new security context.
>> What server are you going against? A network trace would be helpful to
>> check to see if the server is returning such error.
>
> Bringing this thread back to life...
>
> I am using CentOS (effectively same as RHEL) 6.5. nfs-utils version
> 1.2.3-39. All clients and servers are same OS version, same kernel,
> same nfs-utils.

That's like way way way old.. but shouldn't really matter i guess

> Just to be clear: in your suggestion of a network trace, do you mean
> using something like tcpdump or wireshark to see exactly what is going
> on between client and server? Is it sufficient to do this while I am
> seeing "permission denied" on the krb5p share?

Yes tcpdump or wireshark and yes during the failure.

I'm suspecting that the server is returning the error that it has
expired the context. That you should be able to see even if it's krb5p
mount. You should look for rpc.authgss.major != 0 for your filter.

> Since I am using krb5p (not the 'p'), I believe all NFS traffic is
> encrypted... so will I actually be able to see anything useful in the
> packet capture? Can you elaborate specifically on what I should look
> for?
>
> Lastly... as a workaround, can I use the "-t" parameter of rpc.gssd?
> What if I set that value to be equivalent to be the same as our
> Kerberos ticket lifetime?

If the server is deciding the expire the context there is no work around.

> Thank you,
> Matt