2018-03-27 22:06:26

by M A Young

[permalink] [raw]
Subject: NFS mounts failing when keytab present on client

NFS mounts stopped working on one of my computers after a kernel update
from 4.15.3 to 4.15.4. I traced the problem to the commit
[46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
keyed hashes without setting key
and a later kernel with this patch reverted works normally.

The problem seems to be related to kerberos as the mount fails when the
keytab is present, but works if I rename the keytab file. This is true
even though the mount is with sec=sys . The mount should also work with
sec=krb5 but that also fails in the same way. When the mount fails there
are errors in dmesg like
[ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
[ 1232.522819] RPC: couldn't encode RPC header, exit EIO
[ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
[ 1232.522857] RPC: couldn't encode RPC header, exit EIO
[ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
Exiting with error EIO
[ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
[ 1232.525042] RPC: couldn't encode RPC header, exit EIO

Michael Young


2018-03-27 22:29:54

by Eric Biggers

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

Hi Michael,

On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> NFS mounts stopped working on one of my computers after a kernel update from
> 4.15.3 to 4.15.4. I traced the problem to the commit
> [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> keyed hashes without setting key
> and a later kernel with this patch reverted works normally.
>
> The problem seems to be related to kerberos as the mount fails when the
> keytab is present, but works if I rename the keytab file. This is true even
> though the mount is with sec=sys . The mount should also work with sec=krb5
> but that also fails in the same way. When the mount fails there are errors
> in dmesg like
> [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> Exiting with error EIO
> [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
>
> Michael Young

Thanks for the bug report. I think the error is coming from
net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
The first one, which is definitely a bug, is that make_checksum_hmac_md5()
allocates an HMAC transform and request, then does these crypto API calls:

crypto_ahash_init()
crypto_ahash_setkey()
crypto_ahash_digest()

This is wrong because it makes no sense to init() the HMAC request before the
key has been set, and doubly so when it's calling digest() which is shorthand
for init() + update() + final(). So I think it just needs to be removed. You
can test the following patch:

diff --git a/net/sunrpc/auth_gss/gss_krb5_crypto.c b/net/sunrpc/auth_gss/gss_krb5_crypto.c
index 12649c9fedab..8654494b4d0a 100644
--- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
+++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
@@ -237,9 +237,6 @@ make_checksum_hmac_md5(struct krb5_ctx *kctx, char *header, int hdrlen,

ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);

- err = crypto_ahash_init(req);
- if (err)
- goto out;
err = crypto_ahash_setkey(hmac_md5, cksumkey, kctx->gk5e->keylength);
if (err)
goto out;

If that's not it, it's also possible that the error is coming from the
crypto_ahash_init() in make_checksum(). That can only happen if 'cksumkey' is
NULL and the hash algorithm is keyed, which implies a logical error as it
doesn't make sense to use a keyed hash algorithm without the key. The callers
do check kctx->gk5e->keyed_cksum which I'd hope would prevent this, though
perhaps kctx->cksum can be NULL.

Eric

2018-03-28 08:08:30

by M A Young

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Tue, 27 Mar 2018, Eric Biggers wrote:

> Hi Michael,
>
> On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> > NFS mounts stopped working on one of my computers after a kernel update from
> > 4.15.3 to 4.15.4. I traced the problem to the commit
> > [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> > keyed hashes without setting key
> > and a later kernel with this patch reverted works normally.
> >
> > The problem seems to be related to kerberos as the mount fails when the
> > keytab is present, but works if I rename the keytab file. This is true even
> > though the mount is with sec=sys . The mount should also work with sec=krb5
> > but that also fails in the same way. When the mount fails there are errors
> > in dmesg like
> > [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> > [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> > [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> > Exiting with error EIO
> > [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
> >
> > Michael Young
>
> Thanks for the bug report. I think the error is coming from
> net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
> The first one, which is definitely a bug, is that make_checksum_hmac_md5()
> allocates an HMAC transform and request, then does these crypto API calls:
>
> crypto_ahash_init()
> crypto_ahash_setkey()
> crypto_ahash_digest()
>
> This is wrong because it makes no sense to init() the HMAC request before the
> key has been set, and doubly so when it's calling digest() which is shorthand
> for init() + update() + final(). So I think it just needs to be removed. You
> can test the following patch:
>
> diff --git a/net/sunrpc/auth_gss/gss_krb5_crypto.c b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> index 12649c9fedab..8654494b4d0a 100644
> --- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
> +++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> @@ -237,9 +237,6 @@ make_checksum_hmac_md5(struct krb5_ctx *kctx, char *header, int hdrlen,
>
> ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
>
> - err = crypto_ahash_init(req);
> - if (err)
> - goto out;
> err = crypto_ahash_setkey(hmac_md5, cksumkey, kctx->gk5e->keylength);
> if (err)
> goto out;
>
> If that's not it, it's also possible that the error is coming from the
> crypto_ahash_init() in make_checksum(). That can only happen if 'cksumkey' is
> NULL and the hash algorithm is keyed, which implies a logical error as it
> doesn't make sense to use a keyed hash algorithm without the key. The callers
> do check kctx->gk5e->keyed_cksum which I'd hope would prevent this, though
> perhaps kctx->cksum can be NULL.
>
> Eric

The patch fixes the problem.

Michael Young

2018-03-28 15:46:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Tue, Mar 27, 2018 at 03:29:50PM -0700, Eric Biggers wrote:
> Hi Michael,
>
> On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> > NFS mounts stopped working on one of my computers after a kernel update from
> > 4.15.3 to 4.15.4. I traced the problem to the commit
> > [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> > keyed hashes without setting key
> > and a later kernel with this patch reverted works normally.
> >
> > The problem seems to be related to kerberos as the mount fails when the
> > keytab is present, but works if I rename the keytab file. This is true even
> > though the mount is with sec=sys . The mount should also work with sec=krb5
> > but that also fails in the same way. When the mount fails there are errors
> > in dmesg like
> > [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> > [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> > [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> > Exiting with error EIO
> > [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> > [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
> >
> > Michael Young
>
> Thanks for the bug report. I think the error is coming from
> net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
> The first one, which is definitely a bug, is that make_checksum_hmac_md5()
> allocates an HMAC transform and request, then does these crypto API calls:
>
> crypto_ahash_init()
> crypto_ahash_setkey()
> crypto_ahash_digest()
>
> This is wrong because it makes no sense to init() the HMAC request before the
> key has been set, and doubly so when it's calling digest() which is shorthand
> for init() + update() + final(). So I think it just needs to be removed. You
> can test the following patch:

When was this introduced?

3b5cf20cf439 "sunrpc: Use skcipher and ahash/shash"
- probably not, assuming the above was still just as wrong with
crypto_hash_{init,setkey,digest} as it is with
crypto_ahash_{init,setkey,digest}

So I'm guessing it was wrong from the start when it was added by
fffdaef2eb4a "gss_krb5: Add support for rc4-hmac encryption" 8 years
ago. Wonder why it took this long to notice? Did something else
change?

--b.


>
> diff --git a/net/sunrpc/auth_gss/gss_krb5_crypto.c b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> index 12649c9fedab..8654494b4d0a 100644
> --- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
> +++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> @@ -237,9 +237,6 @@ make_checksum_hmac_md5(struct krb5_ctx *kctx, char *header, int hdrlen,
>
> ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
>
> - err = crypto_ahash_init(req);
> - if (err)
> - goto out;
> err = crypto_ahash_setkey(hmac_md5, cksumkey, kctx->gk5e->keylength);
> if (err)
> goto out;
>
> If that's not it, it's also possible that the error is coming from the
> crypto_ahash_init() in make_checksum(). That can only happen if 'cksumkey' is
> NULL and the hash algorithm is keyed, which implies a logical error as it
> doesn't make sense to use a keyed hash algorithm without the key. The callers
> do check kctx->gk5e->keyed_cksum which I'd hope would prevent this, though
> perhaps kctx->cksum can be NULL.
>
> Eric

2018-03-28 17:48:00

by Eric Biggers

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Wed, Mar 28, 2018 at 09:00:14AM +0100, M A Young wrote:
> On Tue, 27 Mar 2018, Eric Biggers wrote:
>
> > Hi Michael,
> >
> > On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> > > NFS mounts stopped working on one of my computers after a kernel update from
> > > 4.15.3 to 4.15.4. I traced the problem to the commit
> > > [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> > > keyed hashes without setting key
> > > and a later kernel with this patch reverted works normally.
> > >
> > > The problem seems to be related to kerberos as the mount fails when the
> > > keytab is present, but works if I rename the keytab file. This is true even
> > > though the mount is with sec=sys . The mount should also work with sec=krb5
> > > but that also fails in the same way. When the mount fails there are errors
> > > in dmesg like
> > > [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> > > [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> > > [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> > > Exiting with error EIO
> > > [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
> > >
> > > Michael Young
> >
> > Thanks for the bug report. I think the error is coming from
> > net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
> > The first one, which is definitely a bug, is that make_checksum_hmac_md5()
> > allocates an HMAC transform and request, then does these crypto API calls:
> >
> > crypto_ahash_init()
> > crypto_ahash_setkey()
> > crypto_ahash_digest()
> >
> > This is wrong because it makes no sense to init() the HMAC request before the
> > key has been set, and doubly so when it's calling digest() which is shorthand
> > for init() + update() + final(). So I think it just needs to be removed. You
> > can test the following patch:
> >
> > diff --git a/net/sunrpc/auth_gss/gss_krb5_crypto.c b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> > index 12649c9fedab..8654494b4d0a 100644
> > --- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
> > +++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
> > @@ -237,9 +237,6 @@ make_checksum_hmac_md5(struct krb5_ctx *kctx, char *header, int hdrlen,
> >
> > ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
> >
> > - err = crypto_ahash_init(req);
> > - if (err)
> > - goto out;
> > err = crypto_ahash_setkey(hmac_md5, cksumkey, kctx->gk5e->keylength);
> > if (err)
> > goto out;
> >
> > If that's not it, it's also possible that the error is coming from the
> > crypto_ahash_init() in make_checksum(). That can only happen if 'cksumkey' is
> > NULL and the hash algorithm is keyed, which implies a logical error as it
> > doesn't make sense to use a keyed hash algorithm without the key. The callers
> > do check kctx->gk5e->keyed_cksum which I'd hope would prevent this, though
> > perhaps kctx->cksum can be NULL.
> >
> > Eric
>
> The patch fixes the problem.
>
> Michael Young

Okay, thanks for testing! I'll send a formal patch.

Eric

2018-03-28 17:50:55

by Eric Biggers

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Wed, Mar 28, 2018 at 11:46:28AM -0400, J. Bruce Fields wrote:
> On Tue, Mar 27, 2018 at 03:29:50PM -0700, Eric Biggers wrote:
> > Hi Michael,
> >
> > On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> > > NFS mounts stopped working on one of my computers after a kernel update from
> > > 4.15.3 to 4.15.4. I traced the problem to the commit
> > > [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> > > keyed hashes without setting key
> > > and a later kernel with this patch reverted works normally.
> > >
> > > The problem seems to be related to kerberos as the mount fails when the
> > > keytab is present, but works if I rename the keytab file. This is true even
> > > though the mount is with sec=sys . The mount should also work with sec=krb5
> > > but that also fails in the same way. When the mount fails there are errors
> > > in dmesg like
> > > [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> > > [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> > > [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> > > Exiting with error EIO
> > > [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> > > [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
> > >
> > > Michael Young
> >
> > Thanks for the bug report. I think the error is coming from
> > net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
> > The first one, which is definitely a bug, is that make_checksum_hmac_md5()
> > allocates an HMAC transform and request, then does these crypto API calls:
> >
> > crypto_ahash_init()
> > crypto_ahash_setkey()
> > crypto_ahash_digest()
> >
> > This is wrong because it makes no sense to init() the HMAC request before the
> > key has been set, and doubly so when it's calling digest() which is shorthand
> > for init() + update() + final(). So I think it just needs to be removed. You
> > can test the following patch:
>
> When was this introduced?
>
> 3b5cf20cf439 "sunrpc: Use skcipher and ahash/shash"
> - probably not, assuming the above was still just as wrong with
> crypto_hash_{init,setkey,digest} as it is with
> crypto_ahash_{init,setkey,digest}
>
> So I'm guessing it was wrong from the start when it was added by
> fffdaef2eb4a "gss_krb5: Add support for rc4-hmac encryption" 8 years
> ago. Wonder why it took this long to notice? Did something else
> change?
>
> --b.

It was wrong from the start, but the crypto API only recently started enforcing
that the key has to be set before init() or digest() is called. Before that the
code was just doing unnecessary work, at least with the software HMAC
implementation. Though, there are also hardware crypto drivers that implement
HMAC-MD5, and it's not immediately obvious that they handle init() before
setkey() as gracefully as the software implementation.

Eric

2018-03-28 18:03:35

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Wed, Mar 28, 2018 at 10:50:51AM -0700, Eric Biggers wrote:
> On Wed, Mar 28, 2018 at 11:46:28AM -0400, J. Bruce Fields wrote:
> > On Tue, Mar 27, 2018 at 03:29:50PM -0700, Eric Biggers wrote:
> > > Hi Michael,
> > >
> > > On Tue, Mar 27, 2018 at 11:06:14PM +0100, Michael Young wrote:
> > > > NFS mounts stopped working on one of my computers after a kernel update from
> > > > 4.15.3 to 4.15.4. I traced the problem to the commit
> > > > [46e8d06e423c4f35eac7a8b677b713b3ec9b0684] crypto: hash - prevent using
> > > > keyed hashes without setting key
> > > > and a later kernel with this patch reverted works normally.
> > > >
> > > > The problem seems to be related to kerberos as the mount fails when the
> > > > keytab is present, but works if I rename the keytab file. This is true even
> > > > though the mount is with sec=sys . The mount should also work with sec=krb5
> > > > but that also fails in the same way. When the mount fails there are errors
> > > > in dmesg like
> > > > [ 1232.522816] gss_marshal: gss_get_mic FAILED (851968)
> > > > [ 1232.522819] RPC: couldn't encode RPC header, exit EIO
> > > > [ 1232.522856] gss_marshal: gss_get_mic FAILED (851968)
> > > > [ 1232.522857] RPC: couldn't encode RPC header, exit EIO
> > > > [ 1232.522863] NFS: nfs4_discover_server_trunking unhandled error -5.
> > > > Exiting with error EIO
> > > > [ 1232.525039] gss_marshal: gss_get_mic FAILED (851968)
> > > > [ 1232.525042] RPC: couldn't encode RPC header, exit EIO
> > > >
> > > > Michael Young
> > >
> > > Thanks for the bug report. I think the error is coming from
> > > net/sunrpc/auth_gss/gss_krb5_crypto.c. There are two potential problems I see.
> > > The first one, which is definitely a bug, is that make_checksum_hmac_md5()
> > > allocates an HMAC transform and request, then does these crypto API calls:
> > >
> > > crypto_ahash_init()
> > > crypto_ahash_setkey()
> > > crypto_ahash_digest()
> > >
> > > This is wrong because it makes no sense to init() the HMAC request before the
> > > key has been set, and doubly so when it's calling digest() which is shorthand
> > > for init() + update() + final(). So I think it just needs to be removed. You
> > > can test the following patch:
> >
> > When was this introduced?
> >
> > 3b5cf20cf439 "sunrpc: Use skcipher and ahash/shash"
> > - probably not, assuming the above was still just as wrong with
> > crypto_hash_{init,setkey,digest} as it is with
> > crypto_ahash_{init,setkey,digest}
> >
> > So I'm guessing it was wrong from the start when it was added by
> > fffdaef2eb4a "gss_krb5: Add support for rc4-hmac encryption" 8 years
> > ago. Wonder why it took this long to notice? Did something else
> > change?
> >
> > --b.
>
> It was wrong from the start, but the crypto API only recently started enforcing
> that the key has to be set before init() or digest() is called. Before that the
> code was just doing unnecessary work, at least with the software HMAC
> implementation. Though, there are also hardware crypto drivers that implement
> HMAC-MD5, and it's not immediately obvious that they handle init() before
> setkey() as gracefully as the software implementation.

Thanks, got it. Do you know how to find a commit id for that change?
It's not entirely fair to blame the crypto change for what was really a
latent nfs bug, but it might still be worth adding a Fixes: line just so
people know where it needs backporting.

--b.

2018-03-28 18:04:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFS mounts failing when keytab present on client

On Wed, Mar 28, 2018 at 02:03:34PM -0400, J. Bruce Fields wrote:
> Thanks, got it. Do you know how to find a commit id for that change?
> It's not entirely fair to blame the crypto change for what was really a
> latent nfs bug, but it might still be worth adding a Fixes: line just so
> people know where it needs backporting.

Whoops, I see you already did that. I should finish my inbox before
responding!

--b.