2016-12-12 17:10:38

by Andy Adamson

[permalink] [raw]
Subject: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

From: Andy Adamson <[email protected]>

The current code sets the expiry_time on the local copy of the rsc
cache entry - but not on the actual cache entry.
Note that currently, the rsc cache entries are not cleaned up even
when nfsd is stopped.

Update the cache with the new expiry_time of now so that cache_clean will
clean up (free) the context to be destroyed.

Signed-off-by: Andy Adamson <[email protected]>
---
net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 45662d7..6033389 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}

#endif /* CONFIG_PROC_FS */

+static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
+{
+ struct rsc new;
+ int ret = -ENOMEM;
+
+ memset(&new, 0, sizeof(struct rsc));
+ if (dup_netobj(&new.handle, &rscp->handle))
+ goto out;
+ new.h.expiry_time = get_seconds();
+ set_bit(CACHE_NEGATIVE, &new.h.flags);
+
+ rscp = rsc_update(sn->rsc_cache, &new, rscp);
+ if (!rscp)
+ goto out;
+ ret = 0;
+out:
+ rsc_free(&new);
+ return ret;
+}
+
/*
* Accept an rpcsec packet.
* If context establishment, punt to user space
@@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
case RPC_GSS_PROC_DESTROY:
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;
- rsci->h.expiry_time = get_seconds();
- set_bit(CACHE_NEGATIVE, &rsci->h.flags);
+ if (rsc_destroy(sn, rsci))
+ goto drop;
+ /**
+ * Balance the cache_put at the end of svcauth_gss_accept.This
+ * will leave the refcount = 1 so that the cache_clean cache_put
+ * will call rsc_put.
+ */
+ cache_get(&rsci->h);
+
if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;
+
svc_putnl(resv, RPC_SUCCESS);
goto complete;
case RPC_GSS_PROC_DATA:
--
1.8.3.1



2016-12-12 21:04:38

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

Hi Andy,

On 12/12/2016 12:08 PM, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> The current code sets the expiry_time on the local copy of the rsc
> cache entry - but not on the actual cache entry.
> Note that currently, the rsc cache entries are not cleaned up even
> when nfsd is stopped.
>
> Update the cache with the new expiry_time of now so that cache_clean will
> clean up (free) the context to be destroyed.
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 45662d7..6033389 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
>
> #endif /* CONFIG_PROC_FS */
>
> +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
> +{
> + struct rsc new;

Can you avoid the memset by initializing new to 0 directly?
struct rsc new = {0};

> + int ret = -ENOMEM;
> +
> + memset(&new, 0, sizeof(struct rsc));
> + if (dup_netobj(&new.handle, &rscp->handle))
> + goto out;
> + new.h.expiry_time = get_seconds();
> + set_bit(CACHE_NEGATIVE, &new.h.flags);

Alternatively, would it make sense to initialize new with these values directly, rather than setting them afterwards?

> +
> + rscp = rsc_update(sn->rsc_cache, &new, rscp);
> + if (!rscp)
> + goto out;
> + ret = 0;
> +out:
> + rsc_free(&new);
> + return ret;
> +}
> +
> /*
> * Accept an rpcsec packet.
> * If context establishment, punt to user space
> @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> case RPC_GSS_PROC_DESTROY:
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
> - rsci->h.expiry_time = get_seconds();
> - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> + if (rsc_destroy(sn, rsci))

If you're just checking success / failure here, would it make sense to have rsc_destroy() return a boolean value rather than an error code? Do you expect it could be used anywhere that the error code would make a difference?

Thanks,
Anna

> + goto drop;
> + /**
> + * Balance the cache_put at the end of svcauth_gss_accept.This
> + * will leave the refcount = 1 so that the cache_clean cache_put
> + * will call rsc_put.
> + */
> + cache_get(&rsci->h);
> +
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
> +
> svc_putnl(resv, RPC_SUCCESS);
> goto complete;
> case RPC_GSS_PROC_DATA:
>

2016-12-12 21:58:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> The current code sets the expiry_time on the local copy of the rsc
> cache entry - but not on the actual cache entry.

I'm not following. It looks to me like "rsci" in the below was returned
from gss_svc_searchbyctx(), which was returned in turn from
sunrpc_cache_lookup(), which is returning an item from the rsc cache--I
don't see any copying.

> Note that currently, the rsc cache entries are not cleaned up even
> when nfsd is stopped.

So, that sounds like a bug, but I don't understand this explanation yet.

> Update the cache with the new expiry_time of now so that cache_clean will
> clean up (free) the context to be destroyed.
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 45662d7..6033389 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
>
> #endif /* CONFIG_PROC_FS */
>
> +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
> +{
> + struct rsc new;
> + int ret = -ENOMEM;
> +
> + memset(&new, 0, sizeof(struct rsc));
> + if (dup_netobj(&new.handle, &rscp->handle))
> + goto out;
> + new.h.expiry_time = get_seconds();
> + set_bit(CACHE_NEGATIVE, &new.h.flags);
> +
> + rscp = rsc_update(sn->rsc_cache, &new, rscp);
> + if (!rscp)
> + goto out;
> + ret = 0;
> +out:
> + rsc_free(&new);
> + return ret;
> +}
> +
> /*
> * Accept an rpcsec packet.
> * If context establishment, punt to user space
> @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> case RPC_GSS_PROC_DESTROY:
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
> - rsci->h.expiry_time = get_seconds();
> - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> + if (rsc_destroy(sn, rsci))
> + goto drop;
> + /**
> + * Balance the cache_put at the end of svcauth_gss_accept.This
> + * will leave the refcount = 1 so that the cache_clean cache_put
> + * will call rsc_put.
> + */

I'm confused by that comment. If it's right, then it means the refcount
is currently zero, in which case the following line is unsafe.

--b.

> + cache_get(&rsci->h);
> +
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
> +
> svc_putnl(resv, RPC_SUCCESS);
> goto complete;
> case RPC_GSS_PROC_DATA:
> --
> 1.8.3.1

2016-12-12 22:59:26

by Adamson, Andy

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

VGhlIGJ1ZyBpcyBzZXR0aW5nIG5ldyB2YWx1ZXMgb24gYW4gcnNjIGNhY2hlIGVudHJ5IHdpdGhv
dXQgY2FsbGluZyByc2NfdXBkYXRlLiBXaGVuIHlvdSBkbyB0aGlzLCB0aGUg4oCcbG9jYWwgY29w
eeKAnSBvZiB0aGUgcnNjIGNhY2hlIGVudHJ5IChlLmcuIHRoZSBvbmUgcmV0dXJuZWQgYnkgZ3Nz
X3N2Y19zZWFyY2hieWN0eCApIGdldHMgdGhlIG5ldyB2YWx1ZXMgKGV4cGlyeV90aW1lIGFuZCBD
QUNIRV9ORUdBVElWRSBiaXQgc2V0KSBidXQgdGhlIG5ldyB2YWx1ZXMgYXJlIG5vdCBwcm9wYWdh
dGVkIGJhY2sgdG8gdGhlIGNhY2hlLiBTbyB0aGUgbmV4dCB0aW1lIHRoZSBjYWNoZSBlbnRyeSBp
cyBsb29rZWQgdXAsIGUuZy4gd2hlbiBjYWNoZV9jbGVhbigpIGlzIGNhbGxlZCB0byBjbGVhbiB1
cCwgdGhlIGV4cGlyeV90aW1lIGFuZCBDQUNIRV9ORUdBVElWRSBhcmUgbm90IHNldCB0byB0aGUg
bmV3IHZhbHVlcyBvbiB0aGUgY2FjaGUgZW50cnkgdG8gYmUgZGVzdHJveWVkLCBhbmQgY2FjaGVf
Y2xlYW4gZG9lcyBub3QgcmVhcCB0aGUgY2FjaGUgZW50cnkuDQoNClRoZSBmaXggaXMgdG8gZG8g
d2hhdCB0aGlzIHBhdGNoIGRvZXMsIGFuZCBjYWxsIHJzY191cGRhdGUgb24gdGhlIHJzYyBlbnRy
eS4gV2l0aCB0aGlzIHBhdGNoLCBjYWNoZV9jbGVhbiBpcyBjYWxsZWQgYW5kIHJlYXBzIHRoZSBj
YWNoZSBlbnRyaWVzLg0KDQpCVFc6IGp1c3QgbG9vayBhdCBhbGwgdGhlIG90aGVyIHVzZXMgb2Yg
dGhlIGNhY2hlIGFuZCB5b3Ugd2lsbCBub3RlIHRoYXQgdGhleSBhbGwgY2FsbCB1cGRhdGUgYWZ0
ZXIgc2V0dGluZyBuZXcgdmFsdWVzLiANCg0KSXTigJlzIGp1c3QgaG93IE5laWzigJlzIGNhY2hl
IGNvZGUgd29ya3MuDQoNCmUuZy4gZG5zX3Jlc29sdmUuYw0KICAgICAgIGtleS5oLmV4cGlyeV90
aW1lID0gdHRsICsgc2Vjb25kc19zaW5jZV9ib290KCk7DQrigKYNCiAgICAgICBpZiAoa2V5LmFk
ZHJsZW4gPT0gMCkNCiAgICAgICAgICAgICAgICBzZXRfYml0KENBQ0hFX05FR0FUSVZFLCAma2V5
LmguZmxhZ3MpOw0KDQogICAgICAgIGl0ZW0gPSBuZnNfZG5zX3VwZGF0ZShjZCwgJmtleSwgaXRl
bSk7DQoNCg0KDQpJIGFsc28ganVzdCBmb3VuZCBhIGJ1ZywgSSBuZWVkIGEgbG9jYWwgcnNjIGNh
Y2hlIHBvaW50ZXIgZm9yIHRoZSByc2NfdXBkYXRlIHJldHVybi4gVGhhdCBwbHVzIEFubmHigJlz
IGNvbW1lbnRzIHdpbGwgYmUgYWRkcmVzc2VkIGluIHZlcnNpb24gMy4gSeKAmWxsIGV4cGxhaW4g
bW9yZSBpbiB0aGUgcGF0Y2ggY29tbWVudHMuDQoNCuKGkkFuZHkNCg0KT24gMTIvMTIvMTYsIDQ6
NTggUE0sICJKLiBCcnVjZSBGaWVsZHMiIDxiZmllbGRzQGZpZWxkc2VzLm9yZz4gd3JvdGU6DQoN
Ck9uIE1vbiwgRGVjIDEyLCAyMDE2IGF0IDEyOjA4OjQ5UE0gLTA1MDAsIGFuZHJvc0BuZXRhcHAu
Y29tIHdyb3RlOg0KPiBGcm9tOiBBbmR5IEFkYW1zb24gPGFuZHJvc0BuZXRhcHAuY29tPg0KPiAN
Cj4gVGhlIGN1cnJlbnQgY29kZSBzZXRzIHRoZSBleHBpcnlfdGltZSBvbiB0aGUgbG9jYWwgY29w
eSBvZiB0aGUgcnNjDQo+IGNhY2hlIGVudHJ5IC0gYnV0IG5vdCBvbiB0aGUgYWN0dWFsIGNhY2hl
IGVudHJ5Lg0KDQpJJ20gbm90IGZvbGxvd2luZy4gIEl0IGxvb2tzIHRvIG1lIGxpa2UgInJzY2ki
IGluIHRoZSBiZWxvdyB3YXMgcmV0dXJuZWQNCmZyb20gZ3NzX3N2Y19zZWFyY2hieWN0eCgpLCB3
aGljaCB3YXMgcmV0dXJuZWQgaW4gdHVybiBmcm9tDQpzdW5ycGNfY2FjaGVfbG9va3VwKCksIHdo
aWNoIGlzIHJldHVybmluZyBhbiBpdGVtIGZyb20gdGhlIHJzYyBjYWNoZS0tSQ0KZG9uJ3Qgc2Vl
IGFueSBjb3B5aW5nLg0KDQo+IE5vdGUgdGhhdCBjdXJyZW50bHksIHRoZSByc2MgY2FjaGUgZW50
cmllcyBhcmUgbm90IGNsZWFuZWQgdXAgZXZlbg0KPiB3aGVuIG5mc2QgaXMgc3RvcHBlZC4NCg0K
U28sIHRoYXQgc291bmRzIGxpa2UgYSBidWcsIGJ1dCBJIGRvbid0IHVuZGVyc3RhbmQgdGhpcyBl
eHBsYW5hdGlvbiB5ZXQuDQoNCj4gVXBkYXRlIHRoZSBjYWNoZSB3aXRoIHRoZSBuZXcgZXhwaXJ5
X3RpbWUgb2Ygbm93IHNvIHRoYXQgY2FjaGVfY2xlYW4gd2lsbA0KPiBjbGVhbiB1cCAoZnJlZSkg
dGhlIGNvbnRleHQgdG8gYmUgZGVzdHJveWVkLg0KPiANCj4gU2lnbmVkLW9mZi1ieTogQW5keSBB
ZGFtc29uIDxhbmRyb3NAbmV0YXBwLmNvbT4NCj4gLS0tDQo+ICBuZXQvc3VucnBjL2F1dGhfZ3Nz
L3N2Y2F1dGhfZ3NzLmMgfCAzMiArKysrKysrKysrKysrKysrKysrKysrKysrKysrKystLQ0KPiAg
MSBmaWxlIGNoYW5nZWQsIDMwIGluc2VydGlvbnMoKyksIDIgZGVsZXRpb25zKC0pDQo+IA0KPiBk
aWZmIC0tZ2l0IGEvbmV0L3N1bnJwYy9hdXRoX2dzcy9zdmNhdXRoX2dzcy5jIGIvbmV0L3N1bnJw
Yy9hdXRoX2dzcy9zdmNhdXRoX2dzcy5jDQo+IGluZGV4IDQ1NjYyZDcuLjYwMzMzODkgMTAwNjQ0
DQo+IC0tLSBhL25ldC9zdW5ycGMvYXV0aF9nc3Mvc3ZjYXV0aF9nc3MuYw0KPiArKysgYi9uZXQv
c3VucnBjL2F1dGhfZ3NzL3N2Y2F1dGhfZ3NzLmMNCj4gQEAgLTEzOTMsNiArMTM5MywyNiBAQCBz
dGF0aWMgdm9pZCBkZXN0cm95X3VzZV9nc3NfcHJveHlfcHJvY19lbnRyeShzdHJ1Y3QgbmV0ICpu
ZXQpIHt9DQo+ICANCj4gICNlbmRpZiAvKiBDT05GSUdfUFJPQ19GUyAqLw0KPiAgDQo+ICtzdGF0
aWMgaW50IHJzY19kZXN0cm95KHN0cnVjdCBzdW5ycGNfbmV0ICpzbiwgc3RydWN0IHJzYyAqcnNj
cCkNCj4gK3sNCj4gKwlzdHJ1Y3QgcnNjIG5ldzsNCj4gKwlpbnQgcmV0ID0gLUVOT01FTTsNCj4g
Kw0KPiArCW1lbXNldCgmbmV3LCAwLCBzaXplb2Yoc3RydWN0IHJzYykpOw0KPiArCWlmIChkdXBf
bmV0b2JqKCZuZXcuaGFuZGxlLCAmcnNjcC0+aGFuZGxlKSkNCj4gKwkJZ290byBvdXQ7DQo+ICsJ
bmV3LmguZXhwaXJ5X3RpbWUgPSBnZXRfc2Vjb25kcygpOw0KPiArCXNldF9iaXQoQ0FDSEVfTkVH
QVRJVkUsICZuZXcuaC5mbGFncyk7DQo+ICsNCj4gKwlyc2NwID0gcnNjX3VwZGF0ZShzbi0+cnNj
X2NhY2hlLCAmbmV3LCByc2NwKTsNCj4gKwlpZiAoIXJzY3ApDQo+ICsJCWdvdG8gb3V0Ow0KPiAr
CXJldCA9IDA7DQo+ICtvdXQ6DQo+ICsJcnNjX2ZyZWUoJm5ldyk7DQo+ICsJcmV0dXJuIHJldDsN
Cj4gK30NCj4gKw0KPiAgLyoNCj4gICAqIEFjY2VwdCBhbiBycGNzZWMgcGFja2V0Lg0KPiAgICog
SWYgY29udGV4dCBlc3RhYmxpc2htZW50LCBwdW50IHRvIHVzZXIgc3BhY2UNCj4gQEAgLTE0ODks
MTAgKzE1MDksMTggQEAgc3RhdGljIHZvaWQgZGVzdHJveV91c2VfZ3NzX3Byb3h5X3Byb2NfZW50
cnkoc3RydWN0IG5ldCAqbmV0KSB7fQ0KPiAgCWNhc2UgUlBDX0dTU19QUk9DX0RFU1RST1k6DQo+
ICAJCWlmIChnc3Nfd3JpdGVfdmVyZihycXN0cCwgcnNjaS0+bWVjaGN0eCwgZ2MtPmdjX3NlcSkp
DQo+ICAJCQlnb3RvIGF1dGhfZXJyOw0KPiAtCQlyc2NpLT5oLmV4cGlyeV90aW1lID0gZ2V0X3Nl
Y29uZHMoKTsNCj4gLQkJc2V0X2JpdChDQUNIRV9ORUdBVElWRSwgJnJzY2ktPmguZmxhZ3MpOw0K
PiArCQlpZiAocnNjX2Rlc3Ryb3koc24sIHJzY2kpKQ0KPiArCQkJZ290byBkcm9wOw0KPiArCQkv
KioNCj4gKwkJICogQmFsYW5jZSB0aGUgY2FjaGVfcHV0IGF0IHRoZSBlbmQgb2Ygc3ZjYXV0aF9n
c3NfYWNjZXB0LlRoaXMNCj4gKwkJICogd2lsbCBsZWF2ZSB0aGUgcmVmY291bnQgPSAxIHNvIHRo
YXQgdGhlIGNhY2hlX2NsZWFuIGNhY2hlX3B1dA0KPiArCQkgKiB3aWxsIGNhbGwgcnNjX3B1dC4N
Cj4gKwkJICovDQoNCkknbSBjb25mdXNlZCBieSB0aGF0IGNvbW1lbnQuICBJZiBpdCdzIHJpZ2h0
LCB0aGVuIGl0IG1lYW5zIHRoZSByZWZjb3VudA0KaXMgY3VycmVudGx5IHplcm8sIGluIHdoaWNo
IGNhc2UgdGhlIGZvbGxvd2luZyBsaW5lIGlzIHVuc2FmZS4NCg0KLS1iLg0KDQo+ICsJCWNhY2hl
X2dldCgmcnNjaS0+aCk7DQo+ICsNCj4gIAkJaWYgKHJlc3YtPmlvdl9sZW4gKyA0ID4gUEFHRV9T
SVpFKQ0KPiAgCQkJZ290byBkcm9wOw0KPiArDQo+ICAJCXN2Y19wdXRubChyZXN2LCBSUENfU1VD
Q0VTUyk7DQo+ICAJCWdvdG8gY29tcGxldGU7DQo+ICAJY2FzZSBSUENfR1NTX1BST0NfREFUQToN
Cj4gLS0gDQo+IDEuOC4zLjENCg0KDQo=

2016-12-12 23:16:08

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Mon, Dec 12, 2016 at 4:58 PM, J. Bruce Fields <[email protected]> wrote:
> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
>> From: Andy Adamson <[email protected]>
>>
>> The current code sets the expiry_time on the local copy of the rsc
>> cache entry - but not on the actual cache entry.
>
> I'm not following. It looks to me like "rsci" in the below was returned
> from gss_svc_searchbyctx(), which was returned in turn from
> sunrpc_cache_lookup(), which is returning an item from the rsc cache--I
> don't see any copying.
>
>> Note that currently, the rsc cache entries are not cleaned up even
>> when nfsd is stopped.
>
> So, that sounds like a bug, but I don't understand this explanation yet.
>
>> Update the cache with the new expiry_time of now so that cache_clean will
>> clean up (free) the context to be destroyed.
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
>> 1 file changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
>> index 45662d7..6033389 100644
>> --- a/net/sunrpc/auth_gss/svcauth_gss.c
>> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
>> @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
>>
>> #endif /* CONFIG_PROC_FS */
>>
>> +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
>> +{
>> + struct rsc new;
>> + int ret = -ENOMEM;
>> +
>> + memset(&new, 0, sizeof(struct rsc));
>> + if (dup_netobj(&new.handle, &rscp->handle))
>> + goto out;
>> + new.h.expiry_time = get_seconds();
>> + set_bit(CACHE_NEGATIVE, &new.h.flags);
>> +
>> + rscp = rsc_update(sn->rsc_cache, &new, rscp);
>> + if (!rscp)
>> + goto out;
>> + ret = 0;
>> +out:
>> + rsc_free(&new);
>> + return ret;
>> +}
>> +
>> /*
>> * Accept an rpcsec packet.
>> * If context establishment, punt to user space
>> @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
>> case RPC_GSS_PROC_DESTROY:
>> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
>> goto auth_err;
>> - rsci->h.expiry_time = get_seconds();
>> - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
>> + if (rsc_destroy(sn, rsci))
>> + goto drop;
>> + /**
>> + * Balance the cache_put at the end of svcauth_gss_accept.This
>> + * will leave the refcount = 1 so that the cache_clean cache_put
>> + * will call rsc_put.
>> + */
>
> I'm confused by that comment. If it's right, then it means the refcount
> is currently zero,

Didn't see this comment...

Nope. "Balance the cache_put at the __end__ of svcauth_gss_accept".

Without the cache_get, the cache_put that gets called for all cases
handled by svcauth_gss_accept that return a non-null cache entry. So,
without this cache_get, the count would be 1, and the cache_put at the
end of svcauth_gss_accept would drop the count to zero, prompting
rsc_put and rsc_free to be called. This in turn makes the cache_put
called by cache_clean() go nuts as the refcount wraps around and
cache_put is then called 2^32 times!!

The RPCSEC_GSS_DESTROY case does not call the cache_get in svcauth_gss_accept:

svcauth_gss_accept()

.......

svcdata->rsci = rsci;
cache_get(&rsci->h); <<<<<<<<<<<<<<<<<<<<<<<<<<<
here is the cache get not called in the DESTROY case.
rqstp->rq_cred.cr_flavor = gss_svc_to_pseudoflavor(
rsci->mechctx->mech_type,
GSS_C_QOP_DEFAULT,
gc->gc_svc);
ret = SVC_OK;
goto out;

}
garbage_args:
ret = SVC_GARBAGE;
goto out;
auth_err:
/* Restore write pointer to its original value: */
xdr_ressize_check(rqstp, reject_stat);
ret = SVC_DENIED;
goto out;

complete:
ret = SVC_COMPLETE;
goto out;

drop:
ret = SVC_DROP;
out:
if (rsci)
cache_put(&rsci->h, sn->rsc_cache); <<<<<<<<<<<<
Here is the cache_put at the end
return ret;

}


> in which case the following line is unsafe.

sorry for the unclear comments. I'll do a better job on the next version.

>
> --b.
>
>> + cache_get(&rsci->h);
>> +
>> if (resv->iov_len + 4 > PAGE_SIZE)
>> goto drop;
>> +
>> svc_putnl(resv, RPC_SUCCESS);
>> goto complete;
>> case RPC_GSS_PROC_DATA:
>> --
>> 1.8.3.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-12-13 16:00:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote:
> The bug is setting new values on an rsc cache entry without calling
> rsc_update. When you do this, the “local copy” of the rsc cache entry
> (e.g. the one returned by gss_svc_searchbyctx ) gets the new values
> (expiry_time and CACHE_NEGATIVE bit set) but the new values are not
> propagated back to the cache.

gss_svc_searchbyctx:

found = rsc_lookup(cd, &rsci);
...
return found;

rsc_lookup:

ch = sunrpc_cache_lookup(cd, &item->h, hash);
if (ch)
return container_of(ch, struct rsc, h);

sunrpc_cache_lookup:

head = &detail->hash_table[hash];
...
hlist_for_each_entry(tmp, head, cache_list) {
...
return tmp;

Definitely looks to me like it's returning a cache entry, not a copy.

Maybe there's some other reason we need to use rsc_update, but that's
not it.

--b.

> So the next time the cache entry is
> looked up, e.g. when cache_clean() is called to clean up, the
> expiry_time and CACHE_NEGATIVE are not set to the new values on the
> cache entry to be destroyed, and cache_clean does not reap the cache
> entry.

>
> The fix is to do what this patch does, and call rsc_update on the rsc entry. With this patch, cache_clean is called and reaps the cache entries.
>
> BTW: just look at all the other uses of the cache and you will note that they all call update after setting new values.
>
> It’s just how Neil’s cache code works.
>
> e.g. dns_resolve.c
> key.h.expiry_time = ttl + seconds_since_boot();
> …
> if (key.addrlen == 0)
> set_bit(CACHE_NEGATIVE, &key.h.flags);
>
> item = nfs_dns_update(cd, &key, item);
>
>
>
> I also just found a bug, I need a local rsc cache pointer for the rsc_update return. That plus Anna’s comments will be addressed in version 3. I’ll explain more in the patch comments.
>
> →Andy
>
> On 12/12/16, 4:58 PM, "J. Bruce Fields" <[email protected]> wrote:
>
> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
> > From: Andy Adamson <[email protected]>
> >
> > The current code sets the expiry_time on the local copy of the rsc
> > cache entry - but not on the actual cache entry.
>
> I'm not following. It looks to me like "rsci" in the below was returned
> from gss_svc_searchbyctx(), which was returned in turn from
> sunrpc_cache_lookup(), which is returning an item from the rsc cache--I
> don't see any copying.
>
> > Note that currently, the rsc cache entries are not cleaned up even
> > when nfsd is stopped.
>
> So, that sounds like a bug, but I don't understand this explanation yet.
>
> > Update the cache with the new expiry_time of now so that cache_clean will
> > clean up (free) the context to be destroyed.
> >
> > Signed-off-by: Andy Adamson <[email protected]>
> > ---
> > net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
> > 1 file changed, 30 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> > index 45662d7..6033389 100644
> > --- a/net/sunrpc/auth_gss/svcauth_gss.c
> > +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> >
> > #endif /* CONFIG_PROC_FS */
> >
> > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
> > +{
> > + struct rsc new;
> > + int ret = -ENOMEM;
> > +
> > + memset(&new, 0, sizeof(struct rsc));
> > + if (dup_netobj(&new.handle, &rscp->handle))
> > + goto out;
> > + new.h.expiry_time = get_seconds();
> > + set_bit(CACHE_NEGATIVE, &new.h.flags);
> > +
> > + rscp = rsc_update(sn->rsc_cache, &new, rscp);
> > + if (!rscp)
> > + goto out;
> > + ret = 0;
> > +out:
> > + rsc_free(&new);
> > + return ret;
> > +}
> > +
> > /*
> > * Accept an rpcsec packet.
> > * If context establishment, punt to user space
> > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> > case RPC_GSS_PROC_DESTROY:
> > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> > goto auth_err;
> > - rsci->h.expiry_time = get_seconds();
> > - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> > + if (rsc_destroy(sn, rsci))
> > + goto drop;
> > + /**
> > + * Balance the cache_put at the end of svcauth_gss_accept.This
> > + * will leave the refcount = 1 so that the cache_clean cache_put
> > + * will call rsc_put.
> > + */
>
> I'm confused by that comment. If it's right, then it means the refcount
> is currently zero, in which case the following line is unsafe.
>
> --b.
>
> > + cache_get(&rsci->h);
> > +
> > if (resv->iov_len + 4 > PAGE_SIZE)
> > goto drop;
> > +
> > svc_putnl(resv, RPC_SUCCESS);
> > goto complete;
> > case RPC_GSS_PROC_DATA:
> > --
> > 1.8.3.1
>
>

2016-12-13 16:36:59

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Tue, Dec 13, 2016 at 11:00 AM, J. Bruce Fields <[email protected]> wr=
ote:
> On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote:
>> The bug is setting new values on an rsc cache entry without calling
>> rsc_update. When you do this, the =E2=80=9Clocal copy=E2=80=9D of the rs=
c cache entry
>> (e.g. the one returned by gss_svc_searchbyctx ) gets the new values
>> (expiry_time and CACHE_NEGATIVE bit set) but the new values are not
>> propagated back to the cache.
>
> gss_svc_searchbyctx:
>
> found =3D rsc_lookup(cd, &rsci);
> ...
> return found;
>
> rsc_lookup:
>
> ch =3D sunrpc_cache_lookup(cd, &item->h, hash);
> if (ch)
> return container_of(ch, struct rsc, h);
>
> sunrpc_cache_lookup:
>
> head =3D &detail->hash_table[hash];
> ...
> hlist_for_each_entry(tmp, head, cache_list) {
> ...
> return tmp;
>
> Definitely looks to me like it's returning a cache entry, not a copy.

Thats right. What I call a "local copy". Guess that is wrong. I should
say that sunrpc_cache_lookup returns a cache entry under the
read_lock() and so writing that cache entry does not save the changes.

>
> Maybe there's some other reason we need to use rsc_update, but that's
> not it.

OK - rsc_update calls sunrpc_cache_update which takes the write_lock
on the cache entry, and so allows changes to be saved.

--Andy

>
> --b.
>
>> So the next time the cache entry is
>> looked up, e.g. when cache_clean() is called to clean up, the
>> expiry_time and CACHE_NEGATIVE are not set to the new values on the
>> cache entry to be destroyed, and cache_clean does not reap the cache
>> entry.
>
>>
>> The fix is to do what this patch does, and call rsc_update on the rsc en=
try. With this patch, cache_clean is called and reaps the cache entries.
>>
>> BTW: just look at all the other uses of the cache and you will note that=
they all call update after setting new values.
>>
>> It=E2=80=99s just how Neil=E2=80=99s cache code works.
>>
>> e.g. dns_resolve.c
>> key.h.expiry_time =3D ttl + seconds_since_boot();
>> =E2=80=A6
>> if (key.addrlen =3D=3D 0)
>> set_bit(CACHE_NEGATIVE, &key.h.flags);
>>
>> item =3D nfs_dns_update(cd, &key, item);
>>
>>
>>
>> I also just found a bug, I need a local rsc cache pointer for the rsc_up=
date return. That plus Anna=E2=80=99s comments will be addressed in version=
3. I=E2=80=99ll explain more in the patch comments.
>>
>> =E2=86=92Andy
>>
>> On 12/12/16, 4:58 PM, "J. Bruce Fields" <[email protected]> wrote:
>>
>> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
>> > From: Andy Adamson <[email protected]>
>> >
>> > The current code sets the expiry_time on the local copy of the rsc
>> > cache entry - but not on the actual cache entry.
>>
>> I'm not following. It looks to me like "rsci" in the below was returned
>> from gss_svc_searchbyctx(), which was returned in turn from
>> sunrpc_cache_lookup(), which is returning an item from the rsc cache--I
>> don't see any copying.
>>
>> > Note that currently, the rsc cache entries are not cleaned up even
>> > when nfsd is stopped.
>>
>> So, that sounds like a bug, but I don't understand this explanation yet.
>>
>> > Update the cache with the new expiry_time of now so that cache_clean w=
ill
>> > clean up (free) the context to be destroyed.
>> >
>> > Signed-off-by: Andy Adamson <[email protected]>
>> > ---
>> > net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++=
--
>> > 1 file changed, 30 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/s=
vcauth_gss.c
>> > index 45662d7..6033389 100644
>> > --- a/net/sunrpc/auth_gss/svcauth_gss.c
>> > +++ b/net/sunrpc/auth_gss/svcauth_gss.c
>> > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(st=
ruct net *net) {}
>> >
>> > #endif /* CONFIG_PROC_FS */
>> >
>> > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
>> > +{
>> > + struct rsc new;
>> > + int ret =3D -ENOMEM;
>> > +
>> > + memset(&new, 0, sizeof(struct rsc));
>> > + if (dup_netobj(&new.handle, &rscp->handle))
>> > + goto out;
>> > + new.h.expiry_time =3D get_seconds();
>> > + set_bit(CACHE_NEGATIVE, &new.h.flags);
>> > +
>> > + rscp =3D rsc_update(sn->rsc_cache, &new, rscp);
>> > + if (!rscp)
>> > + goto out;
>> > + ret =3D 0;
>> > +out:
>> > + rsc_free(&new);
>> > + return ret;
>> > +}
>> > +
>> > /*
>> > * Accept an rpcsec packet.
>> > * If context establishment, punt to user space
>> > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(s=
truct net *net) {}
>> > case RPC_GSS_PROC_DESTROY:
>> > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
>> > goto auth_err;
>> > - rsci->h.expiry_time =3D get_seconds();
>> > - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
>> > + if (rsc_destroy(sn, rsci))
>> > + goto drop;
>> > + /**
>> > + * Balance the cache_put at the end of svcauth_gss_accept.=
This
>> > + * will leave the refcount =3D 1 so that the cache_clean c=
ache_put
>> > + * will call rsc_put.
>> > + */
>>
>> I'm confused by that comment. If it's right, then it means the refcount
>> is currently zero, in which case the following line is unsafe.
>>
>> --b.
>>
>> > + cache_get(&rsci->h);
>> > +
>> > if (resv->iov_len + 4 > PAGE_SIZE)
>> > goto drop;
>> > +
>> > svc_putnl(resv, RPC_SUCCESS);
>> > goto complete;
>> > case RPC_GSS_PROC_DATA:
>> > --
>> > 1.8.3.1
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-12-14 21:04:36

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Tue, Dec 13, 2016 at 11:36:11AM -0500, Andy Adamson wrote:
> On Tue, Dec 13, 2016 at 11:00 AM, J. Bruce Fields <[email protected]> wrote:
> > On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote:
> >> The bug is setting new values on an rsc cache entry without calling
> >> rsc_update. When you do this, the “local copy” of the rsc cache entry
> >> (e.g. the one returned by gss_svc_searchbyctx ) gets the new values
> >> (expiry_time and CACHE_NEGATIVE bit set) but the new values are not
> >> propagated back to the cache.
> >
> > gss_svc_searchbyctx:
> >
> > found = rsc_lookup(cd, &rsci);
> > ...
> > return found;
> >
> > rsc_lookup:
> >
> > ch = sunrpc_cache_lookup(cd, &item->h, hash);
> > if (ch)
> > return container_of(ch, struct rsc, h);
> >
> > sunrpc_cache_lookup:
> >
> > head = &detail->hash_table[hash];
> > ...
> > hlist_for_each_entry(tmp, head, cache_list) {
> > ...
> > return tmp;
> >
> > Definitely looks to me like it's returning a cache entry, not a copy.
>
> Thats right. What I call a "local copy". Guess that is wrong. I should
> say that sunrpc_cache_lookup returns a cache entry under the
> read_lock() and so writing that cache entry does not save the changes.

Writes still happen normally, there's just the hypothetical possibility
of races if two processes are writing at the same time. (Don't know if
that's a risk here, I haven't thought it through.)

Maybe it would be easiest for me to understand if you could start with a
description of your reproducer? What do you do, and what results do you
see?

--b.

>
> >
> > Maybe there's some other reason we need to use rsc_update, but that's
> > not it.
>
> OK - rsc_update calls sunrpc_cache_update which takes the write_lock
> on the cache entry, and so allows changes to be saved.
>
> --Andy
>
> >
> > --b.
> >
> >> So the next time the cache entry is
> >> looked up, e.g. when cache_clean() is called to clean up, the
> >> expiry_time and CACHE_NEGATIVE are not set to the new values on the
> >> cache entry to be destroyed, and cache_clean does not reap the cache
> >> entry.
> >
> >>
> >> The fix is to do what this patch does, and call rsc_update on the rsc entry. With this patch, cache_clean is called and reaps the cache entries.
> >>
> >> BTW: just look at all the other uses of the cache and you will note that they all call update after setting new values.
> >>
> >> It’s just how Neil’s cache code works.
> >>
> >> e.g. dns_resolve.c
> >> key.h.expiry_time = ttl + seconds_since_boot();
> >> …
> >> if (key.addrlen == 0)
> >> set_bit(CACHE_NEGATIVE, &key.h.flags);
> >>
> >> item = nfs_dns_update(cd, &key, item);
> >>
> >>
> >>
> >> I also just found a bug, I need a local rsc cache pointer for the rsc_update return. That plus Anna’s comments will be addressed in version 3. I’ll explain more in the patch comments.
> >>
> >> →Andy
> >>
> >> On 12/12/16, 4:58 PM, "J. Bruce Fields" <[email protected]> wrote:
> >>
> >> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
> >> > From: Andy Adamson <[email protected]>
> >> >
> >> > The current code sets the expiry_time on the local copy of the rsc
> >> > cache entry - but not on the actual cache entry.
> >>
> >> I'm not following. It looks to me like "rsci" in the below was returned
> >> from gss_svc_searchbyctx(), which was returned in turn from
> >> sunrpc_cache_lookup(), which is returning an item from the rsc cache--I
> >> don't see any copying.
> >>
> >> > Note that currently, the rsc cache entries are not cleaned up even
> >> > when nfsd is stopped.
> >>
> >> So, that sounds like a bug, but I don't understand this explanation yet.
> >>
> >> > Update the cache with the new expiry_time of now so that cache_clean will
> >> > clean up (free) the context to be destroyed.
> >> >
> >> > Signed-off-by: Andy Adamson <[email protected]>
> >> > ---
> >> > net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++++++--
> >> > 1 file changed, 30 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> >> > index 45662d7..6033389 100644
> >> > --- a/net/sunrpc/auth_gss/svcauth_gss.c
> >> > +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> >> > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> >> >
> >> > #endif /* CONFIG_PROC_FS */
> >> >
> >> > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
> >> > +{
> >> > + struct rsc new;
> >> > + int ret = -ENOMEM;
> >> > +
> >> > + memset(&new, 0, sizeof(struct rsc));
> >> > + if (dup_netobj(&new.handle, &rscp->handle))
> >> > + goto out;
> >> > + new.h.expiry_time = get_seconds();
> >> > + set_bit(CACHE_NEGATIVE, &new.h.flags);
> >> > +
> >> > + rscp = rsc_update(sn->rsc_cache, &new, rscp);
> >> > + if (!rscp)
> >> > + goto out;
> >> > + ret = 0;
> >> > +out:
> >> > + rsc_free(&new);
> >> > + return ret;
> >> > +}
> >> > +
> >> > /*
> >> > * Accept an rpcsec packet.
> >> > * If context establishment, punt to user space
> >> > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {}
> >> > case RPC_GSS_PROC_DESTROY:
> >> > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> >> > goto auth_err;
> >> > - rsci->h.expiry_time = get_seconds();
> >> > - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
> >> > + if (rsc_destroy(sn, rsci))
> >> > + goto drop;
> >> > + /**
> >> > + * Balance the cache_put at the end of svcauth_gss_accept.This
> >> > + * will leave the refcount = 1 so that the cache_clean cache_put
> >> > + * will call rsc_put.
> >> > + */
> >>
> >> I'm confused by that comment. If it's right, then it means the refcount
> >> is currently zero, in which case the following line is unsafe.
> >>
> >> --b.
> >>
> >> > + cache_get(&rsci->h);
> >> > +
> >> > if (resv->iov_len + 4 > PAGE_SIZE)
> >> > goto drop;
> >> > +
> >> > svc_putnl(resv, RPC_SUCCESS);
> >> > goto complete;
> >> > case RPC_GSS_PROC_DATA:
> >> > --
> >> > 1.8.3.1
> >>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-12-19 16:19:24

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

On Wed, Dec 14, 2016 at 4:04 PM, J. Bruce Fields <[email protected]> wro=
te:
> On Tue, Dec 13, 2016 at 11:36:11AM -0500, Andy Adamson wrote:
>> On Tue, Dec 13, 2016 at 11:00 AM, J. Bruce Fields <[email protected]>=
wrote:
>> > On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote:
>> >> The bug is setting new values on an rsc cache entry without calling
>> >> rsc_update. When you do this, the =E2=80=9Clocal copy=E2=80=9D of the=
rsc cache entry
>> >> (e.g. the one returned by gss_svc_searchbyctx ) gets the new values
>> >> (expiry_time and CACHE_NEGATIVE bit set) but the new values are not
>> >> propagated back to the cache.
>> >
>> > gss_svc_searchbyctx:
>> >
>> > found =3D rsc_lookup(cd, &rsci);
>> > ...
>> > return found;
>> >
>> > rsc_lookup:
>> >
>> > ch =3D sunrpc_cache_lookup(cd, &item->h, hash);
>> > if (ch)
>> > return container_of(ch, struct rsc, h);
>> >
>> > sunrpc_cache_lookup:
>> >
>> > head =3D &detail->hash_table[hash];
>> > ...
>> > hlist_for_each_entry(tmp, head, cache_list) {
>> > ...
>> > return tmp;
>> >
>> > Definitely looks to me like it's returning a cache entry, not a copy.
>>
>> Thats right. What I call a "local copy". Guess that is wrong. I should
>> say that sunrpc_cache_lookup returns a cache entry under the
>> read_lock() and so writing that cache entry does not save the changes.
>
> Writes still happen normally, there's just the hypothetical possibility
> of races if two processes are writing at the same time. (Don't know if
> that's a risk here, I haven't thought it through.)

Why have a read_lock/write_lock architecture and then not follow it?
It seems to me that the hypothetical possibility equals a bug.

>
> Maybe it would be easiest for me to understand if you could start with a
> description of your reproducer? What do you do, and what results do you
> see?

Sure. I'm coding the GSSv3 Full Mode labeling prototype, where I add
more bookkeeping to the rsc cache which needs to be freed. So I
instrumented rsc_put calling rsc_free to ensure that my additions were
being freed correctly. Then I found that rsc_put was not called. I am
assuming that the reason for setting the expiry_time to get_seconds()
and setting the CACHE_NEGATIVE bit is to destroy the context. But it
was not happening.

So I instrumented cache_get, cache_put, cache_clean,
svcauth_cache_lookup, cache_check. I also instrumented all other users
of the sunrpc cache put routines to see what they do.

Here is a long log of the current code with printk output. At the end
I have comments. Then a shorter log of the patched coded with printk
output and some final comments.

----------- Current Code Behavior -------

Case 1) Current code with writing rsci not using the write_lock:


case RPC_GSS_PROC_DESTROY:
printk("%s RPC_GSS_PROC_DESTROY h %p ref %d expiry %lu\n",
__func__, &rsci->h, atomic_read(&rsci->h.ref.refcou=
nt),
rsci->h.expiry_time);
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;

rsci->h.expiry_time =3D get_seconds();
set_bit(CACHE_NEGATIVE, &rsci->h.flags);

printk("%s AFTER SETTING h %p expiry %lu\n\n", __func__,
&rsci->h, rsci->h.expiry_time);

if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;

svc_putnl(resv, RPC_SUCCESS);
goto complete;

case RPC_GSS_PROC_DATA:




svcauth_gss_accept gc_proc 3 rq_proc 0 (one of two
RPC_GSS_PROC_DESTROY requests)

--> sunrpc_cache_lookup from rsc_lookup key ffffc90002b9bc58 hash 940
Note: hash 940. Lookup is from gss_write_verf.

sunrpc_cache_lookup 1 calling cache_get key ffffc90002b9bc58 tmp
ffff880079b3f500
--> cache_get h ffff880079b3f500 refcount 1
Note: refcount is one before cache get, so ends up as 2.

sunrpc_cache_lookup RETURN 1 key ffffc90002b9bc58 tmp ffff880079b3f500
--> rsc_free h ffffc90002b9bc58
Note: the above rsc_free is for the temporary entry created by
sunrpc_cache_lookup

cache_check h ffff880079b3f500 rv 0 h->expiry_time 1481823331
cache_check rqstp NULL rv 0 h ffff880079b3f500
<-- cache_check returns 0 h ffff880079b3f500

svcauth_gss_accept RPC_GSS_PROC_DESTROY h ffff880079b3f500 ref 2
expiry 1481823331 <<<<<
svcauth_gss_accept AFTER SETTING h ffff880079b3f500 expiry 1481819736 <<<=
<<<<

Note: expiry_time (and CACHE_NEGATIVE) are set.

svcauth_gss_accept calling cache_put h ffff880079b3f500
--> cache_put h ffff880079b3f500 refcount 2 cd->cache_put ffffffffa045c180
Dec 15 11:35:36 rhel7-2-ga-2 kernel: <-- cache_put h ffff880079b3f500 refco=
unt 1

Note: this is the cache_put at the end of svcauth_gss_accept.

Dec 15 11:53:21 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3f500 h->expiry_time 1481819736 hash 940
Dec 15 11:53:21 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3e540 h->expiry_time 1481819736 hash 982

Note: The first cache_clean ran approximately 18 minutes later and
shows the two RPC_GSS_PROC_DESTROY'ed rsc entries with the reset
expiry_time, but does not call cache_put.

....

Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff88002666dc80 h->expiry_time 665664692 hash 189
Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880075e4aa00 h->expiry_time 665664692 hash 210
Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880077995660 h->expiry_time 2576 hash 165

Note: The second cache_clean comes approximately 8 minutes later.
Still no cach_put on the rsc entries.

....


Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880076fcd700 h->expiry_time 2576 hash 18
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff880076fcd700 from cache_list
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff880076fcd700
Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880076fcd700
refcount 1 cd->cache_put ffffffffa03f3690
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_put h ffff880076fcd700
expiry_time 2576 nextcheck 2576
Dec 15 12:05:38 rhel7-2-ga-2 kernel: svc_export_put PUT h ffff880076fcd700
Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880076fcd700 refco=
unt 0

Note: approximately 5 minutes later svc_export_put is called by cache_clean

Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880076fcf400 h->expiry_time 2576 hash 78
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff880076fcf400 from cache_list
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff880076fcf400
Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880076fcf400
refcount 1 cd->cache_put ffffffffa03f3690
Dec 15 12:05:38 rhel7-2-ga-2 kernel: svc_export_put PUT h ffff880076fcf400
Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880076fcf400 refco=
unt 0

Note: another svc_export_put being called:

Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880077995660 h->expiry_time 2576 hash 165
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff880077995660 from cache_list
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff880077995660
Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880077995660
refcount 1 cd->cache_put ffffffffa0390ff0
Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_put h ffff880077995660
expiry_time 2576 nextcheck 2576
Dec 15 12:05:38 rhel7-2-ga-2 kernel: ip_map_put PUT h ffff880077995660
Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880077995660 refco=
unt 0

Note: And ip_map_put is called.

Still no rsc_put.

shut down NFSD:


Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff88002666dc80 h->expiry_time 665664692 hash 189
Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff88002666dc80 from cache_list
Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff88002666dc80
Dec 15 12:16:12 rhel7-2-ga-2 kernel: --> cache_put h ffff88002666dc80
refcount 1 cd->cache_put ffffffffa03f3390
Dec 15 12:16:12 rhel7-2-ga-2 kernel: expkey_put PUT h ffff88002666dc80
Dec 15 12:16:12 rhel7-2-ga-2 kernel: <-- cache_put h ffff88002666dc80 refco=
unt 0
Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880075e4aa00 h->expiry_time 665664692 hash 210
Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff880075e4aa00 from cache_list
Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff880075e4aa00
Dec 15 12:16:12 rhel7-2-ga-2 kernel: --> cache_put h ffff880075e4aa00
refcount 1 cd->cache_put ffffffffa03f3390
Dec 15 12:16:12 rhel7-2-ga-2 kernel: expkey_put PUT h ffff880075e4aa00
Dec 15 12:16:12 rhel7-2-ga-2 kernel: <-- cache_put h ffff880075e4aa00 refco=
unt 0
Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFS server and services.
Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopping NFS Mount Daemon...
Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopping NFSv4 ID-name mapping servic=
e...
Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFSv4 ID-name mapping service=
.
Dec 15 12:16:12 rhel7-2-ga-2 rpc.mountd[13897]: Caught signal 15,
un-registering and exiting.
Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFS Mount Daemon.


NOTE: both svcauth rsc cache entries =3D hash 940 and hash 982 are still
not reaped. cache_clean continues to be run, the expiry time is way in
the past.


Dec 15 12:23:39 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3f500 h->expiry_time 1481819736 hash 940
Dec 15 12:23:39 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3e540 h->expiry_time 1481819736 hash 982

....


Dec 15 12:53:49 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3f500 h->expiry_time 1481819736 hash 940
Dec 15 12:53:49 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3e540 h->expiry_time 1481819736 hash 982

...

Dec 15 13:24:00 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3f500 h->expiry_time 1481819736 hash 940
Dec 15 13:24:00 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880079b3e540 h->expiry_time 1481819736 hash 982


I don't know if it ever gets reaped.

13:24.00 =3D time last cache_clean did not reap hash 940
- 11:35:36 =3D time hash 940 expiry_time set to get_seconds()
-----------
1:59:24 not reaped.


----------- End of Current Code Behavior -------

I see that:

1) cache_clean is what takes the write_lock and deletes entries from
the cache_list.
2) no cache_detail->cache_put routine deletes the cache entry from the
cache_list, including rsc_put.

so, either I need to figure out how to properly trigger cache_clean to
be called or I add grabbing the write_lock and call hlist_del_init()
to remove the cache entries in the rsc_put routine.

I wrote the rsc_destroy patch to do the first: set up the rsc cache
entry to be destroyed by the first call to cache_clean.

The disputed cache_get (after rsc_destroy in the code below) leaves
the cache entry with a refcount of 1 which allows cache clean to not
only call rsc_put, but to remove the entry from the cache_list.

Without the disputed cache_get, the cache_put at the end of
svcauth_gss_accept would call rsc_put and rsc_free which would free
the cache entry without removing it from the cache_list. (here I would
need to take the write_lock in rsc_put to delete it).

----------- Start of my patch Code Behavior -------


2) Case 2 with the rsc_destroy patch:
static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
{
struct rsc new;
int ret =3D -ENOMEM;

memset(&new, 0, sizeof(struct rsc));
if (dup_netobj(&new.handle, &rscp->handle))
goto out;
new.h.expiry_time =3D get_seconds();
set_bit(CACHE_NEGATIVE, &new.h.flags);

rscp =3D rsc_update(sn->rsc_cache, &new, rscp);
if (!rscp)
goto out;
ret =3D 0;
out:
rsc_free(&new);
return ret;
}

case RPC_GSS_PROC_DESTROY:
printk("%s RPC_GSS_PROC_DESTROY h %p ref %d expiry %lu\n",
__func__, &rsci->h, atomic_read(&rsci->h.ref.refcou=
nt),
rsci->h.expiry_time);
if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
goto auth_err;

if (!rsc_destroy(sn, rsci))
goto drop;
/**
* Balance the cache_put at the end of svcauth_gss_accept.T=
his
* will leave the refcount =3D 1 after the DESTROY
request has been
* processed so that the cache_clean cache_put will
call rsc_put.
*/
printk("%s calling cache_get h %p ref %d\n", __func__, &rsc=
i->h,
atomic_read(&rsci->h.ref.refcount));

cache_get(&rsci->h); <<<<<<<<<<<<<<< disputed cache_get

if (resv->iov_len + 4 > PAGE_SIZE)
goto drop;

svc_putnl(resv, RPC_SUCCESS);
goto complete;
case RPC_GSS_PROC_DATA:




svcauth_gss_accept gc_proc 3 rq_proc 0 (one of two
RPC_GSS_PROC_DESTROY requests)

--> sunrpc_cache_lookup from rsc_lookup key ffffc90002603c58 hash 940
Note hase 940

sunrpc_cache_lookup 1 calling cache_get key ffffc90002603c58 tmp
ffff880073596240
--> cache_get h ffff880073596240 refcount 1

sunrpc_cache_lookup RETURN 1 key ffffc90002603c58 tmp ffff880073596240
--> rsc_free h ffffc90002603c58
cache_check h ffff880073596240 rv 0 h->expiry_time 1481810443
cache_check rqstp NULL rv 0 h ffff880073596240
<-- cache_check returns 0 h ffff880073596240


Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept
RPC_GSS_PROC_DESTROY h ffff880073596240 ref 2 expiry 1481810443
Note expiry_time

Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> rsc_destroy rsci->h
ffff880073596240 ref 2

Dec 15 08:00:47 rhel7-2-ga-2 kernel: sunrpc_cache_update calling
cache_put h ffff880073596240
Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
refcount 2 cd->cache_put ffffffffa03d3180
Dec 15 08:00:47 rhel7-2-ga-2 kernel: cache_put h ffff880073596240
expiry_time 0 nextcheck 0
Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 refco=
unt 1

Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- rsc_destroy rsci->h
ffff880073596240 ref 1


Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept calling
cache_get h ffff880073596240 ref 1
Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_get h ffff880073596240 refco=
unt 1
Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept calling
cache_put h ffff880073596240
Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
refcount 2 cd->cache_put ffffffffa03d3180
Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 refco=
unt 1

Note: this is the cache_put at the end of svcauth_gss_accept.


Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff8800705ae000 h->expiry_time 1481806847 hash 940
Note: the expiry_time has been set (used to be 1481810443)

Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEAN h
ffff880073596240 h->expiry_time 0 hash 940
Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean DELETE h
ffff880073596240 from cache_list
Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
cache_put h ffff880073596240
Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
refcount 1 cd->cache_put ffffffffa03d3180
Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_put h ffff880073596240
expiry_time 0 nextcheck 0
Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> rsc_put PUT h ffff880073596240
Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> rsc_free h ffff880073596240
Dec 15 08:01:03 rhel7-2-ga-2 kernel: RPC: gss_delete_sec_context
deleting ffff880071288f40
Dec 15 08:01:03 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 refco=
unt 0

Note: AFAICS this is the way the code is supposed to work.


-->Andy




>
> --b.
>
>>
>> >
>> > Maybe there's some other reason we need to use rsc_update, but that's
>> > not it.
>>
>> OK - rsc_update calls sunrpc_cache_update which takes the write_lock
>> on the cache entry, and so allows changes to be saved.
>>
>> --Andy
>>
>> >
>> > --b.
>> >
>> >> So the next time the cache entry is
>> >> looked up, e.g. when cache_clean() is called to clean up, the
>> >> expiry_time and CACHE_NEGATIVE are not set to the new values on the
>> >> cache entry to be destroyed, and cache_clean does not reap the cache
>> >> entry.
>> >
>> >>
>> >> The fix is to do what this patch does, and call rsc_update on the rsc=
entry. With this patch, cache_clean is called and reaps the cache entries.
>> >>
>> >> BTW: just look at all the other uses of the cache and you will note t=
hat they all call update after setting new values.
>> >>
>> >> It=E2=80=99s just how Neil=E2=80=99s cache code works.
>> >>
>> >> e.g. dns_resolve.c
>> >> key.h.expiry_time =3D ttl + seconds_since_boot();
>> >> =E2=80=A6
>> >> if (key.addrlen =3D=3D 0)
>> >> set_bit(CACHE_NEGATIVE, &key.h.flags);
>> >>
>> >> item =3D nfs_dns_update(cd, &key, item);
>> >>
>> >>
>> >>
>> >> I also just found a bug, I need a local rsc cache pointer for the rsc=
_update return. That plus Anna=E2=80=99s comments will be addressed in vers=
ion 3. I=E2=80=99ll explain more in the patch comments.
>> >>
>> >> =E2=86=92Andy
>> >>
>> >> On 12/12/16, 4:58 PM, "J. Bruce Fields" <[email protected]> wrote:
>> >>
>> >> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
>> >> > From: Andy Adamson <[email protected]>
>> >> >
>> >> > The current code sets the expiry_time on the local copy of the rsc
>> >> > cache entry - but not on the actual cache entry.
>> >>
>> >> I'm not following. It looks to me like "rsci" in the below was retur=
ned
>> >> from gss_svc_searchbyctx(), which was returned in turn from
>> >> sunrpc_cache_lookup(), which is returning an item from the rsc cache-=
-I
>> >> don't see any copying.
>> >>
>> >> > Note that currently, the rsc cache entries are not cleaned up even
>> >> > when nfsd is stopped.
>> >>
>> >> So, that sounds like a bug, but I don't understand this explanation y=
et.
>> >>
>> >> > Update the cache with the new expiry_time of now so that cache_clea=
n will
>> >> > clean up (free) the context to be destroyed.
>> >> >
>> >> > Signed-off-by: Andy Adamson <[email protected]>
>> >> > ---
>> >> > net/sunrpc/auth_gss/svcauth_gss.c | 32 +++++++++++++++++++++++++++=
+++--
>> >> > 1 file changed, 30 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gs=
s/svcauth_gss.c
>> >> > index 45662d7..6033389 100644
>> >> > --- a/net/sunrpc/auth_gss/svcauth_gss.c
>> >> > +++ b/net/sunrpc/auth_gss/svcauth_gss.c
>> >> > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entry=
(struct net *net) {}
>> >> >
>> >> > #endif /* CONFIG_PROC_FS */
>> >> >
>> >> > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
>> >> > +{
>> >> > + struct rsc new;
>> >> > + int ret =3D -ENOMEM;
>> >> > +
>> >> > + memset(&new, 0, sizeof(struct rsc));
>> >> > + if (dup_netobj(&new.handle, &rscp->handle))
>> >> > + goto out;
>> >> > + new.h.expiry_time =3D get_seconds();
>> >> > + set_bit(CACHE_NEGATIVE, &new.h.flags);
>> >> > +
>> >> > + rscp =3D rsc_update(sn->rsc_cache, &new, rscp);
>> >> > + if (!rscp)
>> >> > + goto out;
>> >> > + ret =3D 0;
>> >> > +out:
>> >> > + rsc_free(&new);
>> >> > + return ret;
>> >> > +}
>> >> > +
>> >> > /*
>> >> > * Accept an rpcsec packet.
>> >> > * If context establishment, punt to user space
>> >> > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_entr=
y(struct net *net) {}
>> >> > case RPC_GSS_PROC_DESTROY:
>> >> > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
>> >> > goto auth_err;
>> >> > - rsci->h.expiry_time =3D get_seconds();
>> >> > - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
>> >> > + if (rsc_destroy(sn, rsci))
>> >> > + goto drop;
>> >> > + /**
>> >> > + * Balance the cache_put at the end of svcauth_gss_acce=
pt.This
>> >> > + * will leave the refcount =3D 1 so that the cache_clea=
n cache_put
>> >> > + * will call rsc_put.
>> >> > + */
>> >>
>> >> I'm confused by that comment. If it's right, then it means the refco=
unt
>> >> is currently zero, in which case the following line is unsafe.
>> >>
>> >> --b.
>> >>
>> >> > + cache_get(&rsci->h);
>> >> > +
>> >> > if (resv->iov_len + 4 > PAGE_SIZE)
>> >> > goto drop;
>> >> > +
>> >> > svc_putnl(resv, RPC_SUCCESS);
>> >> > goto complete;
>> >> > case RPC_GSS_PROC_DATA:
>> >> > --
>> >> > 1.8.3.1
>> >>
>> >>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" i=
n
>> > the body of a message to [email protected]
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-12-19 16:30:43

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH Version 2] SVCAUTH update the rsc cache on RPC_GSS_PROC_DESTROY

Oh - and all I do to reproduce this is:

mount -o sec=3Dkrb <server>:<export>

then ls the mountpoint, and umount.

-->Andy

On Mon, Dec 19, 2016 at 11:19 AM, Andy Adamson <[email protected]> wr=
ote:
> On Wed, Dec 14, 2016 at 4:04 PM, J. Bruce Fields <[email protected]> w=
rote:
>> On Tue, Dec 13, 2016 at 11:36:11AM -0500, Andy Adamson wrote:
>>> On Tue, Dec 13, 2016 at 11:00 AM, J. Bruce Fields <[email protected]=
> wrote:
>>> > On Mon, Dec 12, 2016 at 10:54:33PM +0000, Adamson, Andy wrote:
>>> >> The bug is setting new values on an rsc cache entry without calling
>>> >> rsc_update. When you do this, the =E2=80=9Clocal copy=E2=80=9D of th=
e rsc cache entry
>>> >> (e.g. the one returned by gss_svc_searchbyctx ) gets the new values
>>> >> (expiry_time and CACHE_NEGATIVE bit set) but the new values are not
>>> >> propagated back to the cache.
>>> >
>>> > gss_svc_searchbyctx:
>>> >
>>> > found =3D rsc_lookup(cd, &rsci);
>>> > ...
>>> > return found;
>>> >
>>> > rsc_lookup:
>>> >
>>> > ch =3D sunrpc_cache_lookup(cd, &item->h, hash);
>>> > if (ch)
>>> > return container_of(ch, struct rsc, h);
>>> >
>>> > sunrpc_cache_lookup:
>>> >
>>> > head =3D &detail->hash_table[hash];
>>> > ...
>>> > hlist_for_each_entry(tmp, head, cache_list) {
>>> > ...
>>> > return tmp;
>>> >
>>> > Definitely looks to me like it's returning a cache entry, not a copy.
>>>
>>> Thats right. What I call a "local copy". Guess that is wrong. I should
>>> say that sunrpc_cache_lookup returns a cache entry under the
>>> read_lock() and so writing that cache entry does not save the changes.
>>
>> Writes still happen normally, there's just the hypothetical possibility
>> of races if two processes are writing at the same time. (Don't know if
>> that's a risk here, I haven't thought it through.)
>
> Why have a read_lock/write_lock architecture and then not follow it?
> It seems to me that the hypothetical possibility equals a bug.
>
>>
>> Maybe it would be easiest for me to understand if you could start with a
>> description of your reproducer? What do you do, and what results do you
>> see?
>
> Sure. I'm coding the GSSv3 Full Mode labeling prototype, where I add
> more bookkeeping to the rsc cache which needs to be freed. So I
> instrumented rsc_put calling rsc_free to ensure that my additions were
> being freed correctly. Then I found that rsc_put was not called. I am
> assuming that the reason for setting the expiry_time to get_seconds()
> and setting the CACHE_NEGATIVE bit is to destroy the context. But it
> was not happening.
>
> So I instrumented cache_get, cache_put, cache_clean,
> svcauth_cache_lookup, cache_check. I also instrumented all other users
> of the sunrpc cache put routines to see what they do.
>
> Here is a long log of the current code with printk output. At the end
> I have comments. Then a shorter log of the patched coded with printk
> output and some final comments.
>
> ----------- Current Code Behavior -------
>
> Case 1) Current code with writing rsci not using the write_lock:
>
>
> case RPC_GSS_PROC_DESTROY:
> printk("%s RPC_GSS_PROC_DESTROY h %p ref %d expiry %lu\n"=
,
> __func__, &rsci->h, atomic_read(&rsci->h.ref.refc=
ount),
> rsci->h.expiry_time);
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
>
> rsci->h.expiry_time =3D get_seconds();
> set_bit(CACHE_NEGATIVE, &rsci->h.flags);
>
> printk("%s AFTER SETTING h %p expiry %lu\n\n", __func__,
> &rsci->h, rsci->h.expiry_time);
>
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
>
> svc_putnl(resv, RPC_SUCCESS);
> goto complete;
>
> case RPC_GSS_PROC_DATA:
>
>
>
>
> svcauth_gss_accept gc_proc 3 rq_proc 0 (one of two
> RPC_GSS_PROC_DESTROY requests)
>
> --> sunrpc_cache_lookup from rsc_lookup key ffffc90002b9bc58 hash 940
> Note: hash 940. Lookup is from gss_write_verf.
>
> sunrpc_cache_lookup 1 calling cache_get key ffffc90002b9bc58 tmp
> ffff880079b3f500
> --> cache_get h ffff880079b3f500 refcount 1
> Note: refcount is one before cache get, so ends up as 2.
>
> sunrpc_cache_lookup RETURN 1 key ffffc90002b9bc58 tmp ffff880079b3f500
> --> rsc_free h ffffc90002b9bc58
> Note: the above rsc_free is for the temporary entry created by
> sunrpc_cache_lookup
>
> cache_check h ffff880079b3f500 rv 0 h->expiry_time 1481823331
> cache_check rqstp NULL rv 0 h ffff880079b3f500
> <-- cache_check returns 0 h ffff880079b3f500
>
> svcauth_gss_accept RPC_GSS_PROC_DESTROY h ffff880079b3f500 ref 2
> expiry 1481823331 <<<<<
> svcauth_gss_accept AFTER SETTING h ffff880079b3f500 expiry 1481819736 <=
<<<<<<
>
> Note: expiry_time (and CACHE_NEGATIVE) are set.
>
> svcauth_gss_accept calling cache_put h ffff880079b3f500
> --> cache_put h ffff880079b3f500 refcount 2 cd->cache_put ffffffffa045c18=
0
> Dec 15 11:35:36 rhel7-2-ga-2 kernel: <-- cache_put h ffff880079b3f500 ref=
count 1
>
> Note: this is the cache_put at the end of svcauth_gss_accept.
>
> Dec 15 11:53:21 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3f500 h->expiry_time 1481819736 hash 940
> Dec 15 11:53:21 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3e540 h->expiry_time 1481819736 hash 982
>
> Note: The first cache_clean ran approximately 18 minutes later and
> shows the two RPC_GSS_PROC_DESTROY'ed rsc entries with the reset
> expiry_time, but does not call cache_put.
>
> ....
>
> Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff88002666dc80 h->expiry_time 665664692 hash 189
> Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880075e4aa00 h->expiry_time 665664692 hash 210
> Dec 15 12:00:33 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880077995660 h->expiry_time 2576 hash 165
>
> Note: The second cache_clean comes approximately 8 minutes later.
> Still no cach_put on the rsc entries.
>
> ....
>
>
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880076fcd700 h->expiry_time 2576 hash 18
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff880076fcd700 from cache_list
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff880076fcd700
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880076fcd700
> refcount 1 cd->cache_put ffffffffa03f3690
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_put h ffff880076fcd700
> expiry_time 2576 nextcheck 2576
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: svc_export_put PUT h ffff880076fcd70=
0
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880076fcd700 ref=
count 0
>
> Note: approximately 5 minutes later svc_export_put is called by cache_cle=
an
>
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880076fcf400 h->expiry_time 2576 hash 78
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff880076fcf400 from cache_list
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff880076fcf400
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880076fcf400
> refcount 1 cd->cache_put ffffffffa03f3690
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: svc_export_put PUT h ffff880076fcf40=
0
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880076fcf400 ref=
count 0
>
> Note: another svc_export_put being called:
>
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880077995660 h->expiry_time 2576 hash 165
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff880077995660 from cache_list
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff880077995660
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: --> cache_put h ffff880077995660
> refcount 1 cd->cache_put ffffffffa0390ff0
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: cache_put h ffff880077995660
> expiry_time 2576 nextcheck 2576
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: ip_map_put PUT h ffff880077995660
> Dec 15 12:05:38 rhel7-2-ga-2 kernel: <-- cache_put h ffff880077995660 ref=
count 0
>
> Note: And ip_map_put is called.
>
> Still no rsc_put.
>
> shut down NFSD:
>
>
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff88002666dc80 h->expiry_time 665664692 hash 189
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff88002666dc80 from cache_list
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff88002666dc80
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: --> cache_put h ffff88002666dc80
> refcount 1 cd->cache_put ffffffffa03f3390
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: expkey_put PUT h ffff88002666dc80
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: <-- cache_put h ffff88002666dc80 ref=
count 0
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880075e4aa00 h->expiry_time 665664692 hash 210
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff880075e4aa00 from cache_list
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff880075e4aa00
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: --> cache_put h ffff880075e4aa00
> refcount 1 cd->cache_put ffffffffa03f3390
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: expkey_put PUT h ffff880075e4aa00
> Dec 15 12:16:12 rhel7-2-ga-2 kernel: <-- cache_put h ffff880075e4aa00 ref=
count 0
> Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFS server and services.
> Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopping NFS Mount Daemon...
> Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopping NFSv4 ID-name mapping serv=
ice...
> Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFSv4 ID-name mapping servi=
ce.
> Dec 15 12:16:12 rhel7-2-ga-2 rpc.mountd[13897]: Caught signal 15,
> un-registering and exiting.
> Dec 15 12:16:12 rhel7-2-ga-2 systemd: Stopped NFS Mount Daemon.
>
>
> NOTE: both svcauth rsc cache entries =3D hash 940 and hash 982 are still
> not reaped. cache_clean continues to be run, the expiry time is way in
> the past.
>
>
> Dec 15 12:23:39 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3f500 h->expiry_time 1481819736 hash 940
> Dec 15 12:23:39 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3e540 h->expiry_time 1481819736 hash 982
>
> ....
>
>
> Dec 15 12:53:49 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3f500 h->expiry_time 1481819736 hash 940
> Dec 15 12:53:49 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3e540 h->expiry_time 1481819736 hash 982
>
> ...
>
> Dec 15 13:24:00 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3f500 h->expiry_time 1481819736 hash 940
> Dec 15 13:24:00 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880079b3e540 h->expiry_time 1481819736 hash 982
>
>
> I don't know if it ever gets reaped.
>
> 13:24.00 =3D time last cache_clean did not reap hash 940
> - 11:35:36 =3D time hash 940 expiry_time set to get_seconds()
> -----------
> 1:59:24 not reaped.
>
>
> ----------- End of Current Code Behavior -------
>
> I see that:
>
> 1) cache_clean is what takes the write_lock and deletes entries from
> the cache_list.
> 2) no cache_detail->cache_put routine deletes the cache entry from the
> cache_list, including rsc_put.
>
> so, either I need to figure out how to properly trigger cache_clean to
> be called or I add grabbing the write_lock and call hlist_del_init()
> to remove the cache entries in the rsc_put routine.
>
> I wrote the rsc_destroy patch to do the first: set up the rsc cache
> entry to be destroyed by the first call to cache_clean.
>
> The disputed cache_get (after rsc_destroy in the code below) leaves
> the cache entry with a refcount of 1 which allows cache clean to not
> only call rsc_put, but to remove the entry from the cache_list.
>
> Without the disputed cache_get, the cache_put at the end of
> svcauth_gss_accept would call rsc_put and rsc_free which would free
> the cache entry without removing it from the cache_list. (here I would
> need to take the write_lock in rsc_put to delete it).
>
> ----------- Start of my patch Code Behavior -------
>
>
> 2) Case 2 with the rsc_destroy patch:
> static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
> {
> struct rsc new;
> int ret =3D -ENOMEM;
>
> memset(&new, 0, sizeof(struct rsc));
> if (dup_netobj(&new.handle, &rscp->handle))
> goto out;
> new.h.expiry_time =3D get_seconds();
> set_bit(CACHE_NEGATIVE, &new.h.flags);
>
> rscp =3D rsc_update(sn->rsc_cache, &new, rscp);
> if (!rscp)
> goto out;
> ret =3D 0;
> out:
> rsc_free(&new);
> return ret;
> }
>
> case RPC_GSS_PROC_DESTROY:
> printk("%s RPC_GSS_PROC_DESTROY h %p ref %d expiry %lu\n"=
,
> __func__, &rsci->h, atomic_read(&rsci->h.ref.refc=
ount),
> rsci->h.expiry_time);
> if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
> goto auth_err;
>
> if (!rsc_destroy(sn, rsci))
> goto drop;
> /**
> * Balance the cache_put at the end of svcauth_gss_accept=
.This
> * will leave the refcount =3D 1 after the DESTROY
> request has been
> * processed so that the cache_clean cache_put will
> call rsc_put.
> */
> printk("%s calling cache_get h %p ref %d\n", __func__, &r=
sci->h,
> atomic_read(&rsci->h.ref.refcount));
>
> cache_get(&rsci->h); <<<<<<<<<<<<<<< disputed cache_get
>
> if (resv->iov_len + 4 > PAGE_SIZE)
> goto drop;
>
> svc_putnl(resv, RPC_SUCCESS);
> goto complete;
> case RPC_GSS_PROC_DATA:
>
>
>
>
> svcauth_gss_accept gc_proc 3 rq_proc 0 (one of two
> RPC_GSS_PROC_DESTROY requests)
>
> --> sunrpc_cache_lookup from rsc_lookup key ffffc90002603c58 hash 940
> Note hase 940
>
> sunrpc_cache_lookup 1 calling cache_get key ffffc90002603c58 tmp
> ffff880073596240
> --> cache_get h ffff880073596240 refcount 1
>
> sunrpc_cache_lookup RETURN 1 key ffffc90002603c58 tmp ffff880073596240
> --> rsc_free h ffffc90002603c58
> cache_check h ffff880073596240 rv 0 h->expiry_time 1481810443
> cache_check rqstp NULL rv 0 h ffff880073596240
> <-- cache_check returns 0 h ffff880073596240
>
>
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept
> RPC_GSS_PROC_DESTROY h ffff880073596240 ref 2 expiry 1481810443
> Note expiry_time
>
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> rsc_destroy rsci->h
> ffff880073596240 ref 2
>
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: sunrpc_cache_update calling
> cache_put h ffff880073596240
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
> refcount 2 cd->cache_put ffffffffa03d3180
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: cache_put h ffff880073596240
> expiry_time 0 nextcheck 0
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 ref=
count 1
>
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- rsc_destroy rsci->h
> ffff880073596240 ref 1
>
>
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept calling
> cache_get h ffff880073596240 ref 1
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_get h ffff880073596240 ref=
count 1
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: svcauth_gss_accept calling
> cache_put h ffff880073596240
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
> refcount 2 cd->cache_put ffffffffa03d3180
> Dec 15 08:00:47 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 ref=
count 1
>
> Note: this is the cache_put at the end of svcauth_gss_accept.
>
>
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff8800705ae000 h->expiry_time 1481806847 hash 940
> Note: the expiry_time has been set (used to be 1481810443)
>
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEAN h
> ffff880073596240 h->expiry_time 0 hash 940
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean DELETE h
> ffff880073596240 from cache_list
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_clean CLEANED calling
> cache_put h ffff880073596240
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> cache_put h ffff880073596240
> refcount 1 cd->cache_put ffffffffa03d3180
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: cache_put h ffff880073596240
> expiry_time 0 nextcheck 0
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> rsc_put PUT h ffff880073596240
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: --> rsc_free h ffff880073596240
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: RPC: gss_delete_sec_context
> deleting ffff880071288f40
> Dec 15 08:01:03 rhel7-2-ga-2 kernel: <-- cache_put h ffff880073596240 ref=
count 0
>
> Note: AFAICS this is the way the code is supposed to work.
>
>
> -->Andy
>
>
>
>
>>
>> --b.
>>
>>>
>>> >
>>> > Maybe there's some other reason we need to use rsc_update, but that's
>>> > not it.
>>>
>>> OK - rsc_update calls sunrpc_cache_update which takes the write_lock
>>> on the cache entry, and so allows changes to be saved.
>>>
>>> --Andy
>>>
>>> >
>>> > --b.
>>> >
>>> >> So the next time the cache entry is
>>> >> looked up, e.g. when cache_clean() is called to clean up, the
>>> >> expiry_time and CACHE_NEGATIVE are not set to the new values on the
>>> >> cache entry to be destroyed, and cache_clean does not reap the cache
>>> >> entry.
>>> >
>>> >>
>>> >> The fix is to do what this patch does, and call rsc_update on the rs=
c entry. With this patch, cache_clean is called and reaps the cache entries=
.
>>> >>
>>> >> BTW: just look at all the other uses of the cache and you will note =
that they all call update after setting new values.
>>> >>
>>> >> It=E2=80=99s just how Neil=E2=80=99s cache code works.
>>> >>
>>> >> e.g. dns_resolve.c
>>> >> key.h.expiry_time =3D ttl + seconds_since_boot();
>>> >> =E2=80=A6
>>> >> if (key.addrlen =3D=3D 0)
>>> >> set_bit(CACHE_NEGATIVE, &key.h.flags);
>>> >>
>>> >> item =3D nfs_dns_update(cd, &key, item);
>>> >>
>>> >>
>>> >>
>>> >> I also just found a bug, I need a local rsc cache pointer for the rs=
c_update return. That plus Anna=E2=80=99s comments will be addressed in ver=
sion 3. I=E2=80=99ll explain more in the patch comments.
>>> >>
>>> >> =E2=86=92Andy
>>> >>
>>> >> On 12/12/16, 4:58 PM, "J. Bruce Fields" <[email protected]> wrote=
:
>>> >>
>>> >> On Mon, Dec 12, 2016 at 12:08:49PM -0500, [email protected] wrote:
>>> >> > From: Andy Adamson <[email protected]>
>>> >> >
>>> >> > The current code sets the expiry_time on the local copy of the rsc
>>> >> > cache entry - but not on the actual cache entry.
>>> >>
>>> >> I'm not following. It looks to me like "rsci" in the below was retu=
rned
>>> >> from gss_svc_searchbyctx(), which was returned in turn from
>>> >> sunrpc_cache_lookup(), which is returning an item from the rsc cache=
--I
>>> >> don't see any copying.
>>> >>
>>> >> > Note that currently, the rsc cache entries are not cleaned up even
>>> >> > when nfsd is stopped.
>>> >>
>>> >> So, that sounds like a bug, but I don't understand this explanation =
yet.
>>> >>
>>> >> > Update the cache with the new expiry_time of now so that cache_cle=
an will
>>> >> > clean up (free) the context to be destroyed.
>>> >> >
>>> >> > Signed-off-by: Andy Adamson <[email protected]>
>>> >> > ---
>>> >> > net/sunrpc/auth_gss/svcauth_gss.c | 32 ++++++++++++++++++++++++++=
++++--
>>> >> > 1 file changed, 30 insertions(+), 2 deletions(-)
>>> >> >
>>> >> > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_g=
ss/svcauth_gss.c
>>> >> > index 45662d7..6033389 100644
>>> >> > --- a/net/sunrpc/auth_gss/svcauth_gss.c
>>> >> > +++ b/net/sunrpc/auth_gss/svcauth_gss.c
>>> >> > @@ -1393,6 +1393,26 @@ static void destroy_use_gss_proxy_proc_entr=
y(struct net *net) {}
>>> >> >
>>> >> > #endif /* CONFIG_PROC_FS */
>>> >> >
>>> >> > +static int rsc_destroy(struct sunrpc_net *sn, struct rsc *rscp)
>>> >> > +{
>>> >> > + struct rsc new;
>>> >> > + int ret =3D -ENOMEM;
>>> >> > +
>>> >> > + memset(&new, 0, sizeof(struct rsc));
>>> >> > + if (dup_netobj(&new.handle, &rscp->handle))
>>> >> > + goto out;
>>> >> > + new.h.expiry_time =3D get_seconds();
>>> >> > + set_bit(CACHE_NEGATIVE, &new.h.flags);
>>> >> > +
>>> >> > + rscp =3D rsc_update(sn->rsc_cache, &new, rscp);
>>> >> > + if (!rscp)
>>> >> > + goto out;
>>> >> > + ret =3D 0;
>>> >> > +out:
>>> >> > + rsc_free(&new);
>>> >> > + return ret;
>>> >> > +}
>>> >> > +
>>> >> > /*
>>> >> > * Accept an rpcsec packet.
>>> >> > * If context establishment, punt to user space
>>> >> > @@ -1489,10 +1509,18 @@ static void destroy_use_gss_proxy_proc_ent=
ry(struct net *net) {}
>>> >> > case RPC_GSS_PROC_DESTROY:
>>> >> > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq))
>>> >> > goto auth_err;
>>> >> > - rsci->h.expiry_time =3D get_seconds();
>>> >> > - set_bit(CACHE_NEGATIVE, &rsci->h.flags);
>>> >> > + if (rsc_destroy(sn, rsci))
>>> >> > + goto drop;
>>> >> > + /**
>>> >> > + * Balance the cache_put at the end of svcauth_gss_acc=
ept.This
>>> >> > + * will leave the refcount =3D 1 so that the cache_cle=
an cache_put
>>> >> > + * will call rsc_put.
>>> >> > + */
>>> >>
>>> >> I'm confused by that comment. If it's right, then it means the refc=
ount
>>> >> is currently zero, in which case the following line is unsafe.
>>> >>
>>> >> --b.
>>> >>
>>> >> > + cache_get(&rsci->h);
>>> >> > +
>>> >> > if (resv->iov_len + 4 > PAGE_SIZE)
>>> >> > goto drop;
>>> >> > +
>>> >> > svc_putnl(resv, RPC_SUCCESS);
>>> >> > goto complete;
>>> >> > case RPC_GSS_PROC_DATA:
>>> >> > --
>>> >> > 1.8.3.1
>>> >>
>>> >>
>>> > --
>>> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
>>> > the body of a message to [email protected]
>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html