When the first time pynfs runs after rpc/nfsd startup, always get the warning,
"Got error: Connection closed"
I found the problem is caused by,
1. A new startup of nfsd, rpc.mountd, etc,
2. A rpc request from client (pynfs test, or normal mounting),
3. An ip_map cache is created but invalid, so upcall to rpc.mountd,
4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
do auth_reload(), and check_useipaddr(),
5. For the first time, old_use_ipaddr = -1, it causes rpc.mountd do write_flush that doing cache_clean,
6. The ip_map cache will be treat as expired and clean,
7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
and updated, the cache_check of old ip_map(doing the upcall) will
return -ETIMEDOUT.
8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
"svcauth_gss: Close connection when dropping an incoming message"
NeilBrown suggest in another email,
"If CACHE_VALID is not set, then there is no data in the cache item,
so there is nothing to expire. So it would be nice if cache items that
don't have CACHE_VALID are never treated as expired."
v2, change the checking of CACHE_PENDING to CACHE_VALID
Signed-off-by: Kinglong Mee <[email protected]>
---
include/linux/sunrpc/cache.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 62a60ee..782024e 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
kref_put(&h->ref, cd->cache_put);
}
-static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
+static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
{
+ if (!test_bit(CACHE_VALID, &h->flags))
+ return false;
+
return (h->expiry_time < seconds_since_boot()) ||
(detail->flush_time >= h->last_refresh);
}
--
2.9.3
On Mon, Feb 06 2017, Kinglong Mee wrote:
> When the first time pynfs runs after rpc/nfsd startup, always get the warning,
>
> "Got error: Connection closed"
>
> I found the problem is caused by,
> 1. A new startup of nfsd, rpc.mountd, etc,
> 2. A rpc request from client (pynfs test, or normal mounting),
> 3. An ip_map cache is created but invalid, so upcall to rpc.mountd,
> 4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
> do auth_reload(), and check_useipaddr(),
> 5. For the first time, old_use_ipaddr = -1, it causes rpc.mountd do write_flush that doing cache_clean,
> 6. The ip_map cache will be treat as expired and clean,
> 7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
> and updated, the cache_check of old ip_map(doing the upcall) will
> return -ETIMEDOUT.
> 8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db05
> "svcauth_gss: Close connection when dropping an incoming message"
>
> NeilBrown suggest in another email,
>
> "If CACHE_VALID is not set, then there is no data in the cache item,
> so there is nothing to expire. So it would be nice if cache items that
> don't have CACHE_VALID are never treated as expired."
>
> v2, change the checking of CACHE_PENDING to CACHE_VALID
>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> include/linux/sunrpc/cache.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index 62a60ee..782024e 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -204,8 +204,11 @@ static inline void cache_put(struct cache_head *h, struct cache_detail *cd)
> kref_put(&h->ref, cd->cache_put);
> }
>
> -static inline int cache_is_expired(struct cache_detail *detail, struct cache_head *h)
> +static inline bool cache_is_expired(struct cache_detail *detail, struct cache_head *h)
> {
> + if (!test_bit(CACHE_VALID, &h->flags))
> + return false;
> +
> return (h->expiry_time < seconds_since_boot()) ||
> (detail->flush_time >= h->last_refresh);
> }
> --
> 2.9.3
Thanks for this.
I think it would be best if this patch were applied *after* the change
to cache_purge(), to avoid possible issues if a git-bisect lands between
these two.
Apart from that:\
Reviewed-by: NeilBrown <[email protected]>
Thanks,
NeilBrown