2018-03-28 16:18:31

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH] nfsd: Do not refuse to serve out of cache

Currently the knfsd replay cache appears to try to refuse replying to
retries that come within 200ms of the cache entry being created. That
makes limited sense in today's world of high speed TCP.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfsd/cache.h | 5 -----
fs/nfsd/nfscache.c | 6 ++----
2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
index 046b3f048757..b7559c6f2b97 100644
--- a/fs/nfsd/cache.h
+++ b/fs/nfsd/cache.h
@@ -67,11 +67,6 @@ enum {
RC_REPLBUFF,
};

-/*
- * If requests are retransmitted within this interval, they're dropped.
- */
-#define RC_DELAY (HZ/5)
-
/* Cache entries expire after this time period */
#define RC_EXPIRE (120 * HZ)

diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 334f2ad60704..637f87c39183 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -394,7 +394,6 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
__wsum csum;
u32 hash = nfsd_cache_hash(xid);
struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
- unsigned long age;
int type = rqstp->rq_cachetype;
int rtn = RC_DOIT;

@@ -461,12 +460,11 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
found_entry:
nfsdstats.rchits++;
/* We found a matching entry which is either in progress or done. */
- age = jiffies - rp->c_timestamp;
lru_put_end(b, rp);

rtn = RC_DROPIT;
- /* Request being processed or excessive rexmits */
- if (rp->c_state == RC_INPROG || age < RC_DELAY)
+ /* Request being processed */
+ if (rp->c_state == RC_INPROG)
goto out;

/* From the hall of fame of impractical attacks:
--
2.14.3



2018-03-28 19:20:49

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH] nfsd: Do not refuse to serve out of cache

On Wed, 2018-03-28 at 12:18 -0400, Trond Myklebust wrote:
> Currently the knfsd replay cache appears to try to refuse replying to
> retries that come within 200ms of the cache entry being created. That
> makes limited sense in today's world of high speed TCP.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfsd/cache.h | 5 -----
> fs/nfsd/nfscache.c | 6 ++----
> 2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> index 046b3f048757..b7559c6f2b97 100644
> --- a/fs/nfsd/cache.h
> +++ b/fs/nfsd/cache.h
> @@ -67,11 +67,6 @@ enum {
> RC_REPLBUFF,
> };
>
> -/*
> - * If requests are retransmitted within this interval, they're
> dropped.
> - */
> -#define RC_DELAY (HZ/5)
> -
> /* Cache entries expire after this time period */
> #define RC_EXPIRE (120 * HZ)
>
> diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> index 334f2ad60704..637f87c39183 100644
> --- a/fs/nfsd/nfscache.c
> +++ b/fs/nfsd/nfscache.c
> @@ -394,7 +394,6 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> __wsum csum;
> u32 hash = nfsd_cache_hash(xid);
> struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
> - unsigned long age;
> int type = rqstp->rq_cachetype;
> int rtn = RC_DOIT;
>
> @@ -461,12 +460,11 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> found_entry:
> nfsdstats.rchits++;
> /* We found a matching entry which is either in progress or
> done. */
> - age = jiffies - rp->c_timestamp;
> lru_put_end(b, rp);
>
> rtn = RC_DROPIT;
> - /* Request being processed or excessive rexmits */
> - if (rp->c_state == RC_INPROG || age < RC_DELAY)
> + /* Request being processed */
> + if (rp->c_state == RC_INPROG)
> goto out;
>
> /* From the hall of fame of impractical attacks:

That condition always looked a bit suspicious to me.

Acked-by: Jeff Layton <[email protected]>

2018-03-28 20:10:43

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: Do not refuse to serve out of cache

Applying, thanks.

On Wed, Mar 28, 2018 at 03:20:45PM -0400, Jeff Layton wrote:
> On Wed, 2018-03-28 at 12:18 -0400, Trond Myklebust wrote:
> > Currently the knfsd replay cache appears to try to refuse replying to
> > retries that come within 200ms of the cache entry being created. That
> > makes limited sense in today's world of high speed TCP.

Trond gave me some helpful context in person, I may tag that onto the
changelog:

After a TCP disconnection, a client can very easily reconnect
and retry an rpc in less than 200ms. If this logic drops that
retry, however, the client may be quite slow to retry again.
This logic is original to the first reply cache implementation
in 2.1, and may have made more sense for UDP clients that
retried much more frequently.

We're still dropping on finding the original request still in
progress, which can cause the same problem, though it's less
likely.

Note svc_check_conn_limits is often the cause of those
disconnections. We may want to fix that some day.

--b.

> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > fs/nfsd/cache.h | 5 -----
> > fs/nfsd/nfscache.c | 6 ++----
> > 2 files changed, 2 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> > index 046b3f048757..b7559c6f2b97 100644
> > --- a/fs/nfsd/cache.h
> > +++ b/fs/nfsd/cache.h
> > @@ -67,11 +67,6 @@ enum {
> > RC_REPLBUFF,
> > };
> >
> > -/*
> > - * If requests are retransmitted within this interval, they're
> > dropped.
> > - */
> > -#define RC_DELAY (HZ/5)
> > -
> > /* Cache entries expire after this time period */
> > #define RC_EXPIRE (120 * HZ)
> >
> > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> > index 334f2ad60704..637f87c39183 100644
> > --- a/fs/nfsd/nfscache.c
> > +++ b/fs/nfsd/nfscache.c
> > @@ -394,7 +394,6 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> > __wsum csum;
> > u32 hash = nfsd_cache_hash(xid);
> > struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
> > - unsigned long age;
> > int type = rqstp->rq_cachetype;
> > int rtn = RC_DOIT;
> >
> > @@ -461,12 +460,11 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> > found_entry:
> > nfsdstats.rchits++;
> > /* We found a matching entry which is either in progress or
> > done. */
> > - age = jiffies - rp->c_timestamp;
> > lru_put_end(b, rp);
> >
> > rtn = RC_DROPIT;
> > - /* Request being processed or excessive rexmits */
> > - if (rp->c_state == RC_INPROG || age < RC_DELAY)
> > + /* Request being processed */
> > + if (rp->c_state == RC_INPROG)
> > goto out;
> >
> > /* From the hall of fame of impractical attacks:
>
> That condition always looked a bit suspicious to me.
>
> Acked-by: Jeff Layton <[email protected]>