2006-09-01 04:39:41

by NeilBrown

[permalink] [raw]
Subject: [PATCH 016 of 19] knfsd: match GRANTED_RES replies using cookies


From: Olaf Kirch <[email protected]>

When we send a GRANTED_MSG call, we current copy the NLM cookie
provided in the original LOCK call - because in 1996, some broken
clients seemed to rely on this bug. However, this means the cookies
are not unique, so that when the client's GRANTED_RES message comes
back, we cannot simply match it based on the cookie, but have to
use the client's IP address in addition. Which breaks when you have
a multi-homed NFS client.

The X/Open spec explicitly mentions that clients should not expect the
same cookie; so one may hope that any clients that were broken in 1996
have either been fixed or rendered obsolete.

Signed-off-by: Olaf Kirch <[email protected]>
Signed-off-by: Neil Brown <[email protected]>

### Diffstat output
./fs/lockd/svc4proc.c | 2 +-
./fs/lockd/svclock.c | 24 +++++++++++++-----------
./fs/lockd/svcproc.c | 2 +-
./include/linux/lockd/lockd.h | 2 +-
4 files changed, 16 insertions(+), 14 deletions(-)

diff .prev/fs/lockd/svc4proc.c ./fs/lockd/svc4proc.c
--- .prev/fs/lockd/svc4proc.c 2006-09-01 12:11:01.000000000 +1000
+++ ./fs/lockd/svc4proc.c 2006-09-01 12:17:21.000000000 +1000
@@ -455,7 +455,7 @@ nlm4svc_proc_granted_res(struct svc_rqst

dprintk("lockd: GRANTED_RES called\n");

- nlmsvc_grant_reply(rqstp, &argp->cookie, argp->status);
+ nlmsvc_grant_reply(&argp->cookie, argp->status);
return rpc_success;
}


diff .prev/fs/lockd/svclock.c ./fs/lockd/svclock.c
--- .prev/fs/lockd/svclock.c 2006-09-01 12:11:01.000000000 +1000
+++ ./fs/lockd/svclock.c 2006-09-01 12:17:21.000000000 +1000
@@ -139,19 +139,19 @@ static inline int nlm_cookie_match(struc
* Find a block with a given NLM cookie.
*/
static inline struct nlm_block *
-nlmsvc_find_block(struct nlm_cookie *cookie, struct sockaddr_in *sin)
+nlmsvc_find_block(struct nlm_cookie *cookie)
{
struct nlm_block *block;

list_for_each_entry(block, &nlm_blocked, b_list) {
- if (nlm_cookie_match(&block->b_call->a_args.cookie,cookie)
- && nlm_cmp_addr(sin, &block->b_host->h_addr))
+ if (nlm_cookie_match(&block->b_call->a_args.cookie,cookie))
goto found;
}

return NULL;

found:
+ dprintk("nlmsvc_find_block(%s): block=%p\n", nlmdbg_cookie2a(cookie), block);
kref_get(&block->b_count);
return block;
}
@@ -165,6 +165,11 @@ found:
* request, but (as I found out later) that's because some implementations
* do just this. Never mind the standards comittees, they support our
* logging industries.
+ *
+ * 10 years later: I hope we can safely ignore these old and broken
+ * clients by now. Let's fix this so we can uniquely identify an incoming
+ * GRANTED_RES message by cookie, without having to rely on the client's IP
+ * address. --okir
*/
static inline struct nlm_block *
nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_file *file,
@@ -197,7 +202,7 @@ nlmsvc_create_block(struct svc_rqst *rqs
/* Set notifier function for VFS, and init args */
call->a_args.lock.fl.fl_flags |= FL_SLEEP;
call->a_args.lock.fl.fl_lmops = &nlmsvc_lock_operations;
- call->a_args.cookie = *cookie; /* see above */
+ nlmclnt_next_cookie(&call->a_args.cookie);

dprintk("lockd: created block %p...\n", block);

@@ -640,17 +645,14 @@ static const struct rpc_call_ops nlmsvc_
* block.
*/
void
-nlmsvc_grant_reply(struct svc_rqst *rqstp, struct nlm_cookie *cookie, u32 status)
+nlmsvc_grant_reply(struct nlm_cookie *cookie, u32 status)
{
struct nlm_block *block;
- struct nlm_file *file;

- dprintk("grant_reply: looking for cookie %x, host (%08x), s=%d \n",
- *(unsigned int *)(cookie->data),
- ntohl(rqstp->rq_addr.sin_addr.s_addr), status);
- if (!(block = nlmsvc_find_block(cookie, &rqstp->rq_addr)))
+ dprintk("grant_reply: looking for cookie %x, s=%d \n",
+ *(unsigned int *)(cookie->data), status);
+ if (!(block = nlmsvc_find_block(cookie)))
return;
- file = block->b_file;

if (block) {
if (status == NLM_LCK_DENIED_GRACE_PERIOD) {

diff .prev/fs/lockd/svcproc.c ./fs/lockd/svcproc.c
--- .prev/fs/lockd/svcproc.c 2006-09-01 12:11:01.000000000 +1000
+++ ./fs/lockd/svcproc.c 2006-09-01 12:17:21.000000000 +1000
@@ -484,7 +484,7 @@ nlmsvc_proc_granted_res(struct svc_rqst

dprintk("lockd: GRANTED_RES called\n");

- nlmsvc_grant_reply(rqstp, &argp->cookie, argp->status);
+ nlmsvc_grant_reply(&argp->cookie, argp->status);
return rpc_success;
}


diff .prev/include/linux/lockd/lockd.h ./include/linux/lockd/lockd.h
--- .prev/include/linux/lockd/lockd.h 2006-09-01 12:17:03.000000000 +1000
+++ ./include/linux/lockd/lockd.h 2006-09-01 12:17:21.000000000 +1000
@@ -193,7 +193,7 @@ u32 nlmsvc_cancel_blocked(struct nlm_
unsigned long nlmsvc_retry_blocked(void);
void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *,
nlm_host_match_fn_t match);
-void nlmsvc_grant_reply(struct svc_rqst *, struct nlm_cookie *, u32);
+void nlmsvc_grant_reply(struct nlm_cookie *, u32);

/*
* File handling for the server personality

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2006-09-01 16:03:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 016 of 19] knfsd: match GRANTED_RES replies using cookies

On Fri, 2006-09-01 at 14:39 +1000, NeilBrown wrote:
> From: Olaf Kirch <[email protected]>
>
> When we send a GRANTED_MSG call, we current copy the NLM cookie
> provided in the original LOCK call - because in 1996, some broken
> clients seemed to rely on this bug. However, this means the cookies
> are not unique, so that when the client's GRANTED_RES message comes
> back, we cannot simply match it based on the cookie, but have to
> use the client's IP address in addition. Which breaks when you have
> a multi-homed NFS client.
>
> The X/Open spec explicitly mentions that clients should not expect the
> same cookie; so one may hope that any clients that were broken in 1996
> have either been fixed or rendered obsolete.

Vetoed. The reason why we need the client's IP address as an argument
for nlmsvc_find_block() is 'cos the cookie value is unique to the
_client_ only.

IOW: when we go searching a global list of blocks for a given cookie
value that was sent to us by a given client, then we want to know that
we're only searching through that particular client's blocks.

A better way, BTW, would be to get rid of the global list nlm_blocked,
and just move the list of blocks into a field in the nlm_host for each
client.

> Signed-off-by: Olaf Kirch <[email protected]>
> Signed-off-by: Neil Brown <[email protected]>
>
> ### Diffstat output
> ./fs/lockd/svc4proc.c | 2 +-
> ./fs/lockd/svclock.c | 24 +++++++++++++-----------
> ./fs/lockd/svcproc.c | 2 +-
> ./include/linux/lockd/lockd.h | 2 +-
> 4 files changed, 16 insertions(+), 14 deletions(-)
>
> diff .prev/fs/lockd/svc4proc.c ./fs/lockd/svc4proc.c
> --- .prev/fs/lockd/svc4proc.c 2006-09-01 12:11:01.000000000 +1000
> +++ ./fs/lockd/svc4proc.c 2006-09-01 12:17:21.000000000 +1000
> @@ -455,7 +455,7 @@ nlm4svc_proc_granted_res(struct svc_rqst
>
> dprintk("lockd: GRANTED_RES called\n");
>
> - nlmsvc_grant_reply(rqstp, &argp->cookie, argp->status);
> + nlmsvc_grant_reply(&argp->cookie, argp->status);
> return rpc_success;
> }
>
>
> diff .prev/fs/lockd/svclock.c ./fs/lockd/svclock.c
> --- .prev/fs/lockd/svclock.c 2006-09-01 12:11:01.000000000 +1000
> +++ ./fs/lockd/svclock.c 2006-09-01 12:17:21.000000000 +1000
> @@ -139,19 +139,19 @@ static inline int nlm_cookie_match(struc
> * Find a block with a given NLM cookie.
> */
> static inline struct nlm_block *
> -nlmsvc_find_block(struct nlm_cookie *cookie, struct sockaddr_in *sin)
> +nlmsvc_find_block(struct nlm_cookie *cookie)
> {
> struct nlm_block *block;
>
> list_for_each_entry(block, &nlm_blocked, b_list) {
> - if (nlm_cookie_match(&block->b_call->a_args.cookie,cookie)
> - && nlm_cmp_addr(sin, &block->b_host->h_addr))
> + if (nlm_cookie_match(&block->b_call->a_args.cookie,cookie))
> goto found;
> }
>
> return NULL;
>
> found:
> + dprintk("nlmsvc_find_block(%s): block=%p\n", nlmdbg_cookie2a(cookie), block);
> kref_get(&block->b_count);
> return block;
> }
> @@ -165,6 +165,11 @@ found:
> * request, but (as I found out later) that's because some implementations
> * do just this. Never mind the standards comittees, they support our
> * logging industries.
> + *
> + * 10 years later: I hope we can safely ignore these old and broken
> + * clients by now. Let's fix this so we can uniquely identify an incoming
> + * GRANTED_RES message by cookie, without having to rely on the client's IP
> + * address. --okir
> */
> static inline struct nlm_block *
> nlmsvc_create_block(struct svc_rqst *rqstp, struct nlm_file *file,
> @@ -197,7 +202,7 @@ nlmsvc_create_block(struct svc_rqst *rqs
> /* Set notifier function for VFS, and init args */
> call->a_args.lock.fl.fl_flags |= FL_SLEEP;
> call->a_args.lock.fl.fl_lmops = &nlmsvc_lock_operations;
> - call->a_args.cookie = *cookie; /* see above */
> + nlmclnt_next_cookie(&call->a_args.cookie);
>
> dprintk("lockd: created block %p...\n", block);
>
> @@ -640,17 +645,14 @@ static const struct rpc_call_ops nlmsvc_
> * block.
> */
> void
> -nlmsvc_grant_reply(struct svc_rqst *rqstp, struct nlm_cookie *cookie, u32 status)
> +nlmsvc_grant_reply(struct nlm_cookie *cookie, u32 status)
> {
> struct nlm_block *block;
> - struct nlm_file *file;
>
> - dprintk("grant_reply: looking for cookie %x, host (%08x), s=%d \n",
> - *(unsigned int *)(cookie->data),
> - ntohl(rqstp->rq_addr.sin_addr.s_addr), status);
> - if (!(block = nlmsvc_find_block(cookie, &rqstp->rq_addr)))
> + dprintk("grant_reply: looking for cookie %x, s=%d \n",
> + *(unsigned int *)(cookie->data), status);
> + if (!(block = nlmsvc_find_block(cookie)))
> return;
> - file = block->b_file;
>
> if (block) {
> if (status == NLM_LCK_DENIED_GRACE_PERIOD) {
>
> diff .prev/fs/lockd/svcproc.c ./fs/lockd/svcproc.c
> --- .prev/fs/lockd/svcproc.c 2006-09-01 12:11:01.000000000 +1000
> +++ ./fs/lockd/svcproc.c 2006-09-01 12:17:21.000000000 +1000
> @@ -484,7 +484,7 @@ nlmsvc_proc_granted_res(struct svc_rqst
>
> dprintk("lockd: GRANTED_RES called\n");
>
> - nlmsvc_grant_reply(rqstp, &argp->cookie, argp->status);
> + nlmsvc_grant_reply(&argp->cookie, argp->status);
> return rpc_success;
> }
>
>
> diff .prev/include/linux/lockd/lockd.h ./include/linux/lockd/lockd.h
> --- .prev/include/linux/lockd/lockd.h 2006-09-01 12:17:03.000000000 +1000
> +++ ./include/linux/lockd/lockd.h 2006-09-01 12:17:21.000000000 +1000
> @@ -193,7 +193,7 @@ u32 nlmsvc_cancel_blocked(struct nlm_
> unsigned long nlmsvc_retry_blocked(void);
> void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *,
> nlm_host_match_fn_t match);
> -void nlmsvc_grant_reply(struct svc_rqst *, struct nlm_cookie *, u32);
> +void nlmsvc_grant_reply(struct nlm_cookie *, u32);
>
> /*
> * File handling for the server personality
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-09-04 09:09:43

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH 016 of 19] knfsd: match GRANTED_RES replies using cookies

On Fri, Sep 01, 2006 at 12:03:38PM -0400, Trond Myklebust wrote:
> Vetoed. The reason why we need the client's IP address as an argument
> for nlmsvc_find_block() is 'cos the cookie value is unique to the
> _client_ only.

In NLM, a cookie can be used to identify the asynchronous reply to the
original request. Previously, there was a hack in the code that
sends GRANT replies to reuse the original cookie from the client's
LOCK request. The protocol spec explicitly says you must not rely
on this behavior; the only reason I added this kludge was that some
HPUX and SunOS versions did just that.

The down side of that kludge is that we are using a client-provided cookie
in one of our calls - which means we keep our fingers crossed it doesn't
collide with the cookie we generate ourselves. And to reduce the risk,
we also check the client IP when searching the nlm_blocked list. But it
is incorrect, and a kludge, and the longer I look at this code the
more I'm amazed it hasn't blown up.

This patch changes the code so that the only cookies we ever use
to look up something are those we generate ourselves, so they
are globally unique. As a consequence, we can stop using the client's
IP when searching the list.

> IOW: when we go searching a global list of blocks for a given cookie
> value that was sent to us by a given client, then we want to know that
> we're only searching through that particular client's blocks.

This is no longer true after applying this change.

> A better way, BTW, would be to get rid of the global list nlm_blocked,
> and just move the list of blocks into a field in the nlm_host for each
> client.

Given an incoming NLM_GRANTED_RES, how can you look up the nlm_host
and the pending NLM_GRANTED_MSG?
The reply may not come from any IP address you know of. The whole
reason for introducing this cookie nonsense in the NLM specification
was that the RPC layer doesn't always give you enough clues to
match a callback to the original message.

So this is really a bugfix which you *do* want to apply.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-09-05 16:12:57

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 016 of 19] knfsd: match GRANTED_RES replies using cookies

On Mon, 2006-09-04 at 11:09 +0200, Olaf Kirch wrote:
> On Fri, Sep 01, 2006 at 12:03:38PM -0400, Trond Myklebust wrote:
> > Vetoed. The reason why we need the client's IP address as an argument
> > for nlmsvc_find_block() is 'cos the cookie value is unique to the
> > _client_ only.
>
> In NLM, a cookie can be used to identify the asynchronous reply to the
> original request. Previously, there was a hack in the code that
> sends GRANT replies to reuse the original cookie from the client's
> LOCK request. The protocol spec explicitly says you must not rely
> on this behavior; the only reason I added this kludge was that some
> HPUX and SunOS versions did just that.
>
> The down side of that kludge is that we are using a client-provided cookie
> in one of our calls - which means we keep our fingers crossed it doesn't
> collide with the cookie we generate ourselves. And to reduce the risk,
> we also check the client IP when searching the nlm_blocked list. But it
> is incorrect, and a kludge, and the longer I look at this code the
> more I'm amazed it hasn't blown up.
>
> This patch changes the code so that the only cookies we ever use
> to look up something are those we generate ourselves, so they
> are globally unique. As a consequence, we can stop using the client's
> IP when searching the list.
>
> > IOW: when we go searching a global list of blocks for a given cookie
> > value that was sent to us by a given client, then we want to know that
> > we're only searching through that particular client's blocks.
>
> This is no longer true after applying this change.

Sorry, I missed that change. I see your point now.

> > A better way, BTW, would be to get rid of the global list nlm_blocked,
> > and just move the list of blocks into a field in the nlm_host for each
> > client.
>
> Given an incoming NLM_GRANTED_RES, how can you look up the nlm_host
> and the pending NLM_GRANTED_MSG?
> The reply may not come from any IP address you know of. The whole
> reason for introducing this cookie nonsense in the NLM specification
> was that the RPC layer doesn't always give you enough clues to
> match a callback to the original message.

Right. The question is how many clients out there do still rely on the
current behaviour?

Looking at Brent's 'NFS Illustrated', I see that he notes that for
NLM_GRANTED, the cookie is "An opaque value that is normally the same as
the client sent in the LOCK request, though the client cannot depend on
it". Which sounds like weasel words for "some clients still do depend on
it".

Cheers
Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-09-05 17:39:44

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH 016 of 19] knfsd: match GRANTED_RES replies using cookies

On Tue, Sep 05, 2006 at 12:12:30PM -0400, Trond Myklebust wrote:
> Right. The question is how many clients out there do still rely on the
> current behaviour?

Right. I know that old SunOS releases did, as did HPUX-6.5 or whatever
it was. But that was ages ago.

Suse has had this code change for a while now in SL10.1 (6 months) as
well as SLES10, and I haven't heard of any interoperability problems,
neither from partners or users, nor from our own QA.

> Looking at Brent's 'NFS Illustrated', I see that he notes that for
> NLM_GRANTED, the cookie is "An opaque value that is normally the same as
> the client sent in the LOCK request, though the client cannot depend on
> it". Which sounds like weasel words for "some clients still do depend on
> it".

The spec says the same (except it doesn't mention that the cookie is
"normally the same", it just explicitly states that the client must not
rely on the cookie in the GRANT call being the same as the one in the
LOCK call).

My guess is that this was an implementation "shortcut" in the original
Sun code that metastased into all the derived implementations, and when
they discovered it was a bad bug that could lead to lock mix-up, this
sloppiness had spread all over the Unix landscape and they needed to
put stern words into the standard...

In summary, I think more than 10 years after the publication of the
X/Open NLM spec it is acceptable to remove that kludge.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs