2002-07-19 08:20:04

by Olaf Kirch

[permalink] [raw]
Subject: Locking patches (generic & nfs)

Hi,

I've been investigating an NFS locking problem a customer
of SuSE has had between an OpenServer machine (oh boy)
acting as the NFS client and a Linux box acting as the server.

In the process of debugging this, I came across a number of
bugs in the 2.4.18 kernel.


fs/locks.c:
When a program locks the entire file, and then does an unlock
of just the first byte in the file, the kernel will not modify
the existing lock because of an overflow/signedness problem.

fs/lockd/svclock.c, include/linux/lockd.h:
Consider the following scenario:
client A locks a file
client B requests a conflicting lock, and asks
for "blocking" mode.
lockd creates a "struct block" and attaches
it to the existing lock
client A unlocks the file
This causes a call to nlmsvc_notify_blocked,
which puts the blocked lock onto a list
of locks which sould be retried, setting
the b_when field to 0.
The next time lockd comes around to inspecting this
list, it should notice that the lock can now be granted,
and send a NLM_GRANTED message to client B.

However, due to a signedness problem, the lock is
appended to the *end* of the list, where it's never
picked up.

fs/lockd/svcproc.c:
There's an interoperability problem with OpenServer and
probably other lockd implementations when it comes to
handling of blocked locks.

The way Linux clients deal with blocked locks goes
like this

C->S: lock this range, block if already taken
(1) S->C: blocked
...
(some other client removes the conflicting lock)
(2) S->C: the lock has been granted
C->S: ack
(3) C->S: lock this range, block if already taken
S->C: granted

At (1), the server records the fact that there's a blocking
lock request, and uses it at (2) to find out whom to
notify that the previously blocked request can now
be granted. When the client then follows up with a
LOCK call, the server notices that there's a blocked
lock around and destroys it.

Now OSR and maybe other lockd implementations do not
follow up on the GRANTED callback with another LOCK
call. According to the NLM spec this is sufficient,
because the GRANTED callback actually says "the lock
has been granted". The reason the Linux client does an
additional LOCK call is for stability (the NLM protocol is
full of race conditions).

However, for this to work properly, the Linux lockd must interpret
the client's response to the GRANTED callback. When receiving
this "ack" (in fact, it's a GRANTED_RES call), it must look
up the corresponding blocked lock and take if off the
list of blocked locks. If it doesn't, server and client
get out of sync wrt to who is blocking on what lock, and
start timing out).
(If you want details of what's exactly going wrong, mail me
for a packet trace).

At any rate, the above means that lockd needs to handle
GRANTED_MSG properly. The functionality to do so is already
there; it's just the handling of the RPC call itself that
wasn't there (or has been removed for some reason).

The second patch does this (even though for NFSv2 only; the
NFSv3 case is analogous).

I'm also attaching the test program I used.

Cheers
Olaf
--
Olaf Kirch | Anyone who has had to work with X.509 has probably
[email protected] | experienced what can best be described as
---------------+ ISO water torture. -- Peter Gutmann


Attachments:
(No filename) (3.26 kB)
linux-2.4.18-locks.patch (2.45 kB)
linux-2.4.18-lockd2.patch (1.42 kB)
mlock.c (1.59 kB)
Download all attachments

2002-07-23 14:56:46

by Trond Myklebust

[permalink] [raw]
Subject: Re: Locking patches (generic & nfs)


Hi Olaf,

>>>>> " " == Olaf Kirch <[email protected]> writes:


> --- linux/fs/lockd/svclock.c.locks Mon Jun 17 13:32:21 2002
> +++ linux/fs/lockd/svclock.c Mon Jun 17 13:37:36 2002
> @@ -62,8 +62,8 @@
> nlmsvc_remove_block(block);
> bp = &nlm_blocked; if (when != NLM_NEVER) {
> - if ((when += jiffies) == NLM_NEVER)
> - when ++;
> + if ((when += jiffies) > NLM_NEVER)
> + when = NLM_NEVER;
> while ((b = *bp) &&
> time_before_eq(b->b_when,when))
> bp = &b->b_next;
> } else

I disagree. As it stands, NLM_NEVER == (~(unsigned long)0), and "when"
is unsigned long, so the only thing we need to protect against is if
we hit the 'magic value' NLM_NEVER. Note that the time_before_eq()
comparison ensures that we cope well with jiffy wraparound etc, so the
entry should *not* in fact get put at the end of the list as you
claimed.

With the above change (plus your change to set NLM_NEVER=0x7fffffff),
we end up never retrying locks that just happen to have been put on
the list at a time when the value of 'jiffies' happens to be > 0x7fffffff.


-

The other fix for fs/locks.c looks reasonable AFAICS (but perhaps
Matthew wants to take a look?)

-

Concerning the fix implementing GRANTED_RES: I fully agree we need
it. I've just never had the time, and it's the sort of thing that
the Connectathon tests don't keep nagging at you with ;-)...

Patrice Dumas recently did some work on implementing this both for
NLMv1,2,3 and NLM4, so I was planning on integrating his changes into
2.4.20.

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-23 15:06:45

by Olaf Kirch

[permalink] [raw]
Subject: Re: Locking patches (generic & nfs)

On Tue, Jul 23, 2002 at 04:56:32PM +0200, Trond Myklebust wrote:
> I disagree. As it stands, NLM_NEVER == (~(unsigned long)0), and "when"
> is unsigned long, so the only thing we need to protect against is if
> we hit the 'magic value' NLM_NEVER. Note that the time_before_eq()
> comparison ensures that we cope well with jiffy wraparound etc, so the
> entry should *not* in fact get put at the end of the list as you
> claimed.
>
> With the above change (plus your change to set NLM_NEVER=0x7fffffff),
> we end up never retrying locks that just happen to have been put on
> the list at a time when the value of 'jiffies' happens to be > 0x7fffffff.

But as it is today, all blocked locks get inserted at the end of the
list because time_before_eq does a signed comparison! With the unpatched
code, when you have a blocked lock, and the conflicting lock is removed,
lockd will never send out a GRANTED_MSG. Because the blocked lock is at the
end of the list, and never picked up.

That's the real reason for changing NLM_NEVER to the largest _signed_
quantity. And if you do that, you need to deal with jiffy wraparound.
Maybe the way I did it is not optiomal, I concede. But you can't leave
it at ~0UL.

> Patrice Dumas recently did some work on implementing this both for
> NLMv1,2,3 and NLM4, so I was planning on integrating his changes into
> 2.4.20.

As you can see from the patch, it's not really much you need to add.
The functionality is all there, one only needs to decode the GRANTED_RES
call rather than dropping it.

Olaf
--
Olaf Kirch | Anyone who has had to work with X.509 has probably
[email protected] | experienced what can best be described as
---------------+ ISO water torture. -- Peter Gutmann


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-23 15:27:53

by Trond Myklebust

[permalink] [raw]
Subject: Re: Locking patches (generic & nfs)

On Tuesday 23 July 2002 17:06, Olaf Kirch wrote:

> But as it is today, all blocked locks get inserted at the end of the
> list because time_before_eq does a signed comparison! With the unpatched
> code, when you have a blocked lock, and the conflicting lock is removed,
> lockd will never send out a GRANTED_MSG. Because the blocked lock is at the
> end of the list, and never picked up.

Fair enough: I see the bug now.

So why could we not do something like the following instead? This just ensures
that we always leave the NLM_NEVER stuff at the end of the list which should
suffice to keep nlmsvc_retry_blocked() happy.

Cheers,
Trond


Attachments:
fix_svclock.dif (397.00 B)

2002-07-23 16:47:22

by Dumas Patrice

[permalink] [raw]
Subject: Re: Locking patches (generic & nfs)

Hi,

> > Patrice Dumas recently did some work on implementing this both for
> > NLMv1,2,3 and NLM4, so I was planning on integrating his changes into
> > 2.4.20.
>
> As you can see from the patch, it's not really much you need to add.
> The functionality is all there, one only needs to decode the GRANTED_RES
> call rather than dropping it.

This is basically all what is needed, however there are some
more issues raised when the server decode the GRANTED_RES calls I tried
to answer in other patches:

- if 2 clients use the same cookie, the blocks will be mixed up

- if the linux client GRANTED_RES is lost, it will get a GRANTED_MSG
twice, the second GRANTED_RES will have a status set to nlm_lck_denied
because in nlmclnt_grant the lock request won't be found. This in
turn will cause the server to unlock the lock, while the client thinks
it is granted and potential file corruption. I made a fix on the
client side.

It doesn't solve the problem on the server side, but it seems to me that
the server side behaviour is the right behaviour (except in very rare cases
where there is a race condition I will try to solve in a next patch but
it is very rare and not urgent in my opinion).

Pat


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-25 12:50:51

by Bill Davidsen

[permalink] [raw]
Subject: Re: Locking patches (generic & nfs)

On Fri, 19 Jul 2002, Olaf Kirch wrote:

> I've been investigating an NFS locking problem a customer
> of SuSE has had between an OpenServer machine (oh boy)
> acting as the NFS client and a Linux box acting as the server.
>
> In the process of debugging this, I came across a number of
> bugs in the 2.4.18 kernel.

When NFSv4 gets in the new kernel I invite you to put your eyeballs on
that! Good catch!

Clearly not a lot of users go through this code or it would have shown up
sooner.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.



-------------------------------------------------------
This sf.net email is sponsored by: Jabber - The world's fastest growing
real-time communications platform! Don't just IM. Build it in!
http://www.jabber.com/osdn/xim
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs