2011-08-04 10:30:21

by Frank van Maarseveen

[permalink] [raw]
Subject: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
relock_filesystem patch proposed earlier).

A second client has an exclusive lock on a file on the server. The
client under test calls fcntl(F_SETLKW) to wait for the same exclusive
lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.

Next the server is rebooted. The second client recovers the lock
correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
this changes to NLM_BLOCKED after grace period expiration the fcntl
returns -ENOLCK ("No locks available.") instead of continuing to wait.

server:/proc/locks shows two entries for the file after the -ENOLCK. When
the second client gives up its lock because the program running there
is killed one entry in server:/proc/locks remains indefinately: as a
result no NFS client can lock the file anymore.

--
Frank


2011-08-04 16:43:15

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > relock_filesystem patch proposed earlier).
> >
> > A second client has an exclusive lock on a file on the server. The
> > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> >
> > Next the server is rebooted. The second client recovers the lock
> > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > returns -ENOLCK ("No locks available.") instead of continuing to wait.
>
> So that sounds like a client bug, and correct behavior from the server
> (assuming the second client was still holding the lock throughout).

yes.

>
> > server:/proc/locks shows two entries for the file after the -ENOLCK. When
> > the second client gives up its lock because the program running there
> > is killed one entry in server:/proc/locks remains indefinately: as a
> > result no NFS client can lock the file anymore.
>
> But that sounds like a server bug--what do the two entries look like?

I think the server assumes correct client behavior; the client under
test resulted in a '->' prefixed entry. The fcntl at the client just
shouldn't have returned yet.

>
> Also, what filesystem are you exporting?

ext4

--
Frank

2011-08-04 16:34:53

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> relock_filesystem patch proposed earlier).
>
> A second client has an exclusive lock on a file on the server. The
> client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
>
> Next the server is rebooted. The second client recovers the lock
> correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> this changes to NLM_BLOCKED after grace period expiration the fcntl
> returns -ENOLCK ("No locks available.") instead of continuing to wait.

So that sounds like a client bug, and correct behavior from the server
(assuming the second client was still holding the lock throughout).

> server:/proc/locks shows two entries for the file after the -ENOLCK. When
> the second client gives up its lock because the program running there
> is killed one entry in server:/proc/locks remains indefinately: as a
> result no NFS client can lock the file anymore.

But that sounds like a server bug--what do the two entries look like?

Also, what filesystem are you exporting?

--b.

2011-08-04 17:24:54

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 12:49:13PM -0400, J. Bruce Fields wrote:
> On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > > relock_filesystem patch proposed earlier).
> > > >
> > > > A second client has an exclusive lock on a file on the server. The
> > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > > >
> > > > Next the server is rebooted. The second client recovers the lock
> > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> > >
> > > So that sounds like a client bug, and correct behavior from the server
> > > (assuming the second client was still holding the lock throughout).
> >
> > yes.
> >
> > >
> > > > server:/proc/locks shows two entries for the file after the -ENOLCK. When
> > > > the second client gives up its lock because the program running there
> > > > is killed one entry in server:/proc/locks remains indefinately: as a
> > > > result no NFS client can lock the file anymore.
> > >
> > > But that sounds like a server bug--what do the two entries look like?
> >
> > I think the server assumes correct client behavior; the client under
> > test resulted in a '->' prefixed entry. The fcntl at the client just
> > shouldn't have returned yet.
>
> Oh, right, so did you see a granted callback returned to the client?

Hmm no, maybe it is a server bug. These are the final request and reply
(which result in the incorrect -ENOLCK for F_SETLKW at the client under
test), decoded by wireshark:

No. Time Source Destination Protocol Info
529 225.386189 172.17.1.124 172.17.1.49 NLM V4 LOCK Call (Reply In 530) FH:0xb17f38ea svid:10 pos:0-0

Frame 529: 246 bytes on wire (1968 bits), 246 bytes captured (1968 bits)
Network Lock Manager Protocol
[Program Version: 4]
[V4 Procedure: LOCK (2)]
cookie: <DATA>
length: 4
contents: <DATA>
block: Yes
exclusive: Yes
lock
caller_name: lokka.tasking.nl
length: 16
contents: lokka.tasking.nl
fh
length: 28
[hash (CRC-32): 0xb17f38ea]
decode type as: unknown
filehandle: 01000601e66f5c256cb3414eba710fcd882a67201b000000...
owner: <DATA>
length: 19
contents: <DATA>
fill bytes: opaque data
svid: 10
l_offset: 0
l_len: 0
reclaim: No
state: 87

No. Time Source Destination Protocol Info
530 225.386368 172.17.1.49 172.17.1.124 NLM V4 LOCK Reply (Call In 529) NLM_BLOCKED

Frame 530: 78 bytes on wire (624 bits), 78 bytes captured (624 bits)
Network Lock Manager Protocol
[Program Version: 4]
[V4 Procedure: LOCK (2)]
cookie: <DATA>
length: 4
contents: <DATA>
stat: NLM_BLOCKED (3)


--
Frank

2011-08-04 16:49:14

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > relock_filesystem patch proposed earlier).
> > >
> > > A second client has an exclusive lock on a file on the server. The
> > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > >
> > > Next the server is rebooted. The second client recovers the lock
> > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> >
> > So that sounds like a client bug, and correct behavior from the server
> > (assuming the second client was still holding the lock throughout).
>
> yes.
>
> >
> > > server:/proc/locks shows two entries for the file after the -ENOLCK. When
> > > the second client gives up its lock because the program running there
> > > is killed one entry in server:/proc/locks remains indefinately: as a
> > > result no NFS client can lock the file anymore.
> >
> > But that sounds like a server bug--what do the two entries look like?
>
> I think the server assumes correct client behavior; the client under
> test resulted in a '->' prefixed entry. The fcntl at the client just
> shouldn't have returned yet.

Oh, right, so did you see a granted callback returned to the client?

--b.

>
> >
> > Also, what filesystem are you exporting?
>
> ext4
>
> --
> Frank

2011-08-05 13:28:25

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 02:17:35PM -0400, Trond Myklebust wrote:
> On Thu, 2011-08-04 at 19:27 +0200, Frank van Maarseveen wrote:
> > On Thu, Aug 04, 2011 at 01:10:20PM -0400, Trond Myklebust wrote:
> > > On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
> > > > On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> > > > > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > > > > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > > > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > > > > > relock_filesystem patch proposed earlier).
> > > > > > >
> > > > > > > A second client has an exclusive lock on a file on the server. The
> > > > > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > > > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > > > > > >
> > > > > > > Next the server is rebooted. The second client recovers the lock
> > > > > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > > > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > > > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > > > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> > > > > >
> > > > > > So that sounds like a client bug, and correct behavior from the server
> > > > > > (assuming the second client was still holding the lock throughout).
> > > > >
> > > > > yes.
> > >
> > > Is the client actually asking for a blocking lock after the grace period
> > > expires?
> >
> > yes, according to my interpretation of that of wireshark, see reply to Bruce.
> >
>
> OK... Does the following patch help?
>
> Cheers
> Trond
> ---
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index 8392cb8..40c0d88 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -270,6 +270,9 @@ nlmclnt_call(struct rpc_cred *cred, struct nlm_rqst *req, u32 proc)
> return -ENOLCK;
> msg.rpc_proc = &clnt->cl_procinfo[proc];
>
> + /* Reset the reply status */
> + if (argp->block)
> + resp->status = nlm_lck_blocked;
> /* Perform the RPC call. If an error occurs, try again */
> if ((status = rpc_call_sync(clnt, &msg, 0)) < 0) {
> dprintk("lockd: rpc_call returned error %d\n", -status);
>

Negative. I've tried it on the client under test and I'm seeing three
types of behavior, one good, two bad. In all cases the secondary
client (unmodified) correctly regains the lock after the server has
rebooted. Client under test behavior depends on whether it had queued
the conflicting lock before of after the server reboot. Afterwards it
seems to work with the above modification (don't know if that was the
case before though).

When the client under test tries to lock before the server reboot then
the fcntl(F_SETLKW) returns either right after the NSM NOTIFY with
-ENOLCK without any NLM trafic or it returns with -ENOLCK when the
NLM_DENIED_GRACE_PERIOD changes into NLM_BLOCKED (the original report).

--
Frank

2011-08-04 17:27:12

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, Aug 04, 2011 at 01:10:20PM -0400, Trond Myklebust wrote:
> On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
> > On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> > > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > > > relock_filesystem patch proposed earlier).
> > > > >
> > > > > A second client has an exclusive lock on a file on the server. The
> > > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > > > >
> > > > > Next the server is rebooted. The second client recovers the lock
> > > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> > > >
> > > > So that sounds like a client bug, and correct behavior from the server
> > > > (assuming the second client was still holding the lock throughout).
> > >
> > > yes.
>
> Is the client actually asking for a blocking lock after the grace period
> expires?

yes, according to my interpretation of that of wireshark, see reply to Bruce.

--
Frank

2011-08-04 17:10:21

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
> On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > > relock_filesystem patch proposed earlier).
> > > >
> > > > A second client has an exclusive lock on a file on the server. The
> > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > > >
> > > > Next the server is rebooted. The second client recovers the lock
> > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> > >
> > > So that sounds like a client bug, and correct behavior from the server
> > > (assuming the second client was still holding the lock throughout).
> >
> > yes.

Is the client actually asking for a blocking lock after the grace period
expires?

> > >
> > > > server:/proc/locks shows two entries for the file after the -ENOLCK. When
> > > > the second client gives up its lock because the program running there
> > > > is killed one entry in server:/proc/locks remains indefinately: as a
> > > > result no NFS client can lock the file anymore.
> > >
> > > But that sounds like a server bug--what do the two entries look like?
> >
> > I think the server assumes correct client behavior; the client under
> > test resulted in a '->' prefixed entry. The fcntl at the client just
> > shouldn't have returned yet.
>
> Oh, right, so did you see a granted callback returned to the client?

The client will reject any unsolicited GRANTED callbacks with an
NLM_LCK_DENIED. As far as I can see, nlmsvc_grant_reply() then only
removes the block, it doesn't cancel the lock...

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-08-04 18:17:55

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

On Thu, 2011-08-04 at 19:27 +0200, Frank van Maarseveen wrote:
> On Thu, Aug 04, 2011 at 01:10:20PM -0400, Trond Myklebust wrote:
> > On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
> > > On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
> > > > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
> > > > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
> > > > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
> > > > > > relock_filesystem patch proposed earlier).
> > > > > >
> > > > > > A second client has an exclusive lock on a file on the server. The
> > > > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
> > > > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
> > > > > >
> > > > > > Next the server is rebooted. The second client recovers the lock
> > > > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
> > > > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
> > > > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
> > > > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
> > > > >
> > > > > So that sounds like a client bug, and correct behavior from the server
> > > > > (assuming the second client was still holding the lock throughout).
> > > >
> > > > yes.
> >
> > Is the client actually asking for a blocking lock after the grace period
> > expires?
>
> yes, according to my interpretation of that of wireshark, see reply to Bruce.
>

OK... Does the following patch help?

Cheers
Trond
---
diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 8392cb8..40c0d88 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -270,6 +270,9 @@ nlmclnt_call(struct rpc_cred *cred, struct nlm_rqst *req, u32 proc)
return -ENOLCK;
msg.rpc_proc = &clnt->cl_procinfo[proc];

+ /* Reset the reply status */
+ if (argp->block)
+ resp->status = nlm_lck_blocked;
/* Perform the RPC call. If an error occurs, try again */
if ((status = rpc_call_sync(clnt, &msg, 0)) < 0) {
dprintk("lockd: rpc_call returned error %d\n", -status);

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2012-03-16 10:54:18

by Ichiko Sakamoto

[permalink] [raw]
Subject: Re: [NLM] fcntl(F_SETLKW) yields -ENOLCK when grace period expires.

(2011/08/05 22:28), Frank van Maarseveen wrote:

> On Thu, Aug 04, 2011 at 02:17:35PM -0400, Trond Myklebust wrote:
>> On Thu, 2011-08-04 at 19:27 +0200, Frank van Maarseveen wrote:
>> > On Thu, Aug 04, 2011 at 01:10:20PM -0400, Trond Myklebust wrote:
>> > > On Thu, 2011-08-04 at 12:49 -0400, J. Bruce Fields wrote:
>> > > > On Thu, Aug 04, 2011 at 06:43:13PM +0200, Frank van Maarseveen wrote:
>> > > > > On Thu, Aug 04, 2011 at 12:34:52PM -0400, J. Bruce Fields wrote:
>> > > > > > On Thu, Aug 04, 2011 at 12:30:19PM +0200, Frank van Maarseveen wrote:
>> > > > > > > Both client- and server run 2.6.39.3, NFSv3 over UDP (without the
>> > > > > > > relock_filesystem patch proposed earlier).
>> > > > > > >
>> > > > > > > A second client has an exclusive lock on a file on the server. The
>> > > > > > > client under test calls fcntl(F_SETLKW) to wait for the same exclusive
>> > > > > > > lock. Wireshark sees NLM V4 LOCK calls resulting in NLM_BLOCKED.
>> > > > > > >
>> > > > > > > Next the server is rebooted. The second client recovers the lock
>> > > > > > > correctly. The client under test now receives NLM_DENIED_GRACE_PERIOD for
>> > > > > > > every NLM V4 LOCK request resulting from the waiting fcntl(F_SETLKW). When
>> > > > > > > this changes to NLM_BLOCKED after grace period expiration the fcntl
>> > > > > > > returns -ENOLCK ("No locks available.") instead of continuing to wait.
>> > > > > >
>> > > > > > So that sounds like a client bug, and correct behavior from the server
>> > > > > > (assuming the second client was still holding the lock throughout).
>> > > > >
>> > > > > yes.
>> > >
>> > > Is the client actually asking for a blocking lock after the grace period
>> > > expires?
>> >
>> > yes, according to my interpretation of that of wireshark, see reply to Bruce.
>> >
>>
>> OK... Does the following patch help?
>>
>> Cheers
>> Trond
>> ---
>> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
>> index 8392cb8..40c0d88 100644
>> --- a/fs/lockd/clntproc.c
>> +++ b/fs/lockd/clntproc.c
>> @@ -270,6 +270,9 @@ nlmclnt_call(struct rpc_cred *cred, struct nlm_rqst *req, u32 proc)
>> return -ENOLCK;
>> msg.rpc_proc = &clnt->cl_procinfo[proc];
>>
>> + /* Reset the reply status */
>> + if (argp->block)
>> + resp->status = nlm_lck_blocked;
>> /* Perform the RPC call. If an error occurs, try again */
>> if ((status = rpc_call_sync(clnt, &msg, 0)) < 0) {
>> dprintk("lockd: rpc_call returned error %d\n", -status);
>>
>
> Negative. I've tried it on the client under test and I'm seeing three
> types of behavior, one good, two bad. In all cases the secondary
> client (unmodified) correctly regains the lock after the server has
> rebooted. Client under test behavior depends on whether it had queued
> the conflicting lock before of after the server reboot. Afterwards it
> seems to work with the above modification (don't know if that was the
> case before though).
>
> When the client under test tries to lock before the server reboot then
> the fcntl(F_SETLKW) returns either right after the NSM NOTIFY with
> -ENOLCK without any NLM trafic or it returns with -ENOLCK when the
> NLM_DENIED_GRACE_PERIOD changes into NLM_BLOCKED (the original report).
>




Hi all

Was this fixed?
I have same issue in 3.2.9-2.fc16.

When the client recieves NSM NOTIFY, reclaimer() thread updates
block->b_status to nlm_lck_denied_grace_period.

fs/lockd/clntlock.c
265 /* Now, wake up all processes that sleep on a blocked lock */
266 spin_lock(&nlm_blocked_lock);
267 list_for_each_entry(block, &nlm_blocked, b_list) {
268 if (block->b_host == host) {
* 269 block->b_status = nlm_lck_denied_grace_period;
270 wake_up(&block->b_wait);
271 }
272 }
273 spin_unlock(&nlm_blocked_lock);

Blocked process loops inside nlmclnt_call() during grace period,
and recieves NLM_BLOCKED again.
Then nlmclnt_block() copies block->b_status(== nlm_lck_denied_grace_period)
to req->a_res.status.

fs/lockd/clntlock.c
139 ret = wait_event_interruptible_timeout(block->b_wait,
140 block->b_status != nlm_lck_blocked,
141 timeout);
142 if (ret < 0)
143 return -ERESTARTSYS;
* 144 req->a_res.status = block->b_status;
145 return 0;

.. and nlmclnt_lock() breaks retry loop and returns -ENOLCK.

fs/lockd/clntproc.c
550 /* Wait on an NLM blocking lock */
551 status = nlmclnt_block(block, req, NLMCLNT_POLL_TIMEOUT);
552 if (status < 0)
553 break;
* 554 if (resp->status != nlm_lck_blocked)
* 555 break;
556 }
...
590 if (resp->status == nlm_lck_denied && (fl_flags & FL_SLEEP))
591 status = -ENOLCK;
592 else
* 593 status = nlm_stat_to_errno(resp->status);
594out_unblock:
595 nlmclnt_finish_block(block);
596out:
597 nlmclnt_release_call(req);
* 598 return status;



Following patch works fine in my fc16.

--- a/fs/lockd/clntlock.c 2012-01-04 23:55:44.000000000 +0000
+++ b/fs/lockd/clntlock.c 2012-03-16 08:08:03.793687409 +0000
@@ -121,6 +121,7 @@
int nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout)
{
long ret;
+ u32 nsmstate;

/* A borken server might ask us to block even if we didn't
* request it. Just say no!
@@ -136,8 +137,10 @@
* a 1 minute timeout would do. See the comment before
* nlmclnt_lock for an explanation.
*/
+ nsmstate = block->b_host->h_nsmstate;
ret = wait_event_interruptible_timeout(block->b_wait,
- block->b_status != nlm_lck_blocked,
+ block->b_status != nlm_lck_blocked ||
+ block->b_host->h_nsmstate != nsmstate,
timeout);
if (ret < 0)
return -ERESTARTSYS;
@@ -266,7 +269,6 @@
spin_lock(&nlm_blocked_lock);
list_for_each_entry(block, &nlm_blocked, b_list) {
if (block->b_host == host) {
- block->b_status = nlm_lck_denied_grace_period;
wake_up(&block->b_wait);
}
}


Thanks,
Ichiko



Attachments:
smime.p7s (5.35 kB)
S/MIME Cryptographic Signature