2008-11-14 20:34:59

by Mike Snitzer

[permalink] [raw]
Subject: help understanding the current (and future) state of NFSv4 locking?

Hello,

I'd like to understand the state of Linux's NFSv4 server regarding the
NFSv4 spec's _optional_ ordered blocking lock list implementation.
Without something like the following patch isn't there still concern
for NFSv4 clients being starved from ever getting a conflicting lock
(local POSIX or lockd waiters would race to get it first)?

"fair queuing" in Linux's fs/locks.c was developed but the patch was
never merged upstream:
http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-032-locks-posix-fair-queue.dif

http://wiki.linux-nfs.org/wiki/index.php/Cluster_Coherent_NFS_and_Byte_Range_Locking
http://www.eisler.com/2008-05-09/draft-ietf-nfsv4-minorversion1-23.html#blocking_locks


I'd also like to understand: what Linux NFSv4.1 support is intended
for the _optional_ CB_NOTIFY_LOCK?:

20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible lock availability:
http://www.eisler.com/2008-05-09/draft-ietf-nfsv4-minorversion1-23.html#OP_CB_NOTIFY_LOCK

Any insight would be appreciated.

Thanks,
Mike


2008-11-18 00:18:34

by Mike Snitzer

[permalink] [raw]
Subject: Re: help understanding the current (and future) state of NFSv4 locking?

On Sun, Nov 16, 2008 at 2:48 PM, J. Bruce Fields <[email protected]> wrote:
> On Fri, Nov 14, 2008 at 03:34:56PM -0500, Mike Snitzer wrote:
>> Hello,
>>
>> I'd like to understand the state of Linux's NFSv4 server regarding the
>> NFSv4 spec's _optional_ ordered blocking lock list implementation.
>> Without something like the following patch isn't there still concern
>> for NFSv4 clients being starved from ever getting a conflicting lock
>> (local POSIX or lockd waiters would race to get it first)?
>
> Yes. I have patches that approach the problem by:
>
> - Defining a new type of lock type, the "provisional" lock,
> which is just like a posix lock type, *except* that it
> doesn't merge with other locks, and hence can still be cancelled
> safely.
> - Modifies the process of waking up waiters for a just-released
> lock to make it a two-step process:
> 1. Apply a "provisional" lock, if there are no
> conflicts, and wake whoever was waiting for it. (If
> there are still conflicts, put the lock on the new
> list without waking anyone.)
> 2. Allow the waiter to upgrade the provisional lock to a
> real posix lock (or, alternatively, to cancel it).
> - Take advantage of the above to implement fair queuing for v4,
> by stretching out the gap between steps 1 and 2 up to a lease
> period, thus allowing a lock that is available but that a
> client has yet polled for to be temporarily represented by a
> provisional lock.
>
> The thought was that we'd also solve a couple other problems along the
> way, by:
>
> - Preventing thundering herd problems on posix locks with lots
> of waiters.
> - Increasing fairness of posix locking (even among local
> lockers).
>
> But we weren't able to actually show any improvement for posix locks
> with local waiters, and it's unclear whether anyone cares much about
> posix lock fairness.
>
> So it's unclear whether it's worth doing the 2-step process above for
> all posix lockers. So maybe the patches should be written to instead
> implement provisional locks as an optional extra for use of the v4
> server.

Thanks for the overview. I think that given how easy it is to starve
a v4 client (see below) something needs to give.

> A real-world test case (showing starvation of v4 clients) would be
> interesting if anyone had one.

I'm not sure what your definition of "real-world test case" is (so
maybe the following is moot) but the attached program (written by a
co-worker) can be used to easily illustrate the starvation of v4
clients. The program tests to see how long it takes to lock/unlock a
file 1000 times. If ran locally on the nfs-server against an exported
ext3 FS I get a "step time" of 11ms. Ran from a v3 client: ~390ms.
Ran from a v4 client: ~430ms.

ran simultaneously on the nfs-server and the v3 client; local=~30ms, v3=~440ms
ran simultaneously on two v3 clients; both v3=~580ms
ran simultaneously on the nfs-server and the two v3 clients; both
v3=~580ms, but local ranges ~1500ms to ~9300ms
ran simultaneously on two v4 clients; v4=~430ms but with frequent
interleaved outliers ranging from ~1500ms to ~75000ms
ran simultaneously on the nfs-server and the v4 client; local=~11ms, v4=STARVED
ran simultaneously on the v3 and the v4 client; v3=~390ms, v4=STARVED

FYI, "STARVED" above doesn't mean the v4 client _never_ acquires the
lock. It eventually acquires the lock; albeit extremely rarely (e.g.
after 5min) because v4 client polling is predisposed to lose the race
with either the hyper-active v3 client or the local locker.

Mike


Attachments:
(No filename) (3.66 kB)
lock_tst.c (982.00 B)
Download all attachments

2008-11-16 19:48:24

by J. Bruce Fields

[permalink] [raw]
Subject: Re: help understanding the current (and future) state of NFSv4 locking?

On Fri, Nov 14, 2008 at 03:34:56PM -0500, Mike Snitzer wrote:
> Hello,
>
> I'd like to understand the state of Linux's NFSv4 server regarding the
> NFSv4 spec's _optional_ ordered blocking lock list implementation.
> Without something like the following patch isn't there still concern
> for NFSv4 clients being starved from ever getting a conflicting lock
> (local POSIX or lockd waiters would race to get it first)?

Yes. I have patches that approach the problem by:

- Defining a new type of lock type, the "provisional" lock,
which is just like a posix lock type, *except* that it
doesn't merge with other locks, and hence can still be cancelled
safely.
- Modifies the process of waking up waiters for a just-released
lock to make it a two-step process:
1. Apply a "provisional" lock, if there are no
conflicts, and wake whoever was waiting for it. (If
there are still conflicts, put the lock on the new
list without waking anyone.)
2. Allow the waiter to upgrade the provisional lock to a
real posix lock (or, alternatively, to cancel it).
- Take advantage of the above to implement fair queuing for v4,
by stretching out the gap between steps 1 and 2 up to a lease
period, thus allowing a lock that is available but that a
client has yet polled for to be temporarily represented by a
provisional lock.

The thought was that we'd also solve a couple other problems along the
way, by:

- Preventing thundering herd problems on posix locks with lots
of waiters.
- Increasing fairness of posix locking (even among local
lockers).

But we weren't able to actually show any improvement for posix locks
with local waiters, and it's unclear whether anyone cares much about
posix lock fairness.

So it's unclear whether it's worth doing the 2-step process above for
all posix lockers. So maybe the patches should be written to instead
implement provisional locks as an optional extra for use of the v4
server.

A real-world test case (showing starvation of v4 clients) would be
interesting if anyone had one.

> "fair queuing" in Linux's fs/locks.c was developed but the patch was
> never merged upstream:
> http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-032-locks-posix-fair-queue.dif
>
> http://wiki.linux-nfs.org/wiki/index.php/Cluster_Coherent_NFS_and_Byte_Range_Locking
> http://www.eisler.com/2008-05-09/draft-ietf-nfsv4-minorversion1-23.html#blocking_locks
>
>
> I'd also like to understand: what Linux NFSv4.1 support is intended
> for the _optional_ CB_NOTIFY_LOCK?:

None currently. It shouldn't be hard to do.

--b.

> 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible lock availability:
> http://www.eisler.com/2008-05-09/draft-ietf-nfsv4-minorversion1-23.html#OP_CB_NOTIFY_LOCK