2015-07-20 14:10:11

by Frank Steiner

[permalink] [raw]
Subject: Parallel file locking quite slow with vers=4.0 compared to vers=3

Hi,

we ran into this when opening about a dozen "xterm ...&" in an ~/.xinitrc on
a diskless client with /var mounted with NFS. The xterms popup one after
another and sometimes it takes up to 7 or 8 seconds until they are all
available.

This works without any noticeable delay as long as /var was exported/mounted
with NFSv3. After switching to NFSv4 the delay showed up. It's caused
by the utempter processes called by xterm that are trying to write
(and thus lock) to /var/log/wtmp in parallel.

Making some little test with a c program doing nothing but
"fcntl (fd, F_SETLKW, &lck);" with "lck.l_type = F_WRLCK" it turned out
that executing ten instances of this program on a NFSv3 share work
within one second, while on a NFSv4 share about 5 or 6 return immediately,
the remaining finish after up to 3 seconds. The more instances we
exec the more are delayed and the delays are getting bigger.

With NFSv4 shares this always happens.
If the server exports with NFSv3, it matters which value for vers=
the client uses. We mounted the same nfs3-export once with "vers=3"
and once with "vers=4.0" on the same client. The test on the vers=3
mount shows no delays even with 20 parallel runs, the vers=4.0 mount
has the delays.

I guess this is some kind of flodding protection for parallel operations,
but can it be tweaked on the server or the client side somehow? Apart
from the xinitrc setting we have some other (weird :-)) scenarions where
this delay are ugly and we would be happy to get rid of them (while still
using NFSv4).

cu,
Frank

--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *


2015-07-20 21:23:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

On Mon, Jul 20, 2015 at 04:02:01PM +0200, Frank Steiner wrote:
> we ran into this when opening about a dozen "xterm ...&" in an ~/.xinitrc on
> a diskless client with /var mounted with NFS. The xterms popup one after
> another and sometimes it takes up to 7 or 8 seconds until they are all
> available.
>
> This works without any noticeable delay as long as /var was exported/mounted
> with NFSv3. After switching to NFSv4 the delay showed up. It's caused
> by the utempter processes called by xterm that are trying to write
> (and thus lock) to /var/log/wtmp in parallel.
>
> Making some little test with a c program doing nothing but
> "fcntl (fd, F_SETLKW, &lck);" with "lck.l_type = F_WRLCK" it turned out
> that executing ten instances of this program on a NFSv3 share work
> within one second, while on a NFSv4 share about 5 or 6 return immediately,
> the remaining finish after up to 3 seconds. The more instances we
> exec the more are delayed and the delays are getting bigger.
>
> With NFSv4 shares this always happens.
> If the server exports with NFSv3, it matters which value for vers=
> the client uses. We mounted the same nfs3-export once with "vers=3"
> and once with "vers=4.0" on the same client. The test on the vers=3
> mount shows no delays even with 20 parallel runs, the vers=4.0 mount
> has the delays.
>
> I guess this is some kind of flodding protection for parallel operations,
> but can it be tweaked on the server or the client side somehow? Apart
> from the xinitrc setting we have some other (weird :-)) scenarions where
> this delay are ugly and we would be happy to get rid of them (while still
> using NFSv4).

NFSv4 clients poll for conflicting locks; NFSv3 clients instead wait for
server callbacks. That might explain the difference.

One way to confirm that might be to run client kernels modified to make
the polling behavior (controlled by
fs/nfs/nfs4proc.c:nfs4_set_lock_task_retry()) more aggressive, and see
if that removes the delays in your case.

If that works, then it might also be worth considering implementing
CB_NOTIFY_LOCK:

https://tools.ietf.org/html/rfc5661#page-593

--b.

2015-07-21 06:44:46

by Frank Steiner

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

J. Bruce Fields wrote

> One way to confirm that might be to run client kernels modified to make
> the polling behavior (controlled by
> fs/nfs/nfs4proc.c:nfs4_set_lock_task_retry()) more aggressive, and see
> if that removes the delays in your case.
>
> If that works, then it might also be worth considering implementing
> CB_NOTIFY_LOCK:
>
> https://tools.ietf.org/html/rfc5661#page-593

I wouldn't mind to test any kind of patch if someone would find the time
to provide some. Writing them myself is beyond my skills I'm afraid :-)

cu,
Frank
--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *

2015-07-21 15:51:45

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

On Tue, Jul 21, 2015 at 08:44:44AM +0200, Frank Steiner wrote:
> J. Bruce Fields wrote
>
> > One way to confirm that might be to run client kernels modified to make
> > the polling behavior (controlled by
> > fs/nfs/nfs4proc.c:nfs4_set_lock_task_retry()) more aggressive, and see
> > if that removes the delays in your case.
> >
> > If that works, then it might also be worth considering implementing
> > CB_NOTIFY_LOCK:
> >
> > https://tools.ietf.org/html/rfc5661#page-593
>
> I wouldn't mind to test any kind of patch if someone would find the time
> to provide some. Writing them myself is beyond my skills I'm afraid :-)

Well, I was just thinking of something like this.

(*Not* saying this is what we really want to do, but it might help
confirm where your delay is coming from.)

--b.

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6f228b5..c119ba7 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5367,7 +5367,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
return err;
}

-#define NFS4_LOCK_MINTIMEOUT (1 * HZ)
+#define NFS4_LOCK_MINTIMEOUT (1 * HZ / 10)
#define NFS4_LOCK_MAXTIMEOUT (30 * HZ)

/*

2015-07-22 14:49:06

by Frank Steiner

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

J. Bruce Fields wrote
o provide some. Writing them myself is beyond my skills I'm afraid :-)
>
> Well, I was just thinking of something like this.
>
> (*Not* saying this is what we really want to do, but it might help
> confirm where your delay is coming from.)

Thanks! I tried your patch and it goes way faster. The last 2 or 3
of 20 xterms that I start in parallel still take about 1-3 seconds,
but the rest is poping up immediately. NFS3 is still better as it
shows all xterm (almost) immediately, but the patch definitely is
an improvement.

cu,
Frank

--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *

2015-07-22 14:53:14

by Frank Steiner

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

Frank Steiner wrote

> Thanks! I tried your patch and it goes way faster. The last 2 or 3
> of 20 xterms that I start in parallel still take about 1-3 seconds,
> but the rest is poping up immediately. NFS3 is still better as it

The difference is still quite large when killing the xterms again.
"killall xterm" on NFSv3 takes half a second, on NFSv4 the last
about 5 xterms are delayed and vanish one after another, the last one
about 5-6 seconds after the killall was started.

cu,
Frank


--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *

2015-07-22 15:40:32

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Parallel file locking quite slow with vers=4.0 compared to vers=3

On Wed, Jul 22, 2015 at 04:53:11PM +0200, Frank Steiner wrote:
> Frank Steiner wrote
>
> > Thanks! I tried your patch and it goes way faster. The last 2 or 3
> > of 20 xterms that I start in parallel still take about 1-3 seconds,
> > but the rest is poping up immediately. NFS3 is still better as it
>
> The difference is still quite large when killing the xterms again.
> "killall xterm" on NFSv3 takes half a second, on NFSv4 the last
> about 5 xterms are delayed and vanish one after another, the last one
> about 5-6 seconds after the killall was started.

No idea what's going on there.

--b.