2019-09-22 19:15:18

by Alkis Georgopoulos

[permalink] [raw]
Subject: Re: rsize,wsize=1M causes severe lags in 10/100 Mbps

On 9/21/19 10:52 AM, Alkis Georgopoulos wrote:
> I think it's caused by the kernel readahead, not glibc readahead.
> TL;DR: This solves the problem:
> echo 4 > /sys/devices/virtual/bdi/0:58/read_ahead_kb
>
> Question: how to configure NFS/kernel to automatically set that?
>
> Long version:
> Doing step (4) below results in tremendous speedup:
>
> 1) mount -t nfs -o tcp,timeo=600,rsize=1048576,wsize=1048576
> 10.161.254.11:/srv/ltsp /mnt
>
> 2) cat /proc/fs/nfsfs/volumes
> We see the DEV number from there, e.g. 0:58
>
> 3) cat /sys/devices/virtual/bdi/0:58/read_ahead_kb
> 15360
> I assume that this means the kernel will try to read ahead up to 15 MB
> for each accessed file. *THIS IS THE PROBLEM*. For non-NFS devices, this
> value is 128 (KB).
>
> 4) echo 4 > /sys/devices/virtual/bdi/0:58/read_ahead_kb
>
> 5) Test. Traffic now should be a *lot* less, and speed a *lot* more.
> E.g. my NFS booting tests:
>  - read_ahead_kb=15360 (the default) => 1160 MB traffic to boot
>  - read_ahead_kb=128 => 324MB traffic
>  - read_ahead_kb=4 => 223MB traffic
>
> So the question that remains, is how to properly configure either NFS or
> the kernel, to use small readahead values for NFS.
>
> I'm currently doing it with this workaround:
> for f in $(awk '/^v[0-9]/ { print $4 }' < /proc/fs/nfsfs/volumes); do
> echo 4 > /sys/devices/virtual/bdi/$f/read_ahead_kb; done
>
> Thanks,
> Alkis



Quoting https://lkml.org/lkml/2010/2/26/48
> nfs: use 2*rsize readahead size
> With default rsize=512k and NFS_MAX_READAHEAD=15, the current NFS
> readahead size 512k*15=7680k is too large than necessary for typical
> clients.

I.e. the problem probably is that when NFS_MAX_READAHEAD=15 was
implemented, rsize was 512k; now that rsize=1M, this results in
readaheads of 15M, which cause all the traffic and lags.


2019-09-22 19:15:56

by Alkis Georgopoulos

[permalink] [raw]
Subject: Re: rsize,wsize=1M causes severe lags in 10/100 Mbps

On 9/21/19 10:59 AM, Alkis Georgopoulos wrote:
> I.e. the problem probably is that when NFS_MAX_READAHEAD=15 was
> implemented, rsize was 512k; now that rsize=1M, this results in
> readaheads of 15M, which cause all the traffic and lags.


I filed a bug report for this:
https://bugzilla.kernel.org/show_bug.cgi?id=204939

A quick work around is to run on the clients, after the NFS mounts:

for f in $(awk '/^v[0-9]/ { print $4 }' < /proc/fs/nfsfs/volumes); do
echo 4 > /sys/devices/virtual/bdi/$f/read_ahead_kb
done

Btw the mail title is wrong, the workaround above causes the netboot
traffic to drop from e.g. 1160MB to 221MB in any network speed;
it was just more observable in lower speeds.

Thank you very much,
Alkis Georgopoulos