2018-03-28 11:14:13

by Antti Tönkyrä

[permalink] [raw]
Subject: Regarding client fairness

I came across a rather annoying issue where a single NFS client caused
resource starvation for NFS server. The server has several storage pools
which are used, in this particular case a single client did fairly large
read requests and effectively ate all nfsd threads on the server and
during that other clients were getting hardly any I/O through to the
other storage pool which was completely idle.

I then proceeded to make a simple testcase and noticed that reading a
file with large blocksize causes NFS server to read using multiple
threads, effectively consuming all nfsd threads on the server and
causing starvation to other clients regardless of the share/backing disk
they were accessing.

In my testcase a simple (ridiculous) dd was able to effectively reserve
the entire NFS server for itself:

# dd if=fgsfds bs=1000M count=10000 iflag=direct

Also several similar dd runs with blocksize of 100M caused the same
effect. During those dd-runs the server was responding at a very slow
rate to any other requests by other clients (or to other NFS shares on
different disks on the server).

My question here is that are there any methods to ensure client fairness
with Linux NFS and/or are there some best common practices to ensure
something like that. I think it would be pretty awesome if clients had
some kind of limit/fairness that would be scoped like {client,
share-on-server} so client which accesses a single share on a server
(with large read IO requests) would not effectively cause denial of
service for the entire NFS server but rather only to the share it is
accessing and at same time other clients accessing different/same share
would get fair amount of access to the data.


2018-03-28 14:54:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Regarding client fairness

On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
> I came across a rather annoying issue where a single NFS client
> caused resource starvation for NFS server. The server has several
> storage pools which are used, in this particular case a single
> client did fairly large read requests and effectively ate all nfsd
> threads on the server and during that other clients were getting
> hardly any I/O through to the other storage pool which was
> completely idle.

What version of the kernel are you running on your server?

--b.

>
> I then proceeded to make a simple testcase and noticed that reading
> a file with large blocksize causes NFS server to read using multiple
> threads, effectively consuming all nfsd threads on the server and
> causing starvation to other clients regardless of the share/backing
> disk they were accessing.
>
> In my testcase a simple (ridiculous) dd was able to effectively
> reserve the entire NFS server for itself:
>
> # dd if=fgsfds bs=1000M count=10000 iflag=direct
>
> Also several similar dd runs with blocksize of 100M caused the same
> effect. During those dd-runs the server was responding at a very
> slow rate to any other requests by other clients (or to other NFS
> shares on different disks on the server).
>
> My question here is that are there any methods to ensure client
> fairness with Linux NFS and/or are there some best common practices
> to ensure something like that. I think it would be pretty awesome if
> clients had some kind of limit/fairness that would be scoped like
> {client, share-on-server} so client which accesses a single share on
> a server (with large read IO requests) would not effectively cause
> denial of service for the entire NFS server but rather only to the
> share it is accessing and at same time other clients accessing
> different/same share would get fair amount of access to the data.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2018-03-28 14:59:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Regarding client fairness

On Wed, Mar 28, 2018 at 10:54:06AM -0400, bfields wrote:
> On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
> > I came across a rather annoying issue where a single NFS client
> > caused resource starvation for NFS server. The server has several
> > storage pools which are used, in this particular case a single
> > client did fairly large read requests and effectively ate all nfsd
> > threads on the server and during that other clients were getting
> > hardly any I/O through to the other storage pool which was
> > completely idle.
>
> What version of the kernel are you running on your server?

I'm thinking that if it includes upstream 637600f3ffbf "SUNRPC: Change
TCP socket space reservation" (in upstream 4.8), then you may want to
experiment setting the sunrpc.svc_rpc_per_connection_limit module
parameter added in ff3ac5c3dc23 "SUNRPC: Add a server side
per-connection limit".

You probably want to experiment with values greater than 0 (the default,
no limit) and the number of server threads.

--b.

>
> --b.
>
> >
> > I then proceeded to make a simple testcase and noticed that reading
> > a file with large blocksize causes NFS server to read using multiple
> > threads, effectively consuming all nfsd threads on the server and
> > causing starvation to other clients regardless of the share/backing
> > disk they were accessing.
> >
> > In my testcase a simple (ridiculous) dd was able to effectively
> > reserve the entire NFS server for itself:
> >
> > # dd if=fgsfds bs=1000M count=10000 iflag=direct
> >
> > Also several similar dd runs with blocksize of 100M caused the same
> > effect. During those dd-runs the server was responding at a very
> > slow rate to any other requests by other clients (or to other NFS
> > shares on different disks on the server).
> >
> > My question here is that are there any methods to ensure client
> > fairness with Linux NFS and/or are there some best common practices
> > to ensure something like that. I think it would be pretty awesome if
> > clients had some kind of limit/fairness that would be scoped like
> > {client, share-on-server} so client which accesses a single share on
> > a server (with large read IO requests) would not effectively cause
> > denial of service for the entire NFS server but rather only to the
> > share it is accessing and at same time other clients accessing
> > different/same share would get fair amount of access to the data.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

2018-03-28 15:35:58

by Antti Tönkyrä

[permalink] [raw]
Subject: Re: Regarding client fairness

On 2018-03-28 17:59, J. Bruce Fields wrote:
> On Wed, Mar 28, 2018 at 10:54:06AM -0400, bfields wrote:
>> On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
>>> I came across a rather annoying issue where a single NFS client
>>> caused resource starvation for NFS server. The server has several
>>> storage pools which are used, in this particular case a single
>>> client did fairly large read requests and effectively ate all nfsd
>>> threads on the server and during that other clients were getting
>>> hardly any I/O through to the other storage pool which was
>>> completely idle.
>> What version of the kernel are you running on your server?
4.15.10 on the system I am testing on.
> I'm thinking that if it includes upstream 637600f3ffbf "SUNRPC: Change
> TCP socket space reservation" (in upstream 4.8), then you may want to
> experiment setting the sunrpc.svc_rpc_per_connection_limit module
> parameter added in ff3ac5c3dc23 "SUNRPC: Add a server side
> per-connection limit".
>
> You probably want to experiment with values greater than 0 (the default,
> no limit) and the number of server threads.
That helps for the client slowing down the whole server, thanks for the
tip! Of course this doesn't help with the case of client accessing 2
different shares on the same server but that is something I can work around.
>
> --b.
>
>> --b.
>>
>>> I then proceeded to make a simple testcase and noticed that reading
>>> a file with large blocksize causes NFS server to read using multiple
>>> threads, effectively consuming all nfsd threads on the server and
>>> causing starvation to other clients regardless of the share/backing
>>> disk they were accessing.
>>>
>>> In my testcase a simple (ridiculous) dd was able to effectively
>>> reserve the entire NFS server for itself:
>>>
>>> # dd if=fgsfds bs=1000M count=10000 iflag=direct
>>>
>>> Also several similar dd runs with blocksize of 100M caused the same
>>> effect. During those dd-runs the server was responding at a very
>>> slow rate to any other requests by other clients (or to other NFS
>>> shares on different disks on the server).
>>>
>>> My question here is that are there any methods to ensure client
>>> fairness with Linux NFS and/or are there some best common practices
>>> to ensure something like that. I think it would be pretty awesome if
>>> clients had some kind of limit/fairness that would be scoped like
>>> {client, share-on-server} so client which accesses a single share on
>>> a server (with large read IO requests) would not effectively cause
>>> denial of service for the entire NFS server but rather only to the
>>> share it is accessing and at same time other clients accessing
>>> different/same share would get fair amount of access to the data.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html


2018-03-28 15:47:54

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Regarding client fairness

On Wed, Mar 28, 2018 at 06:35:53PM +0300, Antti Tönkyrä wrote:
> On 2018-03-28 17:59, J. Bruce Fields wrote:
> >On Wed, Mar 28, 2018 at 10:54:06AM -0400, bfields wrote:
> >>On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
> >>>I came across a rather annoying issue where a single NFS client
> >>>caused resource starvation for NFS server. The server has several
> >>>storage pools which are used, in this particular case a single
> >>>client did fairly large read requests and effectively ate all nfsd
> >>>threads on the server and during that other clients were getting
> >>>hardly any I/O through to the other storage pool which was
> >>>completely idle.
> >>What version of the kernel are you running on your server?
> 4.15.10 on the system I am testing on.
> >I'm thinking that if it includes upstream 637600f3ffbf "SUNRPC: Change
> >TCP socket space reservation" (in upstream 4.8), then you may want to
> >experiment setting the sunrpc.svc_rpc_per_connection_limit module
> >parameter added in ff3ac5c3dc23 "SUNRPC: Add a server side
> >per-connection limit".
> >
> >You probably want to experiment with values greater than 0 (the default,
> >no limit) and the number of server threads.
> That helps for the client slowing down the whole server, thanks for
> the tip! Of course this doesn't help with the case of client
> accessing 2 different shares on the same server but that is
> something I can work around.

I thought the Linux client shared a single connection in that case, but
I could be wrong.

--b.

2018-03-28 15:58:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Regarding client fairness

On Wed, Mar 28, 2018 at 06:35:53PM +0300, Antti Tönkyrä wrote:
> On 2018-03-28 17:59, J. Bruce Fields wrote:
> >On Wed, Mar 28, 2018 at 10:54:06AM -0400, bfields wrote:
> >>On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
> >>>I came across a rather annoying issue where a single NFS client
> >>>caused resource starvation for NFS server. The server has several
> >>>storage pools which are used, in this particular case a single
> >>>client did fairly large read requests and effectively ate all nfsd
> >>>threads on the server and during that other clients were getting
> >>>hardly any I/O through to the other storage pool which was
> >>>completely idle.
> >>What version of the kernel are you running on your server?
> 4.15.10 on the system I am testing on.
> >I'm thinking that if it includes upstream 637600f3ffbf "SUNRPC: Change
> >TCP socket space reservation" (in upstream 4.8), then you may want to
> >experiment setting the sunrpc.svc_rpc_per_connection_limit module
> >parameter added in ff3ac5c3dc23 "SUNRPC: Add a server side
> >per-connection limit".
> >
> >You probably want to experiment with values greater than 0 (the default,
> >no limit) and the number of server threads.
> That helps for the client slowing down the whole server, thanks for
> the tip!

We should probably revisit 637600f3ffbf "SUNRPC: Change TCP socket space
reservation". There's got to be some way to keep high bandwidth pipes
filled with read data without introducing this problem where a single
client can tie up every server thread.

Just out of curiosity, do you know (approximately) the network and disk
bandwidth in this case?

--b.

2018-03-28 16:49:07

by Antti Tönkyrä

[permalink] [raw]
Subject: Re: Regarding client fairness

On 2018-03-28 18:58, J. Bruce Fields wrote:
> On Wed, Mar 28, 2018 at 06:35:53PM +0300, Antti Tönkyrä wrote:
>> On 2018-03-28 17:59, J. Bruce Fields wrote:
>>> On Wed, Mar 28, 2018 at 10:54:06AM -0400, bfields wrote:
>>>> On Wed, Mar 28, 2018 at 02:04:57PM +0300, [email protected] wrote:
>>>>> I came across a rather annoying issue where a single NFS client
>>>>> caused resource starvation for NFS server. The server has several
>>>>> storage pools which are used, in this particular case a single
>>>>> client did fairly large read requests and effectively ate all nfsd
>>>>> threads on the server and during that other clients were getting
>>>>> hardly any I/O through to the other storage pool which was
>>>>> completely idle.
>>>> What version of the kernel are you running on your server?
>> 4.15.10 on the system I am testing on.
>>> I'm thinking that if it includes upstream 637600f3ffbf "SUNRPC: Change
>>> TCP socket space reservation" (in upstream 4.8), then you may want to
>>> experiment setting the sunrpc.svc_rpc_per_connection_limit module
>>> parameter added in ff3ac5c3dc23 "SUNRPC: Add a server side
>>> per-connection limit".
>>>
>>> You probably want to experiment with values greater than 0 (the default,
>>> no limit) and the number of server threads.
>> That helps for the client slowing down the whole server, thanks for
>> the tip!
> We should probably revisit 637600f3ffbf "SUNRPC: Change TCP socket space
> reservation". There's got to be some way to keep high bandwidth pipes
> filled with read data without introducing this problem where a single
> client can tie up every server thread.
>
> Just out of curiosity, do you know (approximately) the network and disk
> bandwidth in this case?
>
> --b.

Locally I can read at about 150-200MB/s (two spindles in mirror).
Additionally I have another mount which is a NVMe drive which performs
>500MB/s which I used to verify that I was not bottlenecked when
running the dd read test which tied up the server threads. Network
bandwidth is 10Gbps.