2008-06-03 19:41:43

by Weathers, Norman R.

[permalink] [raw]
Subject: Problems with large number of clients and reads

Hello all,

We are having some issues with some high throughput servers of ours.

Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
being served are around 2 GB each, and there are usually 3 to 5 of them
being read, so once read they fit into memory nicely, and when all is
working correctly, we have a perfectly filled cache, with almost no disk
activity.

When we have large NFS activity (say, 600 to 1200 clients) connecting to
the server(s), they can get into a state where they are using up all of
memory, but they are dropping cache. slabtop is showing 13 GB of memory
being used by the size-4096 slab object. We have two ethernet channels
bonded, so we see in excess of 240 MB/s of data flowing out of the box,
and all of the sudden, disk activity has risen to 185 MB/s. This
happens if we are using 8 or more nfs threads. If we limit the threads
to 6 or less, this doesn't happen. Of course, we are starving clients,
but at least the jobs that my customers are throwing out there are
progressing. The question becomes, what is causing the memory to be
used up by the slab size-4096 object? Why when all of the sudden a
bunch of clients ask for data does this object grow from 100 MB to 13
GB? I have set the memory settings to something that I thought was
reasonable.

Here is some more of the particulars:

sysctl.conf tcp memory settings:

# NFS Tuning Parameters
sunrpc.udp_slot_table_entries = 128
sunrpc.tcp_slot_table_entries = 128
vm.overcommit_ratio = 80

net.core.rmem_max=524288
net.core.rmem_default=262144
net.core.wmem_max=524288
net.core.wmem_default=262144
net.ipv4.tcp_rmem = 8192 262144 524288
net.ipv4.tcp_wmem = 8192 262144 524288
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=0
vm.min_free_kbytes=50000
vm.overcommit_memory=1
net.ipv4.tcp_reordering=127

# Enable tcp_low_latency
net.ipv4.tcp_low_latency=1

Here is a current reading from a slabtop of a system where this error is
happening:

3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096

Note the size of the object cache, usually it is 50 - 100 MB (I have
another box with 32 threads and the same settings which is bouncing
between 50 and 128 MB right now).

I have a lot of client boxes that need access to these servers, and
would really benefit from having more threads, but if I increase the
number of threads, it pushes everything out of cache, forcing re-reads,
and really slows down our jobs.

Any thoughts on this?


Thanks,

Norman Weathers


2008-06-06 00:06:23

by Dean

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

What is the file system? It is the one managing the cache on the server.
Dean

Norman Weathers wrote:
> Hello all,
>
> We are having some issues with some high throughput servers of ours.
>
> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> being served are around 2 GB each, and there are usually 3 to 5 of them
> being read, so once read they fit into memory nicely, and when all is
> working correctly, we have a perfectly filled cache, with almost no disk
> activity.
>
> When we have large NFS activity (say, 600 to 1200 clients) connecting to
> the server(s), they can get into a state where they are using up all of
> memory, but they are dropping cache. slabtop is showing 13 GB of memory
> being used by the size-4096 slab object. We have two ethernet channels
> bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> and all of the sudden, disk activity has risen to 185 MB/s. This
> happens if we are using 8 or more nfs threads. If we limit the threads
> to 6 or less, this doesn't happen. Of course, we are starving clients,
> but at least the jobs that my customers are throwing out there are
> progressing. The question becomes, what is causing the memory to be
> used up by the slab size-4096 object? Why when all of the sudden a
> bunch of clients ask for data does this object grow from 100 MB to 13
> GB? I have set the memory settings to something that I thought was
> reasonable.
>
> Here is some more of the particulars:
>
> sysctl.conf tcp memory settings:
>
> # NFS Tuning Parameters
> sunrpc.udp_slot_table_entries = 128
> sunrpc.tcp_slot_table_entries = 128
> vm.overcommit_ratio = 80
>
> net.core.rmem_max=524288
> net.core.rmem_default=262144
> net.core.wmem_max=524288
> net.core.wmem_default=262144
> net.ipv4.tcp_rmem = 8192 262144 524288
> net.ipv4.tcp_wmem = 8192 262144 524288
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_timestamps=0
> vm.min_free_kbytes=50000
> vm.overcommit_memory=1
> net.ipv4.tcp_reordering=127
>
> # Enable tcp_low_latency
> net.ipv4.tcp_low_latency=1
>
> Here is a current reading from a slabtop of a system where this error is
> happening:
>
> 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
>
> Note the size of the object cache, usually it is 50 - 100 MB (I have
> another box with 32 threads and the same settings which is bouncing
> between 50 and 128 MB right now).
>
> I have a lot of client boxes that need access to these servers, and
> would really benefit from having more threads, but if I increase the
> number of threads, it pushes everything out of cache, forcing re-reads,
> and really slows down our jobs.
>
> Any thoughts on this?
>
>
> Thanks,
>
> Norman Weathers
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2008-06-06 14:45:10

by Chuck Lever

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

Norman Weathers wrote:
> On Wed, 2008-06-04 at 09:13 -0500, Norman Weathers wrote:
>> On Wed, 2008-06-04 at 09:49 -0400, Chuck Lever wrote:
>>> Hi Norman-
>>>
>>> On Tue, Jun 3, 2008 at 2:50 PM, Norman Weathers
>>> <norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org> wrote:
>>>> Hello all,
>>>>
>>>> We are having some issues with some high throughput servers of ours.
>>>>
>>>> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
>>>> with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
>>>> being served are around 2 GB each, and there are usually 3 to 5 of them
>>>> being read, so once read they fit into memory nicely, and when all is
>>>> working correctly, we have a perfectly filled cache, with almost no disk
>>>> activity.
>>>>
>>>> When we have large NFS activity (say, 600 to 1200 clients) connecting to
>>>> the server(s), they can get into a state where they are using up all of
>>>> memory, but they are dropping cache. slabtop is showing 13 GB of memory
>>>> being used by the size-4096 slab object. We have two ethernet channels
>>>> bonded, so we see in excess of 240 MB/s of data flowing out of the box,
>>>> and all of the sudden, disk activity has risen to 185 MB/s. This
>>>> happens if we are using 8 or more nfs threads. If we limit the threads
>>>> to 6 or less, this doesn't happen. Of course, we are starving clients,
>>>> but at least the jobs that my customers are throwing out there are
>>>> progressing. The question becomes, what is causing the memory to be
>>>> used up by the slab size-4096 object? Why when all of the sudden a
>>>> bunch of clients ask for data does this object grow from 100 MB to 13
>>>> GB? I have set the memory settings to something that I thought was
>>>> reasonable.
>>>>
>>>> Here is some more of the particulars:
>>>>
>>>> sysctl.conf tcp memory settings:
>>>>
>>>> # NFS Tuning Parameters
>>>> sunrpc.udp_slot_table_entries = 128
>>>> sunrpc.tcp_slot_table_entries = 128
>>> I don't have an answer to your size-4096 question, but I do want to
>>> note that setting the slot table entries sysctls has no effect on NFS
>>> servers. It's a client-only setting.
>>>
>>
>> Ok.
>>
>>> Have you tried this experiment on a server where there are no special
>>> memory tuning sysctls?
>> Unfortunately, no. I can try it today.
>>
>
>
> I tried the test with no special memory settings, and I still see the
> same issue. I also have noticed that even with only 3 threads running,
> I can still have times where 11 GB of memory is being used for buffer
> and not for disk cache. It just seems like memory is being used up if
> we have a lot of requests from a lot of clients at once...

I'm at a loss... but I have another question or two. Is it just memory
utilization issues that you see on the server, or are there noticeable
performance problems that crop up when you see this?

Did you mention what your physical file system is on the server? Are
you running it on an LVM or software or hardware RAID?


Attachments:
chuck_lever.vcf (259.00 B)

2008-06-06 16:09:23

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

On Tue, Jun 03, 2008 at 01:50:01PM -0500, Norman Weathers wrote:
> Hello all,
>
> We are having some issues with some high throughput servers of ours.
>
> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> being served are around 2 GB each, and there are usually 3 to 5 of them
> being read, so once read they fit into memory nicely, and when all is
> working correctly, we have a perfectly filled cache, with almost no disk
> activity.
>
> When we have large NFS activity (say, 600 to 1200 clients) connecting to
> the server(s), they can get into a state where they are using up all of
> memory, but they are dropping cache. slabtop is showing 13 GB of memory
> being used by the size-4096 slab object. We have two ethernet channels
> bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> and all of the sudden, disk activity has risen to 185 MB/s. This
> happens if we are using 8 or more nfs threads. If we limit the threads
> to 6 or less, this doesn't happen. Of course, we are starving clients,
> but at least the jobs that my customers are throwing out there are
> progressing. The question becomes, what is causing the memory to be
> used up by the slab size-4096 object? Why when all of the sudden a
> bunch of clients ask for data does this object grow from 100 MB to 13
> GB? I have set the memory settings to something that I thought was
> reasonable.
>
> Here is some more of the particulars:
>
> sysctl.conf tcp memory settings:
>
> # NFS Tuning Parameters
> sunrpc.udp_slot_table_entries = 128
> sunrpc.tcp_slot_table_entries = 128
> vm.overcommit_ratio = 80
>
> net.core.rmem_max=524288
> net.core.rmem_default=262144
> net.core.wmem_max=524288
> net.core.wmem_default=262144
> net.ipv4.tcp_rmem = 8192 262144 524288
> net.ipv4.tcp_wmem = 8192 262144 524288
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_timestamps=0
> vm.min_free_kbytes=50000
> vm.overcommit_memory=1
> net.ipv4.tcp_reordering=127
>
> # Enable tcp_low_latency
> net.ipv4.tcp_low_latency=1
>
> Here is a current reading from a slabtop of a system where this error is
> happening:
>
> 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
>
> Note the size of the object cache, usually it is 50 - 100 MB (I have
> another box with 32 threads and the same settings which is bouncing
> between 50 and 128 MB right now).
>
> I have a lot of client boxes that need access to these servers, and
> would really benefit from having more threads, but if I increase the
> number of threads, it pushes everything out of cache, forcing re-reads,
> and really slows down our jobs.
>
> Any thoughts on this?

I'd've thought that suggests a leak of memory allocated by kmalloc().

Does the size-4096 cache decrease eventually, or does it stay that large
until you reboot?

--b.

2008-06-09 13:21:06

by Weathers, Norman R.

[permalink] [raw]
Subject: RE: Problems with large number of clients and reads

(I dislike Outlook.... Apologize if I end up messing up the formatting
of the message)


The file system is XFS, about 250 GB per server. I would say that yes
it is managing the cache on the server(s) in question. The servers in
question have 16 GB of memory, and the files being served are 1.9 GB,
about 5 each per server.


-----Original Message-----
From: Dean Hildebrand [mailto:[email protected]]
Sent: Thursday, June 05, 2008 7:06 PM
To: Weathers, Norman R.
Cc: [email protected]
Subject: Re: Problems with large number of clients and reads

>What is the file system? It is the one managing the cache on the
server.
>Dean





Norman Weathers wrote:
> Hello all,
>
> We are having some issues with some high throughput servers of ours.
>
> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> being served are around 2 GB each, and there are usually 3 to 5 of
them
> being read, so once read they fit into memory nicely, and when all is
> working correctly, we have a perfectly filled cache, with almost no
disk
> activity.
>
> When we have large NFS activity (say, 600 to 1200 clients) connecting
to
> the server(s), they can get into a state where they are using up all
of
> memory, but they are dropping cache. slabtop is showing 13 GB of
memory
> being used by the size-4096 slab object. We have two ethernet
channels
> bonded, so we see in excess of 240 MB/s of data flowing out of the
box,
> and all of the sudden, disk activity has risen to 185 MB/s. This
> happens if we are using 8 or more nfs threads. If we limit the
threads
> to 6 or less, this doesn't happen. Of course, we are starving
clients,
> but at least the jobs that my customers are throwing out there are
> progressing. The question becomes, what is causing the memory to be
> used up by the slab size-4096 object? Why when all of the sudden a
> bunch of clients ask for data does this object grow from 100 MB to 13
> GB? I have set the memory settings to something that I thought was
> reasonable.
>
> Here is some more of the particulars:
>
> sysctl.conf tcp memory settings:
>
> # NFS Tuning Parameters
> sunrpc.udp_slot_table_entries = 128
> sunrpc.tcp_slot_table_entries = 128
> vm.overcommit_ratio = 80
>
> net.core.rmem_max=524288
> net.core.rmem_default=262144
> net.core.wmem_max=524288
> net.core.wmem_default=262144
> net.ipv4.tcp_rmem = 8192 262144 524288
> net.ipv4.tcp_wmem = 8192 262144 524288
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_timestamps=0
> vm.min_free_kbytes=50000
> vm.overcommit_memory=1
> net.ipv4.tcp_reordering=127
>
> # Enable tcp_low_latency
> net.ipv4.tcp_low_latency=1
>
> Here is a current reading from a slabtop of a system where this error
is
> happening:
>
> 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
>
> Note the size of the object cache, usually it is 50 - 100 MB (I have
> another box with 32 threads and the same settings which is bouncing
> between 50 and 128 MB right now).
>
> I have a lot of client boxes that need access to these servers, and
> would really benefit from having more threads, but if I increase the
> number of threads, it pushes everything out of cache, forcing
re-reads,
> and really slows down our jobs.
>
> Any thoughts on this?
>
>
> Thanks,
>
> Norman Weathers
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2008-06-04 13:49:21

by Chuck Lever

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

Hi Norman-

On Tue, Jun 3, 2008 at 2:50 PM, Norman Weathers
<norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org> wrote:
> Hello all,
>
> We are having some issues with some high throughput servers of ours.
>
> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> being served are around 2 GB each, and there are usually 3 to 5 of them
> being read, so once read they fit into memory nicely, and when all is
> working correctly, we have a perfectly filled cache, with almost no disk
> activity.
>
> When we have large NFS activity (say, 600 to 1200 clients) connecting to
> the server(s), they can get into a state where they are using up all of
> memory, but they are dropping cache. slabtop is showing 13 GB of memory
> being used by the size-4096 slab object. We have two ethernet channels
> bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> and all of the sudden, disk activity has risen to 185 MB/s. This
> happens if we are using 8 or more nfs threads. If we limit the threads
> to 6 or less, this doesn't happen. Of course, we are starving clients,
> but at least the jobs that my customers are throwing out there are
> progressing. The question becomes, what is causing the memory to be
> used up by the slab size-4096 object? Why when all of the sudden a
> bunch of clients ask for data does this object grow from 100 MB to 13
> GB? I have set the memory settings to something that I thought was
> reasonable.
>
> Here is some more of the particulars:
>
> sysctl.conf tcp memory settings:
>
> # NFS Tuning Parameters
> sunrpc.udp_slot_table_entries = 128
> sunrpc.tcp_slot_table_entries = 128

I don't have an answer to your size-4096 question, but I do want to
note that setting the slot table entries sysctls has no effect on NFS
servers. It's a client-only setting.

Have you tried this experiment on a server where there are no special
memory tuning sysctls?

Can you describe the characteristics of your I/O workload (the
random/sequentialness of it, the size of the I/O requests, the
burstiness, etc)?

What mount options are you using on the clients, and what are your
export options on the server? (Which NFS version are you using)?

And finally, the output of uname -a on the server would be good to include.

> vm.overcommit_ratio = 80
>
> net.core.rmem_max=524288
> net.core.rmem_default=262144
> net.core.wmem_max=524288
> net.core.wmem_default=262144
> net.ipv4.tcp_rmem = 8192 262144 524288
> net.ipv4.tcp_wmem = 8192 262144 524288
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_timestamps=0
> vm.min_free_kbytes=50000
> vm.overcommit_memory=1
> net.ipv4.tcp_reordering=127
>
> # Enable tcp_low_latency
> net.ipv4.tcp_low_latency=1
>
> Here is a current reading from a slabtop of a system where this error is
> happening:
>
> 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
>
> Note the size of the object cache, usually it is 50 - 100 MB (I have
> another box with 32 threads and the same settings which is bouncing
> between 50 and 128 MB right now).
>
> I have a lot of client boxes that need access to these servers, and
> would really benefit from having more threads, but if I increase the
> number of threads, it pushes everything out of cache, forcing re-reads,
> and really slows down our jobs.
>
> Any thoughts on this?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

2008-06-04 14:13:30

by Weathers, Norman R.

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

On Wed, 2008-06-04 at 09:49 -0400, Chuck Lever wrote:
> Hi Norman-
>
> On Tue, Jun 3, 2008 at 2:50 PM, Norman Weathers
> <norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org> wrote:
> > Hello all,
> >
> > We are having some issues with some high throughput servers of ours.
> >
> > Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> > with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> > being served are around 2 GB each, and there are usually 3 to 5 of them
> > being read, so once read they fit into memory nicely, and when all is
> > working correctly, we have a perfectly filled cache, with almost no disk
> > activity.
> >
> > When we have large NFS activity (say, 600 to 1200 clients) connecting to
> > the server(s), they can get into a state where they are using up all of
> > memory, but they are dropping cache. slabtop is showing 13 GB of memory
> > being used by the size-4096 slab object. We have two ethernet channels
> > bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> > and all of the sudden, disk activity has risen to 185 MB/s. This
> > happens if we are using 8 or more nfs threads. If we limit the threads
> > to 6 or less, this doesn't happen. Of course, we are starving clients,
> > but at least the jobs that my customers are throwing out there are
> > progressing. The question becomes, what is causing the memory to be
> > used up by the slab size-4096 object? Why when all of the sudden a
> > bunch of clients ask for data does this object grow from 100 MB to 13
> > GB? I have set the memory settings to something that I thought was
> > reasonable.
> >
> > Here is some more of the particulars:
> >
> > sysctl.conf tcp memory settings:
> >
> > # NFS Tuning Parameters
> > sunrpc.udp_slot_table_entries = 128
> > sunrpc.tcp_slot_table_entries = 128
>
> I don't have an answer to your size-4096 question, but I do want to
> note that setting the slot table entries sysctls has no effect on NFS
> servers. It's a client-only setting.
>


Ok.

> Have you tried this experiment on a server where there are no special
> memory tuning sysctls?

Unfortunately, no. I can try it today.

>
> Can you describe the characteristics of your I/O workload (the
> random/sequentialness of it, the size of the I/O requests, the
> burstiness, etc)?

The I/O pattern is somewhat random, but when functioning properly, the
files are small enough to fit into cache. Size per record is ~ 10k (can
be up to 64k).

>
> What mount options are you using on the clients, and what are your
> export options on the server? (Which NFS version are you using)?

NFSv3. Client mount options are:
rw,vers=3,rsize=1048576,wsize=1048576,acregmin=1,acregmax=15,acdirmin=0,acdirmax=0,hard,intr,proto=tcp,timeo=600,retrans=2,addr=hoeptt01


>
> And finally, the output of uname -a on the server would be good to include.
>

Linux hoeptt06 2.6.22.14.SLAB #5 SMP Wed Jan 23 15:45:40 CST 2008 x86_64
x86_64 x86_64 GNU/Linux


> > vm.overcommit_ratio = 80
> >
> > net.core.rmem_max=524288
> > net.core.rmem_default=262144
> > net.core.wmem_max=524288
> > net.core.wmem_default=262144
> > net.ipv4.tcp_rmem = 8192 262144 524288
> > net.ipv4.tcp_wmem = 8192 262144 524288
> > net.ipv4.tcp_sack=0
> > net.ipv4.tcp_timestamps=0
> > vm.min_free_kbytes=50000
> > vm.overcommit_memory=1
> > net.ipv4.tcp_reordering=127
> >
> > # Enable tcp_low_latency
> > net.ipv4.tcp_low_latency=1
> >
> > Here is a current reading from a slabtop of a system where this error is
> > happening:
> >
> > 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
> >
> > Note the size of the object cache, usually it is 50 - 100 MB (I have
> > another box with 32 threads and the same settings which is bouncing
> > between 50 and 128 MB right now).
> >
> > I have a lot of client boxes that need access to these servers, and
> > would really benefit from having more threads, but if I increase the
> > number of threads, it pushes everything out of cache, forcing re-reads,
> > and really slows down our jobs.
> >
> > Any thoughts on this?
>

2008-06-05 18:54:38

by Weathers, Norman R.

[permalink] [raw]
Subject: Re: Problems with large number of clients and reads

On Wed, 2008-06-04 at 09:13 -0500, Norman Weathers wrote:
> On Wed, 2008-06-04 at 09:49 -0400, Chuck Lever wrote:
> > Hi Norman-
> >
> > On Tue, Jun 3, 2008 at 2:50 PM, Norman Weathers
> > <norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org> wrote:
> > > Hello all,
> > >
> > > We are having some issues with some high throughput servers of ours.
> > >
> > > Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> > > with 2 Dual Core Intels (3 GHz) and 16 GB of ram. The files that are
> > > being served are around 2 GB each, and there are usually 3 to 5 of them
> > > being read, so once read they fit into memory nicely, and when all is
> > > working correctly, we have a perfectly filled cache, with almost no disk
> > > activity.
> > >
> > > When we have large NFS activity (say, 600 to 1200 clients) connecting to
> > > the server(s), they can get into a state where they are using up all of
> > > memory, but they are dropping cache. slabtop is showing 13 GB of memory
> > > being used by the size-4096 slab object. We have two ethernet channels
> > > bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> > > and all of the sudden, disk activity has risen to 185 MB/s. This
> > > happens if we are using 8 or more nfs threads. If we limit the threads
> > > to 6 or less, this doesn't happen. Of course, we are starving clients,
> > > but at least the jobs that my customers are throwing out there are
> > > progressing. The question becomes, what is causing the memory to be
> > > used up by the slab size-4096 object? Why when all of the sudden a
> > > bunch of clients ask for data does this object grow from 100 MB to 13
> > > GB? I have set the memory settings to something that I thought was
> > > reasonable.
> > >
> > > Here is some more of the particulars:
> > >
> > > sysctl.conf tcp memory settings:
> > >
> > > # NFS Tuning Parameters
> > > sunrpc.udp_slot_table_entries = 128
> > > sunrpc.tcp_slot_table_entries = 128
> >
> > I don't have an answer to your size-4096 question, but I do want to
> > note that setting the slot table entries sysctls has no effect on NFS
> > servers. It's a client-only setting.
> >
>
>
> Ok.
>
> > Have you tried this experiment on a server where there are no special
> > memory tuning sysctls?
>
> Unfortunately, no. I can try it today.
>


I tried the test with no special memory settings, and I still see the
same issue. I also have noticed that even with only 3 threads running,
I can still have times where 11 GB of memory is being used for buffer
and not for disk cache. It just seems like memory is being used up if
we have a lot of requests from a lot of clients at once...

> >
> > Can you describe the characteristics of your I/O workload (the
> > random/sequentialness of it, the size of the I/O requests, the
> > burstiness, etc)?
>
> The I/O pattern is somewhat random, but when functioning properly, the
> files are small enough to fit into cache. Size per record is ~ 10k (can
> be up to 64k).
>
> >
> > What mount options are you using on the clients, and what are your
> > export options on the server? (Which NFS version are you using)?
>
> NFSv3. Client mount options are:
> rw,vers=3,rsize=1048576,wsize=1048576,acregmin=1,acregmax=15,acdirmin=0,acdirmax=0,hard,intr,proto=tcp,timeo=600,retrans=2,addr=hoeptt01
>
>
> >
> > And finally, the output of uname -a on the server would be good to include.
> >
>
> Linux hoeptt06 2.6.22.14.SLAB #5 SMP Wed Jan 23 15:45:40 CST 2008 x86_64
> x86_64 x86_64 GNU/Linux
>
>
> > > vm.overcommit_ratio = 80
> > >
> > > net.core.rmem_max=524288
> > > net.core.rmem_default=262144
> > > net.core.wmem_max=524288
> > > net.core.wmem_default=262144
> > > net.ipv4.tcp_rmem = 8192 262144 524288
> > > net.ipv4.tcp_wmem = 8192 262144 524288
> > > net.ipv4.tcp_sack=0
> > > net.ipv4.tcp_timestamps=0
> > > vm.min_free_kbytes=50000
> > > vm.overcommit_memory=1
> > > net.ipv4.tcp_reordering=127
> > >
> > > # Enable tcp_low_latency
> > > net.ipv4.tcp_low_latency=1
> > >
> > > Here is a current reading from a slabtop of a system where this error is
> > > happening:
> > >
> > > 3007154 3007154 100% 4.00K 3007154 1 12028616K size-4096
> > >
> > > Note the size of the object cache, usually it is 50 - 100 MB (I have
> > > another box with 32 threads and the same settings which is bouncing
> > > between 50 and 128 MB right now).
> > >
> > > I have a lot of client boxes that need access to these servers, and
> > > would really benefit from having more threads, but if I increase the
> > > number of threads, it pushes everything out of cache, forcing re-reads,
> > > and really slows down our jobs.
> > >
> > > Any thoughts on this?
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html