2007-12-26 11:51:28

by saeed bishara

[permalink] [raw]
Subject: read-ahead in NFS server

Hi,
my NFS server seems not to utilize the read-ahead feature, my exported
dir is located on ext3 fs over sata disk. the sata controller can
issue commands up to 1MB, also I modified the read ahead under the sys
to 1MB. but when the client do reads in 32KB chunks (rsize), I can see
in the server side that all IOs are ~32KB. my kernel version is
2.6.22.7.
according to the nfsd code, the NFS server should utilize the
read-ahead feature, but what should I do
to make it work?


saeed


2007-12-27 02:34:38

by Jeff Garzik

[permalink] [raw]
Subject: Re: read-ahead in NFS server

saeed bishara wrote:
> Hi,
> my NFS server seems not to utilize the read-ahead feature, my exported
> dir is located on ext3 fs over sata disk. the sata controller can
> issue commands up to 1MB, also I modified the read ahead under the sys
> to 1MB. but when the client do reads in 32KB chunks (rsize), I can see
> in the server side that all IOs are ~32KB. my kernel version is
> 2.6.22.7.
> according to the nfsd code, the NFS server should utilize the
> read-ahead feature, but what should I do
> to make it work?

(linux-nfs added to cc)

I cannot speak for the NFS server code specifically, but 32kb sounds
like a network read (or write) data size limit.

Are you using TCP? Are you using NFSv4, or an older version?

Jeff

2007-12-27 08:50:23

by saeed bishara

[permalink] [raw]
Subject: Re: read-ahead in NFS server

> (linux-nfs added to cc)
>
> I cannot speak for the NFS server code specifically, but 32kb sounds
> like a network read (or write) data size limit.
yes
>
> Are you using TCP? Are you using NFSv4, or an older version?
I'm using NFSv3/UDP.
I found that the actual requests size was 16KB, after doing some hacks
in server&client I managed to make it 60KB, now I see better
performance, and I see that the average request size is ~130KB which
means that there is actually read-ahead. but why it's only 130KB? how
can I make it larger?
when I run local dd with bs=4K, I can see that the average IO size is
more than 300KB.

2007-12-27 11:54:59

by Jeff Garzik

[permalink] [raw]
Subject: Re: read-ahead in NFS server

saeed bishara wrote:
>> (linux-nfs added to cc)
>>
>> I cannot speak for the NFS server code specifically, but 32kb sounds
>> like a network read (or write) data size limit.
> yes
>> Are you using TCP? Are you using NFSv4, or an older version?
> I'm using NFSv3/UDP.

IMO, you definitely want TCP and NFSv4. Much better network behavior,
with some of the silly UDP limits (plus greatly improved caching
behavior, due to v4 delegations).


> I found that the actual requests size was 16KB, after doing some hacks
> in server&client I managed to make it 60KB, now I see better
> performance, and I see that the average request size is ~130KB which
> means that there is actually read-ahead. but why it's only 130KB? how
> can I make it larger?
> when I run local dd with bs=4K, I can see that the average IO size is
> more than 300KB.

Read-ahead is easier in NFSv4, because the client probably has the file
delegated locally, and has far less need to constantly revalidate file
mapping(s).

Jeff


2007-12-27 15:00:26

by saeed bishara

[permalink] [raw]
Subject: Re: read-ahead in NFS server

> >> Are you using TCP? Are you using NFSv4, or an older version?
> > I'm using NFSv3/UDP.
>
> IMO, you definitely want TCP and NFSv4. Much better network behavior,
> with some of the silly UDP limits (plus greatly improved caching
> behavior, due to v4 delegations).
the clients of my system going to be embedded system with low
performance cpus and I need UDP as it needs less cpu power.

> > when I run local dd with bs=4K, I can see that the average IO size is
> > more than 300KB.
>
> Read-ahead is easier in NFSv4, because the client probably has the file
> delegated locally, and has far less need to constantly revalidate file
> mapping(s).
I'll check that.
but what about the server side? why the issued IO's are only as twice
as the size of the NFS requests?

2007-12-27 15:07:50

by Jeff Garzik

[permalink] [raw]
Subject: Re: read-ahead in NFS server

saeed bishara wrote:
>>>> Are you using TCP? Are you using NFSv4, or an older version?
>>> I'm using NFSv3/UDP.
>> IMO, you definitely want TCP and NFSv4. Much better network behavior,
>> with some of the silly UDP limits (plus greatly improved caching
>> behavior, due to v4 delegations).
> the clients of my system going to be embedded system with low
> performance cpus and I need UDP as it needs less cpu power.

I bet
TCP + fewer revalidations + greater local pagecache activity
uses less cpu power than
UDP + revalidations + rx/tx network activity


>>> when I run local dd with bs=4K, I can see that the average IO size is
>>> more than 300KB.
>> Read-ahead is easier in NFSv4, because the client probably has the file
>> delegated locally, and has far less need to constantly revalidate file
>> mapping(s).
> I'll check that.
> but what about the server side? why the issued IO's are only as twice
> as the size of the NFS requests?

No idea. I bet the source code can tell you :)

Jeff

2007-12-27 15:39:16

by saeed bishara

[permalink] [raw]
Subject: Re: read-ahead in NFS server

>
> I bet
> TCP + fewer revalidations + greater local pagecache activity
> uses less cpu power than
> UDP + revalidations + rx/tx network activity
what do you mean by revalidations?
the workload of the client going to be large sequential IO's, so does
the local pagecache is necessary for this case?

saeed

2007-12-28 02:33:34

by Wu Fengguang

[permalink] [raw]
Subject: Re: read-ahead in NFS server

On Thu, Dec 27, 2007 at 05:00:12PM +0200, saeed bishara wrote:
> > >> Are you using TCP? Are you using NFSv4, or an older version?
> > > I'm using NFSv3/UDP.
> >
> > IMO, you definitely want TCP and NFSv4. Much better network behavior,
> > with some of the silly UDP limits (plus greatly improved caching
> > behavior, due to v4 delegations).
> the clients of my system going to be embedded system with low
> performance cpus and I need UDP as it needs less cpu power.

You can try the attached adaptive readahead patch.
Apply it on your server and compile kernel with CONFIG_ADAPTIVE_READAHEAD.
Use large 1MB readahead on server and small readahead on clients.

> > > when I run local dd with bs=4K, I can see that the average IO size is
> > > more than 300KB.
> >
> > Read-ahead is easier in NFSv4, because the client probably has the file
> > delegated locally, and has far less need to constantly revalidate file
> > mapping(s).
> I'll check that.
> but what about the server side? why the issued IO's are only as twice
> as the size of the NFS requests?

The readahead code is helpless in NFSv3 :-(
Use NFS over TCP and rsize=readahead=1MB on client side could help.
But if you prefer UDP, the above patch may help you :-)

Fengguang