Return-Path: Received: from mga01.intel.com ([192.55.52.88]:39130 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932264Ab0BYMh7 (ORCPT ); Thu, 25 Feb 2010 07:37:59 -0500 Date: Thu, 25 Feb 2010 20:37:55 +0800 From: Wu Fengguang To: Akshat Aranya Cc: Dave Chinner , Trond Myklebust , "linux-nfs@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , Linux Memory Management List , LKML Subject: Re: [RFC] nfs: use 2*rsize readahead size Message-ID: <20100225123755.GB9077@localhost> References: <20100224024100.GA17048@localhost> <20100224032934.GF16175@discord.disaster> <20100224041822.GB27459@localhost> <20100224052215.GH16175@discord.disaster> Content-Type: text/plain; charset=utf-8 In-Reply-To: Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Wed, Feb 24, 2010 at 07:18:26PM +0800, Akshat Aranya wrote: > On Wed, Feb 24, 2010 at 12:22 AM, Dave Chinner wrote: > > > > >> It sounds silly to have > >> > >>         client_readahead_size > server_readahead_size > > > > I don't think it is  - the client readahead has to take into account > > the network latency as well as the server latency. e.g. a network > > with a high bandwidth but high latency is going to need much more > > client side readahead than a high bandwidth, low latency network to > > get the same throughput. Hence it is not uncommon to see larger > > readahead windows on network clients than for local disk access. > > > > Also, the NFS server may not even be able to detect sequential IO > > patterns because of the combined access patterns from the clients, > > and so the only effective readahead might be what the clients > > issue.... > > > > In my experiments, I have observed that the server-side readahead > shuts off rather quickly even with a single client because the client > readahead causes multiple pending read RPCs on the server which are > then serviced in random order and the pattern observed by the > underlying file system is non-sequential. In our file system, we had > to override what the VFS thought was a random workload and continue to > do readahead anyway. What's the server side kernel version, plus client/server side readahead size? I'd expect the context readahead to handle it well. With the patchset in , you can actually see the readahead details: # echo 1 > /debug/tracing/events/readahead/enable # cp test-file /dev/null # cat /debug/tracing/trace # trimmed output readahead-initial(dev=0:15, ino=100177, req=0+2, ra=0+4-2, async=0) = 4 readahead-subsequent(dev=0:15, ino=100177, req=2+2, ra=4+8-8, async=1) = 8 readahead-subsequent(dev=0:15, ino=100177, req=4+2, ra=12+16-16, async=1) = 16 readahead-subsequent(dev=0:15, ino=100177, req=12+2, ra=28+32-32, async=1) = 32 readahead-subsequent(dev=0:15, ino=100177, req=28+2, ra=60+60-60, async=1) = 24 readahead-subsequent(dev=0:15, ino=100177, req=60+2, ra=120+60-60, async=1) = 0 And I've actually verified the NFS case with the help of such traces long ago. When client_readahead_size <= server_readahead_size, the readahead requests may look a bit random at first, and then will quickly turn into a perfect series of sequential context readaheads. Thanks, Fengguang