Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932376Ab0BYMiB (ORCPT ); Thu, 25 Feb 2010 07:38:01 -0500 Received: from mga01.intel.com ([192.55.52.88]:39130 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932264Ab0BYMh7 (ORCPT ); Thu, 25 Feb 2010 07:37:59 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.49,539,1262592000"; d="scan'208";a="543864676" Date: Thu, 25 Feb 2010 20:37:55 +0800 From: Wu Fengguang To: Akshat Aranya Cc: Dave Chinner , Trond Myklebust , "linux-nfs@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , Linux Memory Management List , LKML Subject: Re: [RFC] nfs: use 2*rsize readahead size Message-ID: <20100225123755.GB9077@localhost> References: <20100224024100.GA17048@localhost> <20100224032934.GF16175@discord.disaster> <20100224041822.GB27459@localhost> <20100224052215.GH16175@discord.disaster> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2777 Lines: 57 On Wed, Feb 24, 2010 at 07:18:26PM +0800, Akshat Aranya wrote: > On Wed, Feb 24, 2010 at 12:22 AM, Dave Chinner wrote: > > > > >> It sounds silly to have > >> > >>         client_readahead_size > server_readahead_size > > > > I don't think it is  - the client readahead has to take into account > > the network latency as well as the server latency. e.g. a network > > with a high bandwidth but high latency is going to need much more > > client side readahead than a high bandwidth, low latency network to > > get the same throughput. Hence it is not uncommon to see larger > > readahead windows on network clients than for local disk access. > > > > Also, the NFS server may not even be able to detect sequential IO > > patterns because of the combined access patterns from the clients, > > and so the only effective readahead might be what the clients > > issue.... > > > > In my experiments, I have observed that the server-side readahead > shuts off rather quickly even with a single client because the client > readahead causes multiple pending read RPCs on the server which are > then serviced in random order and the pattern observed by the > underlying file system is non-sequential. In our file system, we had > to override what the VFS thought was a random workload and continue to > do readahead anyway. What's the server side kernel version, plus client/server side readahead size? I'd expect the context readahead to handle it well. With the patchset in , you can actually see the readahead details: # echo 1 > /debug/tracing/events/readahead/enable # cp test-file /dev/null # cat /debug/tracing/trace # trimmed output readahead-initial(dev=0:15, ino=100177, req=0+2, ra=0+4-2, async=0) = 4 readahead-subsequent(dev=0:15, ino=100177, req=2+2, ra=4+8-8, async=1) = 8 readahead-subsequent(dev=0:15, ino=100177, req=4+2, ra=12+16-16, async=1) = 16 readahead-subsequent(dev=0:15, ino=100177, req=12+2, ra=28+32-32, async=1) = 32 readahead-subsequent(dev=0:15, ino=100177, req=28+2, ra=60+60-60, async=1) = 24 readahead-subsequent(dev=0:15, ino=100177, req=60+2, ra=120+60-60, async=1) = 0 And I've actually verified the NFS case with the help of such traces long ago. When client_readahead_size <= server_readahead_size, the readahead requests may look a bit random at first, and then will quickly turn into a perfect series of sequential context readaheads. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/