Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754537Ab0G1Jak (ORCPT ); Wed, 28 Jul 2010 05:30:40 -0400 Received: from mga14.intel.com ([143.182.124.37]:9616 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754168Ab0G1Jai (ORCPT ); Wed, 28 Jul 2010 05:30:38 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.55,273,1278313200"; d="scan'208";a="305305360" Date: Wed, 28 Jul 2010 17:30:31 +0800 From: Wu Fengguang To: Mel Gorman Cc: Andrew Morton , Minchan Kim , Andy Whitcroft , Rik van Riel , KOSAKI Motohiro , Christoph Hellwig , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , Dave Chinner , Chris Mason , Nick Piggin , Johannes Weiner , KAMEZAWA Hiroyuki , Andrea Arcangeli , Andreas Mohr , Bill Davidsen , Ben Gamari Subject: Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout() Message-ID: <20100728093031.GA29551@localhost> References: <20100728071705.GA22964@localhost> <20100728084654.GA26776@localhost> <20100728091032.GD5300@csn.ul.ie> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100728091032.GD5300@csn.ul.ie> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2611 Lines: 56 On Wed, Jul 28, 2010 at 05:10:33PM +0800, Mel Gorman wrote: > On Wed, Jul 28, 2010 at 04:46:54PM +0800, Wu Fengguang wrote: > > The wait_on_page_writeback() call inside pageout() is virtually dead code. > > > > shrink_inactive_list() > > shrink_page_list(PAGEOUT_IO_ASYNC) > > pageout(PAGEOUT_IO_ASYNC) > > shrink_page_list(PAGEOUT_IO_SYNC) > > pageout(PAGEOUT_IO_SYNC) > > > > Because shrink_page_list/pageout(PAGEOUT_IO_SYNC) is always called after > > a preceding shrink_page_list/pageout(PAGEOUT_IO_ASYNC), the first > > pageout(ASYNC) converts dirty pages into writeback pages, the second > > shrink_page_list(SYNC) waits on the clean of writeback pages before > > calling pageout(SYNC). The second shrink_page_list(SYNC) can hardly run > > into dirty pages for pageout(SYNC) unless in some race conditions. > > > > It's possible for the second call to run into dirty pages as there is a > congestion_wait() call between the first shrink_page_list() call and the > second. That's a big window. OK there is a <=0.1s time window. Then what about the data set size? After first shrink_page_list(ASYNC), there will be hardly any pages left in the page_list except for the already under-writeback pages and other unreclaimable pages. So it still asks for some race conditions for hitting the second pageout(SYNC) -- some unreclaimable pages become reclaimable+dirty in the 0.1s time window. > > And the wait page-by-page behavior of pageout(SYNC) will lead to very > > long stall time if running into some range of dirty pages. > > True, but this is also lumpy reclaim which is depending on a contiguous > range of pages. It's better for it to wait on the selected range of pages > which is known to contain at least one old page than excessively scan and > reclaim newer pages. > > > So it's bad > > idea anyway to call wait_on_page_writeback() inside pageout(). > > > > I recognise that you are probably thinking of the stall-due-to-fork problem > but I'd expect the patch that raises the bar for <= PAGE_ALLOC_COSTLY_ORDER > to be sufficient. If not, I think it still makes sense to call > wait_on_page_writeback() for > PAGE_ALLOC_COSTLY_ORDER. The main intention of this patch is to remove semi-dead code. I'm less disturbed by the long stall time now with the previous patch ;) Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/