Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753241Ab1BPX0X (ORCPT ); Wed, 16 Feb 2011 18:26:23 -0500 Received: from mail-iw0-f174.google.com ([209.85.214.174]:62527 "EHLO mail-iw0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750788Ab1BPX0U convert rfc822-to-8bit (ORCPT ); Wed, 16 Feb 2011 18:26:20 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=HkDGOvY8Z6fLyDgyEgORutYjfyDMCwKv6O+xwWfe6ET9IAV5eslKBwZlt+iO0GICnI AA7oq5+vYIcTV8E9xJiiF4KYst/OwF8oGoE0eE8JXUz6yGW9g8xoUBSXTWoVF3KZfB4y VoqC4S5R+cNovb74vqPmW4K+h+aIpFIq/XSGA= MIME-Version: 1.0 In-Reply-To: <20110216095048.GA4473@csn.ul.ie> References: <20110209154606.GJ27110@cmpxchg.org> <20110209164656.GA1063@csn.ul.ie> <20110209182846.GN3347@random.random> <20110210102109.GB17873@csn.ul.ie> <20110210124838.GU3347@random.random> <20110210133323.GH17873@csn.ul.ie> <20110210141447.GW3347@random.random> <20110210145813.GK17873@csn.ul.ie> <20110216095048.GA4473@csn.ul.ie> Date: Thu, 17 Feb 2011 08:26:19 +0900 Message-ID: Subject: Re: [PATCH] mm: vmscan: Stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT From: Minchan Kim To: Mel Gorman Cc: Andrew Morton , Johannes Weiner , Andrea Arcangeli , Rik van Riel , Michal Hocko , Kent Overstreet , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7168 Lines: 146 On Wed, Feb 16, 2011 at 6:50 PM, Mel Gorman wrote: > should_continue_reclaim() for reclaim/compaction allows scanning to continue > even if pages are not being reclaimed until the full list is scanned. In > terms of allocation success, this makes sense but potentially it introduces > unwanted latency for high-order allocations such as transparent hugepages > and network jumbo frames that would prefer to fail the allocation attempt > and fallback to order-0 pages.  Worse, there is a potential that the full > LRU scan will clear all the young bits, distort page aging information and > potentially push pages into swap that would have otherwise remained resident. > > This patch will stop reclaim/compaction if no pages were reclaimed in the > last SWAP_CLUSTER_MAX pages that were considered. For allocations such as > hugetlbfs that use GFP_REPEAT and have fewer fallback options, the full LRU > list may still be scanned. > > To test this, a tool was developed based on ftrace that tracked the latency of > high-order allocations while transparent hugepage support was enabled and three > benchmarks were run. The "fix-infinite" figures are 2.6.38-rc4 with Johannes's > patch "vmscan: fix zone shrinking exit when scan work is done" applied. > > STREAM Highorder Allocation Latency Statistics >               fix-infinite     break-early > 1 :: Count            10298           10229 > 1 :: Min             0.4560          0.4640 > 1 :: Mean            1.0589          1.0183 > 1 :: Max            14.5990         11.7510 > 1 :: Stddev          0.5208          0.4719 > 2 :: Count                2               1 > 2 :: Min             1.8610          3.7240 > 2 :: Mean            3.4325          3.7240 > 2 :: Max             5.0040          3.7240 > 2 :: Stddev          1.5715          0.0000 > 9 :: Count           111696          111694 > 9 :: Min             0.5230          0.4110 > 9 :: Mean           10.5831         10.5718 > 9 :: Max            38.4480         43.2900 > 9 :: Stddev          1.1147          1.1325 > > Mean time for order-1 allocations is reduced. order-2 looks increased > but with so few allocations, it's not particularly significant. THP mean > allocation latency is also reduced. That said, allocation time varies so > significantly that the reductions are within noise. > > Max allocation time is reduced by a significant amount for low-order > allocations but reduced for THP allocations which presumably are now > breaking before reclaim has done enough work. > > SysBench Highorder Allocation Latency Statistics >               fix-infinite     break-early > 1 :: Count            15745           15677 > 1 :: Min             0.4250          0.4550 > 1 :: Mean            1.1023          1.0810 > 1 :: Max            14.4590         10.8220 > 1 :: Stddev          0.5117          0.5100 > 2 :: Count                1               1 > 2 :: Min             3.0040          2.1530 > 2 :: Mean            3.0040          2.1530 > 2 :: Max             3.0040          2.1530 > 2 :: Stddev          0.0000          0.0000 > 9 :: Count             2017            1931 > 9 :: Min             0.4980          0.7480 > 9 :: Mean           10.4717         10.3840 > 9 :: Max            24.9460         26.2500 > 9 :: Stddev          1.1726          1.1966 > > Again, mean time for order-1 allocations is reduced while order-2 allocations > are too few to draw conclusions from. The mean time for THP allocations is > also slightly reduced albeit the reductions are within varianes. > > Once again, our maximum allocation time is significantly reduced for > low-order allocations and slightly increased for THP allocations. > > Anon stream mmap reference Highorder Allocation Latency Statistics > 1 :: Count             1376            1790 > 1 :: Min             0.4940          0.5010 > 1 :: Mean            1.0289          0.9732 > 1 :: Max             6.2670          4.2540 > 1 :: Stddev          0.4142          0.2785 > 2 :: Count                1               - > 2 :: Min             1.9060               - > 2 :: Mean            1.9060               - > 2 :: Max             1.9060               - > 2 :: Stddev          0.0000               - > 9 :: Count            11266           11257 > 9 :: Min             0.4990          0.4940 > 9 :: Mean        27250.4669      24256.1919 > 9 :: Max      11439211.0000    6008885.0000 > 9 :: Stddev     226427.4624     186298.1430 > > This benchmark creates one thread per CPU which references an amount of > anonymous memory 1.5 times the size of physical RAM. This pounds swap quite > heavily and is intended to exercise THP a bit. > > Mean allocation time for order-1 is reduced as before. It's also reduced > for THP allocations but the variations here are pretty massive due to swap. > As before, maximum allocation times are significantly reduced. > > Overall, the patch reduces the mean and maximum allocation latencies for > the smaller high-order allocations. This was with Slab configured so it > would be expected to be more significant with Slub which uses these size > allocations more aggressively. > > The mean allocation times for THP allocations are also slightly reduced. > The maximum latency was slightly increased as predicted by the comments due > to reclaim/compaction breaking early. However, workloads care more about the > latency of lower-order allocations than THP so it's an acceptable trade-off. > Please consider merging for 2.6.38. > > Signed-off-by: Mel Gorman > --- >  mm/vmscan.c |   32 ++++++++++++++++++++++---------- >  1 files changed, 22 insertions(+), 10 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 148c6e6..591b907 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1841,16 +1841,28 @@ static inline bool should_continue_reclaim(struct zone *zone, >        if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION)) >                return false; > > -       /* > -        * If we failed to reclaim and have scanned the full list, stop. > -        * NOTE: Checking just nr_reclaimed would exit reclaim/compaction far > -        *       faster but obviously would be less likely to succeed > -        *       allocation. If this is desirable, use GFP_REPEAT to decide Typo. __GFP_REPEAT Otherwise, looks good to me. Reviewed-by: Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/