Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752558AbbHTOR3 (ORCPT ); Thu, 20 Aug 2015 10:17:29 -0400 Received: from outbound-smtp02.blacknight.com ([81.17.249.8]:56453 "EHLO outbound-smtp02.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751665AbbHTOR2 (ORCPT ); Thu, 20 Aug 2015 10:17:28 -0400 Date: Thu, 20 Aug 2015 15:17:20 +0100 From: Mel Gorman To: Vlastimil Babka Cc: Linux-MM , Johannes Weiner , Rik van Riel , David Rientjes , Joonsoo Kim , Michal Hocko , LKML Subject: Re: [PATCH 01/10] mm, page_alloc: Delete the zonelist_cache Message-ID: <20150820141720.GE12432@techsingularity.net> References: <1439376335-17895-1-git-send-email-mgorman@techsingularity.net> <1439376335-17895-2-git-send-email-mgorman@techsingularity.net> <55D5D68E.6040206@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <55D5D68E.6040206@suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4278 Lines: 102 On Thu, Aug 20, 2015 at 03:30:54PM +0200, Vlastimil Babka wrote: > >Note the maximum stall latency which was 6 seconds and becomes 67ms with > >this patch applied. However, also note that it is not guaranteed this > >benchmark always hits pathelogical cases and the milage varies. There is > >a secondary impact with more direct reclaim because zones are now being > >considered instead of being skipped by zlc. > > > > 4.1.0 4.1.0 > > vanilla nozlc-v1r4 > >Swap Ins 838 502 > >Swap Outs 1149395 2622895 > >DMA32 allocs 17839113 15863747 > >Normal allocs 129045707 137847920 > >Direct pages scanned 4070089 29046893 > >Kswapd pages scanned 17147837 17140694 > >Kswapd pages reclaimed 17146691 17139601 > >Direct pages reclaimed 1888879 4886630 > >Kswapd efficiency 99% 99% > >Kswapd velocity 17523.721 17518.928 > >Direct efficiency 46% 16% > >Direct velocity 4159.306 29687.854 > >Percentage direct scans 19% 62% > >Page writes by reclaim 1149395.000 2622895.000 > >Page writes file 0 0 > >Page writes anon 1149395 2622895 > > Interesting, kswapd has no decrease that would counter the increase in > direct reclaim. So there's more reclaim overall. Does it mean that stutter > doesn't like LRU and zlc was disrupting LRU? > The LRU is being heavily disrupted by both reclaim and compaction activity. The test is not a reliable means of evaluating reclaim decisions because of the compaction activity. The main purpose of stutter was as a proxy measure of desktop interactivity during IO. As the test does THP allocations, it can trigger the case where zlc can disable a zone for no reason and instead busy loop which is just wrong. > >The direct page scan and reclaim rates are noticeable. It is possible > >this will not be a universal win on all workloads but cycling through > >zonelists waiting for zlc->last_full_zap to expire is not the right > >decision. > > > >Signed-off-by: Mel Gorman > >Acked-by: David Rientjes > > It doesn't seem that removal of zlc would increase overhead due to > "expensive operations no longer being avoided". Making some corner-case > benchmark(s) worse as a side-effect of different LRU approximation shouldn't > be a show-stopper. Hence > > Acked-by: Vlastimil Babka > Thanks. > just git grep found some lines that should be also deleted: > > include/linux/mmzone.h: * If zlcache_ptr is not NULL, then it is just the > address of zlcache, > include/linux/mmzone.h: * as explained above. If zlcache_ptr is NULL, there > is no zlcache. > Thanks > And: > > >@@ -3157,7 +2967,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, > > gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */ > > struct alloc_context ac = { > > .high_zoneidx = gfp_zone(gfp_mask), > >- .nodemask = nodemask, > >+ .nodemask = nodemask ? : &cpuset_current_mems_allowed, > > .migratetype = gfpflags_to_migratetype(gfp_mask), > > }; > > > >@@ -3188,8 +2998,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, > > ac.zonelist = zonelist; > > /* The preferred zone is used for statistics later */ > > preferred_zoneref = first_zones_zonelist(ac.zonelist, ac.high_zoneidx, > >- ac.nodemask ? : &cpuset_current_mems_allowed, > >- &ac.preferred_zone); > >+ ac.nodemask, &ac.preferred_zone); > > if (!ac.preferred_zone) > > goto out; > > ac.classzone_idx = zonelist_zone_idx(preferred_zoneref); > > These hunks appear unrelated to zonelist cache? Also they move the > evaluation of cpuset_current_mems_allowed They are rebase-related brain damage :(. I'll fix it and retest. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/