Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752175AbbFYSls (ORCPT ); Thu, 25 Jun 2015 14:41:48 -0400 Received: from cantor2.suse.de ([195.135.220.15]:39716 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751551AbbFYSlj (ORCPT ); Thu, 25 Jun 2015 14:41:39 -0400 Date: Thu, 25 Jun 2015 19:41:35 +0100 From: Mel Gorman To: Joonsoo Kim Cc: Joonsoo Kim , Andrew Morton , LKML , Linux Memory Management List , Vlastimil Babka , Rik van Riel , David Rientjes , Minchan Kim Subject: Re: [RFC PATCH 00/10] redesign compaction algorithm Message-ID: <20150625184135.GB26927@suse.de> References: <1435193121-25880-1-git-send-email-iamjoonsoo.kim@lge.com> <20150625110314.GJ11809@suse.de> <20150625172550.GA26927@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4483 Lines: 86 On Fri, Jun 26, 2015 at 03:14:39AM +0900, Joonsoo Kim wrote: > > It could though. Reclaim/compaction is entered for orders higher than > > PAGE_ALLOC_COSTLY_ORDER and when scan priority is sufficiently high. > > That could be adjusted if you have a viable case where orders < > > PAGE_ALLOC_COSTLY_ORDER must succeed and currently requires excessive > > reclaim instead of relying on compaction. > > Yes. I saw this problem in real situation. In ARM, order-2 allocation > is requested > in fork(), so it should be succeed. But, there is not enough order-2 freepage, > so reclaim/compaction begins. Compaction fails repeatedly although > I didn't check exact reason. That should be identified and repaired prior to reimplementing compaction because it's important. > >> >> 3) Compaction capability is highly depends on migratetype of memory, > >> >> because freepage scanner doesn't scan unmovable pageblock. > >> >> > >> > > >> > For a very good reason. Unmovable allocation requests that fallback to > >> > other pageblocks are the worst in terms of fragmentation avoidance. The > >> > more of these events there are, the more the system will decay. If there > >> > are many of these events then a compaction benchmark may start with high > >> > success rates but decay over time. > >> > > >> > Very broadly speaking, the more the mm_page_alloc_extfrag tracepoint > >> > triggers with alloc_migratetype == MIGRATE_UNMOVABLE, the faster the > >> > system is decaying. Having the freepage scanner select unmovable > >> > pageblocks will trigger this event more frequently. > >> > > >> > The unfortunate impact is that selecting unmovable blocks from the free > >> > csanner will improve compaction success rates for high-order kernel > >> > allocations early in the lifetime of the system but later fail high-order > >> > allocation requests as more pageblocks get converted to unmovable. It > >> > might be ok for kernel allocations but THP will eventually have a 100% > >> > failure rate. > >> > >> I wrote rationale in the patch itself. We already use non-movable pageblock > >> for migration scanner. It empties non-movable pageblock so number of > >> freepage on non-movable pageblock will increase. Using non-movable > >> pageblock for freepage scanner negates this effect so number of freepage > >> on non-movable pageblock will be balanced. Could you tell me in detail > >> how freepage scanner select unmovable pageblocks will cause > >> more fragmentation? Possibly, I don't understand effect of this patch > >> correctly and need some investigation. :) > >> > > > > The long-term success rate of fragmentation avoidance depends on > > minimsing the number of UNMOVABLE allocation requests that use a > > pageblock belonging to another migratetype. Once such a fallback occurs, > > that pageblock potentially can never be used for a THP allocation again. > > > > Lets say there is an unmovable pageblock with 500 free pages in it. If > > the freepage scanner uses that pageblock and allocates all 500 free > > pages then the next unmovable allocation request needs a new pageblock. > > If one is not completely free then it will fallback to using a > > RECLAIMABLE or MOVABLE pageblock forever contaminating it. > > Yes, I can imagine that situation. But, as I said above, we already use > non-movable pageblock for migration scanner. While unmovable > pageblock with 500 free pages fills, some other unmovable pageblock > with some movable pages will be emptied. Number of freepage > on non-movable would be maintained so fallback doesn't happen. > > Anyway, it is better to investigate this effect. I will do it and attach > result on next submission. > Lets say we have X unmovable pageblocks and Y pageblocks overall. If the migration scanner takes movable pages from X then there is more space for unmovable allocations without having to increase X -- this is good. If the free scanner uses the X pageblocks as targets then they can fill. The next unmovable allocation then falls back to another pageblock and we either have X+1 unmovable pageblocks (full steal) or a mixed pageblock (partial steal) that cannot be used for THP. Do this enough times and X == Y and all THP allocations fail. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/