Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754897AbcDKMrF (ORCPT ); Mon, 11 Apr 2016 08:47:05 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:32961 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754209AbcDKMrC (ORCPT ); Mon, 11 Apr 2016 08:47:02 -0400 Date: Mon, 11 Apr 2016 14:46:53 +0200 From: Michal Hocko To: Vlastimil Babka Cc: Andrew Morton , Linus Torvalds , Johannes Weiner , Mel Gorman , David Rientjes , Tetsuo Handa , Joonsoo Kim , Hillf Danton , linux-mm@kvack.org, LKML Subject: Re: [PATCH 06/11] mm, compaction: distinguish between full and partial COMPACT_COMPLETE Message-ID: <20160411124653.GG23157@dhcp22.suse.cz> References: <1459855533-4600-1-git-send-email-mhocko@kernel.org> <1459855533-4600-7-git-send-email-mhocko@kernel.org> <570B9432.9090600@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <570B9432.9090600@suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2645 Lines: 64 On Mon 11-04-16 14:10:26, Vlastimil Babka wrote: > On 04/05/2016 01:25 PM, Michal Hocko wrote: > >From: Michal Hocko > > > >COMPACT_COMPLETE now means that compaction and free scanner met. This is > >not very useful information if somebody just wants to use this feedback > >and make any decisions based on that. The current caller might be a poor > >guy who just happened to scan tiny portion of the zone and that could be > >the reason no suitable pages were compacted. Make sure we distinguish > >the full and partial zone walks. > > > >Consumers should treat COMPACT_PARTIAL_SKIPPED as a potential success > >and be optimistic in retrying. > > > >The existing users of COMPACT_COMPLETE are conservatively changed to > >use COMPACT_PARTIAL_SKIPPED as well but some of them should be probably > >reconsidered and only defer the compaction only for COMPACT_COMPLETE > >with the new semantic. > > > >This patch shouldn't introduce any functional changes. > > > >Signed-off-by: Michal Hocko > > Acked-by: Vlastimil Babka Thanks! > With some notes: > > >@@ -1463,6 +1466,10 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro > > zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn; > > zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn; > > } > >+ > >+ if (cc->migrate_pfn == start_pfn) > >+ cc->whole_zone = true; > >+ > > This assumes that migrate scanner at initial position implies also free > scanner at the initial position. That should be true, because migration > scanner is the first to run. But getting the zone->compact_cached_*_pfn is > racy. Worse, zone->compact_cached_migrate_pfn is array distinguishing sync > and async compaction, so it's possible that async compaction has advanced > both its own migrate scanner cached position, and the shared free scanner > cached position, and then sync compaction starts migrate scanner at > start_pfn, but free scanner has already advanced. OK, I see. The whole thing smelled racy but I thought it wouldn't be such a big deal. Even if we raced then only a marginal part of the zone wouldn't be scanned, right? Or is it possible that free_pfn would appear in the middle of the zone because of the race? > So you might still see a false positive COMPACT_COMPLETE, just less > frequently and probably with much lower impact. > But if you need to be truly reliable, check also that cc->free_pfn == > round_down(end_pfn - 1, pageblock_nr_pages) I do not think we need the precise check if the race window (in the skipped zone range) is always small. Thanks! -- Michal Hocko SUSE Labs