Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756346AbdDRJ2I (ORCPT ); Tue, 18 Apr 2017 05:28:08 -0400 Received: from mx2.suse.de ([195.135.220.15]:52098 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756150AbdDRJ2C (ORCPT ); Tue, 18 Apr 2017 05:28:02 -0400 Date: Tue, 18 Apr 2017 11:27:58 +0200 From: Michal Hocko To: Vlastimil Babka Cc: linux-mm@kvack.org, Andrew Morton , Mel Gorman , Andrea Arcangeli , Jerome Glisse , Reza Arbab , Yasuaki Ishimatsu , qiuxishi@huawei.com, Kani Toshimitsu , slaoub@gmail.com, Joonsoo Kim , Andi Kleen , David Rientjes , Daniel Kiper , Igor Mammedov , Vitaly Kuznetsov , LKML Subject: Re: [PATCH 1/3] mm: consider zone which is not fully populated to have holes Message-ID: <20170418092757.GM22360@dhcp22.suse.cz> References: <20170410110351.12215-1-mhocko@kernel.org> <20170415121734.6692-1-mhocko@kernel.org> <20170415121734.6692-2-mhocko@kernel.org> <97a658cd-e656-6efa-7725-150063d276f1@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <97a658cd-e656-6efa-7725-150063d276f1@suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3361 Lines: 77 On Tue 18-04-17 10:45:23, Vlastimil Babka wrote: > On 04/15/2017 02:17 PM, Michal Hocko wrote: > > From: Michal Hocko > > > > __pageblock_pfn_to_page has two users currently, set_zone_contiguous > > which checks whether the given zone contains holes and > > pageblock_pfn_to_page which then carefully returns a first valid > > page from the given pfn range for the given zone. This doesn't handle > > zones which are not fully populated though. Memory pageblocks can be > > offlined or might not have been onlined yet. In such a case the zone > > should be considered to have holes otherwise pfn walkers can touch > > and play with offline pages. > > > > Current callers of pageblock_pfn_to_page in compaction seem to work > > properly right now because they only isolate PageBuddy > > (isolate_freepages_block) or PageLRU resp. __PageMovable > > (isolate_migratepages_block) which will be always false for these pages. > > It would be safer to skip these pages altogether, though. In order > > to do that let's check PageReserved in __pageblock_pfn_to_page because > > offline pages are reserved. > > My issue with this is that PageReserved can be also set for other > reasons than offlined block, e.g. by a random driver. So there are two > suboptimal scenarios: > > - PageReserved is set on some page in the middle of pageblock. It won't > be detected by this patch. This violates the "it would be safer" argument. > - PageReserved is set on just the first (few) page(s) and because of > this patch, we skip it completely and won't compact the rest of it. Why would that be a big problem? PageReserved is used only very seldom and few page blocks skipped would seem like a minor issue to me. > So if we decide we really need to check PageReserved to ensure safety, > then we have to check it on each page. But I hope the existing criteria > in compaction scanners are sufficient. Unless the semantic is that if > somebody sets PageReserved, he's free to repurpose the rest of flags at > his will (IMHO that's not the case). I am not aware of any such user. PageReserved has always been about "the core mm should touch these pages and modify their state" AFAIR. But I believe that touching those holes just asks for problems so I would rather have them covered. > The pageblock-level check them becomes a performance optimization so > when there's an "offline hole", compaction won't iterate it page by > page. But the downside is the false positive resulting in skipping whole > pageblock due to single page. > I guess it's uncommon for a longlived offline holes to exist, so we > could simply just drop this? This is hard to tell but I can imagine that some memory hotplug balloning drivers might want to offline hole into existing zones. > > Signed-off-by: Michal Hocko > > --- > > mm/page_alloc.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 0cacba69ab04..dcbbcfdda60e 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -1351,6 +1351,8 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn, > > return NULL; > > > > start_page = pfn_to_page(start_pfn); > > + if (PageReserved(start_page)) > > + return NULL; > > > > if (page_zone(start_page) != zone) > > return NULL; > > -- Michal Hocko SUSE Labs