Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752748AbdFVSRD (ORCPT ); Thu, 22 Jun 2017 14:17:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:51748 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751193AbdFVSRB (ORCPT ); Thu, 22 Jun 2017 14:17:01 -0400 Date: Thu, 22 Jun 2017 20:16:57 +0200 From: Michal Hocko To: Wei Yang Cc: Andrew Morton , linux-mm@kvack.org, Mel Gorman , Vlastimil Babka , Andrea Arcangeli , Jerome Glisse , Reza Arbab , Yasuaki Ishimatsu , qiuxishi@huawei.com, Kani Toshimitsu , slaoub@gmail.com, Joonsoo Kim , Andi Kleen , David Rientjes , Daniel Kiper , Igor Mammedov , Vitaly Kuznetsov , Heiko Carstens , LKML Subject: Re: [PATCH 2/2] mm, memory_hotplug: do not assume ZONE_NORMAL is default kernel zone Message-ID: <20170622181656.GB19563@dhcp22.suse.cz> References: <20170601083746.4924-1-mhocko@kernel.org> <20170601083746.4924-3-mhocko@kernel.org> <20170622023243.GA1242@WeideMacBook-Pro.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170622023243.GA1242@WeideMacBook-Pro.local> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2224 Lines: 68 [Again, please try to trim your quoted response to the minimum] On Thu 22-06-17 10:32:43, Wei Yang wrote: > On Thu, Jun 01, 2017 at 10:37:46AM +0200, Michal Hocko wrote: [...] > >@@ -938,6 +938,27 @@ void __ref move_pfn_range_to_zone(struct zone *zone, > > } > > > > /* > >+ * Returns a default kernel memory zone for the given pfn range. > >+ * If no kernel zone covers this pfn range it will automatically go > >+ * to the ZONE_NORMAL. > >+ */ > >+struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, > >+ unsigned long nr_pages) > >+{ > >+ struct pglist_data *pgdat = NODE_DATA(nid); > >+ int zid; > >+ > >+ for (zid = 0; zid <= ZONE_NORMAL; zid++) { > >+ struct zone *zone = &pgdat->node_zones[zid]; > >+ > >+ if (zone_intersects(zone, start_pfn, nr_pages)) > >+ return zone; > >+ } > >+ > >+ return &pgdat->node_zones[ZONE_NORMAL]; > >+} > > Hmm... a corner case jumped into my mind which may invalidate this > calculation. > > The case is: > > > Zone: | DMA | DMA32 | NORMAL | > v v v v > > Phy mem: [ ] [ ] > > ^ ^ ^ ^ > Node: | Node0 | | Node1 | > A B C D > > > The key point is > 1. There is a hole between Node0 and Node1 > 2. The hole sits in a non-normal zone > > Let's mark the boundary as A, B, C, D. Then we would have > node0->zone[dma21] = [A, B] > node1->zone[dma32] = [C, D] > > If we want to hotplug a range in [B, C] on node0, it looks not that bad. While > if we want to hotplug a range in [B, C] on node1, it will introduce the > overlapped zone. Because the range [B, C] intersects none of the existing > zones on node1. > > Do you think this is possible? Yes, it is possible. I would be much more more surprised if it was real as well. Fixing that would require to use arch_zone_{lowest,highest}_possible_pfn which is not available after init section disappears and I am not even sure we should care. I would rather wait for a real life example of such a configuration to fix it. -- Michal Hocko SUSE Labs