Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933410AbdDERfn (ORCPT ); Wed, 5 Apr 2017 13:35:43 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:50168 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755520AbdDERc7 (ORCPT ); Wed, 5 Apr 2017 13:32:59 -0400 Date: Wed, 5 Apr 2017 12:32:49 -0500 From: Reza Arbab To: Michal Hocko Cc: Mel Gorman , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , Andrea Arcangeli , Yasuaki Ishimatsu , Tang Chen , qiuxishi@huawei.com, Kani Toshimitsu , slaoub@gmail.com, Joonsoo Kim , Andi Kleen , Zhang Zhen , David Rientjes , Daniel Kiper , Igor Mammedov , Vitaly Kuznetsov , LKML , Chris Metcalf , Dan Williams , Heiko Carstens , Lai Jiangshan , Martin Schwidefsky Subject: Re: [PATCH 0/6] mm: make movable onlining suck less References: <20170404082302.GE15132@dhcp22.suse.cz> <20170404160239.ftvuxklioo6zvuxl@arbab-laptop> <20170404164452.GQ15132@dhcp22.suse.cz> <20170404183012.a6biape5y7vu6cjm@arbab-laptop> <20170404194122.GS15132@dhcp22.suse.cz> <20170404214339.6o4c4uhwudyhzbbo@arbab-laptop> <20170405064239.GB6035@dhcp22.suse.cz> <20170405092427.GG6035@dhcp22.suse.cz> <20170405145304.wxzfavqxnyqtrlru@arbab-laptop> <20170405154258.GR6035@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20170405154258.GR6035@dhcp22.suse.cz> Organization: IBM Linux Technology Center User-Agent: NeoMutt/20170306 (1.8.0) X-TM-AS-GCONF: 00 x-cbid: 17040517-2213-0000-0000-00000182EEF5 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006887; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00843526; UDB=6.00415627; IPR=6.00621703; BA=6.00005271; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00014923; XFM=3.00000013; UTC=2017-04-05 17:32:57 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17040517-2214-0000-0000-0000553E7000 Message-Id: <20170405173248.4vtdgk2kolbzztya@arbab-laptop> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-05_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704050149 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1407 Lines: 39 On Wed, Apr 05, 2017 at 05:42:59PM +0200, Michal Hocko wrote: >But one thing that is really bugging me is how could you see low pfns >in the previous oops. Please drop the last patch and sprinkle printks >down the remove_memory path to see where this all go south. I believe >that there is something in the initialization code lurking in my code. >Please also scratch the pfn_valid check in online_pages diff. It will >not help here. Got it. shrink_pgdat_span: start_pfn=0x10000, end_pfn=0x10100, pgdat_start_pfn=0x0, pgdat_end_pfn=0x20000 The problem is that pgdat_start_pfn here should be 0x10000. As you suspected, it never got set. This fixes things for me. diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 623507f..37c1b63 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -884,7 +884,7 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon { unsigned long old_end_pfn = pgdat_end_pfn(pgdat); - if (start_pfn < pgdat->node_start_pfn) + if (!pgdat->node_spanned_pages || start_pfn < pgdat->node_start_pfn) pgdat->node_start_pfn = start_pfn; pgdat->node_spanned_pages = max(start_pfn + nr_pages, old_end_pfn) - pgdat->node_start_pfn; --- Along these lines, maybe we should also do - if (start_pfn < zone->zone_start_pfn) + if (zone_is_empty(zone) || start_pfn < zone->zone_start_pfn) in resize_zone_range()? -- Reza Arbab