Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932708AbdDEJZx (ORCPT ); Wed, 5 Apr 2017 05:25:53 -0400 Received: from mx2.suse.de ([195.135.220.15]:33657 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754674AbdDEJYc (ORCPT ); Wed, 5 Apr 2017 05:24:32 -0400 Date: Wed, 5 Apr 2017 11:24:27 +0200 From: Michal Hocko To: Reza Arbab Cc: Mel Gorman , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , Andrea Arcangeli , Yasuaki Ishimatsu , Tang Chen , qiuxishi@huawei.com, Kani Toshimitsu , slaoub@gmail.com, Joonsoo Kim , Andi Kleen , Zhang Zhen , David Rientjes , Daniel Kiper , Igor Mammedov , Vitaly Kuznetsov , LKML , Chris Metcalf , Dan Williams , Heiko Carstens , Lai Jiangshan , Martin Schwidefsky Subject: Re: [PATCH 0/6] mm: make movable onlining suck less Message-ID: <20170405092427.GG6035@dhcp22.suse.cz> References: <20170403204213.rs7k2cvsnconel2z@arbab-laptop> <20170404072329.GA15132@dhcp22.suse.cz> <20170404073412.GC15132@dhcp22.suse.cz> <20170404082302.GE15132@dhcp22.suse.cz> <20170404160239.ftvuxklioo6zvuxl@arbab-laptop> <20170404164452.GQ15132@dhcp22.suse.cz> <20170404183012.a6biape5y7vu6cjm@arbab-laptop> <20170404194122.GS15132@dhcp22.suse.cz> <20170404214339.6o4c4uhwudyhzbbo@arbab-laptop> <20170405064239.GB6035@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170405064239.GB6035@dhcp22.suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1274 Lines: 35 On Wed 05-04-17 08:42:39, Michal Hocko wrote: > On Tue 04-04-17 16:43:39, Reza Arbab wrote: > > On Tue, Apr 04, 2017 at 09:41:22PM +0200, Michal Hocko wrote: > > >On Tue 04-04-17 13:30:13, Reza Arbab wrote: > > >>I think I found another edge case. You > > >>get an oops when removing all of a node's memory: > > >> > > >>__nr_to_section > > >>__pfn_to_section > > >>find_biggest_section_pfn > > >>shrink_pgdat_span > > >>__remove_zone > > >>__remove_section > > >>__remove_pages > > >>arch_remove_memory > > >>remove_memory > > > > > >Is this something new or an old issue? I believe the state after the > > >online should be the same as before. So if you onlined the full node > > >then there shouldn't be any difference. Let me have a look... > > > > It's new. Without this patchset, I can repeatedly > > add_memory()->online_movable->offline->remove_memory() all of a node's > > memory. > > This is quite unexpected because the code obviously cannot handle the > first memory section. Could you paste /proc/zoneinfo and > grep . -r /sys/devices/system/memory/auto_online_blocks/memory*, after > onlining for both patched and unpatched kernels? Btw. how do you test this? I am really surprised you managed to hotremove such a low pfn range. -- Michal Hocko SUSE Labs