Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752942AbdDDIXL (ORCPT ); Tue, 4 Apr 2017 04:23:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:49985 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751022AbdDDIXI (ORCPT ); Tue, 4 Apr 2017 04:23:08 -0400 Date: Tue, 4 Apr 2017 10:23:02 +0200 From: Michal Hocko To: Reza Arbab , Mel Gorman Cc: linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , Andrea Arcangeli , Yasuaki Ishimatsu , Tang Chen , qiuxishi@huawei.com, Kani Toshimitsu , slaoub@gmail.com, Joonsoo Kim , Andi Kleen , Zhang Zhen , David Rientjes , Daniel Kiper , Igor Mammedov , Vitaly Kuznetsov , LKML , Chris Metcalf , Dan Williams , Heiko Carstens , Lai Jiangshan , Martin Schwidefsky Subject: Re: [PATCH 0/6] mm: make movable onlining suck less Message-ID: <20170404082302.GE15132@dhcp22.suse.cz> References: <20170330115454.32154-1-mhocko@kernel.org> <20170403115545.GK24661@dhcp22.suse.cz> <20170403195830.64libncet5l6vuvb@arbab-laptop> <20170403202337.GA12482@dhcp22.suse.cz> <20170403204213.rs7k2cvsnconel2z@arbab-laptop> <20170404072329.GA15132@dhcp22.suse.cz> <20170404073412.GC15132@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170404073412.GC15132@dhcp22.suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3095 Lines: 84 On Tue 04-04-17 09:34:12, Michal Hocko wrote: > On Tue 04-04-17 09:23:29, Michal Hocko wrote: > > [Let's add Gary who as introduced this code c04fc586c1a48] > > OK, so Gary's email doesn't exist anymore. Does anybody can comment on > this? I suspect this code is just-in-case... Mel? > > > On Mon 03-04-17 15:42:13, Reza Arbab wrote: > [...] > > > Almost there. I'm seeing the memory in the correct node now, but the > > > /sys/devices/system/node/nodeX/memoryY links are not being created. > > > > > > I think it's tripping up here, in register_mem_sect_under_node(): > > > > > > page_nid = get_nid_for_pfn(pfn); > > > if (page_nid < 0) > > > continue; > > > > Huh, this code is confusing. How can we have a memblock spanning more > > nodes? If not then the loop over all sections in the memblock seem > > pointless as well. Also why do we require page_initialized() in > > get_nid_for_pfn? The changelog doesn't explain that and there are no > > comments that would help either. OK, so I've been thinkin about that and I believe that page_initialized check in get_nid_for_pfn is just bogus. There is nothing to rely on the page::lru to be already initialized. So I will go with the following as a separate preparatory patch. I believe the whole code should be revisited and I have put that on my ever growing todo list because I suspect that it is more complex than necessary. I suspect that memblock do not span more nodes and all this is just-in-case code (e.g. the onlining code assumes a single zone aka node. But let's do that later. --- >From fd2e3b6eca1cf7766527203d23db6aca5957a3f1 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Tue, 4 Apr 2017 10:05:06 +0200 Subject: [PATCH] mm: drop page_initialized check from get_nid_for_pfn c04fc586c1a4 ("mm: show node to memory section relationship with symlinks in sysfs") has added means to export memblock<->node association into the sysfs. It has also introduced get_nid_for_pfn which is a rather confusing counterpart of pfn_to_nid which checks also whether the pfn page is already initialized (page_initialized). This is done by checking page::lru != NULL which doesn't make any sense at all. Nothing in this path really relies on the lru list being used or initialized. Just remove it Signed-off-by: Michal Hocko --- drivers/base/node.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 5548f9686016..ee080a35e869 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -368,8 +368,6 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid) } #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE -#define page_initialized(page) (page->lru.next) - static int __ref get_nid_for_pfn(unsigned long pfn) { struct page *page; @@ -380,9 +378,6 @@ static int __ref get_nid_for_pfn(unsigned long pfn) if (system_state == SYSTEM_BOOTING) return early_pfn_to_nid(pfn); #endif - page = pfn_to_page(pfn); - if (!page_initialized(page)) - return -1; return pfn_to_nid(pfn); } -- 2.11.0 -- Michal Hocko SUSE Labs