Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751514AbaKDIMr (ORCPT ); Tue, 4 Nov 2014 03:12:47 -0500 Received: from fgwmail6.fujitsu.co.jp ([192.51.44.36]:51067 "EHLO fgwmail6.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750819AbaKDIMn (ORCPT ); Tue, 4 Nov 2014 03:12:43 -0500 X-SecurityPolicyCheck: OK by SHieldMailChecker v2.0.1 X-SHieldMailCheckerPolicyVersion: FJ-ISEC-20120718-3 Message-ID: <54588A0D.1060207@jp.fujitsu.com> Date: Tue, 4 Nov 2014 17:10:53 +0900 From: Yasuaki Ishimatsu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Tang Chen , , , , , , , , , , , , , , , , , , , , , CC: , , Gu Zheng Subject: Re: [PATCH 2/2] mem-hotplug: Fix wrong check for zone->pageset initialization in online_pages(). References: <1414748812-22610-1-git-send-email-tangchen@cn.fujitsu.com> <1414748812-22610-3-git-send-email-tangchen@cn.fujitsu.com> In-Reply-To: <1414748812-22610-3-git-send-email-tangchen@cn.fujitsu.com> Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-SecurityPolicyCheck-GC: OK by FENCE-Mail Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2014/10/31 18:46), Tang Chen wrote: > When we are doing memory hot-add, the following functions are called: > > add_memory() > |--> hotadd_new_pgdat() > |--> free_area_init_node() > |--> free_area_init_core() > |--> zone->present_pages = realsize; /* 1. zone is populated */ > |--> zone_pcp_init() > |--> zone->pageset = &boot_pageset; /* 2. zone->pageset is set to boot_pageset */ > > There are two problems here: > 1. Zones could be populated before any memory is onlined. > 2. All the zones on a newly added node have the same pageset pointing to boot_pageset. > > The above two problems will result in the following problem: > When we online memory on one node, e.g node2, the following code is executed: > > online_pages() > { > ...... > if (!populated_zone(zone)) { > need_zonelists_rebuild = 1; > build_all_zonelists(NULL, zone); > } > ...... > } > > Because of problem 1, the zone has been populated, and the build_all_zonelists() > will never called. zone->pageset won't be updated. > Because of problem 2, All the zones on a newly added node have the same pageset > pointing to boot_pageset. > And as a result, when we online memory on node2, node3's meminfo will corrupt. > Pages on node2 may be freed to node3. > > # for ((i = 2048; i < 2064; i++)); do echo online_movable > /sys/devices/system/node/node2/memory$i/state; done > # cat /sys/devices/system/node/node2/meminfo > Node 2 MemTotal: 33554432 kB > Node 2 MemFree: 33549092 kB > Node 2 MemUsed: 5340 kB > ...... > # cat /sys/devices/system/node/node3/meminfo > Node 3 MemTotal: 0 kB > Node 3 MemFree: 248 kB /* corrupted */ > Node 3 MemUsed: 0 kB > ...... > > We have to populate some zones before onlining memory, otherwise no memory could be onlined. > So when onlining pages, we should also check if zone->pageset is pointing to boot_pageset. > > Signed-off-by: Gu Zheng > Signed-off-by: Tang Chen > --- > include/linux/mm.h | 1 + > mm/memory_hotplug.c | 6 +++++- > mm/page_alloc.c | 5 +++++ > 3 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 02d11ee..83e6505 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1732,6 +1732,7 @@ void warn_alloc_failed(gfp_t gfp_mask, int order, const char *fmt, ...); > > extern void setup_per_cpu_pageset(void); > > +extern bool zone_pcp_initialized(struct zone *zone); > extern void zone_pcp_update(struct zone *zone); > extern void zone_pcp_reset(struct zone *zone); > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 3ab01b2..bc0de0f 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1013,9 +1013,13 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > * If this zone is not populated, then it is not in zonelist. > * This means the page allocator ignores this zone. > * So, zonelist must be updated after online. > + * > + * If this zone is populated, zone->pageset could be initialized > + * to boot_pageset for the first time a node is added. If so, > + * zone->pageset should be allocated. > */ > mutex_lock(&zonelists_mutex); > - if (!populated_zone(zone)) { > + if (!populated_zone(zone) || !zone_pcp_initialized(zone)) { > need_zonelists_rebuild = 1; > build_all_zonelists(NULL, zone); > } Why does zone->present_pages of the hot-added memroy have valid value? In my understading, the present_pages is incremented/decremented by memory online/offline. So when hot adding memory, the zone->present_pages of the memory should be 0. Thanks, Yasuaki Ishimatsu > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 736d8e1..4ff1540 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6456,6 +6456,11 @@ void __meminit zone_pcp_update(struct zone *zone) > } > #endif > > +bool zone_pcp_initialized(struct zone *zone) > +{ > + return (zone->pageset != &boot_pageset); > +} > + > void zone_pcp_reset(struct zone *zone) > { > unsigned long flags; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/