Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp304256pxb; Thu, 27 Jan 2022 22:37:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJwb0KFlBY+0X1zbmydWLJ04o1ZwYhPpeYkJEYpWpk/LiB0LL4l9DjSAdPONLkl5fw/dNihP X-Received: by 2002:a17:907:961b:: with SMTP id gb27mr5573452ejc.444.1643351860366; Thu, 27 Jan 2022 22:37:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643351860; cv=none; d=google.com; s=arc-20160816; b=Xv21Ollo86UjGqK+KtrUhNxOmwXvNxJ5rJymlwwq44D4i+XyJN1blZ2/JQ7CguYh0l aZDcjPEbbMr3DlEs4B2Foklh1fBuP/yy9CM41dxxu+DIIoL5Ybvxhl8ZsN2vxnUmHKfD NE2Q7QJxl1RVlIl/BabL+r6hhsiq9DHx6wvDKTsg7dZGj+teSsiOworwiq8qEXmB5pcD q2ISbnyzy2Gl5+HiPny1v6fg2Dnizpi6k6t5m72ZWB6Ia4Hzc18Bd9cJbEcKM/OG4Snl U/j35iAvkTwKOSkhJ9aY+t5jzcvy2xDC8Dru6lAHwUG4mAPhzXUTFMWsN7Qhv/TFZzgx URjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=IuUPkX2GjY5TwlCLKPLhoFT3kWymrWBCGUXDzqDp9Qc=; b=srGZAHIvUfjRW1YQMu2bBKjVSIJjLG1ar9vkrV+pl9gFhZe9T2qI1emJZf5W627ufc gdIagp5/kVpRawmz5T631M1+6cNzVt1yJh0qVM22mOAOjQi1pr6cE8EGUmkMCrzs5a9r acMJJHMVBXJ0We/7X6+l4eEuUdNVYeQLWwWHGANDK+ESdHcFT187NNng8srEXThwswOg gr8W//Dr1f1r1ffL/GzPGOmFyY3+PIR1OOjm/PsbZFffdcyFx4eMFASsxib27Y/XvqqR 2WJEpLFUx3NMMBktuw501w9IQrF+Iw2wscn5Xc/IuJFyBiI/MfC1kRf42tgzgXgf/TVq B7/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=fsz8fEAS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ej26si2816328edb.344.2022.01.27.22.37.16; Thu, 27 Jan 2022 22:37:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=fsz8fEAS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242681AbiA0OuR (ORCPT + 99 others); Thu, 27 Jan 2022 09:50:17 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:38014 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232186AbiA0OuP (ORCPT ); Thu, 27 Jan 2022 09:50:15 -0500 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id CEEC3218DF; Thu, 27 Jan 2022 14:50:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1643295014; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=IuUPkX2GjY5TwlCLKPLhoFT3kWymrWBCGUXDzqDp9Qc=; b=fsz8fEASQyPrKePs+n5wUNpLOdcgsxAYjPgggj9H5SYoP+0exdnDqJxsrCtczPFdwvg0Nq gdRArV9p9PIes4HMU7Cvz4ntg2aI5Tbc/TC3Ar1GqGvUMDR9RRcyYHya+6gxV0glRMpzVP qeDJ/xoaacpwaSlL091MzsFIgEapBTU= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 9EEF7A3B81; Thu, 27 Jan 2022 14:50:14 +0000 (UTC) Date: Thu, 27 Jan 2022 15:50:14 +0100 From: Michal Hocko To: David Hildenbrand Cc: Andrew Morton , linux-mm@kvack.org, LKML , Alexey Makhalov , Dennis Zhou , Eric Dumazet , Oscar Salvador , Tejun Heo , Christoph Lameter , Nico Pache , Wei Yang , Rafael Aquini Subject: Re: [PATCH 2/6] mm: handle uninitialized numa nodes gracefully Message-ID: References: <20220127085305.20890-1-mhocko@kernel.org> <20220127085305.20890-3-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 27-01-22 13:41:16, David Hildenbrand wrote: > On 27.01.22 09:53, Michal Hocko wrote: [...] > > diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c > > index 8dc8a554f774..dd0cf4834eaa 100644 > > --- a/arch/ia64/mm/discontig.c > > +++ b/arch/ia64/mm/discontig.c > > @@ -608,11 +608,11 @@ void __init paging_init(void) > > zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page)); > > } > > > > -pg_data_t *arch_alloc_nodedata(int nid) > > +pg_data_t * __init arch_alloc_nodedata(int nid) > > { > > unsigned long size = compute_pernodesize(nid); > > > > - return kzalloc(size, GFP_KERNEL); > > + return memblock_alloc(size, SMP_CACHE_BYTES); > > I feel like we should have > > long arch_pgdat_size(void) instead and have a generic allocation function. > > But we can clean that up in the future. I can have a look later (unless somebody beat me to it). I am not even sure this whole ia64 weirdness is useful these days. [...] > > @@ -1445,9 +1445,6 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) > > > > return ret; > > error: > > - /* rollback pgdat allocation and others */ > > - if (new_node) > > - rollback_node_hotadd(nid); > > As static rollback_node_hotadd() is unused in this patch, doesn't this > trigger a warning? IOW, maybe merge at least the rollback_node_hotadd() > removal into this patch. The arch_free_nodedata() removal can stay separate. It is my slight preference to have this patch smaller to be easier to review and therefore I have removed all parts in a follow up. If a warning is a real deal at this stage then I can change that of course. > > > if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) > > memblock_remove(start, size); > > error_mem_hotplug_end: > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 3589febc6d31..1a05669044d3 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -6402,7 +6402,11 @@ static void __build_all_zonelists(void *data) > > if (self && !node_online(self->node_id)) { > > build_zonelists(self); > > } else { > > - for_each_online_node(nid) { > > + /* > > + * All possible nodes have pgdat preallocated > > ... in free_area_init() ? Will fix it up. > > + * free_area_init > > + */ > > + for_each_node(nid) { > > pg_data_t *pgdat = NODE_DATA(nid); > > > > build_zonelists(pgdat); > > @@ -8096,8 +8100,32 @@ void __init free_area_init(unsigned long *max_zone_pfn) > > /* Initialise every node */ > > mminit_verify_pageflags_layout(); > > setup_nr_node_ids(); > > - for_each_online_node(nid) { > > - pg_data_t *pgdat = NODE_DATA(nid); > > + for_each_node(nid) { > > + pg_data_t *pgdat; > > + > > + if (!node_online(nid)) { > > + pr_warn("Node %d uninitialized by the platform. Please report with boot dmesg.\n", nid); > > + > > + /* Allocator not initialized yet */ > > + pgdat = arch_alloc_nodedata(nid); > > + if (!pgdat) { > > + pr_err("Cannot allocate %zuB for node %d.\n", > > + sizeof(*pgdat), nid); > > + continue; > > + } > > + arch_refresh_nodedata(nid, pgdat); > > We could get rid of arch_refresh_nodedata() now and simply merge that > into arch_alloc_nodedata(). But depends on how we want to proceed with > arch_alloc_nodedata() eventually. yeah, I will postpone that to a later follow ups. > Acked-by: David Hildenbrand Thanks! Btw now I have realized that I have lost the following hung somewhere: diff --git a/mm/internal.h b/mm/internal.h index d80300392a19..43b8ccf56b7f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -718,4 +718,6 @@ void vunmap_range_noflush(unsigned long start, unsigned long end); int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, unsigned long addr, int page_nid, int *flags); +DECLARE_PER_CPU(struct per_cpu_nodestat, boot_nodestats); + #endif /* __MM_INTERNAL_H */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1a05669044d3..4120cc855673 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6380,7 +6380,7 @@ static void per_cpu_pages_init(struct per_cpu_pages *pcp, struct per_cpu_zonesta #define BOOT_PAGESET_BATCH 1 static DEFINE_PER_CPU(struct per_cpu_pages, boot_pageset); static DEFINE_PER_CPU(struct per_cpu_zonestat, boot_zonestats); -static DEFINE_PER_CPU(struct per_cpu_nodestat, boot_nodestats); +DEFINE_PER_CPU(struct per_cpu_nodestat, boot_nodestats); static void __build_all_zonelists(void *data) { -- Michal Hocko SUSE Labs