Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp4307076pxb; Tue, 2 Nov 2021 07:37:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzRGkdP1PBrtuDR7M5WgKBbYJbY4i1WIzk8sJ3cOmNQ7Qa1xaSIDSqmaOMLYu502A7nCNgy X-Received: by 2002:a5d:9d46:: with SMTP id k6mr25933782iok.55.1635863826956; Tue, 02 Nov 2021 07:37:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635863826; cv=none; d=google.com; s=arc-20160816; b=ZIU+NiFOdS4SBLsutKHQWT/X8aQKxGcn6ZyrzoBSX/GofFF0tK6x2GrLGd+HOayBe1 Eh6raRYHO4E/I8auEtpXItihX9pbJEUTpk9N9S/9f/zJoZCX/QtB9amAaajwlEPoLyox vmeQI2T2nd59f76KuJ26LVO8+HhBKv9VLlEJYFUXW7lMECnxgbHlhhA0Ou7I0Jy9yGei xdR1CT6NWzBr5HpMdWA3EezBFrjRHWkK2n+551uMHdB9wmdlrSaBJOPLBTjJOpbq+Dnm 1miZGT683AT5xIbdB+bKRyPHDC04c63/6QYt6irK4ud4TcPdcTG0q8cApW+xQUuBvsbq Rg6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=PZOYwTcs/uRKNYMtgWTyKzvzPAmp2lHm06+rRoOs/v4=; b=s4+oidejvaLMSGgTJyZ0Ys0N7TuUUUO5iYpPwB/LsMOe2HBmJe8opCxvlDPXsgqI8c OcitJsvXNetMoyt+5r9MkfgF2BZZCAp/1VIe804mfydX0AL7RQRUhUeekwmHtk5Ii47W 62kPEwDlrukQZhNzC2TfCspSbcHAxzoBO6zeQYi4ThnCIwVH8fj5XWgsXfUTXEjxZ1Hi n9IdhQ+1nlmiLR2RliK2o9sEQQu9jeDgCP2x2MdOowrmZwaqhtrs9mT8FQfSK3dfg9XG pZo3N6NX9Ly3EqRzOZEhpJNjKfjC+yowULlZuhwMJyHT81l9kMXBuDOgCgrkC0Ivzdat Ssgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=vQq48Hi9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o15si30461016ilu.91.2021.11.02.07.36.55; Tue, 02 Nov 2021 07:37:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=vQq48Hi9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230447AbhKBOiJ (ORCPT + 99 others); Tue, 2 Nov 2021 10:38:09 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]:47262 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229530AbhKBOiI (ORCPT ); Tue, 2 Nov 2021 10:38:08 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 01AD41FD4E; Tue, 2 Nov 2021 14:35:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1635863733; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PZOYwTcs/uRKNYMtgWTyKzvzPAmp2lHm06+rRoOs/v4=; b=vQq48Hi9tmkUX4nm9Wzk1Fq+ApGIlZIIZmmiD9SmLhgmExEVMaSiVApTakHz0HYqIqdy44 Ot3l2rxUuQ7ce7mmLPELGfrb1s4WWwDEC6Qn5wpi/v7GoGi94FHMUVj3KQTFdyyJYzllRl zjCzO1aYk947DkoYQKmzAyuTUh3hwbM= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id B4221A3B87; Tue, 2 Nov 2021 14:35:32 +0000 (UTC) Date: Tue, 2 Nov 2021 15:35:32 +0100 From: Michal Hocko To: Oscar Salvador Cc: David Hildenbrand , Alexey Makhalov , "linux-mm@kvack.org" , Andrew Morton , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , Oscar Salvador Subject: Re: [PATCH] mm: fix panic in __alloc_pages Message-ID: References: <42abfba6-b27e-ca8b-8cdf-883a9398b506@redhat.com> <20211102135201.GA4348@linux> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211102135201.GA4348@linux> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 02-11-21 14:52:01, Oscar Salvador wrote: > On Tue, Nov 02, 2021 at 02:25:03PM +0100, Michal Hocko wrote: > > I think we want to learn how exactly Alexey brought that cpu up. Because > > his initial thought on add_cpu resp cpu_up doesn't seem to be correct. > > Or I am just not following the code properly. Once we know all those > > details we can get in touch with cpu hotplug maintainers and see what > > can we do. > > I am not really familiar with CPU hot-onlining, but I have been taking a look. > As with memory, there are two different stages, hot-adding and onlining (and the > counterparts). > > Part of the hot-adding being: > > acpi_processor_get_info > acpi_processor_hotadd_init > arch_register_cpu > register_cpu > > One of the things that register_cpu() does is to set cpu->dev.bus pointing to > &cpu_subsys, which is: > > struct bus_type cpu_subsys = { > .name = "cpu", > .dev_name = "cpu", > .match = cpu_subsys_match, > #ifdef CONFIG_HOTPLUG_CPU > .online = cpu_subsys_online, > .offline = cpu_subsys_offline, > #endif > }; > > Then, the onlining part (in case of a udev rule or someone onlining the device) > would be: > > online_store > device_online > cpu_subsys_online > cpu_device_up > cpu_up > ... > online node > > Since Alexey disabled the udev rule and no one onlined the CPU, online_store()-> > device_online() wasn't really called. > > The following only applies to x86_64: > I think we got confused because cpu_device_up() is also called from add_cpu(), > but that is an exported function and x86 does not call add_cpu() unless for > debugging purposes (check kernel/torture.c and arch/x86/kernel/topology.c). > It does the onlining through online_store()... > So we can take add_cpu() off the equation here. Yes, so the real problem is (thanks for pointing me to the acpi code). The cpu->node association is done in acpi_map_cpu2node and I suspect this expects that the node is already present as it gets the information from SRAT/PXM tables which are parsed during boot. But I might be just confused or maybe just VMware inject new entries here somehow. Another interesting thing is that acpi_map_cpu2node skips over association if there is no node found in SRAT but that should only mean it would use the default initialization which should be hopefuly 0. Anyway, I have found in my notes https://www.spinics.net/lists/kernel/msg3010886.html which is a slightly different problem but it has some notes about how the initialization mess works (that one was boot time though and hotplug might be different actually). I have ran out of time for this today so hopefully somebody can re-learn that from there... -- Michal Hocko SUSE Labs