Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2095493imm; Thu, 7 Jun 2018 05:29:09 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKxldcHcZHiKM0cnW3eBlkbtw+6g2mdQosvduEXyRc2RXTAgdxsYXExbv6u8KC6mI76yDqb X-Received: by 2002:a17:902:8206:: with SMTP id x6-v6mr1804256pln.220.1528374549365; Thu, 07 Jun 2018 05:29:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528374549; cv=none; d=google.com; s=arc-20160816; b=QdxKQ1OQwa9xJe+WlsfLCVANao2abE7mkgla1h5bZcORlwFEgmXS9kvbDP+iPm31Zz tIKAHN2sMYBsikWSjWzG55rP29iANjJeUrN2TRpZZk3WLFJ7DgR0jRbmXfmXUw8OiuLZ 01lWC0oYdcwO21TZGS8ILnaEpOdGC4Q6WmjJ1c9/QoDZp71UKH+2daIH0jAb49O83c0G jA4Cx6uYf36WKEbkXBsoFI8AaoZsmjOLJCwaOtR+6V3HKPGcILsgAhFLAM1cpSjTbaqA U56+t8bJqgsNhIEpBCCXfQCKGhMIj4tYhQFwHoX+FdMfu/K/xvNU6zO7gzkDFTAlTBYj uvGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=RBQNWwRJwHNH5DsEAp8KZP4Vmrg3D/uRi6HwDuzhIUk=; b=VkdZ+XviiULlgeX4or2niQBrbNOmCUP+CTqQdIgakVKF0/vY7ZPnNpPOVICY6YVZmM p4Qf8UARqYlbsSpcVW5heac+cTeVdePKvMNKbl1oPbkgGaDQ67WI3YsNv8o5YhjI457H P0LlrNMb1otGRG6SEb9C6SLLLZ7TrQZd1tMd6pIf15GvWS1eEepN7LrxnGr73GhXQz8/ 5N4SGB9ZA0EP4Jh+43aVHSOjv5zY/9GcVE2vPi4w48/6L/H+5uACzkDjxGYapbYz5H4+ Rx2FjWQEV0ObJdFQGyTryjcy+vrWJB58V9I/x26XJYFYov3bcJlHe2xOiNg+aU69/835 M7EA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l30-v6si53837536plg.420.2018.06.07.05.28.54; Thu, 07 Jun 2018 05:29:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753560AbeFGMVz (ORCPT + 99 others); Thu, 7 Jun 2018 08:21:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:43060 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752972AbeFGMVy (ORCPT ); Thu, 7 Jun 2018 08:21:54 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id EDC51ABF9; Thu, 7 Jun 2018 12:21:52 +0000 (UTC) Date: Thu, 7 Jun 2018 14:21:52 +0200 From: Michal Hocko To: Hanjun Guo Cc: Bjorn Helgaas , Will Deacon , xiexiuqi@huawei.com, Catalin Marinas , Greg Kroah-Hartman , "Rafael J. Wysocki" , Jarkko Sakkinen , linux-arm , Linux Kernel Mailing List , wanghuiqiang@huawei.com, tnowicki@caviumnetworks.com, linux-pci@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Subject: Re: [PATCH 1/2] arm64: avoid alloc memory on offline node Message-ID: <20180607122152.GP32433@dhcp22.suse.cz> References: <1527768879-88161-1-git-send-email-xiexiuqi@huawei.com> <1527768879-88161-2-git-send-email-xiexiuqi@huawei.com> <20180606154516.GL6631@arm.com> <20180607105514.GA13139@dhcp22.suse.cz> <5ed798a0-6c9c-086e-e5e8-906f593ca33e@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5ed798a0-6c9c-086e-e5e8-906f593ca33e@huawei.com> User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 07-06-18 19:55:53, Hanjun Guo wrote: > On 2018/6/7 18:55, Michal Hocko wrote: [...] > > I am not sure I have the full context but pci_acpi_scan_root calls > > kzalloc_node(sizeof(*info), GFP_KERNEL, node) > > and that should fall back to whatever node that is online. Offline node > > shouldn't keep any pages behind. So there must be something else going > > on here and the patch is not the right way to handle it. What does > > faddr2line __alloc_pages_nodemask+0xf0 tells on this kernel? > > The whole context is: > > The system is booted with a NUMA node has no memory attaching to it > (memory-less NUMA node), also with NR_CPUS less than CPUs presented > in MADT, so CPUs on this memory-less node are not brought up, and > this NUMA node will not be online (but SRAT presents this NUMA node); > > Devices attaching to this NUMA node such as PCI host bridge still > return the valid NUMA node via _PXM, but actually that valid NUMA node > is not online which lead to this issue. But we should have other numa nodes on the zonelists so the allocator should fall back to other node. If the zonelist is not intiailized properly, though, then this can indeed show up as a problem. Knowing which exact place has blown up would help get a better picture... -- Michal Hocko SUSE Labs