Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7977148imu; Tue, 4 Dec 2018 00:23:13 -0800 (PST) X-Google-Smtp-Source: AFSGD/WNS5wJfmzsybYMMpEpQYWfqXAtaCEZ7WdeWvj58hn8oPlgexN32zw/VVzNPC85fBhbvyX6 X-Received: by 2002:aa7:83c6:: with SMTP id j6mr19106401pfn.91.1543911792953; Tue, 04 Dec 2018 00:23:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543911792; cv=none; d=google.com; s=arc-20160816; b=m88tkEEYTi8DlWfOR+3chKbYUjq2MAT9Ppli7yZ2obD9nu1j+/XqmT+3q/33gmH/Cx addCse/F2JoKKqETA1L1q7tTJ3YNBH4A/36T+zQwYwiOg2KUVNfTcZsCZVv+haKtBAbR RxLFVZ6ZJQ1YyErjZCoHSAI2uj7eT5M7BufzCAXQc5DQKwMDvXdSUl8ULUshvvOGLlaN zxVihOgc/jKwyG8DYGIzWDoPeQXqTSqWNDcxQf7/sTpbdb5e2ZRLwA9nOEz7KVU1/zO1 6gePnh2vZyxZoWEWSekdCAt04am9p5yXE6LZq3cm8FjBekmnV7G3UYWeKEIDwV5SoNnY eBrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=yWhpZVnl37B+74xL9TufmqWWQ1jY0zpiiIMndUDygX0=; b=wX6tDj/e1Yv+01+0UIMOfrsw/Odva0CqRIEWAN38bNxhXCmylogRw92gMYBMT5ZCYO Df+hql0s51YRzgedoFZkmk5A31YtBJ6GbsJ5hSN+VyfdyEVri2limmJmsTpBbj/yCikb SM2/XD4dETKKlvb4/IB7LkjedWshSpB+joGoYyiXqa6R0X2S8wlJqWt6tx33Zp2XzQjA HAGBpWlyQTgALA0stVpm6DesQ6MFwT0GDc2ULGg9LLy+95k4x76P05BdauuC4W10BWY/ kc8u6voufRxXQ+DNkSf5PHb2aJMsKYph7h9Dpgey5413+RLMFIZHlXzNhkzqzktqK0xN pd1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iBOYqy3E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n18si15696622pfj.30.2018.12.04.00.22.57; Tue, 04 Dec 2018 00:23:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iBOYqy3E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726096AbeLDIUq (ORCPT + 99 others); Tue, 4 Dec 2018 03:20:46 -0500 Received: from mail-ed1-f67.google.com ([209.85.208.67]:37397 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725983AbeLDIUp (ORCPT ); Tue, 4 Dec 2018 03:20:45 -0500 Received: by mail-ed1-f67.google.com with SMTP id h15so13124936edb.4 for ; Tue, 04 Dec 2018 00:20:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yWhpZVnl37B+74xL9TufmqWWQ1jY0zpiiIMndUDygX0=; b=iBOYqy3EB9EP5fJ38fqpwnGFhNss+e9ZZM/0m/HGsqlHW7Q3oAgruC0dOss4oSAcn2 NvPewr3Yn8rVjvdPfNRfjCm+25enYlVWOM9BBrmRxoP5T8llUHrgGkXe4aJdNQ/fBrYK sxUi/nHus5q78XvOiaQLOLC34/p3AUz7rE0FN8cOA8b/1DiH/BMPJycVjgBxKq0VxD73 5OL0rMn7B8adWUndG8F+R4MaKzWmELQ8Q4mx+qEyF7J1xl8eBloFSytSyK5u0Kupg9Nq UoJsU1Q7PmYQf2I/jD7uibgsBVYFjgtE1SQO1/YFQlvKmb3mVk0vJ/7eChrKkONi3MeM 8Dlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yWhpZVnl37B+74xL9TufmqWWQ1jY0zpiiIMndUDygX0=; b=dVHuVZGrT317Kfe+JvPUQuCzVc7KUOK131q4Tt8Z5RA8ckQqh1eeuyxOQNHxCtuOEo 36FlrYe023wfZz0rgL/kFnYTZ/56UHHUWWCgwWmoocsPw3VBywlY6RY4iQsAE2hf9hNp 2eKV0qffLz9Pl4dXAa26C9B1zI9sixQpfbuVgdRYXAHgZlIiEoQ0QFSYgXcJRgY84AxB L57PZflgTsstAZ6r6ztZtOadX0SxyqETYp/tF4mk6Yd6T9BQwr62YDNSSkRNI16qjnRP xZUcYCvuYTBXnMFLykypLmJQi26h98AgK+3LLVRwwCSWzMG/Tk8o7ECChlCaUo1iJJGV xIcA== X-Gm-Message-State: AA+aEWa2sVgiierLcQq94y8QSH7GXWvStOiHwvLrTLxmE0BKZCzIvsHx DXeAgbPeXWuDw0Ih34mPwcju+qYtYKm+KBrjkA== X-Received: by 2002:aa7:c711:: with SMTP id i17mr17169761edq.253.1543911643900; Tue, 04 Dec 2018 00:20:43 -0800 (PST) MIME-Version: 1.0 References: <1543892757-4323-1-git-send-email-kernelfans@gmail.com> <20181204072251.GT31738@dhcp22.suse.cz> In-Reply-To: <20181204072251.GT31738@dhcp22.suse.cz> From: Pingfan Liu Date: Tue, 4 Dec 2018 16:20:32 +0800 Message-ID: Subject: Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline To: mhocko@kernel.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Vlastimil Babka , Mike Rapoport , Bjorn Helgaas , Jonathan Cameron Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > > kernel failed to bootup, because some node's data struct can not be allocated, > > e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But > > device->numa_node info is used as preferred_nid param for > > __alloc_pages_nodemask(), which causes NULL reference > > ac->zonelist = node_zonelist(preferred_nid, gfp_mask); > > This patch tries to fix the issue by falling back to the first online node, > > when encountering such corner case. > > We have seen similar issues already and the bug was usually that the > zonelists were not initialized yet or the node is completely bogus. > Zonelists should be initialized by build_all_zonelists quite early so I > am wondering whether the later is the case. What is the actual node > number the device is associated with? > The device's node num is 2. And in my case, I used nr_cpus param. Due to init_cpu_to_node() initialize all the possible node. It is hard for me to figure out without this param, how zonelists is accessed before page allocator works. > Your patch is not correct btw, because we want to fallback into the node in > the distance order rather into the first online node. > -- What about this: +extern int find_next_best_node(int node, nodemask_t *used_node_mask); + /* * We get the zone list from the current node and the gfp_mask. * This zone list contains a maximum of MAXNODES*MAX_NR_ZONES zones. @@ -453,6 +455,11 @@ static inline int gfp_zonelist(gfp_t flags) */ static inline struct zonelist *node_zonelist(int nid, gfp_t flags) { + if (unlikely(!node_online(nid))) { + nodemask_t used_mask; + nodes_complement(used_mask, node_online_map); + nid = find_next_best_node(nid, &used_mask); + } return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags); } I just finished the compiling, not test it yet, since the machine is not on hand yet. It needs some time to get it again. Thanks, Pingfan