Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8001898imu; Tue, 4 Dec 2018 00:58:53 -0800 (PST) X-Google-Smtp-Source: AFSGD/XkhslxIN4VMu/KqPJSrY7csOBzN8dnTfOleKbLauBYs7hmvwxKtVHR3heWVLL4V2gLMkTn X-Received: by 2002:a62:e704:: with SMTP id s4mr19490349pfh.124.1543913933249; Tue, 04 Dec 2018 00:58:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543913933; cv=none; d=google.com; s=arc-20160816; b=D5qsWLYxathZg2qKXnimem3tNNoH1bsrqV5+4GLZu29h7WfjmRet54Edbbx2muWoVo kPY3uAjn20UT9YQE0v/IMrNuMrKFPUZIaVhUrAL6Cm/wjJf/yF8QSTO1om6J0vAOnzXQ 7Q4kcp/NHNGfaQs3R9G58AgBX3+6oHZWLvRPnullqUMwD75mivRsS+eurVyCmhTaUS4c RRElZs3KMgpe5Jpz9HzmxvzIFNdVT+yqEn3Sj+kkM9clhyLys2pgXHjuqg1H36oElWIj NiaiSRdYgwKr/TZyTpM6yNHc5vCynJpXHtadM+lqVHZMgOav2uCh+l/Y7d32N9i7loH1 BS0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=6AhkG52J24zaYlglrqaHX2iUIGtzEvFOA6z8myeF9/M=; b=kWC34bNQqTEt1crb6bhvApf+nPbLHknLGSI/jrntvmZz4h/l9LOhWGTc76vqEWcBZ6 hJQofD7JIseYA3d6nCblUXReWvq4DCo93bbV54F1jXJuw4HQP56FROpf2Clz9FDmA18U h3Qe5EsIDOUR2kNf4XkssREyh11hQZp4tDRl2eMZCBL3Kl9G4cIm4orvnTPL+ntfen43 J8QOnKfhIF1A3/kE2CxygF51tvHPM9udP5/FyHoRL+aA+Ln5cOOBqTJV8z176CK8p4SS YFWGs4vVszXkZcYu+qeHWzlRRkkOkf8KVsMX+CGeY+EIjUDfLP2bttDI9glx1IJ83eDe OvQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Y1gPIIH1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6si16477993plp.132.2018.12.04.00.58.37; Tue, 04 Dec 2018 00:58:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Y1gPIIH1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725956AbeLDI4n (ORCPT + 99 others); Tue, 4 Dec 2018 03:56:43 -0500 Received: from mail-ed1-f66.google.com ([209.85.208.66]:33153 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725613AbeLDI4n (ORCPT ); Tue, 4 Dec 2018 03:56:43 -0500 Received: by mail-ed1-f66.google.com with SMTP id p6so4942118eds.0 for ; Tue, 04 Dec 2018 00:56:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6AhkG52J24zaYlglrqaHX2iUIGtzEvFOA6z8myeF9/M=; b=Y1gPIIH1P2sfbktljfQstw2HXomqJGhNRKFSbrkeR633/0co+0FHX2LkCgT2tQD+pI OG5OEWRCMA++ulJDfnS36t77zBAorS0STGC4/wvTT3Ug7WHFeFmo+O1PEz3tavTaZBSj eCCNzrCRTEFREEeQvQ0YSkiOjLWyufBN1ZXLt1gdZwOHPuBrbVLxQJSeFhHl3Jv7F4bJ eXRriSaaSezGWw3wxnJeFQnWL3FLfX6QOHG3v9HwroyvlPgR4sOvB+z2591OpTPW0rdP 8AqrpwSsUrV9ahjMkndcJLdSH+5erPBr/Nb285S091Lqku83XAKSVA5vw+1RUTo0qAN+ zB/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6AhkG52J24zaYlglrqaHX2iUIGtzEvFOA6z8myeF9/M=; b=LUzq/LoB2tW3b/0OKBcW2cd5unes55knDDbpJqbfblxtZA7QMLYKMwXNm9Tmt5pxDD gOm75yzhguhkEZvBifviwG7vJ0Y42ZqDP22QzWe7eYX+7PfMsckUwBlQ86J2O62zwyXe Avj0vYnAaVJSU6TpEciDdcVUtmVNJUTqurTFtfdSgKAfmUaiPU4YHaENlrcR+of8JKCw d3VfUHNyqBQ+8Jbq7iFuL9rsogD8rDJ68dl3OwtSj11Mf4bYHuExayivfGOcGp4IzJpC 0Q5OuUq1VqTMTd1QITsgy1pwTYOVp6mSR0ygSfH5vocg+RPFn2kmY+zPXq0JI6ObRw5q rz9w== X-Gm-Message-State: AA+aEWZ1lpR/D5TKudTwT1h8+FpXUGXKuev3mlQqeP7g0zQh6CQDByPk /i2V2hoRe1FRNeIkMV84I+5XMrepYTzlupkxZw== X-Received: by 2002:a50:cdd0:: with SMTP id h16mr17694168edj.151.1543913801646; Tue, 04 Dec 2018 00:56:41 -0800 (PST) MIME-Version: 1.0 References: <1543892757-4323-1-git-send-email-kernelfans@gmail.com> <20181204072251.GT31738@dhcp22.suse.cz> <20181204084052.gpwwlnp6n2zehjy5@master> In-Reply-To: <20181204084052.gpwwlnp6n2zehjy5@master> From: Pingfan Liu Date: Tue, 4 Dec 2018 16:56:29 +0800 Message-ID: Subject: Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline To: richard.weiyang@gmail.com Cc: mhocko@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Vlastimil Babka , Mike Rapoport , Bjorn Helgaas , Jonathan Cameron Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 4, 2018 at 4:40 PM Wei Yang wrote: > > On Tue, Dec 04, 2018 at 04:20:32PM +0800, Pingfan Liu wrote: > >On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > >> > >> On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > >> > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > >> > kernel failed to bootup, because some node's data struct can not be allocated, > >> > e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But > >> > device->numa_node info is used as preferred_nid param for > >> > __alloc_pages_nodemask(), which causes NULL reference > >> > ac->zonelist = node_zonelist(preferred_nid, gfp_mask); > >> > This patch tries to fix the issue by falling back to the first online node, > >> > when encountering such corner case. > >> > >> We have seen similar issues already and the bug was usually that the > >> zonelists were not initialized yet or the node is completely bogus. > >> Zonelists should be initialized by build_all_zonelists quite early so I > >> am wondering whether the later is the case. What is the actual node > >> number the device is associated with? > >> > >The device's node num is 2. And in my case, I used nr_cpus param. Due > >to init_cpu_to_node() initialize all the possible node. It is hard > >for me to figure out without this param, how zonelists is accessed > >before page allocator works. > > If my understanding is correct, we can't do page alloc before zonelist > is initialized. > > I guess Michal's point is to figure out this reason. > Yeah, I know. I just want to emphasize that I hit this bug using nr_cpus, which may be rarely used by people. Hence it may be a different bug as Michal had seen. Sorry for bad English if it cause confusion. Thanks [...]