Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp614511imm; Fri, 22 Jun 2018 02:13:02 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIRLxpv65nyIEvM14VeYixpLUqHbFhMwQNsudVUlePHg+yKG+DT8pnx8zOWnZKhriAQnbcb X-Received: by 2002:a62:2281:: with SMTP id p1-v6mr852749pfj.53.1529658782450; Fri, 22 Jun 2018 02:13:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529658782; cv=none; d=google.com; s=arc-20160816; b=p7e5JFt7caa/e4dtX3g4M+z2hl4GU0ngbX4dIYFcEqBUv6W3iHbLCch03HNt+KbGb9 o/Ca/+F46ftPaCJG7S4PEtFu9on8yI2bC8o0MMdAWHDvBRsNhPCBYOF1/XbXUVewhLkP eKjV9daLqhZeOSvWTRvw8M1fwtYGT7MRTCB9+mQSCOM8DupM/sZy7Al9YhhsXNM4lnWq cm/6vdSEfB3RIa9OAGPm5GbIvMsenf88bcgCH0NfIuijyD88wCsH+7hCWVbfnLf+HYwJ OZxjL0yx/9QgRItM34LB3qxDmz48XhB2K+L1EG6jlRQVicKi9ucvgjxEWqyfQWvZvS2G vRKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=Ve5PqZwIcC0I/QW5zCQwtsg3ErIw0ONAyoP6NV9BW1g=; b=ymbDt9yTvoT3iZHVHIOEv6GPKDsLGwIZ2eo5/E76pjzkTh6AvlHQTVcW7NXeEO/8M6 SBesLLJDMwK8PAN22S7adRWxkG4rEcbdwu31zUtm5d65d9Xzrc43zCppP5zT7hOBYS+Y loj+UBstE8+3oJFDU7wLXFPGGXnYvd5P4gWmvK9fmJ2pHYQQ0eelEgvrn4JPQ5LXQJDT hhzSHAsnQy9qdEeT4JPFVbYEnj8yiMcArCmLd+k8Icvbi5pgVk+92YrLzbhA490nUita MdJAe387GM9RwfWMiBPQH7CvafL862Sw9eyKWzj+R3UWXlS97qwGANw/gfZQCxgob/9W CrQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h2-v6si5815959pgf.334.2018.06.22.02.12.38; Fri, 22 Jun 2018 02:13:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751325AbeFVJL6 (ORCPT + 99 others); Fri, 22 Jun 2018 05:11:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:55338 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751086AbeFVJL4 (ORCPT ); Fri, 22 Jun 2018 05:11:56 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id E6F2AAC07; Fri, 22 Jun 2018 09:11:54 +0000 (UTC) Date: Fri, 22 Jun 2018 11:11:53 +0200 From: Michal Hocko To: Hanjun Guo Cc: Punit Agrawal , Xie XiuQi , Lorenzo Pieralisi , Bjorn Helgaas , tnowicki@caviumnetworks.com, linux-pci@vger.kernel.org, Catalin Marinas , "Rafael J. Wysocki" , Will Deacon , Linux Kernel Mailing List , Jarkko Sakkinen , linux-mm@kvack.org, wanghuiqiang@huawei.com, Greg Kroah-Hartman , Bjorn Helgaas , Andrew Morton , zhongjiang , linux-arm Subject: Re: [PATCH 1/2] arm64: avoid alloc memory on offline node Message-ID: <20180622091153.GU10465@dhcp22.suse.cz> References: <20180619120714.GE13685@dhcp22.suse.cz> <874lhz3pmn.fsf@e105922-lin.cambridge.arm.com> <20180619140818.GA16927@e107981-ln.cambridge.arm.com> <87wouu3jz1.fsf@e105922-lin.cambridge.arm.com> <20180619151425.GH13685@dhcp22.suse.cz> <87r2l23i2b.fsf@e105922-lin.cambridge.arm.com> <20180619163256.GA18952@e107981-ln.cambridge.arm.com> <814205eb-ae86-a519-bed0-f09b8e2d3a02@huawei.com> <87602d3ccl.fsf@e105922-lin.cambridge.arm.com> <5c083c9c-473f-f504-848b-48506d0fd380@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5c083c9c-473f-f504-848b-48506d0fd380@huawei.com> User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 22-06-18 16:58:05, Hanjun Guo wrote: > On 2018/6/20 19:51, Punit Agrawal wrote: > > Xie XiuQi writes: > > > >> Hi Lorenzo, Punit, > >> > >> > >> On 2018/6/20 0:32, Lorenzo Pieralisi wrote: > >>> On Tue, Jun 19, 2018 at 04:35:40PM +0100, Punit Agrawal wrote: > >>>> Michal Hocko writes: > >>>> > >>>>> On Tue 19-06-18 15:54:26, Punit Agrawal wrote: > >>>>> [...] > >>>>>> In terms of $SUBJECT, I wonder if it's worth taking the original patch > >>>>>> as a temporary fix (it'll also be easier to backport) while we work on > >>>>>> fixing these other issues and enabling memoryless nodes. > >>>>> > >>>>> Well, x86 already does that but copying this antipatern is not really > >>>>> nice. So it is good as a quick fix but it would be definitely much > >>>>> better to have a robust fix. Who knows how many other places might hit > >>>>> this. You certainly do not want to add a hack like this all over... > >>>> > >>>> Completely agree! I was only suggesting it as a temporary measure, > >>>> especially as it looked like a proper fix might be invasive. > >>>> > >>>> Another fix might be to change the node specific allocation to node > >>>> agnostic allocations. It isn't clear why the allocation is being > >>>> requested from a specific node. I think Lorenzo suggested this in one of > >>>> the threads. > >>> > >>> I think that code was just copypasted but it is better to fix the > >>> underlying issue. > >>> > >>>> I've started putting together a set fixing the issues identified in this > >>>> thread. It should give a better idea on the best course of action. > >>> > >>> On ACPI ARM64, this diff should do if I read the code correctly, it > >>> should be (famous last words) just a matter of mapping PXMs to nodes for > >>> every SRAT GICC entry, feel free to pick it up if it works. > >>> > >>> Yes, we can take the original patch just because it is safer for an -rc > >>> cycle even though if the patch below would do delaying the fix for a > >>> couple of -rc (to get it tested across ACPI ARM64 NUMA platforms) is > >>> not a disaster. > >> > >> I tested this patch on my arm board, it works. > > > > I am assuming you tried the patch without enabling support for > > memory-less nodes. > > > > The patch de-couples the onlining of numa nodes (as parsed from SRAT) > > from NR_CPUS restriction. When it comes to building zonelists, the node > > referenced by the PCI controller also has zonelists initialised. > > > > So it looks like a fallback node is setup even if we don't have > > memory-less nodes enabled. I need to stare some more at the code to see > > why we need memory-less nodes at all then ... > > Yes, please. From my limited MM knowledge, zonelists should not be > initialised if no CPU and no memory on this node, correct me if I'm > wrong. Well, as long as there is a code which can explicitly ask for a specific node than it is safer to have zonelists configured. Otherwise you just force callers to add hacks and figure out the proper placement there. Zonelists should be cheep to configure for all possible nodes. It's not like we are talking about huge amount of resources. -- Michal Hocko SUSE Labs