Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp5521910ybp; Tue, 8 Oct 2019 04:18:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqzXWtNCcbvFJgAK6/JSgp7FU8Vivzna04xEosZQtYLIFUqM05w4CUeFgjx2OM7iafb6GIvo X-Received: by 2002:a17:906:e251:: with SMTP id gq17mr27897205ejb.85.1570533525187; Tue, 08 Oct 2019 04:18:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570533525; cv=none; d=google.com; s=arc-20160816; b=WhYunz8sjZUtghF+zEU0Ou+wvo8+k/UZTK1tgs58wQCA5uviv17SjXWupItZ+7sMv7 SdMokZSdNeYKJoEQV5dmbC4WzDXANk1cRfPTC/3xz5NXrEJ1WfADONEuYe96HRQM+bOl 6mRxQTAwhMF015wlDVGoTiQHKXkumaVbrdSER+QttfZw0yLFC4MwYS4S/8ns95RCF7rS SviKjvCKZr6CduCV5yTOJVsCkvQqEK+t7n2GRqmGivL4XYtx22e5r3Yypf+B86ykjPm5 seQd2T4H3TKo3k1V6ICMc2c8VbhAbrcbHFjxe1XKYosfhRST3GF1MBszC2KrfX2oYQ/Q HZ+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=Po+mfEprUYmCaglfn01a7gZdLhL5hR4OIN9xHa9phBM=; b=sPkLYAXg39KWQZxRf952/rXKEDNwHKVlI8NntmAKSzLdSqtCSKhlKZh/2G6+8xDWs2 /tAtuvqC2QUXwzBJ1gQCEOlcN7bOjn5YUOavJu1JrJlbEZzNoWckWkzl+kzzCfLRcGYD ifd5RRH7iKEGSzBgriIJEYQnQ0fYUpGqqILpqKMYtUh0n3V4qz7et5uWFGRhkWOMYGas hD23DmkeObYXMeXdMoZrZgml+Gro9ju2RNmQum/v4xW+b4gJ04J5N3VbgV+Fg1XfzboG bJzJiRkhJQyUZBWReCKiQHHgUp7pWFAfqLwMXhsy2/gnhqE+++fiXFUvOMecGWv4A1gq OTiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w1si9808042eda.214.2019.10.08.04.18.21; Tue, 08 Oct 2019 04:18:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730332AbfJHLR7 (ORCPT + 99 others); Tue, 8 Oct 2019 07:17:59 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:3222 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729790AbfJHLR7 (ORCPT ); Tue, 8 Oct 2019 07:17:59 -0400 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 440B2A04CF488C9331FE; Tue, 8 Oct 2019 19:17:56 +0800 (CST) Received: from localhost (10.202.226.61) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.439.0; Tue, 8 Oct 2019 19:17:47 +0800 Date: Tue, 8 Oct 2019 12:17:29 +0100 From: Jonathan Cameron To: Ingo Molnar CC: , , , , , Keith Busch , , "Rafael J . Wysocki" , , "Andrew Morton" , Dan Williams Subject: Re: [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Message-ID: <20191008121729.00005ee9@huawei.com> In-Reply-To: <20191007145505.GB88143@gmail.com> References: <20191004114330.104746-1-Jonathan.Cameron@huawei.com> <20191004114330.104746-4-Jonathan.Cameron@huawei.com> <20191007145505.GB88143@gmail.com> Organization: Huawei X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.226.61] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 7 Oct 2019 16:55:05 +0200 Ingo Molnar wrote: > * Jonathan Cameron wrote: > > > Done in a somewhat different fashion to arm64. > > Here the infrastructure for memoryless domains was already > > in place. That infrastruture applies just as well to > > domains that also don't have a CPU, hence it works for > > Generic Initiator Domains. > > > > In common with memoryless domains we only register GI domains > > if the proximity node is not online. If a domain is already > > a memory containing domain, or a memoryless domain there is > > nothing to do just because it also contains a Generic Initiator. > > > > Signed-off-by: Jonathan Cameron > > --- > > arch/x86/include/asm/numa.h | 2 ++ > > arch/x86/kernel/setup.c | 1 + > > arch/x86/mm/numa.c | 14 ++++++++++++++ > > 3 files changed, 17 insertions(+) > > > > diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h > > index bbfde3d2662f..f631467272a3 100644 > > --- a/arch/x86/include/asm/numa.h > > +++ b/arch/x86/include/asm/numa.h > > @@ -62,12 +62,14 @@ extern void numa_clear_node(int cpu); > > extern void __init init_cpu_to_node(void); > > extern void numa_add_cpu(int cpu); > > extern void numa_remove_cpu(int cpu); > > +extern void init_gi_nodes(void); > > #else /* CONFIG_NUMA */ > > static inline void numa_set_node(int cpu, int node) { } > > static inline void numa_clear_node(int cpu) { } > > static inline void init_cpu_to_node(void) { } > > static inline void numa_add_cpu(int cpu) { } > > static inline void numa_remove_cpu(int cpu) { } > > +static inline void init_gi_nodes(void) { } > > #endif /* CONFIG_NUMA */ > > > > #ifdef CONFIG_DEBUG_PER_CPU_MAPS > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index cfb533d42371..b6c977907ea5 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -1264,6 +1264,7 @@ void __init setup_arch(char **cmdline_p) > > prefill_possible_map(); > > > > init_cpu_to_node(); > > + init_gi_nodes(); > > > > io_apic_init_mappings(); > > > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > > index 4123100e0eaf..50bf724a425e 100644 > > --- a/arch/x86/mm/numa.c > > +++ b/arch/x86/mm/numa.c > > @@ -733,6 +733,20 @@ static void __init init_memory_less_node(int nid) > > */ > > } > > > > +/* > > + * Generic Initiator Nodes may have neither CPU nor Memory. > > + * At this stage if either of the others were present we would > > + * already be online. > > + */ > > +void __init init_gi_nodes(void) > > +{ > > + int nid; > > + > > + for_each_node_state(nid, N_GENERIC_INITIATOR) > > + if (!node_online(nid)) > > + init_memory_less_node(nid); > > +} > > Nit: missing curly braces. Good point. > > How do these work in practice, will a system that only had nodes 0-1 > today grow a third node '2' that won't have any CPUs on memory on them? Yes. Exactly that. The result is that fallback lists etc work when _PXM is used to assign a device into that new node. The interesting bit comes when a driver does something more interesting and queries the numa distances from SLIT. At that point the driver can elect to do load balancing across multiple nodes at similar distances. In theory you can also specify a device you wish to put into the node via the SRAT entry (IIRC using segment + BDF for PCI devices), but for now I haven't implemented that method. > > Thanks, > > Ingo Thanks, Jonathan