Received: by 10.223.176.46 with SMTP id f43csp2305849wra; Thu, 25 Jan 2018 07:57:26 -0800 (PST) X-Google-Smtp-Source: AH8x226eUoqSU0HmT04nk5IriVPbkIJioV9lLLX3prWDCcebTUiL4ChroLH36sm/5hEAvIe3agyF X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr11882950plr.199.1516895846157; Thu, 25 Jan 2018 07:57:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516895846; cv=none; d=google.com; s=arc-20160816; b=j4yTQAiisoG+U2dMpsDOSWyUDaogM1dOHKmTAfU/tVz6bHObRqGXhGxV8jEzAyMxq2 t8vWnnJ6MSHN/YHc7PcjlZeKuaapWzslNx+WkRIPLahsZvL6vOCMgjB2MCnWxps1lh2u lE4sWRSfCN9M6wzo01t1vgdQvKLPFRKcMPZ9dTiMJ3ZSNwjZT3Lu0lg6UN8OmYii0CuS XNkagaJXx6eAdAAGijjTfeeZ9NRG37BLWL6+Ws4aSn+07Gdd8DKqj7JbEoJBXrlx4RsT woA5yYVdRSqp00EvgQskIX1aGB4xgofOiQhu3nWFeVHZ+n3k6Dy8UZ94WBsFG2D791HM vLFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=Pg69rJR7nsBODu0CLw4lptUQdPw1dknbRSZremHlfWg=; b=zy4t6tfUeeAD0WOAT6lD9l56DPiBD0k1zx+Xe4Yr1VDXyz+894E7w2Yik9WO1kMtDh n/5hIBwClUCzdT1Yc0MjyD0yVXd2/Y5YW7K1RPmdrf1L8BoCDNlCg58ffWqiO7dc6dyj tc0ca2qMb6glCIRLqaCP7TWeny72Cgt6V0MUb9L4VzN1B+iL1lgeUDi8vhb9wWIFE5co jABNRh5GASMuPpZX10mD8zNfFgFmPDRK6nPEWR8QY+5CEYQyh5XpzylxZAQQhWh+6K9N 7BC1W6Pcvvo3+9tGs/b77zp0/v2E6kkhPowNXBcAi6cpz0vmprAffv4e+mHkb8p9o2l7 62aA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 63-v6si2179497plf.645.2018.01.25.07.57.11; Thu, 25 Jan 2018 07:57:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751351AbeAYP4e (ORCPT + 99 others); Thu, 25 Jan 2018 10:56:34 -0500 Received: from foss.arm.com ([217.140.101.70]:37176 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750851AbeAYP4c (ORCPT ); Thu, 25 Jan 2018 10:56:32 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 386111435; Thu, 25 Jan 2018 07:56:32 -0800 (PST) Received: from [192.168.100.241] (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EE9323F53D; Thu, 25 Jan 2018 07:56:30 -0800 (PST) Subject: Re: [PATCH v6 11/12] arm64: topology: enable ACPI/PPTT based CPU topology To: Xiongfeng Wang , linux-acpi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, sudeep.holla@arm.com, hanjun.guo@linaro.org, lorenzo.pieralisi@arm.com, rjw@rjwysocki.net, will.deacon@arm.com, catalin.marinas@arm.com, gregkh@linuxfoundation.org, viresh.kumar@linaro.org, mark.rutland@arm.com, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, jhugo@codeaurora.org, Jonathan.Zhang@cavium.com, ahs3@redhat.com, Jayachandran.Nair@cavium.com, austinwc@codeaurora.org, lenb@kernel.org, vkilari@codeaurora.org, morten.rasmussen@arm.com, Juri Lelli References: <20180113005920.28658-1-jeremy.linton@arm.com> <20180113005920.28658-12-jeremy.linton@arm.com> <928cb0c9-1d7a-3d0c-0538-a8bb0e2a86b1@huawei.com> From: Jeremy Linton Message-ID: Date: Thu, 25 Jan 2018 09:56:30 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <928cb0c9-1d7a-3d0c-0538-a8bb0e2a86b1@huawei.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 01/25/2018 06:15 AM, Xiongfeng Wang wrote: > Hi Jeremy, > > I have tested the patch with the newest UEFI. It prints the below error: > > [ 4.017371] BUG: arch topology borken > [ 4.021069] BUG: arch topology borken > [ 4.024764] BUG: arch topology borken > [ 4.028460] BUG: arch topology borken > [ 4.032153] BUG: arch topology borken > [ 4.035849] BUG: arch topology borken > [ 4.039543] BUG: arch topology borken > [ 4.043239] BUG: arch topology borken > [ 4.046932] BUG: arch topology borken > [ 4.050629] BUG: arch topology borken > [ 4.054322] BUG: arch topology borken > > I checked the code and found that the newest UEFI set PPTT physical_package_flag on a physical package node and > the NUMA domain (SRAT domains) starts from the layer of DIE. (The topology of our board is core->cluster->die->package). I commented about that on the EDK2 mailing list. While the current spec doesn't explicitly ban having the flag set multiple times between the leaf and the root I consider it a "bug" and there is an effort to clarify the spec and the use of that flag. > > When the kernel starts to build sched_domain, the multi-core sched_domain contains all the cores within a package, > and the lowest NUMA sched_domain contains all the cores within a die. But the kernel requires that the multi-core > sched_domain should be a subset of the lowest NUMA sched_domain, so the BUG info is printed. Right. I've mentioned this problem a couple of times. At at the moment, the spec isn't clear about how the proximity domain is detected/located within the PPTT topology (a node with a 1:1 correspondence isn't even required). As you can see from this patch set, we are making the general assumption that the proximity domains are at the same level as the physical socket. This isn't ideal for NUMA topologies, like the D05, that don't align with the physical socket. There are efforts underway to clarify and expand upon the specification to deal with this general problem. The simple solution is another flag (say PPTT_PROXIMITY_DOMAIN which would map to the D05 die) which could be used to find nodes with 1:1 correspondence. At that point we could add a fairly trivial patch to correct just the scheduler topology without affecting the rest of the system topology code. > > If we modify the UEFI to make NUMA sched_domain start from the layer of package, then all the topology information > within the package will be discarded. I think we need to build the multi-core sched_domain using the cores within > the cluster instead of the cores within the package. I think that's what 'multi-core' means. Multi cores form a cluster. I guess. > If we build the multi-core sched_domain using the cores within a cluster, I think we need to add fields in struct cpu_topology > to record which cores are in each cluster. The problem is that there isn't a generic way to identify which level of cache sharing is the "correct" top layer MC domain. For one system cluster might be appropriate, for another it might be the highest caching level within a socket, for another is might be a something in between or a group of clusters or LLCs.. Hence the effort to standardize/guarantee a PPTT node that exactly matches a SRAT domain. With that, each SOC/system provider has clearly defined method for communicating where they want the proximity domain information to begin. Thanks, > > > Thanks, > Xiongfeng > > On 2018/1/13 8:59, Jeremy Linton wrote: >> Propagate the topology information from the PPTT tree to the >> cpu_topology array. We can get the thread id, core_id and >> cluster_id by assuming certain levels of the PPTT tree correspond >> to those concepts. The package_id is flagged in the tree and can be >> found by calling find_acpi_cpu_topology_package() which terminates >> its search when it finds an ACPI node flagged as the physical >> package. If the tree doesn't contain enough levels to represent >> all of the requested levels then the root node will be returned >> for all subsequent levels. >> >> Cc: Juri Lelli >> Signed-off-by: Jeremy Linton >> --- >> arch/arm64/kernel/topology.c | 46 +++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 45 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c >> index 7b06e263fdd1..ce8ec7fd6b32 100644 >> --- a/arch/arm64/kernel/topology.c >> +++ b/arch/arm64/kernel/topology.c >> @@ -11,6 +11,7 @@ >> * for more details. >> */ >> >> +#include >> #include >> #include >> #include >> @@ -22,6 +23,7 @@ >> #include >> #include >> #include >> +#include >> #include >> >> #include >> @@ -300,6 +302,46 @@ static void __init reset_cpu_topology(void) >> } >> } >> >> +#ifdef CONFIG_ACPI >> +/* >> + * Propagate the topology information of the processor_topology_node tree to the >> + * cpu_topology array. >> + */ >> +static int __init parse_acpi_topology(void) >> +{ >> + bool is_threaded; >> + int cpu, topology_id; >> + >> + is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK; >> + >> + for_each_possible_cpu(cpu) { >> + topology_id = find_acpi_cpu_topology(cpu, 0); >> + if (topology_id < 0) >> + return topology_id; >> + >> + if (is_threaded) { >> + cpu_topology[cpu].thread_id = topology_id; >> + topology_id = find_acpi_cpu_topology(cpu, 1); >> + cpu_topology[cpu].core_id = topology_id; >> + topology_id = find_acpi_cpu_topology_package(cpu); >> + cpu_topology[cpu].package_id = topology_id; >> + } else { >> + cpu_topology[cpu].thread_id = -1; >> + cpu_topology[cpu].core_id = topology_id; >> + topology_id = find_acpi_cpu_topology_package(cpu); >> + cpu_topology[cpu].package_id = topology_id; >> + } >> + } >> + >> + return 0; >> +} >> + >> +#else >> +static inline int __init parse_acpi_topology(void) >> +{ >> + return -EINVAL; >> +} >> +#endif >> >> void __init init_cpu_topology(void) >> { >> @@ -309,6 +351,8 @@ void __init init_cpu_topology(void) >> * Discard anything that was parsed if we hit an error so we >> * don't use partial information. >> */ >> - if (of_have_populated_dt() && parse_dt_topology()) >> + if ((!acpi_disabled) && parse_acpi_topology()) >> + reset_cpu_topology(); >> + else if (of_have_populated_dt() && parse_dt_topology()) >> reset_cpu_topology(); >> } >> >