LinuxLists.cc - [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

2019-06-27 19:56:51

Subject: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

The cpu-map DT entry in ARM can describe the CPU topology in much better
way compared to other existing approaches. RISC-V can easily adopt this
binding to represent its own CPU topology. Thus, both cpu-map DT
binding and topology parsing code can be moved to a common location so
that RISC-V or any other architecture can leverage that.

The relevant discussion regarding unifying cpu topology can be found in
[1].

arch_topology seems to be a perfect place to move the common code. I
have not introduced any significant functional changes in the moved code.
The only downside in this approach is that the capacity code will be
executed for RISC-V as well. But, it will exit immediately after not
able to find the appropriate DT node. If the overhead is considered too
much, we can always compile out capacity related functions under a
different config for the architectures that do not support them.

There was an opportunity to unify topology data structure for ARM32 done
by patch 3/4. But, I refrained from making any other changes as I am not
very well versed with original intention for some functions that
are present in arch_topology.c. I hope this patch series can be served
as a baseline for such changes in the future.

The patches have been tested for RISC-V, ARM64, ARM32 & compile tested for
x86.

From Jeremy,

"I applied these to 5.2rc2, along with my PPTT/MT change and verified the
system & scheduler topology/etc on DAWN and ThunderX2 using ACPI on arm64.
They appear to be working correctly.

so for the series,
Tested-by: Jeremy Linton <[email protected]>"

The socket change[2] is also now part of this series.

[1] https://lkml.org/lkml/2018/11/6/19
[2] https://lkml.org/lkml/2018/11/7/918

This patch series can also be found at
https://github.com/atishp04/linux/tree/5.2-rc6_topology

QEMU changes for RISC-V topology are available here
https://lists.nongnu.org/archive/html/qemu-devel/2019-06/msg05974.html

HiFive Unleashed DT with topology node is available here.
https://patchwork.kernel.org/patch/11014313/

Changes from v7->v8
1. Regenerated the patch series without -b option in git format-patch.
Without that, git apply from email won't work because ignored space
changes.

Changes from v6->v7
1. Added socket to HiFive Unleashed topology example.
2. Added Acked-by & Reviewed-by.

Changes from v5->v6
1. Added two more patches from Sudeep about maintainership of arch_topology.c
and Kconfig update.
2. Added Tested-by & Reviewed-by
3. Fixed a nit (reordering of variables)

Changes from v4-v5
1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c
file. Added it in linux/topology.h.
2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not
work.

Changes from v3->v4
1. Get rid of ARM32 specific information in topology structure.
2. Remove redundant functions from ARM32 and use common code instead.

Changes from v2->v3
1. Cover letter update with experiment DT for topology changes.
2. Added the patch for [2].

Changes from v1->v2
1. ARM32 can now use the common code as well.

Atish Patra (4):
dt-binding: cpu-topology: Move cpu-map to a common binding.
cpu-topology: Move cpu topology code to common code.
arm: Use common cpu_topology structure and functions.
RISC-V: Parse cpu topology during boot.

Sudeep Holla (3):
Documentation: DT: arm: add support for sockets defining package
boundaries
base: arch_topology: update Kconfig help description
MAINTAINERS: Add an entry for generic architecture topology

.../topology.txt => cpu/cpu-topology.txt} | 256 ++++++++++-----
MAINTAINERS | 7 +
arch/arm/include/asm/topology.h | 20 --
arch/arm/kernel/topology.c | 60 +---
arch/arm64/include/asm/topology.h | 23 --
arch/arm64/kernel/topology.c | 303 +-----------------
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/smpboot.c | 3 +
drivers/base/Kconfig | 2 +-
drivers/base/arch_topology.c | 298 +++++++++++++++++
include/linux/arch_topology.h | 26 ++
include/linux/topology.h | 1 +
12 files changed, 514 insertions(+), 486 deletions(-)
rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)

--
2.21.0

2019-06-27 19:56:54

by Atish Patra

[permalink] [raw]

Subject: [PATCH v8 1/7] Documentation: DT: arm: add support for sockets defining package boundaries

From: Sudeep Holla <[email protected]>

The current ARM DT topology description provides the operating system
with a topological view of the system that is based on leaf nodes
representing either cores or threads (in an SMT system) and a
hierarchical set of cluster nodes that creates a hierarchical topology
view of how those cores and threads are grouped.

However this hierarchical representation of clusters does not allow to
describe what topology level actually represents the physical package or
the socket boundary, which is a key piece of information to be used by
an operating system to optimize resource allocation and scheduling.

Lets add a new "socket" node type in the cpu-map node to describe the
same.

Signed-off-by: Sudeep Holla <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
.../devicetree/bindings/arm/topology.txt | 172 ++++++++++--------
1 file changed, 99 insertions(+), 73 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
index b0d80c0fb265..3b8febb46dad 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/arm/topology.txt
@@ -9,6 +9,7 @@ ARM topology binding description
In an ARM system, the hierarchy of CPUs is defined through three entities that
are used to describe the layout of physical CPUs in the system:

+- socket
- cluster
- core
- thread
@@ -63,21 +64,23 @@ nodes are listed.

The cpu-map node's child nodes can be:

- - one or more cluster nodes
+ - one or more cluster nodes or
+ - one or more socket nodes in a multi-socket system

Any other configuration is considered invalid.

-The cpu-map node can only contain three types of child nodes:
+The cpu-map node can only contain 4 types of child nodes:

+- socket node
- cluster node
- core node
- thread node

whose bindings are described in paragraph 3.

-The nodes describing the CPU topology (cluster/core/thread) can only
-be defined within the cpu-map node and every core/thread in the system
-must be defined within the topology. Any other configuration is
+The nodes describing the CPU topology (socket/cluster/core/thread) can
+only be defined within the cpu-map node and every core/thread in the
+system must be defined within the topology. Any other configuration is
invalid and therefore must be ignored.

===========================================
@@ -85,26 +88,44 @@ invalid and therefore must be ignored.
===========================================

cpu-map child nodes must follow a naming convention where the node name
-must be "clusterN", "coreN", "threadN" depending on the node type (ie
-cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
-are siblings within a single common parent node must be given a unique and
+must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
+(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
+which are siblings within a single common parent node must be given a unique and
sequential N value, starting from 0).
cpu-map child nodes which do not share a common parent node can have the same
name (ie same number N as other cpu-map child nodes at different device tree
levels) since name uniqueness will be guaranteed by the device tree hierarchy.

===========================================
-3 - cluster/core/thread node bindings
+3 - socket/cluster/core/thread node bindings
===========================================

-Bindings for cluster/cpu/thread nodes are defined as follows:
+Bindings for socket/cluster/cpu/thread nodes are defined as follows:
+
+- socket node
+
+ Description: must be declared within a cpu-map node, one node
+ per physical socket in the system. A system can
+ contain single or multiple physical socket.
+ The association of sockets and NUMA nodes is beyond
+ the scope of this bindings, please refer [2] for
+ NUMA bindings.
+
+ This node is optional for a single socket system.
+
+ The socket node name must be "socketN" as described in 2.1 above.
+ A socket node can not be a leaf node.
+
+ A socket node's child nodes must be one or more cluster nodes.
+
+ Any other configuration is considered invalid.

- cluster node

Description: must be declared within a cpu-map node, one node
per cluster. A system can contain several layers of
- clustering and cluster nodes can be contained in parent
- cluster nodes.
+ clustering within a single physical socket and cluster
+ nodes can be contained in parent cluster nodes.

The cluster node name must be "clusterN" as described in 2.1 above.
A cluster node can not be a leaf node.
@@ -164,90 +185,93 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
4 - Example dts
===========================================

-Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
+Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
+physical socket):

cpus {
#size-cells = <0>;
#address-cells = <2>;

cpu-map {
- cluster0 {
+ socket0 {
cluster0 {
- core0 {
- thread0 {
- cpu = <&CPU0>;
- };
- thread1 {
- cpu = <&CPU1>;
+ cluster0 {
+ core0 {
+ thread0 {
+ cpu = <&CPU0>;
+ };
+ thread1 {
+ cpu = <&CPU1>;
+ };
};
- };

- core1 {
- thread0 {
- cpu = <&CPU2>;
- };
- thread1 {
- cpu = <&CPU3>;
+ core1 {
+ thread0 {
+ cpu = <&CPU2>;
+ };
+ thread1 {
+ cpu = <&CPU3>;
+ };
};
};
- };

- cluster1 {
- core0 {
- thread0 {
- cpu = <&CPU4>;
- };
- thread1 {
- cpu = <&CPU5>;
+ cluster1 {
+ core0 {
+ thread0 {
+ cpu = <&CPU4>;
+ };
+ thread1 {
+ cpu = <&CPU5>;
+ };
};
- };
-
- core1 {
- thread0 {
- cpu = <&CPU6>;
- };
- thread1 {
- cpu = <&CPU7>;
- };
- };
- };
- };

- cluster1 {
- cluster0 {
- core0 {
- thread0 {
- cpu = <&CPU8>;
- };
- thread1 {
- cpu = <&CPU9>;
- };
- };
- core1 {
- thread0 {
- cpu = <&CPU10>;
- };
- thread1 {
- cpu = <&CPU11>;
+ core1 {
+ thread0 {
+ cpu = <&CPU6>;
+ };
+ thread1 {
+ cpu = <&CPU7>;
+ };
};
};
};

cluster1 {
- core0 {
- thread0 {
- cpu = <&CPU12>;
+ cluster0 {
+ core0 {
+ thread0 {
+ cpu = <&CPU8>;
+ };
+ thread1 {
+ cpu = <&CPU9>;
+ };
};
- thread1 {
- cpu = <&CPU13>;
+ core1 {
+ thread0 {
+ cpu = <&CPU10>;
+ };
+ thread1 {
+ cpu = <&CPU11>;
+ };
};
};
- core1 {
- thread0 {
- cpu = <&CPU14>;
+
+ cluster1 {
+ core0 {
+ thread0 {
+ cpu = <&CPU12>;
+ };
+ thread1 {
+ cpu = <&CPU13>;
+ };
};
- thread1 {
- cpu = <&CPU15>;
+ core1 {
+ thread0 {
+ cpu = <&CPU14>;
+ };
+ thread1 {
+ cpu = <&CPU15>;
+ };
};
};
};
@@ -473,3 +497,5 @@ cpus {
===============================================================================
[1] ARM Linux kernel documentation
Documentation/devicetree/bindings/arm/cpus.yaml
+[2] Devicetree NUMA binding description
+ Documentation/devicetree/bindings/numa.txt
--
2.21.0

2019-06-27 19:57:00

by Atish Patra

[permalink] [raw]

Subject: [PATCH v8 3/7] cpu-topology: Move cpu topology code to common code.

Both RISC-V & ARM64 are using cpu-map device tree to describe
their cpu topology. It's better to move the relevant code to
a common place instead of duplicate code.

To: Will Deacon <[email protected]>
To: Catalin Marinas <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
[Tested on QDF2400]
Tested-by: Jeffrey Hugo <[email protected]>
[Tested on Juno and other embedded platforms.]
Tested-by: Sudeep Holla <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/include/asm/topology.h | 23 ---
arch/arm64/kernel/topology.c | 303 +-----------------------------
drivers/base/arch_topology.c | 296 +++++++++++++++++++++++++++++
include/linux/arch_topology.h | 28 +++
include/linux/topology.h | 1 +
5 files changed, 329 insertions(+), 322 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 0524f2438649..a4d945db95a2 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,29 +4,6 @@

#include <linux/cpumask.h>

-struct cpu_topology {
- int thread_id;
- int core_id;
- int package_id;
- int llc_id;
- cpumask_t thread_sibling;
- cpumask_t core_sibling;
- cpumask_t llc_sibling;
-};
-
-extern struct cpu_topology cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
-#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
-#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-void remove_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
#ifdef CONFIG_NUMA

struct pci_bus;
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 0825c4a856e3..6b95c91e7d67 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -14,250 +14,13 @@
#include <linux/acpi.h>
#include <linux/arch_topology.h>
#include <linux/cacheinfo.h>
-#include <linux/cpu.h>
-#include <linux/cpumask.h>
#include <linux/init.h>
#include <linux/percpu.h>
-#include <linux/node.h>
-#include <linux/nodemask.h>
-#include <linux/of.h>
-#include <linux/sched.h>
-#include <linux/sched/topology.h>
-#include <linux/slab.h>
-#include <linux/smp.h>
-#include <linux/string.h>

#include <asm/cpu.h>
#include <asm/cputype.h>
#include <asm/topology.h>

-static int __init get_cpu_for_node(struct device_node *node)
-{
- struct device_node *cpu_node;
- int cpu;
-
- cpu_node = of_parse_phandle(node, "cpu", 0);
- if (!cpu_node)
- return -1;
-
- cpu = of_cpu_node_to_id(cpu_node);
- if (cpu >= 0)
- topology_parse_cpu_capacity(cpu_node, cpu);
- else
- pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
-
- of_node_put(cpu_node);
- return cpu;
-}
-
-static int __init parse_core(struct device_node *core, int package_id,
- int core_id)
-{
- char name[10];
- bool leaf = true;
- int i = 0;
- int cpu;
- struct device_node *t;
-
- do {
- snprintf(name, sizeof(name), "thread%d", i);
- t = of_get_child_by_name(core, name);
- if (t) {
- leaf = false;
- cpu = get_cpu_for_node(t);
- if (cpu >= 0) {
- cpu_topology[cpu].package_id = package_id;
- cpu_topology[cpu].core_id = core_id;
- cpu_topology[cpu].thread_id = i;
- } else {
- pr_err("%pOF: Can't get CPU for thread\n",
- t);
- of_node_put(t);
- return -EINVAL;
- }
- of_node_put(t);
- }
- i++;
- } while (t);
-
- cpu = get_cpu_for_node(core);
- if (cpu >= 0) {
- if (!leaf) {
- pr_err("%pOF: Core has both threads and CPU\n",
- core);
- return -EINVAL;
- }
-
- cpu_topology[cpu].package_id = package_id;
- cpu_topology[cpu].core_id = core_id;
- } else if (leaf) {
- pr_err("%pOF: Can't get CPU for leaf core\n", core);
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int __init parse_cluster(struct device_node *cluster, int depth)
-{
- char name[10];
- bool leaf = true;
- bool has_cores = false;
- struct device_node *c;
- static int package_id __initdata;
- int core_id = 0;
- int i, ret;
-
- /*
- * First check for child clusters; we currently ignore any
- * information about the nesting of clusters and present the
- * scheduler with a flat list of them.
- */
- i = 0;
- do {
- snprintf(name, sizeof(name), "cluster%d", i);
- c = of_get_child_by_name(cluster, name);
- if (c) {
- leaf = false;
- ret = parse_cluster(c, depth + 1);
- of_node_put(c);
- if (ret != 0)
- return ret;
- }
- i++;
- } while (c);
-
- /* Now check for cores */
- i = 0;
- do {
- snprintf(name, sizeof(name), "core%d", i);
- c = of_get_child_by_name(cluster, name);
- if (c) {
- has_cores = true;
-
- if (depth == 0) {
- pr_err("%pOF: cpu-map children should be clusters\n",
- c);
- of_node_put(c);
- return -EINVAL;
- }
-
- if (leaf) {
- ret = parse_core(c, package_id, core_id++);
- } else {
- pr_err("%pOF: Non-leaf cluster with core %s\n",
- cluster, name);
- ret = -EINVAL;
- }
-
- of_node_put(c);
- if (ret != 0)
- return ret;
- }
- i++;
- } while (c);
-
- if (leaf && !has_cores)
- pr_warn("%pOF: empty cluster\n", cluster);
-
- if (leaf)
- package_id++;
-
- return 0;
-}
-
-static int __init parse_dt_topology(void)
-{
- struct device_node *cn, *map;
- int ret = 0;
- int cpu;
-
- cn = of_find_node_by_path("/cpus");
- if (!cn) {
- pr_err("No CPU information found in DT\n");
- return 0;
- }
-
- /*
- * When topology is provided cpu-map is essentially a root
- * cluster with restricted subnodes.
- */
- map = of_get_child_by_name(cn, "cpu-map");
- if (!map)
- goto out;
-
- ret = parse_cluster(map, 0);
- if (ret != 0)
- goto out_map;
-
- topology_normalize_cpu_scale();
-
- /*
- * Check that all cores are in the topology; the SMP code will
- * only mark cores described in the DT as possible.
- */
- for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id == -1)
- ret = -EINVAL;
-
-out_map:
- of_node_put(map);
-out:
- of_node_put(cn);
- return ret;
-}
-
-/*
- * cpu topology table
- */
-struct cpu_topology cpu_topology[NR_CPUS];
-EXPORT_SYMBOL_GPL(cpu_topology);
-
-const struct cpumask *cpu_coregroup_mask(int cpu)
-{
- const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
-
- /* Find the smaller of NUMA, core or LLC siblings */
- if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
- /* not numa in package, lets use the package siblings */
- core_mask = &cpu_topology[cpu].core_sibling;
- }
- if (cpu_topology[cpu].llc_id != -1) {
- if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
- core_mask = &cpu_topology[cpu].llc_sibling;
- }
-
- return core_mask;
-}
-
-static void update_siblings_masks(unsigned int cpuid)
-{
- struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
- int cpu;
-
- /* update core and thread sibling masks */
- for_each_online_cpu(cpu) {
- cpu_topo = &cpu_topology[cpu];
-
- if (cpuid_topo->llc_id == cpu_topo->llc_id) {
- cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
- cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
- }
-
- if (cpuid_topo->package_id != cpu_topo->package_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
- cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
- if (cpuid_topo->core_id != cpu_topo->core_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
- cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
- }
-}
-
void store_cpu_topology(unsigned int cpuid)
{
struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
@@ -296,59 +59,19 @@ void store_cpu_topology(unsigned int cpuid)
update_siblings_masks(cpuid);
}

-static void clear_cpu_topology(int cpu)
-{
- struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
- cpumask_clear(&cpu_topo->llc_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
-
- cpumask_clear(&cpu_topo->core_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
- cpumask_clear(&cpu_topo->thread_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
-}
-
-static void __init reset_cpu_topology(void)
-{
- unsigned int cpu;
-
- for_each_possible_cpu(cpu) {
- struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
- cpu_topo->thread_id = -1;
- cpu_topo->core_id = 0;
- cpu_topo->package_id = -1;
- cpu_topo->llc_id = -1;
-
- clear_cpu_topology(cpu);
- }
-}
-
-void remove_cpu_topology(unsigned int cpu)
-{
- int sibling;
-
- for_each_cpu(sibling, topology_core_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
- for_each_cpu(sibling, topology_sibling_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
- for_each_cpu(sibling, topology_llc_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
-
- clear_cpu_topology(cpu);
-}
-
#ifdef CONFIG_ACPI
/*
* Propagate the topology information of the processor_topology_node tree to the
* cpu_topology array.
*/
-static int __init parse_acpi_topology(void)
+int __init parse_acpi_topology(void)
{
bool is_threaded;
int cpu, topology_id;

+ if (acpi_disabled)
+ return 0;
+
is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;

for_each_possible_cpu(cpu) {
@@ -384,24 +107,6 @@ static int __init parse_acpi_topology(void)

return 0;
}
-
-#else
-static inline int __init parse_acpi_topology(void)
-{
- return -EINVAL;
-}
#endif

-void __init init_cpu_topology(void)
-{
- reset_cpu_topology();

- /*
- * Discard anything that was parsed if we hit an error so we
- * don't use partial information.
- */
- if (!acpi_disabled && parse_acpi_topology())
- reset_cpu_topology();
- else if (of_have_populated_dt() && parse_dt_topology())
- reset_cpu_topology();
-}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1739d7e1952a..5781bb4c457c 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -15,6 +15,11 @@
#include <linux/string.h>
#include <linux/sched/topology.h>
#include <linux/cpuset.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
+#include <linux/smp.h>

DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;

@@ -244,3 +249,294 @@ static void parsing_done_workfn(struct work_struct *work)
#else
core_initcall(free_raw_capacity);
#endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+static int __init get_cpu_for_node(struct device_node *node)
+{
+ struct device_node *cpu_node;
+ int cpu;
+
+ cpu_node = of_parse_phandle(node, "cpu", 0);
+ if (!cpu_node)
+ return -1;
+
+ cpu = of_cpu_node_to_id(cpu_node);
+ if (cpu >= 0)
+ topology_parse_cpu_capacity(cpu_node, cpu);
+ else
+ pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
+
+ of_node_put(cpu_node);
+ return cpu;
+}
+
+static int __init parse_core(struct device_node *core, int package_id,
+ int core_id)
+{
+ char name[10];
+ bool leaf = true;
+ int i = 0;
+ int cpu;
+ struct device_node *t;
+
+ do {
+ snprintf(name, sizeof(name), "thread%d", i);
+ t = of_get_child_by_name(core, name);
+ if (t) {
+ leaf = false;
+ cpu = get_cpu_for_node(t);
+ if (cpu >= 0) {
+ cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].core_id = core_id;
+ cpu_topology[cpu].thread_id = i;
+ } else {
+ pr_err("%pOF: Can't get CPU for thread\n",
+ t);
+ of_node_put(t);
+ return -EINVAL;
+ }
+ of_node_put(t);
+ }
+ i++;
+ } while (t);
+
+ cpu = get_cpu_for_node(core);
+ if (cpu >= 0) {
+ if (!leaf) {
+ pr_err("%pOF: Core has both threads and CPU\n",
+ core);
+ return -EINVAL;
+ }
+
+ cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].core_id = core_id;
+ } else if (leaf) {
+ pr_err("%pOF: Can't get CPU for leaf core\n", core);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int __init parse_cluster(struct device_node *cluster, int depth)
+{
+ char name[10];
+ bool leaf = true;
+ bool has_cores = false;
+ struct device_node *c;
+ static int package_id __initdata;
+ int core_id = 0;
+ int i, ret;
+
+ /*
+ * First check for child clusters; we currently ignore any
+ * information about the nesting of clusters and present the
+ * scheduler with a flat list of them.
+ */
+ i = 0;
+ do {
+ snprintf(name, sizeof(name), "cluster%d", i);
+ c = of_get_child_by_name(cluster, name);
+ if (c) {
+ leaf = false;
+ ret = parse_cluster(c, depth + 1);
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ i++;
+ } while (c);
+
+ /* Now check for cores */
+ i = 0;
+ do {
+ snprintf(name, sizeof(name), "core%d", i);
+ c = of_get_child_by_name(cluster, name);
+ if (c) {
+ has_cores = true;
+
+ if (depth == 0) {
+ pr_err("%pOF: cpu-map children should be clusters\n",
+ c);
+ of_node_put(c);
+ return -EINVAL;
+ }
+
+ if (leaf) {
+ ret = parse_core(c, package_id, core_id++);
+ } else {
+ pr_err("%pOF: Non-leaf cluster with core %s\n",
+ cluster, name);
+ ret = -EINVAL;
+ }
+
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ i++;
+ } while (c);
+
+ if (leaf && !has_cores)
+ pr_warn("%pOF: empty cluster\n", cluster);
+
+ if (leaf)
+ package_id++;
+
+ return 0;
+}
+
+static int __init parse_dt_topology(void)
+{
+ struct device_node *cn, *map;
+ int ret = 0;
+ int cpu;
+
+ cn = of_find_node_by_path("/cpus");
+ if (!cn) {
+ pr_err("No CPU information found in DT\n");
+ return 0;
+ }
+
+ /*
+ * When topology is provided cpu-map is essentially a root
+ * cluster with restricted subnodes.
+ */
+ map = of_get_child_by_name(cn, "cpu-map");
+ if (!map)
+ goto out;
+
+ ret = parse_cluster(map, 0);
+ if (ret != 0)
+ goto out_map;
+
+ topology_normalize_cpu_scale();
+
+ /*
+ * Check that all cores are in the topology; the SMP code will
+ * only mark cores described in the DT as possible.
+ */
+ for_each_possible_cpu(cpu)
+ if (cpu_topology[cpu].package_id == -1)
+ ret = -EINVAL;
+
+out_map:
+ of_node_put(map);
+out:
+ of_node_put(cn);
+ return ret;
+}
+
+/*
+ * cpu topology table
+ */
+struct cpu_topology cpu_topology[NR_CPUS];
+EXPORT_SYMBOL_GPL(cpu_topology);
+
+const struct cpumask *cpu_coregroup_mask(int cpu)
+{
+ const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+
+ /* Find the smaller of NUMA, core or LLC siblings */
+ if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
+ /* not numa in package, lets use the package siblings */
+ core_mask = &cpu_topology[cpu].core_sibling;
+ }
+ if (cpu_topology[cpu].llc_id != -1) {
+ if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
+ core_mask = &cpu_topology[cpu].llc_sibling;
+ }
+
+ return core_mask;
+}
+
+void update_siblings_masks(unsigned int cpuid)
+{
+ struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+ int cpu;
+
+ /* update core and thread sibling masks */
+ for_each_online_cpu(cpu) {
+ cpu_topo = &cpu_topology[cpu];
+
+ if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+ cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
+ cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
+ }
+
+ if (cpuid_topo->package_id != cpu_topo->package_id)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+ cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+ if (cpuid_topo->core_id != cpu_topo->core_id)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
+ cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
+ }
+}
+
+static void clear_cpu_topology(int cpu)
+{
+ struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+ cpumask_clear(&cpu_topo->llc_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
+
+ cpumask_clear(&cpu_topo->core_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
+ cpumask_clear(&cpu_topo->thread_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
+}
+
+static void __init reset_cpu_topology(void)
+{
+ unsigned int cpu;
+
+ for_each_possible_cpu(cpu) {
+ struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+ cpu_topo->thread_id = -1;
+ cpu_topo->core_id = -1;
+ cpu_topo->package_id = -1;
+ cpu_topo->llc_id = -1;
+
+ clear_cpu_topology(cpu);
+ }
+}
+
+void remove_cpu_topology(unsigned int cpu)
+{
+ int sibling;
+
+ for_each_cpu(sibling, topology_core_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
+ for_each_cpu(sibling, topology_sibling_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
+ for_each_cpu(sibling, topology_llc_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
+
+ clear_cpu_topology(cpu);
+}
+
+__weak int __init parse_acpi_topology(void)
+{
+ return 0;
+}
+
+void __init init_cpu_topology(void)
+{
+ reset_cpu_topology();
+
+ /*
+ * Discard anything that was parsed if we hit an error so we
+ * don't use partial information.
+ */
+ if (parse_acpi_topology())
+ reset_cpu_topology();
+ else if (of_have_populated_dt() && parse_dt_topology())
+ reset_cpu_topology();
+}
+#endif
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d9bdc1a7f4e7..d4e76e0a283f 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -33,4 +33,32 @@ unsigned long topology_get_freq_scale(int cpu)
return per_cpu(freq_scale, cpu);
}

+struct cpu_topology {
+ int thread_id;
+ int core_id;
+ int package_id;
+ int llc_id;
+ cpumask_t thread_sibling;
+ cpumask_t core_sibling;
+ cpumask_t llc_sibling;
+};
+
+#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
+extern struct cpu_topology cpu_topology[NR_CPUS];
+
+#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
+#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
+#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
+#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(int cpu);
+#endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+void update_siblings_masks(unsigned int cpu);
+#endif
+void remove_cpu_topology(unsigned int cpuid);
+
#endif /* _LINUX_ARCH_TOPOLOGY_H_ */
diff --git a/include/linux/topology.h b/include/linux/topology.h
index cb0775e1ee4b..4b3755d65812 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -27,6 +27,7 @@
#ifndef _LINUX_TOPOLOGY_H
#define _LINUX_TOPOLOGY_H

+#include <linux/arch_topology.h>
#include <linux/cpumask.h>
#include <linux/bitops.h>
#include <linux/mmzone.h>
--
2.21.0

2019-06-27 19:57:48

by Atish Patra

[permalink] [raw]

Subject: [PATCH v8 4/7] arm: Use common cpu_topology structure and functions.

Currently, ARM32 and ARM64 uses different data structures to represent
their cpu topologies. Since, we are moving the ARM64 topology to common
code to be used by other architectures, we can reuse that for ARM32 as
well.

Take this opprtunity to remove the redundant functions from ARM32 and
reuse the common code instead.

To: Russell King <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
Tested-by: Sudeep Holla <[email protected]> (on TC2)
Reviewed-by : Sudeep Holla <[email protected]>
---
arch/arm/include/asm/topology.h | 20 -----------
arch/arm/kernel/topology.c | 60 ++++-----------------------------
drivers/base/arch_topology.c | 4 ++-
include/linux/arch_topology.h | 6 ++--
4 files changed, 11 insertions(+), 79 deletions(-)

diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 2a786f54d8b8..8a0fae94d45e 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -5,26 +5,6 @@
#ifdef CONFIG_ARM_CPU_TOPOLOGY

#include <linux/cpumask.h>
-
-struct cputopo_arm {
- int thread_id;
- int core_id;
- int socket_id;
- cpumask_t thread_sibling;
- cpumask_t core_sibling;
-};
-
-extern struct cputopo_arm cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
-#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
#include <linux/arch_topology.h>

/* Replace task scheduler's default frequency-invariant accounting */
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 60e375ce1ab2..238f1da0219c 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {}
static inline void update_cpu_capacity(unsigned int cpuid) {}
#endif

- /*
- * cpu topology table
- */
-struct cputopo_arm cpu_topology[NR_CPUS];
-EXPORT_SYMBOL_GPL(cpu_topology);
-
-const struct cpumask *cpu_coregroup_mask(int cpu)
-{
- return &cpu_topology[cpu].core_sibling;
-}
-
/*
* The current assumption is that we can power gate each core independently.
* This will be superseded by DT binding once available.
@@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu)
return &cpu_topology[cpu].thread_sibling;
}

-static void update_siblings_masks(unsigned int cpuid)
-{
- struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
- int cpu;
-
- /* update core and thread sibling masks */
- for_each_possible_cpu(cpu) {
- cpu_topo = &cpu_topology[cpu];
-
- if (cpuid_topo->socket_id != cpu_topo->socket_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
- if (cpu != cpuid)
- cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
- if (cpuid_topo->core_id != cpu_topo->core_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
- if (cpu != cpuid)
- cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
- }
- smp_wmb();
-}
-
/*
* store_cpu_topology is called at boot when only one cpu is running
* and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
@@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid)
*/
void store_cpu_topology(unsigned int cpuid)
{
- struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
+ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
unsigned int mpidr;

/* If the cpu topology has been already set, just return */
@@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid)
/* core performance interdependency */
cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
- cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+ cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
} else {
/* largely independent cores */
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
- cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+ cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
}
} else {
/*
@@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid)
*/
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = 0;
- cpuid_topo->socket_id = -1;
+ cpuid_topo->package_id = -1;
}

update_siblings_masks(cpuid);
@@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid)
pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
cpuid, cpu_topology[cpuid].thread_id,
cpu_topology[cpuid].core_id,
- cpu_topology[cpuid].socket_id, mpidr);
+ cpu_topology[cpuid].package_id, mpidr);
}

static inline int cpu_corepower_flags(void)
@@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = {
*/
void __init init_cpu_topology(void)
{
- unsigned int cpu;
-
- /* init core mask and capacity */
- for_each_possible_cpu(cpu) {
- struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
-
- cpu_topo->thread_id = -1;
- cpu_topo->core_id = -1;
- cpu_topo->socket_id = -1;
- cpumask_clear(&cpu_topo->core_sibling);
- cpumask_clear(&cpu_topo->thread_sibling);
- }
+ reset_cpu_topology();
smp_wmb();

parse_dt_topology();
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 5781bb4c457c..797e3cd71bea 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -426,6 +426,7 @@ static int __init parse_dt_topology(void)
of_node_put(cn);
return ret;
}
+#endif

/*
* cpu topology table
@@ -491,7 +492,7 @@ static void clear_cpu_topology(int cpu)
cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
}

-static void __init reset_cpu_topology(void)
+void __init reset_cpu_topology(void)
{
unsigned int cpu;

@@ -526,6 +527,7 @@ __weak int __init parse_acpi_topology(void)
return 0;
}

+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void __init init_cpu_topology(void)
{
reset_cpu_topology();
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d4e76e0a283f..d4311127970d 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -54,11 +54,9 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
-#endif
-
-#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void update_siblings_masks(unsigned int cpu);
-#endif
void remove_cpu_topology(unsigned int cpuid);
+void reset_cpu_topology(void);
+#endif

#endif /* _LINUX_ARCH_TOPOLOGY_H_ */
--
2.21.0

2019-06-29 03:49:21

by Hanjun Guo

[permalink] [raw]

Subject: Re: [PATCH v8 3/7] cpu-topology: Move cpu topology code to common code.

On 2019/6/28 3:52, Atish Patra wrote:
> Both RISC-V & ARM64 are using cpu-map device tree to describe
> their cpu topology. It's better to move the relevant code to
> a common place instead of duplicate code.
>
> To: Will Deacon <[email protected]>
> To: Catalin Marinas <[email protected]>

Using Cc: is better.

> Signed-off-by: Atish Patra <[email protected]>
> [Tested on QDF2400]
> Tested-by: Jeffrey Hugo <[email protected]>
> [Tested on Juno and other embedded platforms.]
> Tested-by: Sudeep Holla <[email protected]>
> Reviewed-by: Sudeep Holla <[email protected]>
> Acked-by: Will Deacon <[email protected]>
> Acked-by: Greg Kroah-Hartman <[email protected]>
> ---
> arch/arm64/include/asm/topology.h | 23 ---
> arch/arm64/kernel/topology.c | 303 +-----------------------------
> drivers/base/arch_topology.c | 296 +++++++++++++++++++++++++++++
> include/linux/arch_topology.h | 28 +++
> include/linux/topology.h | 1 +
> 5 files changed, 329 insertions(+), 322 deletions(-)

Tested on Kunpeng920 ARM64 server, works good,

# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 96
On-line CPU(s) list: 0-95
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4
Vendor ID: 0x48
Model: 0
Stepping: 0x1
CPU max MHz: 2600.0000
CPU min MHz: 260.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

Tested-by: Hanjun Guo <[email protected]>

For the ACPI code,

Acked-by: Hanjun Guo <[email protected]>

Thanks
Hanjun

2019-07-01 18:45:05

by Paul Walmsley

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

Hi Atish

Looks like patches 1, 6, and 7 are missing your Signed-off-by:. Can I add
those?

- Paul

2019-07-01 18:52:29

by Atish Patra

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

On Mon, 2019-07-01 at 11:44 -0700, Paul Walmsley wrote:
> Hi Atish
>
> Looks like patches 1, 6, and 7 are missing your Signed-off-by:. Can
> I add
> those?
>
Sure.

Is it a common practice to add "Signed-off-by:" the sender even if the
sender has not touched the patch at all?

Regards,
Atish
>
> - Paul

2019-07-01 18:56:43

by Paul Walmsley

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

On Mon, 1 Jul 2019, Atish Patra wrote:

> On Mon, 2019-07-01 at 11:44 -0700, Paul Walmsley wrote:
> >
> > Looks like patches 1, 6, and 7 are missing your Signed-off-by:. Can I
> > add those?
> >
> Sure.
>
> Is it a common practice to add "Signed-off-by:" the sender even if the
> sender has not touched the patch at all?

Yes, see section 11(c) here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n418

The main factor here is that you collected and resent the patches - thus
you're in the patch submission chain.

- Paul

2019-07-01 20:19:33

by Atish Patra

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

On Mon, 2019-07-01 at 11:55 -0700, Paul Walmsley wrote:
> On Mon, 1 Jul 2019, Atish Patra wrote:
>
> > On Mon, 2019-07-01 at 11:44 -0700, Paul Walmsley wrote:
> > > Looks like patches 1, 6, and 7 are missing your Signed-off-
> > > by:. Can I
> > > add those?
> > >
> > Sure.
> >
> > Is it a common practice to add "Signed-off-by:" the sender even if
> > the
> > sender has not touched the patch at all?
>
> Yes, see section 11(c) here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n418
>
> The main factor here is that you collected and resent the patches -
> thus
> you're in the patch submission chain.
>

Ahh okay. Thanks for the link. I will keep this in mind in future.

Regards,
Atish
>
> - Paul

2019-07-12 17:17:49

by Paul Walmsley

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

Folks,

On Thu, 27 Jun 2019, Atish Patra wrote:

> The cpu-map DT entry in ARM can describe the CPU topology in much better
> way compared to other existing approaches. RISC-V can easily adopt this
> binding to represent its own CPU topology. Thus, both cpu-map DT
> binding and topology parsing code can be moved to a common location so
> that RISC-V or any other architecture can leverage that.
>
> The relevant discussion regarding unifying cpu topology can be found in
> [1].
>
> arch_topology seems to be a perfect place to move the common code. I
> have not introduced any significant functional changes in the moved code.
> The only downside in this approach is that the capacity code will be
> executed for RISC-V as well. But, it will exit immediately after not
> able to find the appropriate DT node. If the overhead is considered too
> much, we can always compile out capacity related functions under a
> different config for the architectures that do not support them.
>
> There was an opportunity to unify topology data structure for ARM32 done
> by patch 3/4. But, I refrained from making any other changes as I am not
> very well versed with original intention for some functions that
> are present in arch_topology.c. I hope this patch series can be served
> as a baseline for such changes in the future.
>
> The patches have been tested for RISC-V, ARM64, ARM32 & compile tested for
> x86.

Since these patches touch files across several different architectures,
and thus really should sit in -next for a while; and because it's late in
the merge window, I'm planning to postpone sending these patches upstream
until after v5.3-rc1 is released.

Once v5.3-rc1 is released, let's plan to get these patches rebased and
reposted and into linux-next as soon as possible.

Sorry for the delay here,

- Paul

2019-07-23 02:38:02

by Paul Walmsley

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

On Fri, 12 Jul 2019, Paul Walmsley wrote:

> On Thu, 27 Jun 2019, Atish Patra wrote:
>
> > The cpu-map DT entry in ARM can describe the CPU topology in much better
> > way compared to other existing approaches. RISC-V can easily adopt this
> > binding to represent its own CPU topology. Thus, both cpu-map DT
> > binding and topology parsing code can be moved to a common location so
> > that RISC-V or any other architecture can leverage that.
> > different config for the architectures that do not support them.
>
> Once v5.3-rc1 is released, let's plan to get these patches rebased and
> reposted and into linux-next as soon as possible.

These CPU topology patches are now queued for v5.4-rc1. They should enter
linux-next shortly.

- Paul

2019-07-25 09:59:17

by Atish Patra

[permalink] [raw]

Subject: Re: [PATCH v8 0/7] Unify CPU topology across ARM & RISC-V

On 7/22/19 12:25 PM, Paul Walmsley wrote:
> On Fri, 12 Jul 2019, Paul Walmsley wrote:
>
>> On Thu, 27 Jun 2019, Atish Patra wrote:
>>
>>> The cpu-map DT entry in ARM can describe the CPU topology in much better
>>> way compared to other existing approaches. RISC-V can easily adopt this
>>> binding to represent its own CPU topology. Thus, both cpu-map DT
>>> binding and topology parsing code can be moved to a common location so
>>> that RISC-V or any other architecture can leverage that.
>>> different config for the architectures that do not support them.
>>
>> Once v5.3-rc1 is released, let's plan to get these patches rebased and
>> reposted and into linux-next as soon as possible.
>
> These CPU topology patches are now queued for v5.4-rc1. They should enter
> linux-next shortly.
>
>
> - Paul
>

Thanks!!

--
Regards,
Atish