2019-03-20 23:50:02

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 0/5] Unify CPU topology across ARM & RISC-V

The cpu-map DT entry in ARM can describe the CPU topology in much better
way compared to other existing approaches. RISC-V can easily adopt this
binding to represent its own CPU topology. Thus, both cpu-map DT
binding and topology parsing code can be moved to a common location so
that RISC-V or any other architecture can leverage that.

The relevant discussion regarding unifying cpu topology can be found in
[1].

arch_topology seems to be a perfect place to move the common code. I
have not introduced any significant functional changes in the moved code.
The only downside in this approach is that the capacity code will be
executed for RISC-V as well. But, it will exit immediately after not
able to find the appropriate DT node. If the overhead is considered too
much, we can always compile out capacity related functions under a
different config for the architectures that do not support them.

There was an opportunity to unify topology data structure for ARM32 done
by patch 3/4. But, I refrained from making any other changes as I am not
very well versed with original intention for some functions that
are present in arch_topology.c. I hope this patch series can be served
as a baseline for such changes in the future.

The patches have been tested for RISC-V and compile tested for ARM64,
ARM32 & x86.

The socket change[2] is also now part of this series.

[1] https://lkml.org/lkml/2018/11/6/19
[2] https://lkml.org/lkml/2018/11/7/918

QEMU changes for RISC-V topology are available at

https://github.com/atishp04/qemu/tree/riscv_topology_dt

HiFive Unleashed DT with topology node is available here.
https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology

It can be verified with OpenSBI with following additional compile time
option.

FW_PAYLOAD_FDT="unleashed_topology.dtb"

Changes from v2->v3
1. Cover letter update with experiment DT for topology changes.
2. Added the patch for [2].

Changes from v1->v2
1. ARM32 can now use the common code as well.

Atish Patra (4):
dt-binding: cpu-topology: Move cpu-map to a common binding.
cpu-topology: Move cpu topology code to common code.
arm: Use common cpu_topology
RISC-V: Parse cpu topology during boot.

Sudeep Holla (1):
Documentation: DT: arm: add support for sockets defining package
boundaries

.../topology.txt => cpu/cpu-topology.txt} | 134 ++++++--
arch/arm/include/asm/topology.h | 22 +-
arch/arm/kernel/topology.c | 10 +-
arch/arm64/include/asm/topology.h | 23 --
arch/arm64/kernel/topology.c | 303 +-----------------
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/smpboot.c | 3 +
drivers/base/arch_topology.c | 298 ++++++++++++++++-
drivers/base/topology.c | 1 +
include/linux/arch_topology.h | 36 +++
10 files changed, 453 insertions(+), 378 deletions(-)
rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)

--
2.21.0



2019-03-20 23:49:10

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 2/5] dt-binding: cpu-topology: Move cpu-map to a common binding.

cpu-map binding can be used to described cpu topology for both
RISC-V & ARM. It makes more sense to move the binding to document
to a common place.

The relevant discussion can be found here.
https://lkml.org/lkml/2018/11/6/19

Signed-off-by: Atish Patra <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
---
.../topology.txt => cpu/cpu-topology.txt} | 82 +++++++++++++++----
1 file changed, 66 insertions(+), 16 deletions(-)
rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
similarity index 86%
rename from Documentation/devicetree/bindings/arm/topology.txt
rename to Documentation/devicetree/bindings/cpu/cpu-topology.txt
index 3b8febb4..069addcc 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/cpu/cpu-topology.txt
@@ -1,12 +1,12 @@
===========================================
-ARM topology binding description
+CPU topology binding description
===========================================

===========================================
1 - Introduction
===========================================

-In an ARM system, the hierarchy of CPUs is defined through three entities that
+In a SMP system, the hierarchy of CPUs is defined through three entities that
are used to describe the layout of physical CPUs in the system:

- socket
@@ -14,9 +14,6 @@ are used to describe the layout of physical CPUs in the system:
- core
- thread

-The cpu nodes (bindings defined in [1]) represent the devices that
-correspond to physical CPUs and are to be mapped to the hierarchy levels.
-
The bottom hierarchy level sits at core or thread level depending on whether
symmetric multi-threading (SMT) is supported or not.

@@ -25,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above.
In systems where SMT is not supported "cpu" nodes represent all cores present
in the system and map to the hierarchy level "core" above.

-ARM topology bindings allow one to associate cpu nodes with hierarchical groups
+CPU topology bindings allow one to associate cpu nodes with hierarchical groups
corresponding to the system hierarchy; syntactically they are defined as device
tree nodes.

-The remainder of this document provides the topology bindings for ARM, based
-on the Devicetree Specification, available from:
+Currently, only ARM/RISC-V intend to use this cpu topology binding but it may be
+used for any other architecture as well.

-https://www.devicetree.org/specifications/
+The cpu nodes, as per bindings defined in [4], represent the devices that
+correspond to physical CPUs and are to be mapped to the hierarchy levels.

-If not stated otherwise, whenever a reference to a cpu node phandle is made its
-value must point to a cpu node compliant with the cpu node bindings as
-documented in [1].
A topology description containing phandles to cpu nodes that are not compliant
-with bindings standardized in [1] is therefore considered invalid.
+with bindings standardized in [4] is therefore considered invalid.

===========================================
2 - cpu-map node
===========================================

-The ARM CPU topology is defined within the cpu-map node, which is a direct
+The ARM/RISC-V CPU topology is defined within the cpu-map node, which is a direct
child of the cpus node and provides a container where the actual topology
nodes are listed.

- cpu-map node

- Usage: Optional - On ARM SMP systems provide CPUs topology to the OS.
- ARM uniprocessor systems do not require a topology
+ Usage: Optional - On SMP systems provide CPUs topology to the OS.
+ Uniprocessor systems do not require a topology
description and therefore should not define a
cpu-map node.

@@ -494,8 +489,63 @@ cpus {
};
};

+Example 3: HiFive Unleashed (RISC-V 64 bit, 4 core system)
+
+{
+ #address-cells = <2>;
+ #size-cells = <2>;
+ compatible = "sifive,fu540g", "sifive,fu500";
+ model = "sifive,hifive-unleashed-a00";
+
+ ...
+ cpus {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&CPU1>;
+ };
+ core1 {
+ cpu = <&CPU2>;
+ };
+ core2 {
+ cpu0 = <&CPU2>;
+ };
+ core3 {
+ cpu0 = <&CPU3>;
+ };
+ };
+ };
+
+ CPU1: cpu@1 {
+ device_type = "cpu";
+ compatible = "sifive,rocket0", "riscv";
+ reg = <0x1>;
+ }
+
+ CPU2: cpu@2 {
+ device_type = "cpu";
+ compatible = "sifive,rocket0", "riscv";
+ reg = <0x2>;
+ }
+ CPU3: cpu@3 {
+ device_type = "cpu";
+ compatible = "sifive,rocket0", "riscv";
+ reg = <0x3>;
+ }
+ CPU4: cpu@4 {
+ device_type = "cpu";
+ compatible = "sifive,rocket0", "riscv";
+ reg = <0x4>;
+ }
+ }
+};
===============================================================================
[1] ARM Linux kernel documentation
Documentation/devicetree/bindings/arm/cpus.yaml
[2] Devicetree NUMA binding description
Documentation/devicetree/bindings/numa.txt
+[3] RISC-V Linux kernel documentation
+ Documentation/devicetree/bindings/riscv/cpus.txt
+[4] https://www.devicetree.org/specifications/
--
2.21.0


2019-03-20 23:49:14

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 3/5] cpu-topology: Move cpu topology code to common code.

Both RISC-V & ARM64 are using cpu-map device tree to describe
their cpu topology. It's better to move the relevant code to
a common place instead of duplicate code.

Signed-off-by: Atish Patra <[email protected]>
Tested-by: Jeffrey Hugo <[email protected]>
---
arch/arm64/include/asm/topology.h | 23 ---
arch/arm64/kernel/topology.c | 303 +-----------------------------
drivers/base/arch_topology.c | 298 ++++++++++++++++++++++++++++-
drivers/base/topology.c | 1 +
include/linux/arch_topology.h | 28 +++
5 files changed, 330 insertions(+), 323 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 0524f243..a4d945db 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,29 +4,6 @@

#include <linux/cpumask.h>

-struct cpu_topology {
- int thread_id;
- int core_id;
- int package_id;
- int llc_id;
- cpumask_t thread_sibling;
- cpumask_t core_sibling;
- cpumask_t llc_sibling;
-};
-
-extern struct cpu_topology cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
-#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
-#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-void remove_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
#ifdef CONFIG_NUMA

struct pci_bus;
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 0825c4a8..6b95c91e 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -14,250 +14,13 @@
#include <linux/acpi.h>
#include <linux/arch_topology.h>
#include <linux/cacheinfo.h>
-#include <linux/cpu.h>
-#include <linux/cpumask.h>
#include <linux/init.h>
#include <linux/percpu.h>
-#include <linux/node.h>
-#include <linux/nodemask.h>
-#include <linux/of.h>
-#include <linux/sched.h>
-#include <linux/sched/topology.h>
-#include <linux/slab.h>
-#include <linux/smp.h>
-#include <linux/string.h>

#include <asm/cpu.h>
#include <asm/cputype.h>
#include <asm/topology.h>

-static int __init get_cpu_for_node(struct device_node *node)
-{
- struct device_node *cpu_node;
- int cpu;
-
- cpu_node = of_parse_phandle(node, "cpu", 0);
- if (!cpu_node)
- return -1;
-
- cpu = of_cpu_node_to_id(cpu_node);
- if (cpu >= 0)
- topology_parse_cpu_capacity(cpu_node, cpu);
- else
- pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
-
- of_node_put(cpu_node);
- return cpu;
-}
-
-static int __init parse_core(struct device_node *core, int package_id,
- int core_id)
-{
- char name[10];
- bool leaf = true;
- int i = 0;
- int cpu;
- struct device_node *t;
-
- do {
- snprintf(name, sizeof(name), "thread%d", i);
- t = of_get_child_by_name(core, name);
- if (t) {
- leaf = false;
- cpu = get_cpu_for_node(t);
- if (cpu >= 0) {
- cpu_topology[cpu].package_id = package_id;
- cpu_topology[cpu].core_id = core_id;
- cpu_topology[cpu].thread_id = i;
- } else {
- pr_err("%pOF: Can't get CPU for thread\n",
- t);
- of_node_put(t);
- return -EINVAL;
- }
- of_node_put(t);
- }
- i++;
- } while (t);
-
- cpu = get_cpu_for_node(core);
- if (cpu >= 0) {
- if (!leaf) {
- pr_err("%pOF: Core has both threads and CPU\n",
- core);
- return -EINVAL;
- }
-
- cpu_topology[cpu].package_id = package_id;
- cpu_topology[cpu].core_id = core_id;
- } else if (leaf) {
- pr_err("%pOF: Can't get CPU for leaf core\n", core);
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int __init parse_cluster(struct device_node *cluster, int depth)
-{
- char name[10];
- bool leaf = true;
- bool has_cores = false;
- struct device_node *c;
- static int package_id __initdata;
- int core_id = 0;
- int i, ret;
-
- /*
- * First check for child clusters; we currently ignore any
- * information about the nesting of clusters and present the
- * scheduler with a flat list of them.
- */
- i = 0;
- do {
- snprintf(name, sizeof(name), "cluster%d", i);
- c = of_get_child_by_name(cluster, name);
- if (c) {
- leaf = false;
- ret = parse_cluster(c, depth + 1);
- of_node_put(c);
- if (ret != 0)
- return ret;
- }
- i++;
- } while (c);
-
- /* Now check for cores */
- i = 0;
- do {
- snprintf(name, sizeof(name), "core%d", i);
- c = of_get_child_by_name(cluster, name);
- if (c) {
- has_cores = true;
-
- if (depth == 0) {
- pr_err("%pOF: cpu-map children should be clusters\n",
- c);
- of_node_put(c);
- return -EINVAL;
- }
-
- if (leaf) {
- ret = parse_core(c, package_id, core_id++);
- } else {
- pr_err("%pOF: Non-leaf cluster with core %s\n",
- cluster, name);
- ret = -EINVAL;
- }
-
- of_node_put(c);
- if (ret != 0)
- return ret;
- }
- i++;
- } while (c);
-
- if (leaf && !has_cores)
- pr_warn("%pOF: empty cluster\n", cluster);
-
- if (leaf)
- package_id++;
-
- return 0;
-}
-
-static int __init parse_dt_topology(void)
-{
- struct device_node *cn, *map;
- int ret = 0;
- int cpu;
-
- cn = of_find_node_by_path("/cpus");
- if (!cn) {
- pr_err("No CPU information found in DT\n");
- return 0;
- }
-
- /*
- * When topology is provided cpu-map is essentially a root
- * cluster with restricted subnodes.
- */
- map = of_get_child_by_name(cn, "cpu-map");
- if (!map)
- goto out;
-
- ret = parse_cluster(map, 0);
- if (ret != 0)
- goto out_map;
-
- topology_normalize_cpu_scale();
-
- /*
- * Check that all cores are in the topology; the SMP code will
- * only mark cores described in the DT as possible.
- */
- for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id == -1)
- ret = -EINVAL;
-
-out_map:
- of_node_put(map);
-out:
- of_node_put(cn);
- return ret;
-}
-
-/*
- * cpu topology table
- */
-struct cpu_topology cpu_topology[NR_CPUS];
-EXPORT_SYMBOL_GPL(cpu_topology);
-
-const struct cpumask *cpu_coregroup_mask(int cpu)
-{
- const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
-
- /* Find the smaller of NUMA, core or LLC siblings */
- if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
- /* not numa in package, lets use the package siblings */
- core_mask = &cpu_topology[cpu].core_sibling;
- }
- if (cpu_topology[cpu].llc_id != -1) {
- if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
- core_mask = &cpu_topology[cpu].llc_sibling;
- }
-
- return core_mask;
-}
-
-static void update_siblings_masks(unsigned int cpuid)
-{
- struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
- int cpu;
-
- /* update core and thread sibling masks */
- for_each_online_cpu(cpu) {
- cpu_topo = &cpu_topology[cpu];
-
- if (cpuid_topo->llc_id == cpu_topo->llc_id) {
- cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
- cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
- }
-
- if (cpuid_topo->package_id != cpu_topo->package_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
- cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
- if (cpuid_topo->core_id != cpu_topo->core_id)
- continue;
-
- cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
- cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
- }
-}
-
void store_cpu_topology(unsigned int cpuid)
{
struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
@@ -296,59 +59,19 @@ void store_cpu_topology(unsigned int cpuid)
update_siblings_masks(cpuid);
}

-static void clear_cpu_topology(int cpu)
-{
- struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
- cpumask_clear(&cpu_topo->llc_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
-
- cpumask_clear(&cpu_topo->core_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
- cpumask_clear(&cpu_topo->thread_sibling);
- cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
-}
-
-static void __init reset_cpu_topology(void)
-{
- unsigned int cpu;
-
- for_each_possible_cpu(cpu) {
- struct cpu_topology *cpu_topo = &cpu_topology[cpu];
-
- cpu_topo->thread_id = -1;
- cpu_topo->core_id = 0;
- cpu_topo->package_id = -1;
- cpu_topo->llc_id = -1;
-
- clear_cpu_topology(cpu);
- }
-}
-
-void remove_cpu_topology(unsigned int cpu)
-{
- int sibling;
-
- for_each_cpu(sibling, topology_core_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
- for_each_cpu(sibling, topology_sibling_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
- for_each_cpu(sibling, topology_llc_cpumask(cpu))
- cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
-
- clear_cpu_topology(cpu);
-}
-
#ifdef CONFIG_ACPI
/*
* Propagate the topology information of the processor_topology_node tree to the
* cpu_topology array.
*/
-static int __init parse_acpi_topology(void)
+int __init parse_acpi_topology(void)
{
bool is_threaded;
int cpu, topology_id;

+ if (acpi_disabled)
+ return 0;
+
is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;

for_each_possible_cpu(cpu) {
@@ -384,24 +107,6 @@ static int __init parse_acpi_topology(void)

return 0;
}
-
-#else
-static inline int __init parse_acpi_topology(void)
-{
- return -EINVAL;
-}
#endif

-void __init init_cpu_topology(void)
-{
- reset_cpu_topology();

- /*
- * Discard anything that was parsed if we hit an error so we
- * don't use partial information.
- */
- if (!acpi_disabled && parse_acpi_topology())
- reset_cpu_topology();
- else if (of_have_populated_dt() && parse_dt_topology())
- reset_cpu_topology();
-}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index edfcf8d9..6cc6a860 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -6,8 +6,8 @@
* Written by: Juri Lelli, ARM Ltd.
*/

-#include <linux/acpi.h>
#include <linux/arch_topology.h>
+#include <linux/acpi.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
#include <linux/device.h>
@@ -16,6 +16,11 @@
#include <linux/string.h>
#include <linux/sched/topology.h>
#include <linux/cpuset.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
+#include <linux/smp.h>

DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;

@@ -278,3 +283,294 @@ static void parsing_done_workfn(struct work_struct *work)
#else
core_initcall(free_raw_capacity);
#endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+static int __init get_cpu_for_node(struct device_node *node)
+{
+ struct device_node *cpu_node;
+ int cpu;
+
+ cpu_node = of_parse_phandle(node, "cpu", 0);
+ if (!cpu_node)
+ return -1;
+
+ cpu = of_cpu_node_to_id(cpu_node);
+ if (cpu >= 0)
+ topology_parse_cpu_capacity(cpu_node, cpu);
+ else
+ pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
+
+ of_node_put(cpu_node);
+ return cpu;
+}
+
+static int __init parse_core(struct device_node *core, int package_id,
+ int core_id)
+{
+ char name[10];
+ bool leaf = true;
+ int i = 0;
+ int cpu;
+ struct device_node *t;
+
+ do {
+ snprintf(name, sizeof(name), "thread%d", i);
+ t = of_get_child_by_name(core, name);
+ if (t) {
+ leaf = false;
+ cpu = get_cpu_for_node(t);
+ if (cpu >= 0) {
+ cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].core_id = core_id;
+ cpu_topology[cpu].thread_id = i;
+ } else {
+ pr_err("%pOF: Can't get CPU for thread\n",
+ t);
+ of_node_put(t);
+ return -EINVAL;
+ }
+ of_node_put(t);
+ }
+ i++;
+ } while (t);
+
+ cpu = get_cpu_for_node(core);
+ if (cpu >= 0) {
+ if (!leaf) {
+ pr_err("%pOF: Core has both threads and CPU\n",
+ core);
+ return -EINVAL;
+ }
+
+ cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].core_id = core_id;
+ } else if (leaf) {
+ pr_err("%pOF: Can't get CPU for leaf core\n", core);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int __init parse_cluster(struct device_node *cluster, int depth)
+{
+ char name[10];
+ bool leaf = true;
+ bool has_cores = false;
+ struct device_node *c;
+ static int package_id __initdata;
+ int core_id = 0;
+ int i, ret;
+
+ /*
+ * First check for child clusters; we currently ignore any
+ * information about the nesting of clusters and present the
+ * scheduler with a flat list of them.
+ */
+ i = 0;
+ do {
+ snprintf(name, sizeof(name), "cluster%d", i);
+ c = of_get_child_by_name(cluster, name);
+ if (c) {
+ leaf = false;
+ ret = parse_cluster(c, depth + 1);
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ i++;
+ } while (c);
+
+ /* Now check for cores */
+ i = 0;
+ do {
+ snprintf(name, sizeof(name), "core%d", i);
+ c = of_get_child_by_name(cluster, name);
+ if (c) {
+ has_cores = true;
+
+ if (depth == 0) {
+ pr_err("%pOF: cpu-map children should be clusters\n",
+ c);
+ of_node_put(c);
+ return -EINVAL;
+ }
+
+ if (leaf) {
+ ret = parse_core(c, package_id, core_id++);
+ } else {
+ pr_err("%pOF: Non-leaf cluster with core %s\n",
+ cluster, name);
+ ret = -EINVAL;
+ }
+
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ i++;
+ } while (c);
+
+ if (leaf && !has_cores)
+ pr_warn("%pOF: empty cluster\n", cluster);
+
+ if (leaf)
+ package_id++;
+
+ return 0;
+}
+
+static int __init parse_dt_topology(void)
+{
+ struct device_node *cn, *map;
+ int ret = 0;
+ int cpu;
+
+ cn = of_find_node_by_path("/cpus");
+ if (!cn) {
+ pr_err("No CPU information found in DT\n");
+ return 0;
+ }
+
+ /*
+ * When topology is provided cpu-map is essentially a root
+ * cluster with restricted subnodes.
+ */
+ map = of_get_child_by_name(cn, "cpu-map");
+ if (!map)
+ goto out;
+
+ ret = parse_cluster(map, 0);
+ if (ret != 0)
+ goto out_map;
+
+ topology_normalize_cpu_scale();
+
+ /*
+ * Check that all cores are in the topology; the SMP code will
+ * only mark cores described in the DT as possible.
+ */
+ for_each_possible_cpu(cpu)
+ if (cpu_topology[cpu].package_id == -1)
+ ret = -EINVAL;
+
+out_map:
+ of_node_put(map);
+out:
+ of_node_put(cn);
+ return ret;
+}
+
+/*
+ * cpu topology table
+ */
+struct cpu_topology cpu_topology[NR_CPUS];
+EXPORT_SYMBOL_GPL(cpu_topology);
+
+const struct cpumask *cpu_coregroup_mask(int cpu)
+{
+ const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
+
+ /* Find the smaller of NUMA, core or LLC siblings */
+ if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
+ /* not numa in package, lets use the package siblings */
+ core_mask = &cpu_topology[cpu].core_sibling;
+ }
+ if (cpu_topology[cpu].llc_id != -1) {
+ if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
+ core_mask = &cpu_topology[cpu].llc_sibling;
+ }
+
+ return core_mask;
+}
+
+void update_siblings_masks(unsigned int cpuid)
+{
+ struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+ int cpu;
+
+ /* update core and thread sibling masks */
+ for_each_online_cpu(cpu) {
+ cpu_topo = &cpu_topology[cpu];
+
+ if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+ cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
+ cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
+ }
+
+ if (cpuid_topo->package_id != cpu_topo->package_id)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+ cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+ if (cpuid_topo->core_id != cpu_topo->core_id)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
+ cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
+ }
+}
+
+static void clear_cpu_topology(int cpu)
+{
+ struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+ cpumask_clear(&cpu_topo->llc_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
+
+ cpumask_clear(&cpu_topo->core_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
+ cpumask_clear(&cpu_topo->thread_sibling);
+ cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
+}
+
+static void __init reset_cpu_topology(void)
+{
+ unsigned int cpu;
+
+ for_each_possible_cpu(cpu) {
+ struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+
+ cpu_topo->thread_id = -1;
+ cpu_topo->core_id = 0;
+ cpu_topo->package_id = -1;
+ cpu_topo->llc_id = -1;
+
+ clear_cpu_topology(cpu);
+ }
+}
+
+void remove_cpu_topology(unsigned int cpu)
+{
+ int sibling;
+
+ for_each_cpu(sibling, topology_core_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
+ for_each_cpu(sibling, topology_sibling_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
+ for_each_cpu(sibling, topology_llc_cpumask(cpu))
+ cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
+
+ clear_cpu_topology(cpu);
+}
+
+__weak int __init parse_acpi_topology(void)
+{
+ return 0;
+}
+
+void __init init_cpu_topology(void)
+{
+ reset_cpu_topology();
+
+ /*
+ * Discard anything that was parsed if we hit an error so we
+ * don't use partial information.
+ */
+ if (parse_acpi_topology())
+ reset_cpu_topology();
+ else if (of_have_populated_dt() && parse_dt_topology())
+ reset_cpu_topology();
+}
+#endif
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 5fd9f167..c17d5434 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -8,6 +8,7 @@
*
* All rights reserved.
*/
+#include <linux/arch_topology.h>
#include <linux/mm.h>
#include <linux/cpu.h>
#include <linux/module.h>
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d9bdc1a7..d4e76e0a 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -33,4 +33,32 @@ unsigned long topology_get_freq_scale(int cpu)
return per_cpu(freq_scale, cpu);
}

+struct cpu_topology {
+ int thread_id;
+ int core_id;
+ int package_id;
+ int llc_id;
+ cpumask_t thread_sibling;
+ cpumask_t core_sibling;
+ cpumask_t llc_sibling;
+};
+
+#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
+extern struct cpu_topology cpu_topology[NR_CPUS];
+
+#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
+#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
+#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
+#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(int cpu);
+#endif
+
+#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+void update_siblings_masks(unsigned int cpu);
+#endif
+void remove_cpu_topology(unsigned int cpuid);
+
#endif /* _LINUX_ARCH_TOPOLOGY_H_ */
--
2.21.0


2019-03-20 23:49:34

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 5/5] RISC-V: Parse cpu topology during boot.

Currently, there are no topology defined for RISC-V.
Parse the cpu-map node from device tree and setup the
cpu topology.

CPU topology after applying the patch.
$cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list
0-3
$cat /sys/devices/system/cpu/cpu3/topology/core_siblings_list
0-3
$cat /sys/devices/system/cpu/cpu3/topology/physical_package_id
0
$cat /sys/devices/system/cpu/cpu3/topology/core_id
3

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/smpboot.c | 3 +++
2 files changed, 4 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index eb56c82d..ac87a0ec 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -47,6 +47,7 @@ config RISCV
select PCI_MSI if PCI
select RISCV_TIMER
select GENERIC_IRQ_MULTI_HANDLER
+ select GENERIC_ARCH_TOPOLOGY if SMP
select ARCH_HAS_PTE_SPECIAL
select HAVE_EBPF_JIT if 64BIT

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index f41eb193..a8fe590c 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -16,6 +16,7 @@
* GNU General Public License for more details.
*/

+#include <linux/arch_topology.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
@@ -43,6 +44,7 @@ static DECLARE_COMPLETION(cpu_running);

void __init smp_prepare_boot_cpu(void)
{
+ init_cpu_topology();
}

void __init smp_prepare_cpus(unsigned int max_cpus)
@@ -146,6 +148,7 @@ asmlinkage void __init smp_callin(void)

trap_init();
notify_cpu_starting(smp_processor_id());
+ update_siblings_masks(smp_processor_id());
set_cpu_online(smp_processor_id(), 1);
/*
* Remote TLB flushes are ignored while the CPU is offline, so emit
--
2.21.0


2019-03-20 23:49:44

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 4/5] arm: Use common cpu_topology

Currently, ARM32 and ARM64 uses different data structures to
represent their cpu toplogies. Since, we are moving the ARM64
topology to common code to be used by other architectures, we
can reuse that for ARM32 as well.

Signed-off-by: Atish Patra <[email protected]>
---
arch/arm/include/asm/topology.h | 22 +---------------------
arch/arm/kernel/topology.c | 10 +++++-----
include/linux/arch_topology.h | 10 +++++++++-
3 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 2a786f54..52f99ec8 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -5,26 +5,6 @@
#ifdef CONFIG_ARM_CPU_TOPOLOGY

#include <linux/cpumask.h>
-
-struct cputopo_arm {
- int thread_id;
- int core_id;
- int socket_id;
- cpumask_t thread_sibling;
- cpumask_t core_sibling;
-};
-
-extern struct cputopo_arm cpu_topology[NR_CPUS];
-
-#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
-#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
-#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
-#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
-
-void init_cpu_topology(void);
-void store_cpu_topology(unsigned int cpuid);
-const struct cpumask *cpu_coregroup_mask(int cpu);
-
#include <linux/arch_topology.h>

/* Replace task scheduler's default frequency-invariant accounting */
@@ -38,7 +18,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu);

#else

-static inline void init_cpu_topology(void) { }
+static inline void init_cpu_topology(void) {}
static inline void store_cpu_topology(unsigned int cpuid) { }

#endif
diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 60e375ce..0ddb24c7 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -180,7 +180,7 @@ static inline void update_cpu_capacity(unsigned int cpuid) {}
/*
* cpu topology table
*/
-struct cputopo_arm cpu_topology[NR_CPUS];
+struct cpu_topology cpu_topology[NR_CPUS];
EXPORT_SYMBOL_GPL(cpu_topology);

const struct cpumask *cpu_coregroup_mask(int cpu)
@@ -197,9 +197,9 @@ const struct cpumask *cpu_corepower_mask(int cpu)
return &cpu_topology[cpu].thread_sibling;
}

-static void update_siblings_masks(unsigned int cpuid)
+void update_siblings_masks(unsigned int cpuid)
{
- struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+ struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
int cpu;

/* update core and thread sibling masks */
@@ -230,7 +230,7 @@ static void update_siblings_masks(unsigned int cpuid)
*/
void store_cpu_topology(unsigned int cpuid)
{
- struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid];
+ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
unsigned int mpidr;

/* If the cpu topology has been already set, just return */
@@ -302,7 +302,7 @@ void __init init_cpu_topology(void)

/* init core mask and capacity */
for_each_possible_cpu(cpu) {
- struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
+ struct cpu_topology *cpu_topo = &(cpu_topology[cpu]);

cpu_topo->thread_id = -1;
cpu_topo->core_id = -1;
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d4e76e0a..7c850611 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -36,17 +36,25 @@ unsigned long topology_get_freq_scale(int cpu)
struct cpu_topology {
int thread_id;
int core_id;
+#ifdef CONFIG_ARM_CPU_TOPOLOGY
+ int socket_id;
+#else
int package_id;
int llc_id;
+ cpumask_t llc_sibling;
+#endif
cpumask_t thread_sibling;
cpumask_t core_sibling;
- cpumask_t llc_sibling;
};

#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
extern struct cpu_topology cpu_topology[NR_CPUS];

+#ifdef CONFIG_ARM_CPU_TOPOLOGY
+#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
+#else
#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
+#endif
#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
--
2.21.0


2019-03-20 23:50:24

by Atish Patra

[permalink] [raw]
Subject: [RFT/RFC PATCH v3 1/5] Documentation: DT: arm: add support for sockets defining package boundaries

From: Sudeep Holla <[email protected]>

The current ARM DT topology description provides the operating system
with a topological view of the system that is based on leaf nodes
representing either cores or threads (in an SMT system) and a
hierarchical set of cluster nodes that creates a hierarchical topology
view of how those cores and threads are grouped.

However this hierarchical representation of clusters does not allow to
describe what topology level actually represents the physical package or
the socket boundary, which is a key piece of information to be used by
an operating system to optimize resource allocation and scheduling.

Lets add a new "socket" node type in the cpu-map node to describe the
same.

Signed-off-by: Sudeep Holla <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
.../devicetree/bindings/arm/topology.txt | 52 ++++++++++++++-----
1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/topology.txt b/Documentation/devicetree/bindings/arm/topology.txt
index b0d80c0f..3b8febb4 100644
--- a/Documentation/devicetree/bindings/arm/topology.txt
+++ b/Documentation/devicetree/bindings/arm/topology.txt
@@ -9,6 +9,7 @@ ARM topology binding description
In an ARM system, the hierarchy of CPUs is defined through three entities that
are used to describe the layout of physical CPUs in the system:

+- socket
- cluster
- core
- thread
@@ -63,21 +64,23 @@ nodes are listed.

The cpu-map node's child nodes can be:

- - one or more cluster nodes
+ - one or more cluster nodes or
+ - one or more socket nodes in a multi-socket system

Any other configuration is considered invalid.

-The cpu-map node can only contain three types of child nodes:
+The cpu-map node can only contain 4 types of child nodes:

+- socket node
- cluster node
- core node
- thread node

whose bindings are described in paragraph 3.

-The nodes describing the CPU topology (cluster/core/thread) can only
-be defined within the cpu-map node and every core/thread in the system
-must be defined within the topology. Any other configuration is
+The nodes describing the CPU topology (socket/cluster/core/thread) can
+only be defined within the cpu-map node and every core/thread in the
+system must be defined within the topology. Any other configuration is
invalid and therefore must be ignored.

===========================================
@@ -85,26 +88,44 @@ invalid and therefore must be ignored.
===========================================

cpu-map child nodes must follow a naming convention where the node name
-must be "clusterN", "coreN", "threadN" depending on the node type (ie
-cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which
-are siblings within a single common parent node must be given a unique and
+must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
+(ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
+which are siblings within a single common parent node must be given a unique and
sequential N value, starting from 0).
cpu-map child nodes which do not share a common parent node can have the same
name (ie same number N as other cpu-map child nodes at different device tree
levels) since name uniqueness will be guaranteed by the device tree hierarchy.

===========================================
-3 - cluster/core/thread node bindings
+3 - socket/cluster/core/thread node bindings
===========================================

-Bindings for cluster/cpu/thread nodes are defined as follows:
+Bindings for socket/cluster/cpu/thread nodes are defined as follows:
+
+- socket node
+
+ Description: must be declared within a cpu-map node, one node
+ per physical socket in the system. A system can
+ contain single or multiple physical socket.
+ The association of sockets and NUMA nodes is beyond
+ the scope of this bindings, please refer [2] for
+ NUMA bindings.
+
+ This node is optional for a single socket system.
+
+ The socket node name must be "socketN" as described in 2.1 above.
+ A socket node can not be a leaf node.
+
+ A socket node's child nodes must be one or more cluster nodes.
+
+ Any other configuration is considered invalid.

- cluster node

Description: must be declared within a cpu-map node, one node
per cluster. A system can contain several layers of
- clustering and cluster nodes can be contained in parent
- cluster nodes.
+ clustering within a single physical socket and cluster
+ nodes can be contained in parent cluster nodes.

The cluster node name must be "clusterN" as described in 2.1 above.
A cluster node can not be a leaf node.
@@ -164,13 +185,15 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
4 - Example dts
===========================================

-Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters):
+Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
+physical socket):

cpus {
#size-cells = <0>;
#address-cells = <2>;

cpu-map {
+ socket0 {
cluster0 {
cluster0 {
core0 {
@@ -253,6 +276,7 @@ cpus {
};
};
};
+ };

CPU0: cpu@0 {
device_type = "cpu";
@@ -473,3 +497,5 @@ cpus {
===============================================================================
[1] ARM Linux kernel documentation
Documentation/devicetree/bindings/arm/cpus.yaml
+[2] Devicetree NUMA binding description
+ Documentation/devicetree/bindings/numa.txt
--
2.21.0


2019-03-24 21:17:37

by Rob Herring

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 2/5] dt-binding: cpu-topology: Move cpu-map to a common binding.

On Wed, Mar 20, 2019 at 6:48 PM Atish Patra <[email protected]> wrote:
>
> cpu-map binding can be used to described cpu topology for both
> RISC-V & ARM. It makes more sense to move the binding to document
> to a common place.
>
> The relevant discussion can be found here.
> https://lkml.org/lkml/2018/11/6/19
>
> Signed-off-by: Atish Patra <[email protected]>
> Reviewed-by: Sudeep Holla <[email protected]>
> ---
> .../topology.txt => cpu/cpu-topology.txt} | 82 +++++++++++++++----
> 1 file changed, 66 insertions(+), 16 deletions(-)
> rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (86%)

Reviewed-by: Rob Herring <[email protected]>

2019-04-10 22:51:02

by Atish Patra

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 0/5] Unify CPU topology across ARM & RISC-V

On 3/20/19 4:48 PM, Atish Patra wrote:
> The cpu-map DT entry in ARM can describe the CPU topology in much better
> way compared to other existing approaches. RISC-V can easily adopt this
> binding to represent its own CPU topology. Thus, both cpu-map DT
> binding and topology parsing code can be moved to a common location so
> that RISC-V or any other architecture can leverage that.
>
> The relevant discussion regarding unifying cpu topology can be found in
> [1].
>
> arch_topology seems to be a perfect place to move the common code. I
> have not introduced any significant functional changes in the moved code.
> The only downside in this approach is that the capacity code will be
> executed for RISC-V as well. But, it will exit immediately after not
> able to find the appropriate DT node. If the overhead is considered too
> much, we can always compile out capacity related functions under a
> different config for the architectures that do not support them.
>
> There was an opportunity to unify topology data structure for ARM32 done
> by patch 3/4. But, I refrained from making any other changes as I am not
> very well versed with original intention for some functions that
> are present in arch_topology.c. I hope this patch series can be served
> as a baseline for such changes in the future.
>
> The patches have been tested for RISC-V and compile tested for ARM64,
> ARM32 & x86.
>
> The socket change[2] is also now part of this series.
>
> [1] https://lkml.org/lkml/2018/11/6/19
> [2] https://lkml.org/lkml/2018/11/7/918
>
> QEMU changes for RISC-V topology are available at
>
> https://github.com/atishp04/qemu/tree/riscv_topology_dt
>
> HiFive Unleashed DT with topology node is available here.
> https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology
>
> It can be verified with OpenSBI with following additional compile time
> option.
>
> FW_PAYLOAD_FDT="unleashed_topology.dtb"
>
> Changes from v2->v3
> 1. Cover letter update with experiment DT for topology changes.
> 2. Added the patch for [2].
>
> Changes from v1->v2
> 1. ARM32 can now use the common code as well.
>
> Atish Patra (4):
> dt-binding: cpu-topology: Move cpu-map to a common binding.
> cpu-topology: Move cpu topology code to common code.
> arm: Use common cpu_topology
> RISC-V: Parse cpu topology during boot.
>
> Sudeep Holla (1):
> Documentation: DT: arm: add support for sockets defining package
> boundaries
>
> .../topology.txt => cpu/cpu-topology.txt} | 134 ++++++--
> arch/arm/include/asm/topology.h | 22 +-
> arch/arm/kernel/topology.c | 10 +-
> arch/arm64/include/asm/topology.h | 23 --
> arch/arm64/kernel/topology.c | 303 +-----------------
> arch/riscv/Kconfig | 1 +
> arch/riscv/kernel/smpboot.c | 3 +
> drivers/base/arch_topology.c | 298 ++++++++++++++++-
> drivers/base/topology.c | 1 +
> include/linux/arch_topology.h | 36 +++
> 10 files changed, 453 insertions(+), 378 deletions(-)
> rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%)
>
> --
> 2.21.0
>
>

Ping?

Specifically, patch 3 & 4 affects ARM & ARM64. Any tests on real
hardware would be great.

Regards,
Atish

2019-04-12 17:29:10

by Sudeep Holla

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 0/5] Unify CPU topology across ARM & RISC-V

On Wed, Apr 10, 2019 at 03:49:54PM -0700, Atish Patra wrote:

[...]

>
> Ping?
>

Sorry for the delay(was off for a while), will take a look next week.

> Specifically, patch 3 & 4 affects ARM & ARM64. Any tests on real hardware
> would be great.
>

Sure.

--
Regards,
Sudeep

2019-04-15 15:30:06

by Sudeep Holla

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 3/5] cpu-topology: Move cpu topology code to common code.

Hi Atish,

Thanks again for doing this. Overall changes look good except a couple
of minor nit, see below.

On Wed, Mar 20, 2019 at 04:48:04PM -0700, Atish Patra wrote:
> Both RISC-V & ARM64 are using cpu-map device tree to describe
> their cpu topology. It's better to move the relevant code to
> a common place instead of duplicate code.
>
> Signed-off-by: Atish Patra <[email protected]>
> Tested-by: Jeffrey Hugo <[email protected]>
> ---
> arch/arm64/include/asm/topology.h | 23 ---
> arch/arm64/kernel/topology.c | 303 +-----------------------------
> drivers/base/arch_topology.c | 298 ++++++++++++++++++++++++++++-
> drivers/base/topology.c | 1 +
> include/linux/arch_topology.h | 28 +++
> 5 files changed, 330 insertions(+), 323 deletions(-)
>

[...]

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index edfcf8d9..6cc6a860 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -6,8 +6,8 @@
> * Written by: Juri Lelli, ARM Ltd.
> */
>
> -#include <linux/acpi.h>
> #include <linux/arch_topology.h>
> +#include <linux/acpi.h>
> #include <linux/cpu.h>
> #include <linux/cpufreq.h>
> #include <linux/device.h>
> @@ -16,6 +16,11 @@
> #include <linux/string.h>
> #include <linux/sched/topology.h>
> #include <linux/cpuset.h>
> +#include <linux/cpumask.h>
> +#include <linux/init.h>
> +#include <linux/percpu.h>
> +#include <linux/sched.h>
> +#include <linux/smp.h>
>
> DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>
> @@ -278,3 +283,294 @@ static void parsing_done_workfn(struct work_struct *work)
> #else
> core_initcall(free_raw_capacity);
> #endif
> +
> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)

Why can't the above one be just GENERIC_ARCH_TOPOLOGY ?
I may be missing to find it myself, but would like to know.

> +
> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)

Ditto.

--
Regards,
Sudeep

2019-04-15 15:32:55

by Sudeep Holla

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 4/5] arm: Use common cpu_topology

On Wed, Mar 20, 2019 at 04:48:05PM -0700, Atish Patra wrote:
> Currently, ARM32 and ARM64 uses different data structures to
> represent their cpu toplogies. Since, we are moving the ARM64
> topology to common code to be used by other architectures, we
> can reuse that for ARM32 as well.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/arm/include/asm/topology.h | 22 +---------------------
> arch/arm/kernel/topology.c | 10 +++++-----
> include/linux/arch_topology.h | 10 +++++++++-
> 3 files changed, 15 insertions(+), 27 deletions(-)
>

[...]

> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index d4e76e0a..7c850611 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -36,17 +36,25 @@ unsigned long topology_get_freq_scale(int cpu)
> struct cpu_topology {
> int thread_id;
> int core_id;
> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> + int socket_id;

Sorry, but I can't find any reason why we need to do this ifdef dance
here, especially for socket_id vs package_id ? Other's I can understand
as there are new, but I am sure we can find a way and get away with
#ifdefery here completely.

> +#else
> int package_id;
> int llc_id;
> + cpumask_t llc_sibling;
> +#endif
> cpumask_t thread_sibling;
> cpumask_t core_sibling;
> - cpumask_t llc_sibling;
> };
>
> #ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
> extern struct cpu_topology cpu_topology[NR_CPUS];
>
> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> +#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
> +#else
> #define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
> +#endif

Since all callsites must use topology_physical_package_id, we should be
able to rename socket_id to package_id easily.

--
Regards,
Sudeep

2019-04-15 21:19:01

by Atish Patra

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 4/5] arm: Use common cpu_topology

On 4/15/19 8:31 AM, Sudeep Holla wrote:
> On Wed, Mar 20, 2019 at 04:48:05PM -0700, Atish Patra wrote:
>> Currently, ARM32 and ARM64 uses different data structures to
>> represent their cpu toplogies. Since, we are moving the ARM64
>> topology to common code to be used by other architectures, we
>> can reuse that for ARM32 as well.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> ---
>> arch/arm/include/asm/topology.h | 22 +---------------------
>> arch/arm/kernel/topology.c | 10 +++++-----
>> include/linux/arch_topology.h | 10 +++++++++-
>> 3 files changed, 15 insertions(+), 27 deletions(-)
>>
>
> [...]
>
>> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
>> index d4e76e0a..7c850611 100644
>> --- a/include/linux/arch_topology.h
>> +++ b/include/linux/arch_topology.h
>> @@ -36,17 +36,25 @@ unsigned long topology_get_freq_scale(int cpu)
>> struct cpu_topology {
>> int thread_id;
>> int core_id;
>> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
>> + int socket_id;
>
> Sorry, but I can't find any reason why we need to do this ifdef dance
> here, especially for socket_id vs package_id ?

I was not sure if we can rename socket_id to package_id from a semantic
point of view. If you are okay with it, I will change it to package_id
and send a v4.

Other's I can understand
> as there are new, but I am sure we can find a way and get away with
> #ifdefery here completely.
>
That would be good. Any suggestions on how to do that?

>> +#else
>> int package_id;
>> int llc_id;
>> + cpumask_t llc_sibling;
>> +#endif
>> cpumask_t thread_sibling;
>> cpumask_t core_sibling;
>> - cpumask_t llc_sibling;
>> };
>>
>> #ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
>> extern struct cpu_topology cpu_topology[NR_CPUS];
>>
>> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
>> +#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
>> +#else
>> #define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
>> +#endif
>
> Since all callsites must use topology_physical_package_id, we should be
> able to rename socket_id to package_id easily.
>
Sure.

Regards,
Atish
> --
> Regards,
> Sudeep
>

2019-04-15 22:10:18

by Atish Patra

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 3/5] cpu-topology: Move cpu topology code to common code.

On 4/15/19 8:27 AM, Sudeep Holla wrote:
> Hi Atish,
>
> Thanks again for doing this. Overall changes look good except a couple
> of minor nit, see below.
>
> On Wed, Mar 20, 2019 at 04:48:04PM -0700, Atish Patra wrote:
>> Both RISC-V & ARM64 are using cpu-map device tree to describe
>> their cpu topology. It's better to move the relevant code to
>> a common place instead of duplicate code.
>>
>> Signed-off-by: Atish Patra <[email protected]>
>> Tested-by: Jeffrey Hugo <[email protected]>
>> ---
>> arch/arm64/include/asm/topology.h | 23 ---
>> arch/arm64/kernel/topology.c | 303 +-----------------------------
>> drivers/base/arch_topology.c | 298 ++++++++++++++++++++++++++++-
>> drivers/base/topology.c | 1 +
>> include/linux/arch_topology.h | 28 +++
>> 5 files changed, 330 insertions(+), 323 deletions(-)
>>
>
> [...]
>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index edfcf8d9..6cc6a860 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -6,8 +6,8 @@
>> * Written by: Juri Lelli, ARM Ltd.
>> */
>>
>> -#include <linux/acpi.h>
>> #include <linux/arch_topology.h>
>> +#include <linux/acpi.h>
>> #include <linux/cpu.h>
>> #include <linux/cpufreq.h>
>> #include <linux/device.h>
>> @@ -16,6 +16,11 @@
>> #include <linux/string.h>
>> #include <linux/sched/topology.h>
>> #include <linux/cpuset.h>
>> +#include <linux/cpumask.h>
>> +#include <linux/init.h>
>> +#include <linux/percpu.h>
>> +#include <linux/sched.h>
>> +#include <linux/smp.h>
>>
>> DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>>
>> @@ -278,3 +283,294 @@ static void parsing_done_workfn(struct work_struct *work)
>> #else
>> core_initcall(free_raw_capacity);
>> #endif
>> +
>> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>
> Why can't the above one be just GENERIC_ARCH_TOPOLOGY ?
> I may be missing to find it myself, but would like to know.
>
GENERIC_ARCH_TOPOLOGY is now used for both RISCV, ARM & ARM64.
The below functions under this #ifdef have different implementation for
ARM and ARM64.

parse_dt_topology
cpu_coregroup_mask
update_siblings_masks

While we can combine the later two functions and move them to common
code as well, parse_dt_topology is significantly different.

That's why we need some kind of #ifdef or renaming of parse_dt_topology
for ARM32 code.

Thanks for the review!!

Regards,
Atish
>> +
>> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>
> Ditto.
>
> --
> Regards,
> Sudeep
>

2019-04-16 13:10:57

by Sudeep Holla

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 4/5] arm: Use common cpu_topology

On Mon, Apr 15, 2019 at 02:16:43PM -0700, Atish Patra wrote:
> On 4/15/19 8:31 AM, Sudeep Holla wrote:
> > On Wed, Mar 20, 2019 at 04:48:05PM -0700, Atish Patra wrote:
> > > Currently, ARM32 and ARM64 uses different data structures to
> > > represent their cpu toplogies. Since, we are moving the ARM64
> > > topology to common code to be used by other architectures, we
> > > can reuse that for ARM32 as well.
> > >
> > > Signed-off-by: Atish Patra <[email protected]>
> > > ---
> > > arch/arm/include/asm/topology.h | 22 +---------------------
> > > arch/arm/kernel/topology.c | 10 +++++-----
> > > include/linux/arch_topology.h | 10 +++++++++-
> > > 3 files changed, 15 insertions(+), 27 deletions(-)
> > >
> >
> > [...]
> >
> > > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> > > index d4e76e0a..7c850611 100644
> > > --- a/include/linux/arch_topology.h
> > > +++ b/include/linux/arch_topology.h
> > > @@ -36,17 +36,25 @@ unsigned long topology_get_freq_scale(int cpu)
> > > struct cpu_topology {
> > > int thread_id;
> > > int core_id;
> > > +#ifdef CONFIG_ARM_CPU_TOPOLOGY
> > > + int socket_id;
> >
> > Sorry, but I can't find any reason why we need to do this ifdef dance
> > here, especially for socket_id vs package_id ?
>
> I was not sure if we can rename socket_id to package_id from a semantic
> point of view. If you are okay with it, I will change it to package_id and
> send a v4.
>

Thanks, all make sure to cc [email protected],
just noticed that's missing and you are asking for testing on ARM
platforms :)

> Other's I can understand
> > as there are new, but I am sure we can find a way and get away with
> > #ifdefery here completely.
> >
> That would be good. Any suggestions on how to do that?
>

Do you see any issues having extra structure members for ARM ?
Something like below seem to compile + boot fine on my 32-bit TC2 with
proper topology info on top of your series. Of course, more testing is
better, but I don't see any issue keeping llc_{id,sibling} around for
ARM eliminating the need for #ifdefs

Let me know if I am missing something.

-->8

diff --git i/arch/arm/kernel/topology.c w/arch/arm/kernel/topology.c
index 0ddb24c76c17..f2aa942e0cfa 100644
--- i/arch/arm/kernel/topology.c
+++ w/arch/arm/kernel/topology.c
@@ -206,7 +206,7 @@ void update_siblings_masks(unsigned int cpuid)
for_each_possible_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];

- if (cpuid_topo->socket_id != cpu_topo->socket_id)
+ if (cpuid_topo->package_id != cpu_topo->package_id)
continue;

cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
@@ -250,12 +250,12 @@ void store_cpu_topology(unsigned int cpuid)
/* core performance interdependency */
cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
- cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
+ cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
} else {
/* largely independent cores */
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
- cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+ cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
}
} else {
/*
@@ -265,7 +265,7 @@ void store_cpu_topology(unsigned int cpuid)
*/
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = 0;
- cpuid_topo->socket_id = -1;
+ cpuid_topo->package_id = -1;
}

update_siblings_masks(cpuid);
@@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
cpuid, cpu_topology[cpuid].thread_id,
cpu_topology[cpuid].core_id,
- cpu_topology[cpuid].socket_id, mpidr);
+ cpu_topology[cpuid].package_id, mpidr);
}

static inline int cpu_corepower_flags(void)
@@ -306,7 +306,7 @@ void __init init_cpu_topology(void)

cpu_topo->thread_id = -1;
cpu_topo->core_id = -1;
- cpu_topo->socket_id = -1;
+ cpu_topo->package_id = -1;
cpumask_clear(&cpu_topo->core_sibling);
cpumask_clear(&cpu_topo->thread_sibling);
}
diff --git i/include/linux/arch_topology.h w/include/linux/arch_topology.h
index 7c850611986d..8e82389c2bed 100644
--- i/include/linux/arch_topology.h
+++ w/include/linux/arch_topology.h
@@ -36,13 +36,9 @@ unsigned long topology_get_freq_scale(int cpu)
struct cpu_topology {
int thread_id;
int core_id;
-#ifdef CONFIG_ARM_CPU_TOPOLOGY
- int socket_id;
-#else
int package_id;
int llc_id;
cpumask_t llc_sibling;
-#endif
cpumask_t thread_sibling;
cpumask_t core_sibling;
};
@@ -50,11 +46,7 @@ struct cpu_topology {
#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
extern struct cpu_topology cpu_topology[NR_CPUS];

-#ifdef CONFIG_ARM_CPU_TOPOLOGY
-#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
-#else
#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
-#endif
#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)

2019-04-16 13:25:02

by Sudeep Holla

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 3/5] cpu-topology: Move cpu topology code to common code.

On Mon, Apr 15, 2019 at 03:08:45PM -0700, Atish Patra wrote:
> On 4/15/19 8:27 AM, Sudeep Holla wrote:
> > Hi Atish,
> >
> > Thanks again for doing this. Overall changes look good except a couple
> > of minor nit, see below.
> >
> > On Wed, Mar 20, 2019 at 04:48:04PM -0700, Atish Patra wrote:
> > > Both RISC-V & ARM64 are using cpu-map device tree to describe
> > > their cpu topology. It's better to move the relevant code to
> > > a common place instead of duplicate code.
> > >
> > > Signed-off-by: Atish Patra <[email protected]>
> > > Tested-by: Jeffrey Hugo <[email protected]>
> > > ---
> > > arch/arm64/include/asm/topology.h | 23 ---
> > > arch/arm64/kernel/topology.c | 303 +-----------------------------
> > > drivers/base/arch_topology.c | 298 ++++++++++++++++++++++++++++-
> > > drivers/base/topology.c | 1 +
> > > include/linux/arch_topology.h | 28 +++
> > > 5 files changed, 330 insertions(+), 323 deletions(-)
> > >
> >
> > [...]
> >
> > > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > > index edfcf8d9..6cc6a860 100644
> > > --- a/drivers/base/arch_topology.c
> > > +++ b/drivers/base/arch_topology.c
> > > @@ -6,8 +6,8 @@
> > > * Written by: Juri Lelli, ARM Ltd.
> > > */
> > > -#include <linux/acpi.h>
> > > #include <linux/arch_topology.h>
> > > +#include <linux/acpi.h>
> > > #include <linux/cpu.h>
> > > #include <linux/cpufreq.h>
> > > #include <linux/device.h>
> > > @@ -16,6 +16,11 @@
> > > #include <linux/string.h>
> > > #include <linux/sched/topology.h>
> > > #include <linux/cpuset.h>
> > > +#include <linux/cpumask.h>
> > > +#include <linux/init.h>
> > > +#include <linux/percpu.h>
> > > +#include <linux/sched.h>
> > > +#include <linux/smp.h>
> > > DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
> > > @@ -278,3 +283,294 @@ static void parsing_done_workfn(struct work_struct *work)
> > > #else
> > > core_initcall(free_raw_capacity);
> > > #endif
> > > +
> > > +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> >
> > Why can't the above one be just GENERIC_ARCH_TOPOLOGY ?
> > I may be missing to find it myself, but would like to know.
> >
> GENERIC_ARCH_TOPOLOGY is now used for both RISCV, ARM & ARM64.
> The below functions under this #ifdef have different implementation for ARM
> and ARM64.
>
> parse_dt_topology
> cpu_coregroup_mask
> update_siblings_masks
>
> While we can combine the later two functions and move them to common code as
> well, parse_dt_topology is significantly different.
>

Sure, had a quick glance and indeed they may look different, but won't
it defeat the purpose of this binding consolidation ?

> That's why we need some kind of #ifdef or renaming of parse_dt_topology for
> ARM32 code.
>

I am fine if we want to take this up later to keep the impact minimum.
But cpu_coregroup_mask and update_siblings_masks can and must be unified.
In fact the existing generic version must work on ARM32 too.

> Thanks for the review!!
>

You are welcome.

--
Regards,
Sudeep

2019-04-16 18:55:44

by Atish Patra

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 3/5] cpu-topology: Move cpu topology code to common code.

On 4/16/19 6:23 AM, Sudeep Holla wrote:
> On Mon, Apr 15, 2019 at 03:08:45PM -0700, Atish Patra wrote:
>> On 4/15/19 8:27 AM, Sudeep Holla wrote:
>>> Hi Atish,
>>>
>>> Thanks again for doing this. Overall changes look good except a couple
>>> of minor nit, see below.
>>>
>>> On Wed, Mar 20, 2019 at 04:48:04PM -0700, Atish Patra wrote:
>>>> Both RISC-V & ARM64 are using cpu-map device tree to describe
>>>> their cpu topology. It's better to move the relevant code to
>>>> a common place instead of duplicate code.
>>>>
>>>> Signed-off-by: Atish Patra <[email protected]>
>>>> Tested-by: Jeffrey Hugo <[email protected]>
>>>> ---
>>>> arch/arm64/include/asm/topology.h | 23 ---
>>>> arch/arm64/kernel/topology.c | 303 +-----------------------------
>>>> drivers/base/arch_topology.c | 298 ++++++++++++++++++++++++++++-
>>>> drivers/base/topology.c | 1 +
>>>> include/linux/arch_topology.h | 28 +++
>>>> 5 files changed, 330 insertions(+), 323 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>>>> index edfcf8d9..6cc6a860 100644
>>>> --- a/drivers/base/arch_topology.c
>>>> +++ b/drivers/base/arch_topology.c
>>>> @@ -6,8 +6,8 @@
>>>> * Written by: Juri Lelli, ARM Ltd.
>>>> */
>>>> -#include <linux/acpi.h>
>>>> #include <linux/arch_topology.h>
>>>> +#include <linux/acpi.h>
>>>> #include <linux/cpu.h>
>>>> #include <linux/cpufreq.h>
>>>> #include <linux/device.h>
>>>> @@ -16,6 +16,11 @@
>>>> #include <linux/string.h>
>>>> #include <linux/sched/topology.h>
>>>> #include <linux/cpuset.h>
>>>> +#include <linux/cpumask.h>
>>>> +#include <linux/init.h>
>>>> +#include <linux/percpu.h>
>>>> +#include <linux/sched.h>
>>>> +#include <linux/smp.h>
>>>> DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>>>> @@ -278,3 +283,294 @@ static void parsing_done_workfn(struct work_struct *work)
>>>> #else
>>>> core_initcall(free_raw_capacity);
>>>> #endif
>>>> +
>>>> +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>>>
>>> Why can't the above one be just GENERIC_ARCH_TOPOLOGY ?
>>> I may be missing to find it myself, but would like to know.
>>>
>> GENERIC_ARCH_TOPOLOGY is now used for both RISCV, ARM & ARM64.
>> The below functions under this #ifdef have different implementation for ARM
>> and ARM64.
>>
>> parse_dt_topology
>> cpu_coregroup_mask
>> update_siblings_masks
>>
>> While we can combine the later two functions and move them to common code as
>> well, parse_dt_topology is significantly different.
>>
>
> Sure, had a quick glance and indeed they may look different, but won't
> it defeat the purpose of this binding consolidation ?
>
I didn't want change too much at first go.

>> That's why we need some kind of #ifdef or renaming of parse_dt_topology for
>> ARM32 code.
>>
>
> I am fine if we want to take this up later to keep the impact minimum.
> But cpu_coregroup_mask and update_siblings_masks can and must be unified.

Sure. I will just leave parse_dt_topology as it is for now and unify
other two functions.

I think we should unify parse_dt_topology in separate series.

Regards,
Atish
> In fact the existing generic version must work on ARM32 too.
>
>> Thanks for the review!!
>>
>
> You are welcome.
>
> --
> Regards,
> Sudeep
>

2019-04-16 19:05:10

by Atish Patra

[permalink] [raw]
Subject: Re: [RFT/RFC PATCH v3 4/5] arm: Use common cpu_topology

On 4/16/19 6:09 AM, Sudeep Holla wrote:
> On Mon, Apr 15, 2019 at 02:16:43PM -0700, Atish Patra wrote:
>> On 4/15/19 8:31 AM, Sudeep Holla wrote:
>>> On Wed, Mar 20, 2019 at 04:48:05PM -0700, Atish Patra wrote:
>>>> Currently, ARM32 and ARM64 uses different data structures to
>>>> represent their cpu toplogies. Since, we are moving the ARM64
>>>> topology to common code to be used by other architectures, we
>>>> can reuse that for ARM32 as well.
>>>>
>>>> Signed-off-by: Atish Patra <[email protected]>
>>>> ---
>>>> arch/arm/include/asm/topology.h | 22 +---------------------
>>>> arch/arm/kernel/topology.c | 10 +++++-----
>>>> include/linux/arch_topology.h | 10 +++++++++-
>>>> 3 files changed, 15 insertions(+), 27 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
>>>> index d4e76e0a..7c850611 100644
>>>> --- a/include/linux/arch_topology.h
>>>> +++ b/include/linux/arch_topology.h
>>>> @@ -36,17 +36,25 @@ unsigned long topology_get_freq_scale(int cpu)
>>>> struct cpu_topology {
>>>> int thread_id;
>>>> int core_id;
>>>> +#ifdef CONFIG_ARM_CPU_TOPOLOGY
>>>> + int socket_id;
>>>
>>> Sorry, but I can't find any reason why we need to do this ifdef dance
>>> here, especially for socket_id vs package_id ?
>>
>> I was not sure if we can rename socket_id to package_id from a semantic
>> point of view. If you are okay with it, I will change it to package_id and
>> send a v4.
>>
>
> Thanks, all make sure to cc [email protected],
> just noticed that's missing and you are asking for testing on ARM
> platforms :)
>

My bad!! I didn't realize that linux-arm-kernel is not included. Thanks
for pointing that out.

>> Other's I can understand
>>> as there are new, but I am sure we can find a way and get away with
>>> #ifdefery here completely.
>>>
>> That would be good. Any suggestions on how to do that?
>>
>
> Do you see any issues having extra structure members for ARM ?
> Something like below seem to compile + boot fine on my 32-bit TC2 with
> proper topology info on top of your series. Of course, more testing is
> better, but I don't see any issue keeping llc_{id,sibling} around for
> ARM eliminating the need for #ifdefs
>

I thought adding unused members for ARM32 might be unacceptable :).
I will update my v4 with this.

Regards,
Atish
> Let me know if I am missing something.
>
> -->8
>
> diff --git i/arch/arm/kernel/topology.c w/arch/arm/kernel/topology.c
> index 0ddb24c76c17..f2aa942e0cfa 100644
> --- i/arch/arm/kernel/topology.c
> +++ w/arch/arm/kernel/topology.c
> @@ -206,7 +206,7 @@ void update_siblings_masks(unsigned int cpuid)
> for_each_possible_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpuid_topo->socket_id != cpu_topo->socket_id)
> + if (cpuid_topo->package_id != cpu_topo->package_id)
> continue;
>
> cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
> @@ -250,12 +250,12 @@ void store_cpu_topology(unsigned int cpuid)
> /* core performance interdependency */
> cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> - cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
> + cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
> } else {
> /* largely independent cores */
> cpuid_topo->thread_id = -1;
> cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> - cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> + cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> }
> } else {
> /*
> @@ -265,7 +265,7 @@ void store_cpu_topology(unsigned int cpuid)
> */
> cpuid_topo->thread_id = -1;
> cpuid_topo->core_id = 0;
> - cpuid_topo->socket_id = -1;
> + cpuid_topo->package_id = -1;
> }
>
> update_siblings_masks(cpuid);
> @@ -275,7 +275,7 @@ void store_cpu_topology(unsigned int cpuid)
> pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
> cpuid, cpu_topology[cpuid].thread_id,
> cpu_topology[cpuid].core_id,
> - cpu_topology[cpuid].socket_id, mpidr);
> + cpu_topology[cpuid].package_id, mpidr);
> }
>
> static inline int cpu_corepower_flags(void)
> @@ -306,7 +306,7 @@ void __init init_cpu_topology(void)
>
> cpu_topo->thread_id = -1;
> cpu_topo->core_id = -1;
> - cpu_topo->socket_id = -1;
> + cpu_topo->package_id = -1;
> cpumask_clear(&cpu_topo->core_sibling);
> cpumask_clear(&cpu_topo->thread_sibling);
> }
> diff --git i/include/linux/arch_topology.h w/include/linux/arch_topology.h
> index 7c850611986d..8e82389c2bed 100644
> --- i/include/linux/arch_topology.h
> +++ w/include/linux/arch_topology.h
> @@ -36,13 +36,9 @@ unsigned long topology_get_freq_scale(int cpu)
> struct cpu_topology {
> int thread_id;
> int core_id;
> -#ifdef CONFIG_ARM_CPU_TOPOLOGY
> - int socket_id;
> -#else
> int package_id;
> int llc_id;
> cpumask_t llc_sibling;
> -#endif
> cpumask_t thread_sibling;
> cpumask_t core_sibling;
> };
> @@ -50,11 +46,7 @@ struct cpu_topology {
> #ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
> extern struct cpu_topology cpu_topology[NR_CPUS];
>
> -#ifdef CONFIG_ARM_CPU_TOPOLOGY
> -#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
> -#else
> #define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
> -#endif
> #define topology_core_id(cpu) (cpu_topology[cpu].core_id)
> #define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
> #define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
>