From: Wang Qing <[email protected]>
Use nested cluster structures in DT to support describing multi-level
cluster topologies and increase the parsing of nested cluster.
Notice: the clusters describing in DT currently are not physical
boundaries, since changing "cluster" to "socket" is too involved and error
prone, this patch will not have any effect on one-level cluster topo, but
can support the mutil-level cluster topo to support CLUSTER_SCHED.
Signed-off-by: Wang Qing <[email protected]>
---
drivers/base/arch_topology.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..5a407cff0ab1 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -491,7 +491,7 @@ static int __init get_cpu_for_node(struct device_node *node)
}
static int __init parse_core(struct device_node *core, int package_id,
- int core_id)
+ int cluster_id, int core_id)
{
char name[20];
bool leaf = true;
@@ -507,6 +507,7 @@ static int __init parse_core(struct device_node *core, int package_id,
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else if (cpu != -ENODEV) {
@@ -528,6 +529,7 @@ static int __init parse_core(struct device_node *core, int package_id,
}
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf && cpu != -ENODEV) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -544,13 +546,15 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
bool has_cores = false;
struct device_node *c;
static int package_id __initdata;
+ static int cluster_id __initdata;
int core_id = 0;
int i, ret;
/*
- * First check for child clusters; we currently ignore any
- * information about the nesting of clusters and present the
- * scheduler with a flat list of them.
+ * nesting of clusters :
+ * level 1: package_id
+ * level 2: cluster_id
+ * level 3+: ignore
*/
i = 0;
do {
@@ -559,6 +563,14 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
if (c) {
leaf = false;
ret = parse_cluster(c, depth + 1);
+ if (depth == 0) {
+ package_id++;
+ cluster_id = 0;
+ } else if (depth == 1)
+ cluster_id++;
+ else
+ pr_err("Ignore nested clusters with more than two levels!\n");
+
of_node_put(c);
if (ret != 0)
return ret;
@@ -582,7 +594,8 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
}
if (leaf) {
- ret = parse_core(c, package_id, core_id++);
+ ret = parse_core(c, package_id, (depth == 2)?cluster_id : -1,
+ core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -599,9 +612,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
if (leaf && !has_cores)
pr_warn("%pOF: empty cluster\n", cluster);
- if (leaf)
- package_id++;
-
return 0;
}
--
2.7.4
On 11/05/2022 11:52, Qing Wang wrote:
> From: Wang Qing <[email protected]>
[...]
> @@ -582,7 +594,8 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
> }
>
> if (leaf) {
> - ret = parse_core(c, package_id, core_id++);
> + ret = parse_core(c, package_id, (depth == 2)?cluster_id : -1,
> + core_id++);
> } else {
> pr_err("%pOF: Non-leaf cluster with core %s\n",
> cluster, name);
> @@ -599,9 +612,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
> if (leaf && !has_cores)
> pr_warn("%pOF: empty cluster\n", cluster);
>
> - if (leaf)
> - package_id++;
> -
> return 0;
> }
The issue I mentioned under
https://lkml.kernel.org/r/[email protected]
still exists.
Btw, I recommend the following test strategy.
(A) Create a set of dts files which represent today's topologies in DT:
(1) 8 CPUs flat (Arm DynamIQ single DSU)
(2) 2 groups of 4 CPUs (e.g. hikey 960) (which covers Phantom* domain)
(3) your QC SM8450 Armv9 tri-gear (4-3-1) DynamIQ single DSU w/ shared
L2 btwn CPU0-1 and CPU2-3.
...
* used in Android
(B) Compile dtb's
dtc -I dts -O dtb -o foo.dtb foo.dts
(C) Run them under qemu w/ and w/o CONFIG_SCHED_CLUSTER and check:
sudo qemu-system-aarch64 ... -dtb foo.dtb
(1) sched domains:
cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
(2) sched flags:
cat /sys/kernel/debug/sched/domains/cpu*/domain*/flags
(3) cpumasks:
cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]
You can even mention the test results in your patch so that people see
that you already covered them. This will speed up the review-process
enormously.
On Wed, May 11, 2022 at 02:52:56AM -0700, Qing Wang wrote:
> From: Wang Qing <[email protected]>
>
> Use nested cluster structures in DT to support describing multi-level
> cluster topologies and increase the parsing of nested cluster.
>
> Notice: the clusters describing in DT currently are not physical
> boundaries, since changing "cluster" to "socket" is too involved and error
> prone, this patch will not have any effect on one-level cluster topo, but
> can support the mutil-level cluster topo to support CLUSTER_SCHED.
Sorry the socket/package_id is broken. If we are playing with cluster_id
which is now wrongly presented as package_id, you are forced to fix that
too. We don't want to break that in a different way or leave that as is
since the cluster_id and package ids now show up as same now. Earlier the
cluster_id was -1.
I had a look when I started reviewing your patch. Assuming we don't need
nested cluster support yet, I have some patches(not built or tested
unfortunately yet). Let me know your thoughts. If you think you still
need support for some kind of nested cluster, build that on top of this.
Also I haven't bothered about sched domains as this purely relates to
topology and how this is mapped to sched domain is orthogonal.
If anything is broken, that needs to be fixed separately there. I see the
idea here is correct and would like to push the patches once I build/test
and get some review/more testing.
Regards,
Sudeep
---->8
From 73de6524249287159a5c9fab9493d84bc5efc6e6 Mon Sep 17 00:00:00 2001
From: Sudeep Holla <[email protected]>
Date: Thu, 12 May 2022 14:12:20 +0100
Subject: [PATCH 1/3] arch_topology: Don't set cluster identifier as physical
package identifier
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index f73b836047cf..44f733b365cc 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -543,7 +543,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
bool leaf = true;
bool has_cores = false;
struct device_node *c;
- static int package_id __initdata;
int core_id = 0;
int i, ret;
@@ -582,7 +581,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
}
if (leaf) {
- ret = parse_core(c, package_id, core_id++);
+ ret = parse_core(c, 0, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -599,9 +598,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
if (leaf && !has_cores)
pr_warn("%pOF: empty cluster\n", cluster);
- if (leaf)
- package_id++;
-
return 0;
}
--
2.36.1
From 33a5184fbb3020a59f27347051fde1af6356b559 Mon Sep 17 00:00:00 2001
From: Sudeep Holla <[email protected]>
Date: Thu, 12 May 2022 14:13:43 +0100
Subject: [PATCH 2/3] arch_topology: Set cluster identifier in each core/thread
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 44f733b365cc..87150b90ede4 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -491,7 +491,7 @@ static int __init get_cpu_for_node(struct device_node *node)
}
static int __init parse_core(struct device_node *core, int package_id,
- int core_id)
+ int cluster_id, int core_id)
{
char name[20];
bool leaf = true;
@@ -507,6 +507,7 @@ static int __init parse_core(struct device_node *core, int package_id,
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else if (cpu != -ENODEV) {
@@ -528,6 +529,7 @@ static int __init parse_core(struct device_node *core, int package_id,
}
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf && cpu != -ENODEV) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -537,7 +539,8 @@ static int __init parse_core(struct device_node *core, int package_id,
return 0;
}
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
{
char name[20];
bool leaf = true;
@@ -557,7 +560,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
- ret = parse_cluster(c, depth + 1);
+ ret = parse_cluster(c, i, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
@@ -581,7 +584,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
}
if (leaf) {
- ret = parse_core(c, 0, core_id++);
+ ret = parse_core(c, 0, cluster_id, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -621,7 +624,7 @@ static int __init parse_dt_topology(void)
if (!map)
goto out;
- ret = parse_cluster(map, 0);
+ ret = parse_cluster(map, -1, 0);
if (ret != 0)
goto out_map;
--
2.36.1
From 82def1dbe2ffd0d03c3b5d995dfa163b312c4b6b Mon Sep 17 00:00:00 2001
From: Sudeep Holla <[email protected]>
Date: Thu, 12 May 2022 14:33:05 +0100
Subject: [PATCH 3/3] arch_topology: Add support for parsing sockets in
/cpu-map
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 87150b90ede4..0ec461bb5d63 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -539,8 +539,8 @@ static int __init parse_core(struct device_node *core, int package_id,
return 0;
}
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+ int cluster_id, int depth)
{
char name[20];
bool leaf = true;
@@ -560,7 +560,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
- ret = parse_cluster(c, i, depth + 1);
+ ret = parse_cluster(c, package_id, i, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
@@ -584,7 +584,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
}
if (leaf) {
- ret = parse_core(c, 0, cluster_id, core_id++);
+ ret = parse_core(c, package_id, cluster_id,
+ core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -604,6 +605,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
return 0;
}
+static int __init parse_socket(struct device_node *socket)
+{
+ char name[20];
+ struct device_node *c;
+ bool has_socket = false;
+ int package_id = 0, ret;
+
+ do {
+ snprintf(name, sizeof(name), "socket%d", package_id);
+ c = of_get_child_by_name(socket, name);
+ if (c) {
+ has_socket = true;
+ ret = parse_cluster(c, package_id, -1, 0);
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ package_id++;
+ } while(c);
+
+ if (!has_socket)
+ ret = parse_cluster(socket, 0, -1, 0);
+
+ return ret;
+}
+
static int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
@@ -624,7 +651,7 @@ static int __init parse_dt_topology(void)
if (!map)
goto out;
- ret = parse_cluster(map, -1, 0);
+ ret = parse_socket(map);
if (ret != 0)
goto out_map;
--
2.36.1
On Fri, May 13, 2022 at 08:30:10AM +0000, 王擎 wrote:
>
> >> From: Wang Qing <[email protected]>
> >>
> >> Use nested cluster structures in DT to support describing multi-level
> >> cluster topologies and increase the parsing of nested cluster.
> >>
> >> Notice: the clusters describing in DT currently are not physical
> >> boundaries, since changing "cluster" to "socket" is too involved and error
> >> prone, this patch will not have any effect on one-level cluster topo, but
> >> can support the mutil-level cluster topo to support CLUSTER_SCHED.
> >
> >Sorry the socket/package_id is broken. If we are playing with cluster_id
> >which is now wrongly presented as package_id, you are forced to fix that
> >too. We don't want to break that in a different way or leave that as is
> >since the cluster_id and package ids now show up as same now. Earlier the
> >cluster_id was -1.
> >
> >I had a look when I started reviewing your patch. Assuming we don't need
> >nested cluster support yet, I have some patches(not built or tested
> >unfortunately yet). Let me know your thoughts. If you think you still
> >need support for some kind of nested cluster, build that on top of this.
> >Also I haven't bothered about sched domains as this purely relates to
> >topology and how this is mapped to sched domain is orthogonal.
> >
> >If anything is broken, that needs to be fixed separately there. I see the
> >idea here is correct and would like to push the patches once I build/test
> >and get some review/more testing.
> >
> >Regards,
> >Sudeep
>
> You have to modify all DTS(rename "cluster" to "socket"), otherwise,
> new package_id = -1 and new cluster_id = old package_id.
>
Nope. I am handling absence of socket nodes and that is a must for backward
compatibility with existing DT.
> This will affect MC and CLS useage, do you have any plans about this?
>
I don't have much knowledge on scheduler domains and I will defer that to
the experts. I am just trying to get the topology read from DT correct and
to align with PPTT. Though LLC is not yet considered but that is not part
of cpu-map. I am trying to get only the /cpu-map part of topology correct
at first. We(you, me or anyone interested) can get the LLC part updated
after that.
--
Regards,
Sudeep
On 12/05/2022 16:17, Sudeep Holla wrote:
> On Wed, May 11, 2022 at 02:52:56AM -0700, Qing Wang wrote:
>> From: Wang Qing <[email protected]>
>>
>> Use nested cluster structures in DT to support describing multi-level
>> cluster topologies and increase the parsing of nested cluster.
>>
>> Notice: the clusters describing in DT currently are not physical
>> boundaries, since changing "cluster" to "socket" is too involved and error
>> prone, this patch will not have any effect on one-level cluster topo, but
>> can support the mutil-level cluster topo to support CLUSTER_SCHED.
>
> Sorry the socket/package_id is broken. If we are playing with cluster_id
> which is now wrongly presented as package_id, you are forced to fix that
> too. We don't want to break that in a different way or leave that as is
> since the cluster_id and package ids now show up as same now. Earlier the
> cluster_id was -1.
We can leave package_id=0 (and maybe add socket parsing later) and use
llc_id instead. Like some Arm server do via ACPI. This will leave
cluster_id for Armv9 L2 sharing. cluster_id is also used in servers for
2. level "clustering", e.g. Kunpeng920 L3-tags or Ampere Altra's SCU
boundaries.
This way we can achieve both. (1) not use package_id for cluster and (2)
have cluster_id available for 2. level cluster.
I just send out a lightly tested RFC:
https://lkml.kernel.org/r/[email protected]
>
> I had a look when I started reviewing your patch. Assuming we don't need
> nested cluster support yet, I have some patches(not built or tested
> unfortunately yet). Let me know your thoughts. If you think you still
> need support for some kind of nested cluster, build that on top of this.
> Also I haven't bothered about sched domains as this purely relates to
> topology and how this is mapped to sched domain is orthogonal.
>
> If anything is broken, that needs to be fixed separately there. I see the
> idea here is correct and would like to push the patches once I build/test
> and get some review/more testing.
On Fri, May 13, 2022 at 10:36:26AM +0200, Dietmar Eggemann wrote:
> On 12/05/2022 16:17, Sudeep Holla wrote:
> > On Wed, May 11, 2022 at 02:52:56AM -0700, Qing Wang wrote:
> >> From: Wang Qing <[email protected]>
> >>
> >> Use nested cluster structures in DT to support describing multi-level
> >> cluster topologies and increase the parsing of nested cluster.
> >>
> >> Notice: the clusters describing in DT currently are not physical
> >> boundaries, since changing "cluster" to "socket" is too involved and error
> >> prone, this patch will not have any effect on one-level cluster topo, but
> >> can support the mutil-level cluster topo to support CLUSTER_SCHED.
> >
> > Sorry the socket/package_id is broken. If we are playing with cluster_id
> > which is now wrongly presented as package_id, you are forced to fix that
> > too. We don't want to break that in a different way or leave that as is
> > since the cluster_id and package ids now show up as same now. Earlier the
> > cluster_id was -1.
>
> We can leave package_id=0 (and maybe add socket parsing later) and use
> llc_id instead. Like some Arm server do via ACPI. This will leave
> cluster_id for Armv9 L2 sharing. cluster_id is also used in servers for
> 2. level "clustering", e.g. Kunpeng920 L3-tags or Ampere Altra's SCU
> boundaries.
>
OK I need to brush up my knowledge there. IIUC, the cluster id and llc_id are
different and I don't believe you can mix them. There are platforms with
system-wide(meaning including all the clusters) last level cache. This
may break on those platforms.
Also IIRC ACPI PPTT has both find_cpu_cluster and find_last_level_cache
(names may differ as I haven't looked at the code) which are entirely
different. They may be same on some platforms but the information source
is definitely different.
> This way we can achieve both. (1) not use package_id for cluster and (2)
> have cluster_id available for 2. level cluster.
>
> I just send out a lightly tested RFC:
>
> https://lkml.kernel.org/r/[email protected]
>
OK, I will take a look, but llc_id and cluster_id are fundamentally different.
Let me see what you have done in the patch exactly and comment there.
--
Regards,
Sudeep
>> From: Wang Qing <[email protected]>
>>
>> Use nested cluster structures in DT to support describing multi-level
>> cluster topologies and increase the parsing of nested cluster.
>>
>> Notice: the clusters describing in DT currently are not physical
>> boundaries, since changing "cluster" to "socket" is too involved and error
>> prone, this patch will not have any effect on one-level cluster topo, but
>> can support the mutil-level cluster topo to support CLUSTER_SCHED.
>
>Sorry the socket/package_id is broken. If we are playing with cluster_id
>which is now wrongly presented as package_id, you are forced to fix that
>too. We don't want to break that in a different way or leave that as is
>since the cluster_id and package ids now show up as same now. Earlier the
>cluster_id was -1.
>
>I had a look when I started reviewing your patch. Assuming we don't need
>nested cluster support yet, I have some patches(not built or tested
>unfortunately yet). Let me know your thoughts. If you think you still
>need support for some kind of nested cluster, build that on top of this.
>Also I haven't bothered about sched domains as this purely relates to
>topology and how this is mapped to sched domain is orthogonal.
>
>If anything is broken, that needs to be fixed separately there. I see the
>idea here is correct and would like to push the patches once I build/test
>and get some review/more testing.
>
>Regards,
>Sudeep
You have to modify all DTS(rename "cluster" to "socket"), otherwise,
new package_id = -1 and new cluster_id = old package_id.
This will affect MC and CLS useage, do you have any plans about this?
Thanks,
Qing
>
>...
>[...]
>
>> @@ -582,7 +594,8 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
>> }
>>
>> if (leaf) {
>> - ret = parse_core(c, package_id, core_id++);
>> + ret = parse_core(c, package_id, (depth == 2)?cluster_id : -1,
>> + core_id++);
>> } else {
>> pr_err("%pOF: Non-leaf cluster with core %s\n",
>> cluster, name);
>> @@ -599,9 +612,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
>> if (leaf && !has_cores)
>> pr_warn("%pOF: empty cluster\n", cluster);
>>
>> - if (leaf)
>> - package_id++;
>> -
>> return 0;
>> }
>
>The issue I mentioned under
>https://lkml.kernel.org/r/[email protected]
>still exists.
Yes, I get it, will be fixed in V3.
>
>Btw, I recommend the following test strategy.
>
>(A) Create a set of dts files which represent today's topologies in DT:
>
> (1) 8 CPUs flat (Arm DynamIQ single DSU)
>
> (2) 2 groups of 4 CPUs (e.g. hikey 960) (which covers Phantom* domain)
>
> (3) your QC SM8450 Armv9 tri-gear (4-3-1) DynamIQ single DSU w/ shared
> L2 btwn CPU0-1 and CPU2-3.
> ...
>
> * used in Android
>
>(B) Compile dtb's
>
> dtc -I dts -O dtb -o foo.dtb foo.dts
>
>
>(C) Run them under qemu w/ and w/o CONFIG_SCHED_CLUSTER and check:
>
> sudo qemu-system-aarch64 ... -dtb foo.dtb
>
> (1) sched domains:
>
> cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
>
> (2) sched flags:
>
> cat /sys/kernel/debug/sched/domains/cpu*/domain*/flags
>
> (3) cpumasks:
>
> cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]
>
>You can even mention the test results in your patch so that people see
>that you already covered them. This will speed up the review-process
>enormously.
Yes, because I'm not sure if this idea is correct. I will add it in the
next version, thank you again.
Qing