From: wangqing <[email protected]>
When ACPI is not enabled, we can parse cache topolopy from DT:
* cpu0: cpu@000 {
* next-level-cache = <&L2_1>;
* L2_1: l2-cache {
* compatible = "cache";
* next-level-cache = <&L3_1>;
* };
* L3_1: l3-cache {
* compatible = "cache";
* };
* };
*
* cpu1: cpu@001 {
* next-level-cache = <&L2_1>;
* };
* cpu2: cpu@002 {
* L2_2: l2-cache {
* compatible = "cache";
* next-level-cache = <&L3_1>;
* };
* };
*
* cpu3: cpu@003 {
* next-level-cache = <&L2_2>;
* };
cache_topology hold the pointer describing "next-level-cache",
it can describe the cache topology of every level.
Expand the use of llc_sibling when ACPI is not enabled.
Signed-off-by: Wang Qing <[email protected]>
---
drivers/base/arch_topology.c | 63 +++++++++++++++++++++++++++++++++--
include/linux/arch_topology.h | 3 ++
2 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..d633184429f2 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -647,6 +647,64 @@ static int __init parse_dt_topology(void)
}
#endif
+/*
+ * cpu cache topology table
+ */
+#define MAX_CACHE_LEVEL 7
+struct device_node *cache_topology[NR_CPUS][MAX_CACHE_LEVEL];
+
+void init_cpu_cache_topology(void)
+{
+ struct device_node *node_cpu, *node_cache;
+ int cpu;
+ int level = 0;
+
+ for_each_possible_cpu(cpu) {
+ node_cpu = of_get_cpu_node(cpu, NULL);
+ if (!node_cpu)
+ continue;
+
+ level = 0;
+ node_cache = node_cpu;
+ while (level < MAX_CACHE_LEVEL) {
+ node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
+ if (!node_cache)
+ break;
+
+ cache_topology[cpu][level++] = node_cache;
+ }
+ of_node_put(node_cpu);
+ }
+}
+
+bool cpu_share_llc(int cpu1, int cpu2)
+{
+ int cache_level;
+
+ for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
+ if (!cache_topology[cpu1][cache_level])
+ continue;
+
+ if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
+ return true;
+
+ return false;
+ }
+
+ return false;
+}
+
+bool cpu_share_l2c(int cpu1, int cpu2)
+{
+ if (!cache_topology[cpu1][0])
+ return false;
+
+ if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
+ return true;
+
+ return false;
+}
+
/*
* cpu topology table
*/
@@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
/* not numa in package, lets use the package siblings */
core_mask = &cpu_topology[cpu].core_sibling;
}
- if (cpu_topology[cpu].llc_id != -1) {
+ if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
core_mask = &cpu_topology[cpu].llc_sibling;
}
@@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
for_each_online_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
- if (cpuid_topo->llc_id == cpu_topo->llc_id) {
+ if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
+ || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..df670d5fc03b 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -86,6 +86,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_cluster_cpumask(cpu) (&cpu_topology[cpu].cluster_sibling)
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
+void init_cpu_cache_topology(void);
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
const struct cpumask *cpu_clustergroup_mask(int cpu);
@@ -93,6 +94,8 @@ void update_siblings_masks(unsigned int cpu);
void remove_cpu_topology(unsigned int cpuid);
void reset_cpu_topology(void);
int parse_acpi_topology(void);
+bool cpu_share_llc(int cpu1, int cpu2);
+bool cpu_share_l2c(int cpu1, int cpu2);
#endif
#endif /* _LINUX_ARCH_TOPOLOGY_H_ */
--
2.27.0.windows.1
On Wed, Apr 06, 2022 at 02:18:00AM -0700, Qing Wang wrote:
> From: wangqing <[email protected]>
>
> When ACPI is not enabled, we can parse cache topolopy from DT:
> * cpu0: cpu@000 {
> * next-level-cache = <&L2_1>;
> * L2_1: l2-cache {
> * compatible = "cache";
> * next-level-cache = <&L3_1>;
> * };
> * L3_1: l3-cache {
> * compatible = "cache";
> * };
> * };
> *
> * cpu1: cpu@001 {
> * next-level-cache = <&L2_1>;
> * };
> * cpu2: cpu@002 {
> * L2_2: l2-cache {
> * compatible = "cache";
> * next-level-cache = <&L3_1>;
> * };
> * };
> *
> * cpu3: cpu@003 {
> * next-level-cache = <&L2_2>;
> * };
> cache_topology hold the pointer describing "next-level-cache",
> it can describe the cache topology of every level.
>
> Expand the use of llc_sibling when ACPI is not enabled.
>
You seem to have posted this patch as part of the series first. One patch
was rejected and then you post this without any history. It confuses if you
don't provide all the background/history.
Having said that, NACK for this patch as it stands. We have
drivers/base/cacheinfo.c which has all the parsing of cache information.
IIRC we already consider LLC but highlight if anything is particularly
missing. I am unable to follow/understand with you commit message.
--
Regards,
Sudeep
>> From: wangqing <[email protected]>
>>
>> When ACPI is not enabled, we can parse cache topolopy from DT:
>> * cpu0: cpu@000 {
>> * next-level-cache = <&L2_1>;
>> * L2_1: l2-cache {
>> * compatible = "cache";
>> * next-level-cache = <&L3_1>;
>> * };
>> * L3_1: l3-cache {
>> * compatible = "cache";
>> * };
>> * };
>> *
>> * cpu1: cpu@001 {
>> * next-level-cache = <&L2_1>;
>> * };
>> * cpu2: cpu@002 {
>> * L2_2: l2-cache {
>> * compatible = "cache";
>> * next-level-cache = <&L3_1>;
>> * };
>> * };
>> *
>> * cpu3: cpu@003 {
>> * next-level-cache = <&L2_2>;
>> * };
>> cache_topology hold the pointer describing "next-level-cache",
>> it can describe the cache topology of every level.
>>
>> Expand the use of llc_sibling when ACPI is not enabled.
>>
>
>You seem to have posted this patch as part of the series first. One patch
>was rejected and then you post this without any history. It confuses if you
>don't provide all the background/history.
Yes, the series contains several parts, the sched_domain part was rejected
temporary. But it has nothing to do with this patch, that's why I took it apart.
The background doesn't matter, let's focus on this patch itself.
>
>Having said that, NACK for this patch as it stands. We have
>drivers/base/cacheinfo.c which has all the parsing of cache information.
>IIRC we already consider LLC but highlight if anything is particularly
>missing. I am unable to follow/understand with you commit message.
cacheinfo.c just describes the properties of the cache, It can't describe
the cache topology, some like cpuinfo and cpu topology.
llc_sibling is not used at all if ACPI is not enabled, because llc_id
always be -1, and cpu_coregroup_mask() always return the core_sibling.
Why not get the cache topology from DT if the arch support GENERIC_ARCH_TOPOLOGY.
Thanks,
Wang
>
>--
>Regards,
>Sudeep
On Thu, Apr 07, 2022 at 03:11:51AM +0000, 王擎 wrote:
>
> >> From: wangqing <[email protected]>
> >>
> >> When ACPI is not enabled, we can parse cache topolopy from DT:
> >> * cpu0: cpu@000 {
> >> * next-level-cache = <&L2_1>;
> >> * L2_1: l2-cache {
> >> * compatible = "cache";
> >> * next-level-cache = <&L3_1>;
> >> * };
> >> * L3_1: l3-cache {
> >> * compatible = "cache";
> >> * };
> >> * };
> >> *
> >> * cpu1: cpu@001 {
> >> * next-level-cache = <&L2_1>;
> >> * };
> >> * cpu2: cpu@002 {
> >> * L2_2: l2-cache {
> >> * compatible = "cache";
> >> * next-level-cache = <&L3_1>;
> >> * };
> >> * };
> >> *
> >> * cpu3: cpu@003 {
> >> * next-level-cache = <&L2_2>;
> >> * };
> >> cache_topology hold the pointer describing "next-level-cache",
> >> it can describe the cache topology of every level.
> >>
> >> Expand the use of llc_sibling when ACPI is not enabled.
> >>
> >
> >You seem to have posted this patch as part of the series first. One patch
> >was rejected and then you post this without any history. It confuses if you
> >don't provide all the background/history.
>
> Yes, the series contains several parts, the sched_domain part was rejected
> temporary. But it has nothing to do with this patch, that's why I took it apart.
That's not correct if you plan to use it there. Currently no users so no need
to add.
> The background doesn't matter, let's focus on this patch itself.
>
It depends, some people might find it useful, so better to provide it.
One can ignore if that is not needed or if they are already aware.
> >
> >Having said that, NACK for this patch as it stands. We have
> >drivers/base/cacheinfo.c which has all the parsing of cache information.
> >IIRC we already consider LLC but highlight if anything is particularly
> >missing. I am unable to follow/understand with you commit message.
>
> cacheinfo.c just describes the properties of the cache, It can't describe
> the cache topology, some like cpuinfo and cpu topology.
>
Not 100% correct. We do have info about sharing there.
> llc_sibling is not used at all if ACPI is not enabled, because llc_id
> always be -1, and cpu_coregroup_mask() always return the core_sibling.
>
You can use of_find_last_level_cache or something similar and remove load
of duplicated code you have in this patch.
> Why not get the cache topology from DT if the arch support GENERIC_ARCH_TOPOLOGY.
>
Sure but if the intended use is for scheduler, please include relevant people
as there are quite a few threads around the topic recently and disintegrating
and throwing patches at random with different set of people is not going to
help make progress. If this is intended for Arm64 platforms, I suggest to
keep these 2 in the loop as they are following few other threads and gives
them full picture of intended use-case.
Vincent Guittot <[email protected]>
Dietmar Eggemann <[email protected]>
--
Regards,
Sudeep
On 06/04/2022 11:18, Qing Wang wrote:
> From: wangqing <[email protected]>
[...]
> +void init_cpu_cache_topology(void)
> +{
> + struct device_node *node_cpu, *node_cache;
> + int cpu;
> + int level = 0;
> +
> + for_each_possible_cpu(cpu) {
> + node_cpu = of_get_cpu_node(cpu, NULL);
> + if (!node_cpu)
> + continue;
> +
> + level = 0;
> + node_cache = node_cpu;
> + while (level < MAX_CACHE_LEVEL) {
> + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
> + if (!node_cache)
> + break;
> +
> + cache_topology[cpu][level++] = node_cache;
> + }
> + of_node_put(node_cpu);
> + }
> +}
From where is init_cpu_cache_topology() called?
> +bool cpu_share_llc(int cpu1, int cpu2)
> +{
> + int cache_level;
> +
> + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
> + if (!cache_topology[cpu1][cache_level])
> + continue;
> +
> + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
> + return true;
> +
> + return false;
> + }
> +
> + return false;
> +}
Like I mentioned in:
https://lkml.kernel.org/r/[email protected]
the correct setting in DT's cpu-map node (only core nodes in your case
(One DynamIQ cluster) will give you the correct LLC (highest
SD_SHARE_PKG_RESOURCES) setting.
https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/topology.txt
> +
> +bool cpu_share_l2c(int cpu1, int cpu2)
> +{
> + if (!cache_topology[cpu1][0])
> + return false;
> +
> + if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
> + return true;
> +
> + return false;
> +}
> +
> /*
> * cpu topology table
> */
> @@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> /* not numa in package, lets use the package siblings */
> core_mask = &cpu_topology[cpu].core_sibling;
> }
> - if (cpu_topology[cpu].llc_id != -1) {
> + if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> core_mask = &cpu_topology[cpu].llc_sibling;
> }
> @@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
> for_each_online_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
> + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
Assuming a:
.---------------.
CPU |0 1 2 3 4 5 6 7|
+---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
+---------------+
L2 | | | | | | |
+---------------+
L3 |<-- -->|
+---------------+
|<-- cluster -->|
+---------------+
|<-- DSU -->|
'---------------'
system, I guess you would get (w/ Phantom SD and L2/L3 cache info in DT):
CPU0 .. 3:
MC SD_SHARE_PKG_RESOURCES
DIE no SD_SHARE_PKG_RESOURCES
CPU 4...7:
DIE no SD_SHARE_PKG_RESOURCES
I can't see how this would make any sense ...
Reason is cpu_share_llc(). You don't check cache_level=0 and w/
CPU0 .. 3:
cache_topology[CPUX][0] == L2
cache_topology[CPUX][1] == L3
CPU4...7:
cache_topology[CPUX][0] == L3
there is, except for CPU0-1 and CPU2-3, no LLC match.
[...]
>
>[...]
>
>> +void init_cpu_cache_topology(void)
>> +{
>> + struct device_node *node_cpu, *node_cache;
>> + int cpu;
>> + int level = 0;
>> +
>> + for_each_possible_cpu(cpu) {
>> + node_cpu = of_get_cpu_node(cpu, NULL);
>> + if (!node_cpu)
>> + continue;
>> +
>> + level = 0;
>> + node_cache = node_cpu;
>> + while (level < MAX_CACHE_LEVEL) {
>> + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
>> + if (!node_cache)
>> + break;
>> +
>> + cache_topology[cpu][level++] = node_cache;
>> + }
>> + of_node_put(node_cpu);
>> + }
>> +}
>
>From where is init_cpu_cache_topology() called?
All arch which support GENERIC_ARCH_TOPOLOGY can be called,
I will take arm64 as an example in V2.
>
>> +bool cpu_share_llc(int cpu1, int cpu2)
>> +{
>> + int cache_level;
>> +
>> + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
>> + if (!cache_topology[cpu1][cache_level])
>> + continue;
>> +
>> + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
>> + return true;
>> +
>> + return false;
>> + }
>> +
>> + return false;
>> +}
>
>Like I mentioned in:
>
>https://lkml.kernel.org/r/[email protected]
>
>the correct setting in DT's cpu-map node (only core nodes in your case
>(One DynamIQ cluster) will give you the correct LLC (highest
>SD_SHARE_PKG_RESOURCES) setting.
Thanks, I have seen your reply, but not 100% agree.
1.A cluster not only considers the cache, but also a set of the same arch,capability,
and policy. So we can't simply change its cluster(little, medium, big) even if DSU.
2.The cpu-map node can't give the correct LLC, it just give the correct cluster/core group
>
>https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/topology.txt
>
>> +
>> +bool cpu_share_l2c(int cpu1, int cpu2)
>> +{
>> + if (!cache_topology[cpu1][0])
>> + return false;
>> +
>> + if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
>> + return true;
>> +
>> + return false;
>> +}
>> +
>> /*
>> * cpu topology table
>> */
>> @@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>> /* not numa in package, lets use the package siblings */
>> core_mask = &cpu_topology[cpu].core_sibling;
>> }
>> - if (cpu_topology[cpu].llc_id != -1) {
>> + if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
>> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
>> core_mask = &cpu_topology[cpu].llc_sibling;
>> }
>> @@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
>> for_each_online_cpu(cpu) {
>> cpu_topo = &cpu_topology[cpu];
>>
>> - if (cpuid_topo->llc_id == cpu_topo->llc_id) {
>> + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
>> + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
>
>Assuming a:
>
> .---------------.
>CPU |0 1 2 3 4 5 6 7|
> +---------------+
>uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
> +---------------+
> L2 | | | | | | |
We only discuss the 4-core system here, but it should be:
L2 | | | | | | | |
> +---------------+
> L3 |<-- -->|
> +---------------+
> |<-- cluster -->|
> +---------------+
> |<-- DSU -->|
> '---------------'
>
>system, I guess you would get (w/ Phantom SD and L2/L3 cache info in DT):
>
>CPU0 .. 3:
>
>MC SD_SHARE_PKG_RESOURCES
>DIE no SD_SHARE_PKG_RESOURCES
>
>CPU 4...7:
>
>DIE no SD_SHARE_PKG_RESOURCES
>
>I can't see how this would make any sense ...
>
>Reason is cpu_share_llc(). You don't check cache_level=0 and w/
>
>CPU0 .. 3:
>cache_topology[CPUX][0] == L2
>cache_topology[CPUX][1] == L3
>
>CPU4...7:
>cache_topology[CPUX][0] == L3
No, I didn't describe the cache topology of cpu[4,7] in the commit log,
cause that's not the point, I use 4 cores(3 level cache) as an example.
CPU4...7 if needed:
cache_topology[CPUX][0] == L2
cache_topology[CPUX][1] == L3
cache_topology[CPUX][level], level is strictly corresponding to the cache level
>
>there is, except for CPU0-1 and CPU2-3, no LLC match.
All 4 CPUs match the LLC in this case.
Thanks,
Wang
>
>[...]
>>
>> >> From: wangqing <[email protected]>
>> >>
>> >> When ACPI is not enabled, we can parse cache topolopy from DT:
>> >> * cpu0: cpu@000 {
>> >> * next-level-cache = <&L2_1>;
>> >> * L2_1: l2-cache {
>> >> * compatible = "cache";
>> >> * next-level-cache = <&L3_1>;
>> >> * };
>> >> * L3_1: l3-cache {
>> >> * compatible = "cache";
>> >> * };
>> >> * };
>> >> *
>> >> * cpu1: cpu@001 {
>> >> * next-level-cache = <&L2_1>;
>> >> * };
>> >> * cpu2: cpu@002 {
>> >> * L2_2: l2-cache {
>> >> * compatible = "cache";
>> >> * next-level-cache = <&L3_1>;
>> >> * };
>> >> * };
>> >> *
>> >> * cpu3: cpu@003 {
>> >> * next-level-cache = <&L2_2>;
>> >> * };
>> >> cache_topology hold the pointer describing "next-level-cache",
>> >> it can describe the cache topology of every level.
>> >>
>> >> Expand the use of llc_sibling when ACPI is not enabled.
>> >>
>> >
>> >You seem to have posted this patch as part of the series first. One patch
>> >was rejected and then you post this without any history. It confuses if you
>> >don't provide all the background/history.
>>
>> Yes, the series contains several parts, the sched_domain part was rejected
>> temporary. But it has nothing to do with this patch, that's why I took it apart.
>
>That's not correct if you plan to use it there. Currently no users so no need
>to add.
>
>> The background doesn't matter, let's focus on this patch itself.
>>
>
>It depends, some people might find it useful, so better to provide it.
>One can ignore if that is not needed or if they are already aware.
>
I see.
>> >
>> >Having said that, NACK for this patch as it stands. We have
>> >drivers/base/cacheinfo.c which has all the parsing of cache information.
>> >IIRC we already consider LLC but highlight if anything is particularly
>> >missing. I am unable to follow/understand with you commit message.
>>
>> cacheinfo.c just describes the properties of the cache, It can't describe
>> the cache topology, some like cpuinfo and cpu topology.
>>
>
>Not 100% correct. We do have info about sharing there.
>
>> llc_sibling is not used at all if ACPI is not enabled, because llc_id
>> always be -1, and cpu_coregroup_mask() always return the core_sibling.
>>
>
>You can use of_find_last_level_cache or something similar and remove load
>of duplicated code you have in this patch.
First correct my previous mistakes, we could have obtained cache topology
through get_cpu_cacheinfo(cpu)->info_list + index, but its initialization
is after smp cpumask and llc configuration.
Can we take detect_cache_attributes() before smp prepare, or put it in
parse_dt_topology()?
>
>> Why not get the cache topology from DT if the arch support GENERIC_ARCH_TOPOLOGY.
>>
>
>Sure but if the intended use is for scheduler, please include relevant people
>as there are quite a few threads around the topic recently and disintegrating
>and throwing patches at random with different set of people is not going to
>help make progress. If this is intended for Arm64 platforms, I suggest to
>keep these 2 in the loop as they are following few other threads and gives
>them full picture of intended use-case.
>
>Vincent Guittot <[email protected]>
>Dietmar Eggemann <[email protected]>
>
I will do it if necessary.
Thanks,
Wang
>--
>Regards,
sched_domain_flags_f