LinuxLists.cc - [PATCH v2 0/1] arm: topology: parse the topology from the dt

[permalink] [raw]

Subject: [PATCH v2 1/1] arm: topology: parse the topology from the dt

From: Ruifeng Zhang <[email protected]>

The arm topology still parse from the MPIDR, but it is incomplete. When
the armv8.2 or above cpu runs kernel in EL1 with aarch32 mode, it will
parse out the wrong topology.

Changed:
1) Arm using the same parse_dt_topology function as arm64.
2) For compatibility to keep the function of get capacity from default
cputype, renamed arm parse_dt_topology to get_efficiency_capacity and
delete related logic of parse from dt.
3) The update_cpu_capacity function should not depend on the topology, so
it is moved to the beginning of store_cpu_topology.

The arm device boot step is to look for the efficiency_capacity firstly.
Then parse the topology and capacity from dt to replace efficiency value.

Signed-off-by: Ruifeng Zhang <[email protected]>
Signed-off-by: Nianfu Bai <[email protected]>
---
arch/arm/kernel/topology.c | 22 ++++++----------------
drivers/base/arch_topology.c | 4 ++--
include/linux/arch_topology.h | 1 +
3 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index ef0058de432b..93d875320cc4 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -72,7 +72,6 @@ static unsigned long *__cpu_capacity;
#define cpu_capacity(cpu) __cpu_capacity[cpu]

static unsigned long middle_capacity = 1;
-static bool cap_from_dt = true;

/*
* Iterate all CPUs' descriptor in DT and compute the efficiency
@@ -82,7 +81,7 @@ static bool cap_from_dt = true;
* 'average' CPU is of middle capacity. Also see the comments near
* table_efficiency[] and update_cpu_capacity().
*/
-static void __init parse_dt_topology(void)
+static void __init get_coretype_capacity(void)
{
const struct cpu_efficiency *cpu_eff;
struct device_node *cn = NULL;
@@ -105,13 +104,6 @@ static void __init parse_dt_topology(void)
continue;
}

- if (topology_parse_cpu_capacity(cn, cpu)) {
- of_node_put(cn);
- continue;
- }
-
- cap_from_dt = false;
-
for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
if (of_device_is_compatible(cn, cpu_eff->compatible))
break;
@@ -151,9 +143,6 @@ static void __init parse_dt_topology(void)
else
middle_capacity = ((max_capacity / 3)
>> (SCHED_CAPACITY_SHIFT-1)) + 1;
-
- if (cap_from_dt)
- topology_normalize_cpu_scale();
}

/*
@@ -163,7 +152,7 @@ static void __init parse_dt_topology(void)
*/
static void update_cpu_capacity(unsigned int cpu)
{
- if (!cpu_capacity(cpu) || cap_from_dt)
+ if (!cpu_capacity(cpu))
return;

topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
@@ -173,7 +162,7 @@ static void update_cpu_capacity(unsigned int cpu)
}

#else
-static inline void parse_dt_topology(void) {}
+static inline void get_cputype_capacity(void) {}
static inline void update_cpu_capacity(unsigned int cpuid) {}
#endif

@@ -187,6 +176,8 @@ void store_cpu_topology(unsigned int cpuid)
struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
unsigned int mpidr;

+ update_cpu_capacity(cpuid);
+
if (cpuid_topo->package_id != -1)
goto topology_populated;

@@ -221,8 +212,6 @@ void store_cpu_topology(unsigned int cpuid)
cpuid_topo->package_id = -1;
}

- update_cpu_capacity(cpuid);
-
pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
cpuid, cpu_topology[cpuid].thread_id,
cpu_topology[cpuid].core_id,
@@ -241,5 +230,6 @@ void __init init_cpu_topology(void)
reset_cpu_topology();
smp_wmb();

+ get_coretype_capacity();
parse_dt_topology();
}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index de8587cc119e..a45aec356ec4 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -295,7 +295,7 @@ static void parsing_done_workfn(struct work_struct *work)
core_initcall(free_raw_capacity);
#endif

-#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
/*
* This function returns the logic cpu number of the node.
* There are basically three kinds of return values:
@@ -441,7 +441,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
return 0;
}

-static int __init parse_dt_topology(void)
+int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 0f6cd6b73a61..cfa5a5072aa0 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -66,6 +66,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
+int __init parse_dt_topology(void);
const struct cpumask *cpu_coregroup_mask(int cpu);
void update_siblings_masks(unsigned int cpu);
void remove_cpu_topology(unsigned int cpuid);
--
2.17.1

2021-04-15 00:24:05

[permalink] [raw]

Subject: Re: [PATCH v2 1/1] arm: topology: parse the topology from the dt

Dear Dietmar,

In the last, update_cpu_capacity(cpuid) must be called in
store_cpu_topology because the cpuid_topo->package_id is always be the
initialization value.

Because of added the DT parsing logic, the cpuid_topo->package_id will
be parse in driver/base/arch_topology and the update_cpu_capacity
can't be called if DT has cpu-map.

Update cpu capacity isn't related with cpu topology, so move it to the
beginning of this function.

Please help to review and test the new patch, thanks.

Ruifeng Zhang <[email protected]> 于2021年4月14日周三下午8:24写道：
>
> From: Ruifeng Zhang <[email protected]>
>
> The arm topology still parse from the MPIDR, but it is incomplete. When
> the armv8.2 or above cpu runs kernel in EL1 with aarch32 mode, it will
> parse out the wrong topology.
>
> Changed:
> 1) Arm using the same parse_dt_topology function as arm64.
> 2) For compatibility to keep the function of get capacity from default
> cputype, renamed arm parse_dt_topology to get_efficiency_capacity and
> delete related logic of parse from dt.
> 3) The update_cpu_capacity function should not depend on the topology, so
> it is moved to the beginning of store_cpu_topology.
>
> The arm device boot step is to look for the efficiency_capacity firstly.
> Then parse the topology and capacity from dt to replace efficiency value.
>
> Signed-off-by: Ruifeng Zhang <[email protected]>
> Signed-off-by: Nianfu Bai <[email protected]>
> ---
> arch/arm/kernel/topology.c | 22 ++++++----------------
> drivers/base/arch_topology.c | 4 ++--
> include/linux/arch_topology.h | 1 +
> 3 files changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index ef0058de432b..93d875320cc4 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -72,7 +72,6 @@ static unsigned long *__cpu_capacity;
> #define cpu_capacity(cpu) __cpu_capacity[cpu]
>
> static unsigned long middle_capacity = 1;
> -static bool cap_from_dt = true;
>
> /*
> * Iterate all CPUs' descriptor in DT and compute the efficiency
> @@ -82,7 +81,7 @@ static bool cap_from_dt = true;
> * 'average' CPU is of middle capacity. Also see the comments near
> * table_efficiency[] and update_cpu_capacity().
> */
> -static void __init parse_dt_topology(void)
> +static void __init get_coretype_capacity(void)
> {
> const struct cpu_efficiency *cpu_eff;
> struct device_node *cn = NULL;
> @@ -105,13 +104,6 @@ static void __init parse_dt_topology(void)
> continue;
> }
>
> - if (topology_parse_cpu_capacity(cn, cpu)) {
> - of_node_put(cn);
> - continue;
> - }
> -
> - cap_from_dt = false;
> -
> for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
> if (of_device_is_compatible(cn, cpu_eff->compatible))
> break;
> @@ -151,9 +143,6 @@ static void __init parse_dt_topology(void)
> else
> middle_capacity = ((max_capacity / 3)
> >> (SCHED_CAPACITY_SHIFT-1)) + 1;
> -
> - if (cap_from_dt)
> - topology_normalize_cpu_scale();
> }
>
> /*
> @@ -163,7 +152,7 @@ static void __init parse_dt_topology(void)
> */
> static void update_cpu_capacity(unsigned int cpu)
> {
> - if (!cpu_capacity(cpu) || cap_from_dt)
> + if (!cpu_capacity(cpu))
> return;
>
> topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
> @@ -173,7 +162,7 @@ static void update_cpu_capacity(unsigned int cpu)
> }
>
> #else
> -static inline void parse_dt_topology(void) {}
> +static inline void get_cputype_capacity(void) {}
> static inline void update_cpu_capacity(unsigned int cpuid) {}
> #endif
>
> @@ -187,6 +176,8 @@ void store_cpu_topology(unsigned int cpuid)
> struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
> unsigned int mpidr;
>
> + update_cpu_capacity(cpuid);
> +
> if (cpuid_topo->package_id != -1)
> goto topology_populated;
>
> @@ -221,8 +212,6 @@ void store_cpu_topology(unsigned int cpuid)
> cpuid_topo->package_id = -1;
> }
>
> - update_cpu_capacity(cpuid);
> -
> pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
> cpuid, cpu_topology[cpuid].thread_id,
> cpu_topology[cpuid].core_id,
> @@ -241,5 +230,6 @@ void __init init_cpu_topology(void)
> reset_cpu_topology();
> smp_wmb();
>
> + get_coretype_capacity();
> parse_dt_topology();
> }
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index de8587cc119e..a45aec356ec4 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -295,7 +295,7 @@ static void parsing_done_workfn(struct work_struct *work)
> core_initcall(free_raw_capacity);
> #endif
>
> -#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> /*
> * This function returns the logic cpu number of the node.
> * There are basically three kinds of return values:
> @@ -441,7 +441,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
> return 0;
> }
>
> -static int __init parse_dt_topology(void)
> +int __init parse_dt_topology(void)
> {
> struct device_node *cn, *map;
> int ret = 0;
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 0f6cd6b73a61..cfa5a5072aa0 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -66,6 +66,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
> #define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
> void init_cpu_topology(void);
> void store_cpu_topology(unsigned int cpuid);
> +int __init parse_dt_topology(void);
> const struct cpumask *cpu_coregroup_mask(int cpu);
> void update_siblings_masks(unsigned int cpu);
> void remove_cpu_topology(unsigned int cpuid);
> --
> 2.17.1
>

2021-04-15 18:10:45

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 14/04/21 20:23, Ruifeng Zhang wrote:
> From: Ruifeng Zhang <[email protected]>
>
> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
> version, one of them is the kernel running on EL1 using aarch32.
> user(EL0) kernel(EL1)
> sc9863a_go aarch32 aarch32
> sc9863a aarch64 aarch64
>
> When kernel runs on EL1 using aarch32, the topology will parse wrong.
> For example,
> The MPIDR has been written to the chip register in armv8.2 format.
> For example,
> core0: 0000000080000000
> core1: 0000000080000100
> core2: 0000000080000200
> ...
>
> It will parse to:
> | | aff2 | packageid | coreid |
> |-------+------+-----------+--------|
> | Core0 | 0 | 0 | 0 |
> | Core1 | 0 | 1 | 0 |
> | Core2 | 0 | 2 | 0 |
> | ... | | | |
>
> The wrong topology is that all of the coreid are 0 and unexpected
> packageid.
>
> The reason is the MPIDR format is different between armv7 and armv8.2.
> armv7 (A7) mpidr is:
> [11:8] [7:2] [1:0]
> cluster reserved cpu
> The cortex-a7 spec DDI0464F 4.3.5
> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>
> armv8.2 (A55) mpidr is:
> [23:16] [15:8] [7:0]
> cluster cpu thread
>

What I had understood from our conversation was that there *isn't* a format
difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
would parse it the same, except that MPIDR parsing has been deprecated for
arm64.

The problem is that those MPIDR values don't match the actual topology. If
they had the MT bit set, i.e.

core0: 0000000081000000
core1: 0000000081000100
core2: 0000000081000200

then it would be parsed as:

| | package_id | core_id | thread_id |
|-------+------------+---------+-----------|
| Core0 | 0 | 0 | 0 |
| Core1 | 0 | 1 | 0 |
| Core2 | 0 | 2 | 0 |

which would make more sense (wrt the actual, physical topology).

2021-04-15 18:10:54

[permalink] [raw]

Subject: Re: [PATCH v2 1/1] arm: topology: parse the topology from the dt

On 14/04/21 20:23, Ruifeng Zhang wrote:
> From: Ruifeng Zhang <[email protected]>
>
> The arm topology still parse from the MPIDR, but it is incomplete. When
> the armv8.2 or above cpu runs kernel in EL1 with aarch32 mode, it will
> parse out the wrong topology.
>

Per my other email, isn't the problem that MPIDR can't be trusted to
properly represent the topology, and thus a new method of describing the
topology (cpu-map) has to be used?

2021-04-15 20:11:56

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 15/04/2021 20:09, Valentin Schneider wrote:
> On 14/04/21 20:23, Ruifeng Zhang wrote:
>> From: Ruifeng Zhang <[email protected]>
>>
>> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
>> version, one of them is the kernel running on EL1 using aarch32.
>> user(EL0) kernel(EL1)
>> sc9863a_go aarch32 aarch32
>> sc9863a aarch64 aarch64
>>
>> When kernel runs on EL1 using aarch32, the topology will parse wrong.
>> For example,
>> The MPIDR has been written to the chip register in armv8.2 format.
>> For example,
>> core0: 0000000080000000
>> core1: 0000000080000100
>> core2: 0000000080000200
>> ...
>>
>> It will parse to:
>> | | aff2 | packageid | coreid |
>> |-------+------+-----------+--------|
>> | Core0 | 0 | 0 | 0 |
>> | Core1 | 0 | 1 | 0 |
>> | Core2 | 0 | 2 | 0 |
>> | ... | | | |
>>
>> The wrong topology is that all of the coreid are 0 and unexpected
>> packageid.
>>
>> The reason is the MPIDR format is different between armv7 and armv8.2.
>> armv7 (A7) mpidr is:
>> [11:8] [7:2] [1:0]
>> cluster reserved cpu
>> The cortex-a7 spec DDI0464F 4.3.5
>> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>>
>> armv8.2 (A55) mpidr is:
>> [23:16] [15:8] [7:0]
>> cluster cpu thread
>>
>
> What I had understood from our conversation was that there *isn't* a format
> difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
> would parse it the same, except that MPIDR parsing has been deprecated for
> arm64.
>
> The problem is that those MPIDR values don't match the actual topology. If
> they had the MT bit set, i.e.
>
> core0: 0000000081000000
> core1: 0000000081000100
> core2: 0000000081000200
>
> then it would be parsed as:
>
> | | package_id | core_id | thread_id |
> |-------+------------+---------+-----------|
> | Core0 | 0 | 0 | 0 |
> | Core1 | 0 | 1 | 0 |
> | Core2 | 0 | 2 | 0 |
>
> which would make more sense (wrt the actual, physical topology).

... and this would be in sync with
https://developer.arm.com/documentation/100442/0200/register-descriptions/aarch32-system-registers/mpidr--multiprocessor-affinity-register

MT, [24]

0b1 ...

There is no 0b0 for MT.

2021-04-16 08:35:52

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

Dietmar Eggemann <[email protected]> 于2021年4月16日周五上午4:10写道：
>
> On 15/04/2021 20:09, Valentin Schneider wrote:
> > On 14/04/21 20:23, Ruifeng Zhang wrote:
> >> From: Ruifeng Zhang <[email protected]>
> >>
> >> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
> >> version, one of them is the kernel running on EL1 using aarch32.
> >> user(EL0) kernel(EL1)
> >> sc9863a_go aarch32 aarch32
> >> sc9863a aarch64 aarch64
> >>
> >> When kernel runs on EL1 using aarch32, the topology will parse wrong.
> >> For example,
> >> The MPIDR has been written to the chip register in armv8.2 format.
> >> For example,
> >> core0: 0000000080000000
> >> core1: 0000000080000100
> >> core2: 0000000080000200
> >> ...
> >>
> >> It will parse to:
> >> | | aff2 | packageid | coreid |
> >> |-------+------+-----------+--------|
> >> | Core0 | 0 | 0 | 0 |
> >> | Core1 | 0 | 1 | 0 |
> >> | Core2 | 0 | 2 | 0 |
> >> | ... | | | |
> >>
> >> The wrong topology is that all of the coreid are 0 and unexpected
> >> packageid.
> >>
> >> The reason is the MPIDR format is different between armv7 and armv8.2.
> >> armv7 (A7) mpidr is:
> >> [11:8] [7:2] [1:0]
> >> cluster reserved cpu
> >> The cortex-a7 spec DDI0464F 4.3.5
> >> https://developer.arm.com/documentation/ddi0464/f/?lang=en
> >>
> >> armv8.2 (A55) mpidr is:
> >> [23:16] [15:8] [7:0]
> >> cluster cpu thread
> >>
> >
> > What I had understood from our conversation was that there *isn't* a format
> > difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
> > would parse it the same, except that MPIDR parsing has been deprecated for
> > arm64.

I agree, I should change my description.

> >
> > The problem is that those MPIDR values don't match the actual topology. If
> > they had the MT bit set, i.e.
> >
> > core0: 0000000081000000
> > core1: 0000000081000100
> > core2: 0000000081000200
> >
> > then it would be parsed as:
> >
> > | | package_id | core_id | thread_id |
> > |-------+------------+---------+-----------|
> > | Core0 | 0 | 0 | 0 |
> > | Core1 | 0 | 1 | 0 |
> > | Core2 | 0 | 2 | 0 |
> >
> > which would make more sense (wrt the actual, physical topology).
>
> ... and this would be in sync with
> https://developer.arm.com/documentation/100442/0200/register-descriptions/aarch32-system-registers/mpidr--multiprocessor-affinity-register
>
> MT, [24]
>
> 0b1 ...
>
> There is no 0b0 for MT.
>
As you said, the MT must be 0b1, so the {aff1} means coreid for A55.
It's no problem for parsing coreid.

For more requirements, if all cores in one physical cluster, the
{aff2} of all cores are the same value.
i.e. the sc9863a,
core0: 0000000081000000
core1: 0000000081000100
core2: 0000000081000200
core3: 0000000081000300
core4: 0000000081000400
core5: 0000000081000500
core6: 0000000081000600
core7: 0000000081000700

According to MPIDR all cores will parse to the one cluster, but it's
the big.LITTLE system, it's need two logic cluster for schedule or
cpufreq.
So I think it's better to add the logic of parse topology from DT.

2021-04-16 10:31:07

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 16/04/21 15:47, Ruifeng Zhang wrote:
> For more requirements, if all cores in one physical cluster, the
> {aff2} of all cores are the same value.
> i.e. the sc9863a,
> core0: 0000000081000000
> core1: 0000000081000100
> core2: 0000000081000200
> core3: 0000000081000300
> core4: 0000000081000400
> core5: 0000000081000500
> core6: 0000000081000600
> core7: 0000000081000700
>
> According to MPIDR all cores will parse to the one cluster, but it's
> the big.LITTLE system, it's need two logic cluster for schedule or
> cpufreq.
> So I think it's better to add the logic of parse topology from DT.

Ah, so it's a slightly different issue, but still one that requires a
different means of specifying topology.

2021-04-16 10:59:41

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 16/04/2021 11:32, Valentin Schneider wrote:
> On 16/04/21 15:47, Ruifeng Zhang wrote:
>> For more requirements, if all cores in one physical cluster, the
>> {aff2} of all cores are the same value.
>> i.e. the sc9863a,
>> core0: 0000000081000000
>> core1: 0000000081000100
>> core2: 0000000081000200
>> core3: 0000000081000300
>> core4: 0000000081000400
>> core5: 0000000081000500
>> core6: 0000000081000600
>> core7: 0000000081000700
>>
>> According to MPIDR all cores will parse to the one cluster, but it's
>> the big.LITTLE system, it's need two logic cluster for schedule or
>> cpufreq.
>> So I think it's better to add the logic of parse topology from DT.
>
> Ah, so it's a slightly different issue, but still one that requires a
> different means of specifying topology.

I'm confused. Do you have the MT bit set to 1 then? So the issue that
the mpidr handling in arm32's store_cpu_topology() is not correct does
not exist?

With DynamIQ you have only *one* cluster, you should also be able to run
your big.LITTLE system with only an MC sched domain.

# cat /proc/schedstat
cpu0 ....
domain0 ff ... <- MC
...

You can introduce a cpu-map to create what we called Phantom Domains in
Android products.

# cat /proc/schedstat

cpu0 ....
domain0 0f ... <- MC
domain1 ff ... < DIE

Is this what you need for your arm32 kernel system? Adding the
possibility to parse cpu-map to create Phantom Domains?

2021-04-16 11:35:18

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

Dietmar Eggemann <[email protected]> 于2021年4月16日周五下午6:39写道：
>
> On 16/04/2021 11:32, Valentin Schneider wrote:
> > On 16/04/21 15:47, Ruifeng Zhang wrote:
> >> For more requirements, if all cores in one physical cluster, the
> >> {aff2} of all cores are the same value.
> >> i.e. the sc9863a,
> >> core0: 0000000081000000
> >> core1: 0000000081000100
> >> core2: 0000000081000200
> >> core3: 0000000081000300
> >> core4: 0000000081000400
> >> core5: 0000000081000500
> >> core6: 0000000081000600
> >> core7: 0000000081000700
> >>
> >> According to MPIDR all cores will parse to the one cluster, but it's
> >> the big.LITTLE system, it's need two logic cluster for schedule or
> >> cpufreq.
> >> So I think it's better to add the logic of parse topology from DT.
> >
> > Ah, so it's a slightly different issue, but still one that requires a
> > different means of specifying topology.
>
> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> the mpidr handling in arm32's store_cpu_topology() is not correct does
> not exist?

I have reconfirmed it, the MT bit has been set to 1. I am sorry for
the previous messages.
The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.

>
> With DynamIQ you have only *one* cluster, you should also be able to run
> your big.LITTLE system with only an MC sched domain.
>
> # cat /proc/schedstat
> cpu0 ....
> domain0 ff ... <- MC
> ...
>
> You can introduce a cpu-map to create what we called Phantom Domains in
> Android products.
>
> # cat /proc/schedstat
>
> cpu0 ....
> domain0 0f ... <- MC
> domain1 ff ... < DIE
>
> Is this what you need for your arm32 kernel system? Adding the
> possibility to parse cpu-map to create Phantom Domains?

Yes, I need parse DT cpu-map to create different Phantom Domains.
With it, the dts should be change to:
cpu-map {
cluster0 {
core0 {
cpu = <&CPU0>;
};
core1 {
cpu = <&CPU1>;
};
core2 {
cpu = <&CPU2>;
};
core3 {
cpu = <&CPU3>;
};
};

cluster1 {
core0 {
cpu = <&CPU4>;
};
core1 {
cpu = <&CPU5>;
};
core2 {
cpu = <&CPU6>;
};
core3 {
cpu = <&CPU7>;
};
};
};

2021-04-16 12:07:46

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 16/04/21 12:39, Dietmar Eggemann wrote:
> On 16/04/2021 11:32, Valentin Schneider wrote:
>> On 16/04/21 15:47, Ruifeng Zhang wrote:
>>> For more requirements, if all cores in one physical cluster, the
>>> {aff2} of all cores are the same value.
>>> i.e. the sc9863a,
>>> core0: 0000000081000000
>>> core1: 0000000081000100
>>> core2: 0000000081000200
>>> core3: 0000000081000300
>>> core4: 0000000081000400
>>> core5: 0000000081000500
>>> core6: 0000000081000600
>>> core7: 0000000081000700
>>>
>>> According to MPIDR all cores will parse to the one cluster, but it's
>>> the big.LITTLE system, it's need two logic cluster for schedule or
>>> cpufreq.
>>> So I think it's better to add the logic of parse topology from DT.
>>
>> Ah, so it's a slightly different issue, but still one that requires a
>> different means of specifying topology.
>
> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> the mpidr handling in arm32's store_cpu_topology() is not correct does
> not exist?
>
> With DynamIQ you have only *one* cluster, you should also be able to run
> your big.LITTLE system with only an MC sched domain.
>
> # cat /proc/schedstat
> cpu0 ....
> domain0 ff ... <- MC
> ...
>

You're right, this is actually a DynamIQ system, not a (legacy) big.LITTLE
one, so all CPUs are under the same LLC (the DSU). I probably should have
checked this earlier on, but this is quite obvious from sc9863a.dtsi:

cpu-map {
cluster0 {
core0 {
cpu = <&CPU0>;
};
core1 {
cpu = <&CPU1>;
};
core2 {
cpu = <&CPU2>;
};
core3 {
cpu = <&CPU3>;
};
core4 {
cpu = <&CPU4>;
};
core5 {
cpu = <&CPU5>;
};
core6 {
cpu = <&CPU6>;
};
core7 {
cpu = <&CPU7>;
};
};
};

All CPUs are in the same cluster, and the MPIDR values actually match that.

> You can introduce a cpu-map to create what we called Phantom Domains in
> Android products.
>
> # cat /proc/schedstat
>
> cpu0 ....
> domain0 0f ... <- MC
> domain1 ff ... < DIE
>
> Is this what you need for your arm32 kernel system? Adding the
> possibility to parse cpu-map to create Phantom Domains?

2021-04-16 18:51:36

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 16/04/2021 13:04, Ruifeng Zhang wrote:
> Dietmar Eggemann <[email protected]> 于2021年4月16日周五下午6:39写道：
>>
>> On 16/04/2021 11:32, Valentin Schneider wrote:
>>> On 16/04/21 15:47, Ruifeng Zhang wrote:

[...]

>> I'm confused. Do you have the MT bit set to 1 then? So the issue that
>> the mpidr handling in arm32's store_cpu_topology() is not correct does
>> not exist?
>
> I have reconfirmed it, the MT bit has been set to 1. I am sorry for
> the previous messages.
> The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.

Nice! This is sorted then.

[...]

>> Is this what you need for your arm32 kernel system? Adding the
>> possibility to parse cpu-map to create Phantom Domains?
>
> Yes, I need parse DT cpu-map to create different Phantom Domains.
> With it, the dts should be change to:
> cpu-map {
> cluster0 {
> core0 {
> cpu = <&CPU0>;
> };
> core1 {
> cpu = <&CPU1>;
> };
> core2 {
> cpu = <&CPU2>;
> };
> core3 {
> cpu = <&CPU3>;
> };
> };
>
> cluster1 {
> core0 {
> cpu = <&CPU4>;
> };
> core1 {
> cpu = <&CPU5>;
> };
> core2 {
> cpu = <&CPU6>;
> };
> core3 {
> cpu = <&CPU7>;
> };
> };
> };
>

I'm afraid that this is now a much weaker case to get this into
mainline.

I'm able to run with an extra cpu-map entry:

diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
index 012d40a7228c..f60d9b448253 100644
--- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
+++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
@@ -35,6 +35,29 @@ cpus {
#address-cells = <1>;
#size-cells = <0>;

+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&cpu0>;
+ };
+ core1 {
+ cpu = <&cpu1>;
+ };
+ };
+
+ cluster1 {
+ core0 {
+ cpu = <&cpu2>;
+ };
+ core1 {
+ cpu = <&cpu3>;
+ };
+ core2 {
+ cpu = <&cpu4>;
+ };
+ };
+ };
+
cpu0: cpu@0 {

a condensed version (see below) of your patch on my Arm32 TC2.
The move of update_cpu_capacity() in store_cpu_topology() is only
necessary when I use the old clock-frequency based cpu_efficiency
approach for asymmetric CPU capacity (TC2 is a15/a7):

diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
index f60d9b448253..e0679cca40ed 100644
--- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
+++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
@@ -64,7 +64,7 @@ cpu0: cpu@0 {
reg = <0>;
cci-control-port = <&cci_control1>;
cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
- capacity-dmips-mhz = <1024>;
+ clock-frequency = <1000000000>;
dynamic-power-coefficient = <990>;
};

@@ -74,7 +74,7 @@ cpu1: cpu@1 {
reg = <1>;
cci-control-port = <&cci_control1>;
cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
- capacity-dmips-mhz = <1024>;
+ clock-frequency = <1000000000>;
dynamic-power-coefficient = <990>;
};

@@ -84,7 +84,7 @@ cpu2: cpu@2 {
reg = <0x100>;
cci-control-port = <&cci_control2>;
cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
- capacity-dmips-mhz = <516>;
+ clock-frequency = <800000000>;
dynamic-power-coefficient = <133>;
};

@@ -94,7 +94,7 @@ cpu3: cpu@3 {
reg = <0x101>;
cci-control-port = <&cci_control2>;
cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
- capacity-dmips-mhz = <516>;
+ clock-frequency = <800000000>;
dynamic-power-coefficient = <133>;
};

@@ -104,7 +104,7 @@ cpu4: cpu@4 {
reg = <0x102>;
cci-control-port = <&cci_control2>;
cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
- capacity-dmips-mhz = <516>;
+ clock-frequency = <800000000>;
dynamic-power-coefficient = <133>;
};

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index ef0058de432b..bff56c12c3a6 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -82,7 +82,7 @@ static bool cap_from_dt = true;
* 'average' CPU is of middle capacity. Also see the comments near
* table_efficiency[] and update_cpu_capacity().
*/
-static void __init parse_dt_topology(void)
+static void __init get_coretype_capacity(void)
{
const struct cpu_efficiency *cpu_eff;
struct device_node *cn = NULL;
@@ -173,7 +173,7 @@ static void update_cpu_capacity(unsigned int cpu)
}

#else
-static inline void parse_dt_topology(void) {}
+static inline void get_coretype_capacity(void) {}
static inline void update_cpu_capacity(unsigned int cpuid) {}
#endif

@@ -221,14 +221,13 @@ void store_cpu_topology(unsigned int cpuid)
cpuid_topo->package_id = -1;
}

- update_cpu_capacity(cpuid);
-
pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
cpuid, cpu_topology[cpuid].thread_id,
cpu_topology[cpuid].core_id,
cpu_topology[cpuid].package_id, mpidr);

topology_populated:
+ update_cpu_capacity(cpuid);
update_siblings_masks(cpuid);
}

@@ -241,5 +240,6 @@ void __init init_cpu_topology(void)
reset_cpu_topology();
smp_wmb();

+ get_coretype_capacity();
parse_dt_topology();
}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index de8587cc119e..a2335da28f2a 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -295,7 +295,6 @@ static void parsing_done_workfn(struct work_struct *work)
core_initcall(free_raw_capacity);
#endif

-#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
/*
* This function returns the logic cpu number of the node.
* There are basically three kinds of return values:
@@ -441,7 +440,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
return 0;
}

-static int __init parse_dt_topology(void)
+int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
@@ -481,7 +480,6 @@ static int __init parse_dt_topology(void)
of_node_put(cn);
return ret;
}
-#endif

/*
* cpu topology table
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 0f6cd6b73a61..cfa5a5072aa0 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -66,6 +66,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
+int __init parse_dt_topology(void);
const struct cpumask *cpu_coregroup_mask(int cpu);
void update_siblings_masks(unsigned int cpu);
void remove_cpu_topology(unsigned int cpuid);

2021-04-19 02:56:36

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

Dietmar Eggemann <[email protected]> 于2021年4月17日周六上午1:00写道：
>
> On 16/04/2021 13:04, Ruifeng Zhang wrote:
> > Dietmar Eggemann <[email protected]> 于2021年4月16日周五下午6:39写道：
> >>
> >> On 16/04/2021 11:32, Valentin Schneider wrote:
> >>> On 16/04/21 15:47, Ruifeng Zhang wrote:
>
> [...]
>
> >> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> >> the mpidr handling in arm32's store_cpu_topology() is not correct does
> >> not exist?
> >
> > I have reconfirmed it, the MT bit has been set to 1. I am sorry for
> > the previous messages.
> > The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.
>
> Nice! This is sorted then.
>
> [...]
>
> >> Is this what you need for your arm32 kernel system? Adding the
> >> possibility to parse cpu-map to create Phantom Domains?
> >
> > Yes, I need parse DT cpu-map to create different Phantom Domains.
> > With it, the dts should be change to:
> > cpu-map {
> > cluster0 {
> > core0 {
> > cpu = <&CPU0>;
> > };
> > core1 {
> > cpu = <&CPU1>;
> > };
> > core2 {
> > cpu = <&CPU2>;
> > };
> > core3 {
> > cpu = <&CPU3>;
> > };
> > };
> >
> > cluster1 {
> > core0 {
> > cpu = <&CPU4>;
> > };
> > core1 {
> > cpu = <&CPU5>;
> > };
> > core2 {
> > cpu = <&CPU6>;
> > };
> > core3 {
> > cpu = <&CPU7>;
> > };
> > };
> > };
> >
>
> I'm afraid that this is now a much weaker case to get this into
> mainline.

But it's still a problem and it's not break the original logic ( parse
topology from MPIDR or parse capacity ), only add the support for
parse topology from DT.
I think it should still be merged into the mainline. If don't, the
DynamIQ SoC has some issue in sched and cpufreq.
>
> I'm able to run with an extra cpu-map entry:

Great.
>
> diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> index 012d40a7228c..f60d9b448253 100644
> --- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> +++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> @@ -35,6 +35,29 @@ cpus {
> #address-cells = <1>;
> #size-cells = <0>;
>
> + cpu-map {
> + cluster0 {
> + core0 {
> + cpu = <&cpu0>;
> + };
> + core1 {
> + cpu = <&cpu1>;
> + };
> + };
> +
> + cluster1 {
> + core0 {
> + cpu = <&cpu2>;
> + };
> + core1 {
> + cpu = <&cpu3>;
> + };
> + core2 {
> + cpu = <&cpu4>;
> + };
> + };
> + };
> +
> cpu0: cpu@0 {
>
> a condensed version (see below) of your patch on my Arm32 TC2.
> The move of update_cpu_capacity() in store_cpu_topology() is only
> necessary when I use the old clock-frequency based cpu_efficiency
> approach for asymmetric CPU capacity (TC2 is a15/a7):
>
> diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> index f60d9b448253..e0679cca40ed 100644
> --- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> +++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> @@ -64,7 +64,7 @@ cpu0: cpu@0 {
> reg = <0>;
> cci-control-port = <&cci_control1>;
> cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
> - capacity-dmips-mhz = <1024>;
> + clock-frequency = <1000000000>;
> dynamic-power-coefficient = <990>;
> };
>
> @@ -74,7 +74,7 @@ cpu1: cpu@1 {
> reg = <1>;
> cci-control-port = <&cci_control1>;
> cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
> - capacity-dmips-mhz = <1024>;
> + clock-frequency = <1000000000>;
> dynamic-power-coefficient = <990>;
> };
>
> @@ -84,7 +84,7 @@ cpu2: cpu@2 {
> reg = <0x100>;
> cci-control-port = <&cci_control2>;
> cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
> - capacity-dmips-mhz = <516>;
> + clock-frequency = <800000000>;
> dynamic-power-coefficient = <133>;
> };
>
> @@ -94,7 +94,7 @@ cpu3: cpu@3 {
> reg = <0x101>;
> cci-control-port = <&cci_control2>;
> cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
> - capacity-dmips-mhz = <516>;
> + clock-frequency = <800000000>;
> dynamic-power-coefficient = <133>;
> };
>
> @@ -104,7 +104,7 @@ cpu4: cpu@4 {
> reg = <0x102>;
> cci-control-port = <&cci_control2>;
> cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
> - capacity-dmips-mhz = <516>;
> + clock-frequency = <800000000>;
> dynamic-power-coefficient = <133>;
> };
>
>
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index ef0058de432b..bff56c12c3a6 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -82,7 +82,7 @@ static bool cap_from_dt = true;
> * 'average' CPU is of middle capacity. Also see the comments near
> * table_efficiency[] and update_cpu_capacity().
> */
> -static void __init parse_dt_topology(void)
> +static void __init get_coretype_capacity(void)
> {
> const struct cpu_efficiency *cpu_eff;
> struct device_node *cn = NULL;
> @@ -173,7 +173,7 @@ static void update_cpu_capacity(unsigned int cpu)
> }
>
> #else
> -static inline void parse_dt_topology(void) {}
> +static inline void get_coretype_capacity(void) {}
> static inline void update_cpu_capacity(unsigned int cpuid) {}
> #endif
>
> @@ -221,14 +221,13 @@ void store_cpu_topology(unsigned int cpuid)
> cpuid_topo->package_id = -1;
> }
>
> - update_cpu_capacity(cpuid);
> -
> pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
> cpuid, cpu_topology[cpuid].thread_id,
> cpu_topology[cpuid].core_id,
> cpu_topology[cpuid].package_id, mpidr);
>
> topology_populated:
> + update_cpu_capacity(cpuid);

Agree, it's more better than mine.
> update_siblings_masks(cpuid);
> }
>
> @@ -241,5 +240,6 @@ void __init init_cpu_topology(void)
> reset_cpu_topology();
> smp_wmb();
>
> + get_coretype_capacity();
> parse_dt_topology();
> }
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index de8587cc119e..a2335da28f2a 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -295,7 +295,6 @@ static void parsing_done_workfn(struct work_struct *work)
> core_initcall(free_raw_capacity);
> #endif
>
> -#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)

It seems that the following functions are not defined on other CPU
architectures now, so I agree to remove it.
> /*
> * This function returns the logic cpu number of the node.
> * There are basically three kinds of return values:
> @@ -441,7 +440,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
> return 0;
> }
>
> -static int __init parse_dt_topology(void)
> +int __init parse_dt_topology(void)
> {
> struct device_node *cn, *map;
> int ret = 0;
> @@ -481,7 +480,6 @@ static int __init parse_dt_topology(void)
> of_node_put(cn);
> return ret;
> }
> -#endif
>
> /*
> * cpu topology table
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 0f6cd6b73a61..cfa5a5072aa0 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -66,6 +66,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
> #define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
> void init_cpu_topology(void);
> void store_cpu_topology(unsigned int cpuid);
> +int __init parse_dt_topology(void);
> const struct cpumask *cpu_coregroup_mask(int cpu);
> void update_siblings_masks(unsigned int cpu);
> void remove_cpu_topology(unsigned int cpuid);

Why do you keep the logic of topology_parse_cpu_capacity in arm
get_coretype_capacity function? The capacity-dmips-mhz will be parsed
by drivers/base/arch_topology.c as following:
parse_dt_topology
parse_cluster
parse_core
get_cpu_for_node
topology_parse_cpu_capacity

2021-04-19 21:29:17

[permalink] [raw]

Subject: Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

On 19/04/2021 04:55, Ruifeng Zhang wrote:
> Dietmar Eggemann <[email protected]> 于2021年4月17日周六上午1:00写道：
>>
>> On 16/04/2021 13:04, Ruifeng Zhang wrote:
>>> Dietmar Eggemann <[email protected]> 于2021年4月16日周五下午6:39写道：
>>>>
>>>> On 16/04/2021 11:32, Valentin Schneider wrote:
>>>>> On 16/04/21 15:47, Ruifeng Zhang wrote:

[...]

>> I'm afraid that this is now a much weaker case to get this into
>> mainline.
>
> But it's still a problem and it's not break the original logic ( parse
> topology from MPIDR or parse capacity ), only add the support for
> parse topology from DT.
> I think it should still be merged into the mainline. If don't, the
> DynamIQ SoC has some issue in sched and cpufreq.

IMHO, not necessarily. Your DynamIQ SoC is one cluster with 8 CPUs. It's
subdivided into 2 Frequency Domains (FDs).

CFS Energy-Aware-Scheduling (EAS, find_energy_efficient_cpu()) and
Capacity-Aware-Scheduling (CAS, select_idle_sibling() ->
select_idle_capacity()) work correctly even in case you only have an MC
sched domain (sd).
No matter which sd (MC, DIE) the sd_asym_cpucapacity is, we always
iterate over all CPUs. Per Performance Domains (i.e. FDs) in EAS and
over sched_domain_span(sd) in CAS.

CFS load-balancing (in case your system is `over-utilized`) might work
slightly different due to the missing DIE sd but not inevitably worse.

Do you have benchmarks or testcases in mind which convince you that
Phantom Domains is something you would need? BTW, they are called
Phantom since they let you use uarch and/or max CPU frequency domain to
fake real topology (like LLC) boundaries.

[...]

> Why do you keep the logic of topology_parse_cpu_capacity in arm
> get_coretype_capacity function? The capacity-dmips-mhz will be parsed
> by drivers/base/arch_topology.c as following:
> parse_dt_topology
> parse_cluster
> parse_core
> get_cpu_for_node
> topology_parse_cpu_capacity

I think we still need it for systems out there w/o cpu-map in dt, like
my arm32 TC2 with mainline vexpress-v2p-ca15_a7.dts.

It's called twice on each CPU in case I add the cpu-map dt entry though.

2021-04-23 06:28:12