2021-04-12 07:21:02

by Ruifeng Zhang

[permalink] [raw]
Subject: [PATCH 1/1] arm: topology: parse the topology from the dt

From: Ruifeng Zhang <[email protected]>

The arm topology still parse from the MPIDR, but it is incomplete. When
the armv8.3 cpu runs in aarch32 mode, it will parse out the wrong topology.

armv7 (A7) mpidr is:
[11:8] [7:2] [1:0]
cluster reserved cpu

armv8.3 (A55) mpidr is:
[23:16] [15:8] [7:0]
cluster cpu thread

For compatibility to keep the function of get capacity from default
cputype, renamed arm parse_dt_topology to get_cputype_capacity and delete
related logic of parse from dt.
Arm using the same parse_dt_topology function as arm64.

The arm device boot step is to look for the default cputype and get cpu
capacity firstly. Then parse the topology and capacity from dt to replace
default values.

Signed-off-by: Ruifeng Zhang <[email protected]>
---
arch/arm/kernel/topology.c | 18 ++++--------------
drivers/base/arch_topology.c | 4 ++--
include/linux/arch_topology.h | 1 +
3 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index ef0058de432b..7a4217367c7e 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -72,7 +72,6 @@ static unsigned long *__cpu_capacity;
#define cpu_capacity(cpu) __cpu_capacity[cpu]

static unsigned long middle_capacity = 1;
-static bool cap_from_dt = true;

/*
* Iterate all CPUs' descriptor in DT and compute the efficiency
@@ -82,7 +81,7 @@ static bool cap_from_dt = true;
* 'average' CPU is of middle capacity. Also see the comments near
* table_efficiency[] and update_cpu_capacity().
*/
-static void __init parse_dt_topology(void)
+static void __init get_coretype_capacity(void)
{
const struct cpu_efficiency *cpu_eff;
struct device_node *cn = NULL;
@@ -105,13 +104,6 @@ static void __init parse_dt_topology(void)
continue;
}

- if (topology_parse_cpu_capacity(cn, cpu)) {
- of_node_put(cn);
- continue;
- }
-
- cap_from_dt = false;
-
for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
if (of_device_is_compatible(cn, cpu_eff->compatible))
break;
@@ -151,9 +143,6 @@ static void __init parse_dt_topology(void)
else
middle_capacity = ((max_capacity / 3)
>> (SCHED_CAPACITY_SHIFT-1)) + 1;
-
- if (cap_from_dt)
- topology_normalize_cpu_scale();
}

/*
@@ -163,7 +152,7 @@ static void __init parse_dt_topology(void)
*/
static void update_cpu_capacity(unsigned int cpu)
{
- if (!cpu_capacity(cpu) || cap_from_dt)
+ if (!cpu_capacity(cpu))
return;

topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
@@ -173,7 +162,7 @@ static void update_cpu_capacity(unsigned int cpu)
}

#else
-static inline void parse_dt_topology(void) {}
+static inline void parse_dt_capacity(void) {}
static inline void update_cpu_capacity(unsigned int cpuid) {}
#endif

@@ -241,5 +230,6 @@ void __init init_cpu_topology(void)
reset_cpu_topology();
smp_wmb();

+ get_coretype_capacity();
parse_dt_topology();
}
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index de8587cc119e..a45aec356ec4 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -295,7 +295,7 @@ static void parsing_done_workfn(struct work_struct *work)
core_initcall(free_raw_capacity);
#endif

-#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
/*
* This function returns the logic cpu number of the node.
* There are basically three kinds of return values:
@@ -441,7 +441,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
return 0;
}

-static int __init parse_dt_topology(void)
+int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 0f6cd6b73a61..cfa5a5072aa0 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -66,6 +66,7 @@ extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
+int __init parse_dt_topology(void);
const struct cpumask *cpu_coregroup_mask(int cpu);
void update_siblings_masks(unsigned int cpu);
void remove_cpu_topology(unsigned int cpuid);
--
2.17.1


2021-04-12 12:44:16

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

On 12/04/2021 14:20, Ruifeng Zhang wrote:
> Valentin Schneider <[email protected]> 于2021年4月12日周一 下午7:32写道:
>>
>>
>> Hi,
>>
>> On 12/04/21 15:08, Ruifeng Zhang wrote:
>>> From: Ruifeng Zhang <[email protected]>
>>>
>>> The arm topology still parse from the MPIDR, but it is incomplete. When
>>> the armv8.3 cpu runs in aarch32 mode, it will parse out the wrong topology.
>>>
>>> armv7 (A7) mpidr is:
>>> [11:8] [7:2] [1:0]
>>> cluster reserved cpu
>>>
>>> armv8.3 (A55) mpidr is:
>>> [23:16] [15:8] [7:0]
>>> cluster cpu thread
>>>
>>> For compatibility to keep the function of get capacity from default
>>> cputype, renamed arm parse_dt_topology to get_cputype_capacity and delete
>>> related logic of parse from dt.
>>> Arm using the same parse_dt_topology function as arm64.
>>>
>>> The arm device boot step is to look for the default cputype and get cpu
>>> capacity firstly. Then parse the topology and capacity from dt to replace
>>> default values.
>>>
>>
>> I'm afraid I don't get it.
>>
>> CONFIG_COMPAT lets you run 32-bit stuff at EL0, but the kernel is still
>> arm64. So if you take your armv8.3 system, the topology parsed by the
>> kernel will be the same regardless of CONFIG_COMPAT.
>>
>> Could you elaborate on what problem you are trying to fix here?
>
> There is a armv8.3 cpu which should work normally both on aarch64 and aarch32.
> The MPIDR has been written to the chip register in armv8.3 format.
> For example,
> core0: 0000000080000000
> core1: 0000000080000100
> core2: 0000000080000200
> ...
>
> Its cpu topology can be parsed normally on aarch64 mode (both
> userspace and kernel work on arm64).
>
> The problem is when it working on aarch32 mode (both userspace and
> kernel work on arm 32-bit), the cpu topology
> will parse error because of the format is different between armv7 and armv8.3.
> The arm 32-bit driver, arch/arm/kernel/topology will parse the MPIDR
> and store to the topology with armv7,
> and the result is all cpu core_id is 0, the bit[1:0] of armv7 MPIDR format.
>
> In addition, I think arm should also allow customers to configure cpu
> topologies via DT.

This patch ruins the CPU capacity detection based on capacity-dmips-mhz
(Documentation/devicetree/bindings/arm/cpu-capacity.txt) on my TC2 [L B
B L L] (armv7).

tip/sched/core with *mainline* multi_v7_defconfig:

root@linaro-nano:~# cat /sys/devices/system/cpu/cpu*/cpu_capacity
516
1024
1024
516
516

your patch with mainline multi_v7_defconfig:

root@linaro-nano:~# cat /sys/devices/system/cpu/cpu*/cpu_capacity
1024
1024
1024
1024
1024


There are 2 capacity detection mechanism in arch/arm/kernel/topology.c:

(1) cpu_efficiency (only for armv7 a15 and a7) based, relies on
clock-frequency dt property

(2) capacity-dmips-mhz dt property based

I currently don't see how this different MPIDR layout leads to you code
changes.


2021-04-12 15:38:07

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

On 12/04/21 20:20, Ruifeng Zhang wrote:
> There is a armv8.3 cpu which should work normally both on aarch64 and aarch32.
> The MPIDR has been written to the chip register in armv8.3 format.
> For example,
> core0: 0000000080000000
> core1: 0000000080000100
> core2: 0000000080000200
> ...
>
> Its cpu topology can be parsed normally on aarch64 mode (both
> userspace and kernel work on arm64).
>
> The problem is when it working on aarch32 mode (both userspace and
> kernel work on arm 32-bit),

I didn't know using aarch32 elsewhere than EL0 was something actually being
used. Do you deploy this somewhere, or do you use it for testing purposes?

> the cpu topology
> will parse error because of the format is different between armv7 and armv8.3.
> The arm 32-bit driver, arch/arm/kernel/topology will parse the MPIDR
> and store to the topology with armv7,
> and the result is all cpu core_id is 0, the bit[1:0] of armv7 MPIDR format.
>

I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
I couldn't find anything about MPIDR format differences:

DDI 0487G.a G8.2.113
"""
AArch32 System register MPIDR bits [31:0] are architecturally mapped to
AArch64 System register MPIDR_EL1[31:0].
"""

Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
just the same, i.e. for both of them, with your example of:

core0: 0000000080000000
core1: 0000000080000100
core2: 0000000080000200
...

we'll get:

| | aff2 | aff1 | aff0 |
|-------+------+------+------|
| Core0 | 0 | 0 | 0 |
| Core1 | 0 | 1 | 0 |
| Core2 | 0 | 2 | 0 |
...

Now, arm64 doesn't fallback to MPIDR for topology information anymore since

3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")

so without DT we would get:
| | package_id | core_id |
|-------+------------+---------|
| Core0 | 0 | 0 |
| Core1 | 0 | 1 |
| Core2 | 0 | 2 |

Whereas with an arm kernel we'll end up parsing MPIDR as:
| | package_id | core_id |
|-------+------------+---------|
| Core0 | 0 | 0 |
| Core1 | 1 | 0 |
| Core2 | 2 | 0 |

Did I get this right? Is this what you're observing?

> In addition, I think arm should also allow customers to configure cpu
> topologies via DT.

2021-04-13 01:02:11

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt


Hi,

On 12/04/21 15:08, Ruifeng Zhang wrote:
> From: Ruifeng Zhang <[email protected]>
>
> The arm topology still parse from the MPIDR, but it is incomplete. When
> the armv8.3 cpu runs in aarch32 mode, it will parse out the wrong topology.
>
> armv7 (A7) mpidr is:
> [11:8] [7:2] [1:0]
> cluster reserved cpu
>
> armv8.3 (A55) mpidr is:
> [23:16] [15:8] [7:0]
> cluster cpu thread
>
> For compatibility to keep the function of get capacity from default
> cputype, renamed arm parse_dt_topology to get_cputype_capacity and delete
> related logic of parse from dt.
> Arm using the same parse_dt_topology function as arm64.
>
> The arm device boot step is to look for the default cputype and get cpu
> capacity firstly. Then parse the topology and capacity from dt to replace
> default values.
>

I'm afraid I don't get it.

CONFIG_COMPAT lets you run 32-bit stuff at EL0, but the kernel is still
arm64. So if you take your armv8.3 system, the topology parsed by the
kernel will be the same regardless of CONFIG_COMPAT.

Could you elaborate on what problem you are trying to fix here?

2021-04-13 01:35:49

by Ruifeng Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

Valentin Schneider <[email protected]> 于2021年4月12日周一 下午7:32写道:
>
>
> Hi,
>
> On 12/04/21 15:08, Ruifeng Zhang wrote:
> > From: Ruifeng Zhang <[email protected]>
> >
> > The arm topology still parse from the MPIDR, but it is incomplete. When
> > the armv8.3 cpu runs in aarch32 mode, it will parse out the wrong topology.
> >
> > armv7 (A7) mpidr is:
> > [11:8] [7:2] [1:0]
> > cluster reserved cpu
> >
> > armv8.3 (A55) mpidr is:
> > [23:16] [15:8] [7:0]
> > cluster cpu thread
> >
> > For compatibility to keep the function of get capacity from default
> > cputype, renamed arm parse_dt_topology to get_cputype_capacity and delete
> > related logic of parse from dt.
> > Arm using the same parse_dt_topology function as arm64.
> >
> > The arm device boot step is to look for the default cputype and get cpu
> > capacity firstly. Then parse the topology and capacity from dt to replace
> > default values.
> >
>
> I'm afraid I don't get it.
>
> CONFIG_COMPAT lets you run 32-bit stuff at EL0, but the kernel is still
> arm64. So if you take your armv8.3 system, the topology parsed by the
> kernel will be the same regardless of CONFIG_COMPAT.
>
> Could you elaborate on what problem you are trying to fix here?

There is a armv8.3 cpu which should work normally both on aarch64 and aarch32.
The MPIDR has been written to the chip register in armv8.3 format.
For example,
core0: 0000000080000000
core1: 0000000080000100
core2: 0000000080000200
...

Its cpu topology can be parsed normally on aarch64 mode (both
userspace and kernel work on arm64).

The problem is when it working on aarch32 mode (both userspace and
kernel work on arm 32-bit), the cpu topology
will parse error because of the format is different between armv7 and armv8.3.
The arm 32-bit driver, arch/arm/kernel/topology will parse the MPIDR
and store to the topology with armv7,
and the result is all cpu core_id is 0, the bit[1:0] of armv7 MPIDR format.

In addition, I think arm should also allow customers to configure cpu
topologies via DT.

2021-04-13 08:58:23

by Ruifeng Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:
>
> On 12/04/21 20:20, Ruifeng Zhang wrote:
> > There is a armv8.3 cpu which should work normally both on aarch64 and aarch32.
> > The MPIDR has been written to the chip register in armv8.3 format.
> > For example,
> > core0: 0000000080000000
> > core1: 0000000080000100
> > core2: 0000000080000200
> > ...
> >
> > Its cpu topology can be parsed normally on aarch64 mode (both
> > userspace and kernel work on arm64).
> >
> > The problem is when it working on aarch32 mode (both userspace and
> > kernel work on arm 32-bit),
>
> I didn't know using aarch32 elsewhere than EL0 was something actually being
> used. Do you deploy this somewhere, or do you use it for testing purposes?

In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
version, one
of them is the kernel running on EL1 using aarch32.
user(EL0) kernel(EL1)
sc9863a_go aarch32 aarch32
sc9863a aarch64 aarch64
>
> > the cpu topology
> > will parse error because of the format is different between armv7 and armv8.3.
> > The arm 32-bit driver, arch/arm/kernel/topology will parse the MPIDR
> > and store to the topology with armv7,
> > and the result is all cpu core_id is 0, the bit[1:0] of armv7 MPIDR format.
> >
>
> I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
> I couldn't find anything about MPIDR format differences:
>
> DDI 0487G.a G8.2.113
> """
> AArch32 System register MPIDR bits [31:0] are architecturally mapped to
> AArch64 System register MPIDR_EL1[31:0].
> """
>
> Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
> just the same, i.e. for both of them, with your example of:

The cortex-a7 spec DDI0464F 4.3.5
https://developer.arm.com/documentation/ddi0464/f/?lang=en

The current arch/arm/kernel/topology code parse the MPIDR with a armv7 format.
the parse code is:
void store_cpu_topology(unsigned int cpuid)
{
...
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
...
}
>
> core0: 0000000080000000
> core1: 0000000080000100
> core2: 0000000080000200
> ...
>
> we'll get:
>
> | | aff2 | aff1 | aff0 |
> |-------+------+------+------|
> | Core0 | 0 | 0 | 0 |
> | Core1 | 0 | 1 | 0 |
> | Core2 | 0 | 2 | 0 |
> ...
>
> Now, arm64 doesn't fallback to MPIDR for topology information anymore since
>
> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
>
> so without DT we would get:
> | | package_id | core_id |
> |-------+------------+---------|
> | Core0 | 0 | 0 |
> | Core1 | 0 | 1 |
> | Core2 | 0 | 2 |
>
> Whereas with an arm kernel we'll end up parsing MPIDR as:
> | | package_id | core_id |
> |-------+------------+---------|
> | Core0 | 0 | 0 |
> | Core1 | 1 | 0 |
> | Core2 | 2 | 0 |
>
> Did I get this right? Is this what you're observing?

Yes, this is a problem if an armv8.2 or above cpu is running a 32-bit
kernel on EL1.
>
> > In addition, I think arm should also allow customers to configure cpu
> > topologies via DT.

2021-04-13 08:58:51

by Ruifeng Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

Dietmar Eggemann <[email protected]> 于2021年4月12日周一 下午8:40写道:
>
> On 12/04/2021 14:20, Ruifeng Zhang wrote:
> > Valentin Schneider <[email protected]> 于2021年4月12日周一 下午7:32写道:
> >>
> >>
> >> Hi,
> >>
> >> On 12/04/21 15:08, Ruifeng Zhang wrote:
> >>> From: Ruifeng Zhang <[email protected]>
> >>>
> >>> The arm topology still parse from the MPIDR, but it is incomplete. When
> >>> the armv8.3 cpu runs in aarch32 mode, it will parse out the wrong topology.
> >>>
> >>> armv7 (A7) mpidr is:
> >>> [11:8] [7:2] [1:0]
> >>> cluster reserved cpu
> >>>
> >>> armv8.3 (A55) mpidr is:
> >>> [23:16] [15:8] [7:0]
> >>> cluster cpu thread
> >>>
> >>> For compatibility to keep the function of get capacity from default
> >>> cputype, renamed arm parse_dt_topology to get_cputype_capacity and delete
> >>> related logic of parse from dt.
> >>> Arm using the same parse_dt_topology function as arm64.
> >>>
> >>> The arm device boot step is to look for the default cputype and get cpu
> >>> capacity firstly. Then parse the topology and capacity from dt to replace
> >>> default values.
> >>>
> >>
> >> I'm afraid I don't get it.
> >>
> >> CONFIG_COMPAT lets you run 32-bit stuff at EL0, but the kernel is still
> >> arm64. So if you take your armv8.3 system, the topology parsed by the
> >> kernel will be the same regardless of CONFIG_COMPAT.
> >>
> >> Could you elaborate on what problem you are trying to fix here?
> >
> > There is a armv8.3 cpu which should work normally both on aarch64 and aarch32.
> > The MPIDR has been written to the chip register in armv8.3 format.
> > For example,
> > core0: 0000000080000000
> > core1: 0000000080000100
> > core2: 0000000080000200
> > ...
> >
> > Its cpu topology can be parsed normally on aarch64 mode (both
> > userspace and kernel work on arm64).
> >
> > The problem is when it working on aarch32 mode (both userspace and
> > kernel work on arm 32-bit), the cpu topology
> > will parse error because of the format is different between armv7 and armv8.3.
> > The arm 32-bit driver, arch/arm/kernel/topology will parse the MPIDR
> > and store to the topology with armv7,
> > and the result is all cpu core_id is 0, the bit[1:0] of armv7 MPIDR format.
> >
> > In addition, I think arm should also allow customers to configure cpu
> > topologies via DT.
>
> This patch ruins the CPU capacity detection based on capacity-dmips-mhz
> (Documentation/devicetree/bindings/arm/cpu-capacity.txt) on my TC2 [L B
> B L L] (armv7).
>
> tip/sched/core with *mainline* multi_v7_defconfig:
>
> root@linaro-nano:~# cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 516
> 1024
> 1024
> 516
> 516
>
> your patch with mainline multi_v7_defconfig:
>
> root@linaro-nano:~# cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1024
> 1024
> 1024
> 1024
> 1024
>
>
> There are 2 capacity detection mechanism in arch/arm/kernel/topology.c:
>
> (1) cpu_efficiency (only for armv7 a15 and a7) based, relies on
> clock-frequency dt property
>
> (2) capacity-dmips-mhz dt property based
>
> I currently don't see how this different MPIDR layout leads to you code
> changes.

Thanks for your test, I will update patch-V2 to solve this problem.
>
>

2021-04-13 14:51:23

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

On 13/04/21 14:13, Ruifeng Zhang wrote:
> Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:
>> I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
>> I couldn't find anything about MPIDR format differences:
>>
>> DDI 0487G.a G8.2.113
>> """
>> AArch32 System register MPIDR bits [31:0] are architecturally mapped to
>> AArch64 System register MPIDR_EL1[31:0].
>> """
>>
>> Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
>> just the same, i.e. for both of them, with your example of:
>
> The cortex-a7 spec DDI0464F 4.3.5
> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>

Ah, so that's where the core_id=bit[1:0] comes from. That does still
conform to the MPIDR format, and as you point out below that's being parsed
the same (aff2, aff1, aff0) == mpidr([23:16][15:8][7:0])

> The current arch/arm/kernel/topology code parse the MPIDR with a armv7 format.
> the parse code is:
> void store_cpu_topology(unsigned int cpuid)
> {
> ...
> cpuid_topo->thread_id = -1;
> cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> ...
> }
>>
>> core0: 0000000080000000
>> core1: 0000000080000100
>> core2: 0000000080000200
>> ...
>>
>> we'll get:
>>
>> | | aff2 | aff1 | aff0 |
>> |-------+------+------+------|
>> | Core0 | 0 | 0 | 0 |
>> | Core1 | 0 | 1 | 0 |
>> | Core2 | 0 | 2 | 0 |
>> ...
>>
>> Now, arm64 doesn't fallback to MPIDR for topology information anymore since
>>
>> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
>>
>> so without DT we would get:
>> | | package_id | core_id |
>> |-------+------------+---------|
>> | Core0 | 0 | 0 |
>> | Core1 | 0 | 1 |
>> | Core2 | 0 | 2 |
>>
>> Whereas with an arm kernel we'll end up parsing MPIDR as:
>> | | package_id | core_id |
>> |-------+------------+---------|
>> | Core0 | 0 | 0 |
>> | Core1 | 1 | 0 |
>> | Core2 | 2 | 0 |
>>
>> Did I get this right? Is this what you're observing?
>
> Yes, this is a problem if an armv8.2 or above cpu is running a 32-bit
> kernel on EL1.


With the above MPIDR(_EL1) values, you would have the same problem in
aarch64 mode on any kernel predating

3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")

since all Aff0 values are 0. Arguably those MPIDR(_EL1) values don't
make much sense (cores in the same cluster should have different Aff0
values, unless SMT), but in arm64 that's usually "corrected" by DT.

As you pointed out, arm doesn't currently leverage the cpu-map DT entry. I
don't see any obvious problem with adding support for it, so if you can fix
the capacity issue Dietmar reported, I think we could consider it.

2021-04-13 15:25:58

by Ruifeng Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

Thanks for your review. Patch-v2 that solve the capacity issue will be
uploaded as soon as possible. : )

Valentin Schneider <[email protected]> 于2021年4月13日周二 下午7:40写道:
>
> On 13/04/21 14:13, Ruifeng Zhang wrote:
> > Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:
> >> I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
> >> I couldn't find anything about MPIDR format differences:
> >>
> >> DDI 0487G.a G8.2.113
> >> """
> >> AArch32 System register MPIDR bits [31:0] are architecturally mapped to
> >> AArch64 System register MPIDR_EL1[31:0].
> >> """
> >>
> >> Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
> >> just the same, i.e. for both of them, with your example of:
> >
> > The cortex-a7 spec DDI0464F 4.3.5
> > https://developer.arm.com/documentation/ddi0464/f/?lang=en
> >
>
> Ah, so that's where the core_id=bit[1:0] comes from. That does still
> conform to the MPIDR format, and as you point out below that's being parsed
> the same (aff2, aff1, aff0) == mpidr([23:16][15:8][7:0])
>
> > The current arch/arm/kernel/topology code parse the MPIDR with a armv7 format.
> > the parse code is:
> > void store_cpu_topology(unsigned int cpuid)
> > {
> > ...
> > cpuid_topo->thread_id = -1;
> > cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> > cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> > ...
> > }
> >>
> >> core0: 0000000080000000
> >> core1: 0000000080000100
> >> core2: 0000000080000200
> >> ...
> >>
> >> we'll get:
> >>
> >> | | aff2 | aff1 | aff0 |
> >> |-------+------+------+------|
> >> | Core0 | 0 | 0 | 0 |
> >> | Core1 | 0 | 1 | 0 |
> >> | Core2 | 0 | 2 | 0 |
> >> ...
> >>
> >> Now, arm64 doesn't fallback to MPIDR for topology information anymore since
> >>
> >> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
> >>
> >> so without DT we would get:
> >> | | package_id | core_id |
> >> |-------+------------+---------|
> >> | Core0 | 0 | 0 |
> >> | Core1 | 0 | 1 |
> >> | Core2 | 0 | 2 |
> >>
> >> Whereas with an arm kernel we'll end up parsing MPIDR as:
> >> | | package_id | core_id |
> >> |-------+------------+---------|
> >> | Core0 | 0 | 0 |
> >> | Core1 | 1 | 0 |
> >> | Core2 | 2 | 0 |
> >>
> >> Did I get this right? Is this what you're observing?
> >
> > Yes, this is a problem if an armv8.2 or above cpu is running a 32-bit
> > kernel on EL1.
>
>
> With the above MPIDR(_EL1) values, you would have the same problem in
> aarch64 mode on any kernel predating
>
> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
>
> since all Aff0 values are 0. Arguably those MPIDR(_EL1) values don't
> make much sense (cores in the same cluster should have different Aff0
> values, unless SMT), but in arm64 that's usually "corrected" by DT.
>
> As you pointed out, arm doesn't currently leverage the cpu-map DT entry. I
> don't see any obvious problem with adding support for it, so if you can fix
> the capacity issue Dietmar reported, I think we could consider it.

2021-04-14 15:06:34

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

On 13/04/2021 15:26, Ruifeng Zhang wrote:
> Thanks for your review. Patch-v2 that solve the capacity issue will be
> uploaded as soon as possible. : )
>
> Valentin Schneider <[email protected]> 于2021年4月13日周二 下午7:40写道:
>>
>> On 13/04/21 14:13, Ruifeng Zhang wrote:
>>> Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:
>>>> I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
>>>> I couldn't find anything about MPIDR format differences:
>>>>
>>>> DDI 0487G.a G8.2.113
>>>> """
>>>> AArch32 System register MPIDR bits [31:0] are architecturally mapped to
>>>> AArch64 System register MPIDR_EL1[31:0].
>>>> """
>>>>
>>>> Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
>>>> just the same, i.e. for both of them, with your example of:
>>>
>>> The cortex-a7 spec DDI0464F 4.3.5
>>> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>>>
>>
>> Ah, so that's where the core_id=bit[1:0] comes from. That does still
>> conform to the MPIDR format, and as you point out below that's being parsed
>> the same (aff2, aff1, aff0) == mpidr([23:16][15:8][7:0])
>>
>>> The current arch/arm/kernel/topology code parse the MPIDR with a armv7 format.
>>> the parse code is:
>>> void store_cpu_topology(unsigned int cpuid)
>>> {
>>> ...
>>> cpuid_topo->thread_id = -1;
>>> cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>>> cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>>> ...
>>> }
>>>>
>>>> core0: 0000000080000000
>>>> core1: 0000000080000100
>>>> core2: 0000000080000200
>>>> ...
>>>>
>>>> we'll get:
>>>>
>>>> | | aff2 | aff1 | aff0 |
>>>> |-------+------+------+------|
>>>> | Core0 | 0 | 0 | 0 |
>>>> | Core1 | 0 | 1 | 0 |
>>>> | Core2 | 0 | 2 | 0 |
>>>> ...
>>>>
>>>> Now, arm64 doesn't fallback to MPIDR for topology information anymore since
>>>>
>>>> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
>>>>
>>>> so without DT we would get:
>>>> | | package_id | core_id |
>>>> |-------+------------+---------|
>>>> | Core0 | 0 | 0 |
>>>> | Core1 | 0 | 1 |
>>>> | Core2 | 0 | 2 |
>>>>
>>>> Whereas with an arm kernel we'll end up parsing MPIDR as:
>>>> | | package_id | core_id |
>>>> |-------+------------+---------|
>>>> | Core0 | 0 | 0 |
>>>> | Core1 | 1 | 0 |
>>>> | Core2 | 2 | 0 |
>>>>
>>>> Did I get this right? Is this what you're observing?
>>>
>>> Yes, this is a problem if an armv8.2 or above cpu is running a 32-bit
>>> kernel on EL1.
>>
>>
>> With the above MPIDR(_EL1) values, you would have the same problem in
>> aarch64 mode on any kernel predating
>>
>> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
>>
>> since all Aff0 values are 0. Arguably those MPIDR(_EL1) values don't
>> make much sense (cores in the same cluster should have different Aff0
>> values, unless SMT), but in arm64 that's usually "corrected" by DT.
>>
>> As you pointed out, arm doesn't currently leverage the cpu-map DT entry. I
>> don't see any obvious problem with adding support for it, so if you can fix
>> the capacity issue Dietmar reported, I think we could consider it.

Coming back to your original patch. You want to use parse_dt_topology()
from drivers/base/arch_topology.c to be able detect a cpu-map in dt and
so bypassing the read of mpidr in store_cpu_topology()?

Looks like sc9863a has two frequency domains (1.6 and 1.2GHz). So
technically it's a big.LITTLE system (based only on max CPU frequency
(not on uarch) differences).
But the dts file doesn't contain any `capacity-dmips-mhz` entries? So
asymmetric CPU capacity (even only based on max CPU frequency) detection
won't kick in. Since you don't have any uarch diffs, you would have to
specify `capacity-dmips-mhz = <1024>` for each CPU.

2021-04-14 15:32:49

by Ruifeng Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

Dietmar Eggemann <[email protected]> 于2021年4月14日周三 下午5:43写道:
>
> On 13/04/2021 15:26, Ruifeng Zhang wrote:
> > Thanks for your review. Patch-v2 that solve the capacity issue will be
> > uploaded as soon as possible. : )
> >
> > Valentin Schneider <[email protected]> 于2021年4月13日周二 下午7:40写道:
> >>
> >> On 13/04/21 14:13, Ruifeng Zhang wrote:
> >>> Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:
> >>>> I'm not fluent at all in armv7 (or most aarch32 compat mode stuff), but
> >>>> I couldn't find anything about MPIDR format differences:
> >>>>
> >>>> DDI 0487G.a G8.2.113
> >>>> """
> >>>> AArch32 System register MPIDR bits [31:0] are architecturally mapped to
> >>>> AArch64 System register MPIDR_EL1[31:0].
> >>>> """
> >>>>
> >>>> Peeking at some armv7 doc and arm/kernel/topology.c the layout really looks
> >>>> just the same, i.e. for both of them, with your example of:
> >>>
> >>> The cortex-a7 spec DDI0464F 4.3.5
> >>> https://developer.arm.com/documentation/ddi0464/f/?lang=en
> >>>
> >>
> >> Ah, so that's where the core_id=bit[1:0] comes from. That does still
> >> conform to the MPIDR format, and as you point out below that's being parsed
> >> the same (aff2, aff1, aff0) == mpidr([23:16][15:8][7:0])
> >>
> >>> The current arch/arm/kernel/topology code parse the MPIDR with a armv7 format.
> >>> the parse code is:
> >>> void store_cpu_topology(unsigned int cpuid)
> >>> {
> >>> ...
> >>> cpuid_topo->thread_id = -1;
> >>> cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> >>> cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> >>> ...
> >>> }
> >>>>
> >>>> core0: 0000000080000000
> >>>> core1: 0000000080000100
> >>>> core2: 0000000080000200
> >>>> ...
> >>>>
> >>>> we'll get:
> >>>>
> >>>> | | aff2 | aff1 | aff0 |
> >>>> |-------+------+------+------|
> >>>> | Core0 | 0 | 0 | 0 |
> >>>> | Core1 | 0 | 1 | 0 |
> >>>> | Core2 | 0 | 2 | 0 |
> >>>> ...
> >>>>
> >>>> Now, arm64 doesn't fallback to MPIDR for topology information anymore since
> >>>>
> >>>> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
> >>>>
> >>>> so without DT we would get:
> >>>> | | package_id | core_id |
> >>>> |-------+------------+---------|
> >>>> | Core0 | 0 | 0 |
> >>>> | Core1 | 0 | 1 |
> >>>> | Core2 | 0 | 2 |
> >>>>
> >>>> Whereas with an arm kernel we'll end up parsing MPIDR as:
> >>>> | | package_id | core_id |
> >>>> |-------+------------+---------|
> >>>> | Core0 | 0 | 0 |
> >>>> | Core1 | 1 | 0 |
> >>>> | Core2 | 2 | 0 |
> >>>>
> >>>> Did I get this right? Is this what you're observing?
> >>>
> >>> Yes, this is a problem if an armv8.2 or above cpu is running a 32-bit
> >>> kernel on EL1.
> >>
> >>
> >> With the above MPIDR(_EL1) values, you would have the same problem in
> >> aarch64 mode on any kernel predating
> >>
> >> 3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")
> >>
> >> since all Aff0 values are 0. Arguably those MPIDR(_EL1) values don't
> >> make much sense (cores in the same cluster should have different Aff0
> >> values, unless SMT), but in arm64 that's usually "corrected" by DT.
> >>
> >> As you pointed out, arm doesn't currently leverage the cpu-map DT entry. I
> >> don't see any obvious problem with adding support for it, so if you can fix
> >> the capacity issue Dietmar reported, I think we could consider it.
>
> Coming back to your original patch. You want to use parse_dt_topology()
> from drivers/base/arch_topology.c to be able detect a cpu-map in dt and
> so bypassing the read of mpidr in store_cpu_topology()?

Yes
>
> Looks like sc9863a has two frequency domains (1.6 and 1.2GHz). So
> technically it's a big.LITTLE system (based only on max CPU frequency
> (not on uarch) differences).
> But the dts file doesn't contain any `capacity-dmips-mhz` entries? So
> asymmetric CPU capacity (even only based on max CPU frequency) detection
> won't kick in. Since you don't have any uarch diffs, you would have to
> specify `capacity-dmips-mhz = <1024>` for each CPU.

Yes, for capacity, the DT should have a capacity-dmips-MHz entry or a
clock-frequency entry (for A7 and A15 only).
The sc9863a dts is a vendor file, in my opinion is not appropriate to
be update with this series.
What do you think if I independently update the sc9863a dts file later?

2021-04-15 00:29:29

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH 1/1] arm: topology: parse the topology from the dt

On 14/04/2021 13:26, Ruifeng Zhang wrote:
> Dietmar Eggemann <[email protected]> 于2021年4月14日周三 下午5:43写道:
>>
>> On 13/04/2021 15:26, Ruifeng Zhang wrote:
>>> Thanks for your review. Patch-v2 that solve the capacity issue will be
>>> uploaded as soon as possible. : )
>>>
>>> Valentin Schneider <[email protected]> 于2021年4月13日周二 下午7:40写道:
>>>>
>>>> On 13/04/21 14:13, Ruifeng Zhang wrote:
>>>>> Valentin Schneider <[email protected]> 于2021年4月12日周一 下午11:33写道:

[...]

>> Looks like sc9863a has two frequency domains (1.6 and 1.2GHz). So
>> technically it's a big.LITTLE system (based only on max CPU frequency
>> (not on uarch) differences).
>> But the dts file doesn't contain any `capacity-dmips-mhz` entries? So
>> asymmetric CPU capacity (even only based on max CPU frequency) detection
>> won't kick in. Since you don't have any uarch diffs, you would have to
>> specify `capacity-dmips-mhz = <1024>` for each CPU.
>
> Yes, for capacity, the DT should have a capacity-dmips-MHz entry or a
> clock-frequency entry (for A7 and A15 only).
> The sc9863a dts is a vendor file, in my opinion is not appropriate to
> be update with this series.
> What do you think if I independently update the sc9863a dts file later?
>

Yes, this is a separate thing. Just wanted to mention it here since this
allows you to test asymmetric CPU capacity on your platform w/ and w/o
your patch on arm64 and arm.

No need to add `clock-frequency` entries, just `capacity-dmips-mhz`
entries should do.