2022-06-21 19:50:26

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 00/20] arch_topology: Updates to add socket support and fix cluster ids

Hi All,

This version updates cacheinfo to populate and use the information from
there for all the cache topology.

This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.

Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.

These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.

The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.

Hi Greg,

I had not cc-ed you on earlier 3 versions as we had some disagreement
amongst Arm developers which we have not settled. Let me know how you want to
merge this once you agree with the changes. I can set pull request if
you prefer. Let me know.

v4[3]->v4:
- Updated ACPI PPTT fw_token to use table offset instead of virtual
address as it could get changed for everytime it is mapped before
the global acpi_permanent_mmap is set
- Added warning for the topology with nested clusters
- Added update to cpu_clustergroup_mask so that introduction of
correct cluster_id doesn't break existing platforms by limiting
the span of clustergroup_mask(by Ionela)

v2[2]->v3[3]:
- Dropped support to get the device node for the CPU's LLC
- Updated cacheinfo to support calling of detect_cache_attributes
early in smp_prepare_cpus stage
- Added support to check if LLC is valid and shared in the cacheinfo
- Used the same in arch_topology

v1[1]->v2[2]:
- Updated ID validity check include all non-negative value
- Added support to get the device node for the CPU's last level cache
- Added support to build llc_sibling on DT platforms

[1] https://lore.kernel.org/lkml/[email protected]
[2] https://lore.kernel.org/lkml/[email protected]
[3] https://lore.kernel.org/lkml/[email protected]


Ionela Voinescu (1):
arch_topology: Limit span of cpu_clustergroup_mask()

Sudeep Holla (19):
ACPI: PPTT: Use table offset as fw_token instead of virtual address
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Use cache identifiers to check if the caches are shared if available
arch_topology: Add support to parse and detect cache attributes
arch_topology: Use the last level cache information from the cacheinfo
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Drop LLC identifier stash from the CPU topology
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Drop unnecessary check for uninitialised package_id
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Warn that topology for nested clusters is not supported

arch/arm64/kernel/topology.c | 14 ----
drivers/acpi/pptt.c | 3 +-
drivers/base/arch_topology.c | 102 ++++++++++++++++++---------
drivers/base/cacheinfo.c | 127 ++++++++++++++++++++++------------
include/linux/arch_topology.h | 1 -
include/linux/cacheinfo.h | 3 +
6 files changed, 159 insertions(+), 91 deletions(-)

--
2.36.1


2022-06-21 19:51:01

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 10/20] arm64: topology: Remove redundant setting of llc_id in CPU topology

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.

Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.

Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
arch/arm64/kernel/topology.c | 14 --------------
1 file changed, 14 deletions(-)

Hi Will/Catalin,

This is part of a series updating topology to get both ACPI and DT view
aligned. I have not cc-ed you assuming you won't be interested. Let me
know if you are. The parts affecting arm64 is just this patch removing
some unnecessary ACPI code that is now moved to core arch_topology.c

Please ack if you are happy with this and OK to take this as part of the
series.

Regards,
Sudeep

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
return 0;

for_each_possible_cpu(cpu) {
- int i, cache_id;
-
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].cluster_id = topology_id;
topology_id = find_acpi_cpu_topology_package(cpu);
cpu_topology[cpu].package_id = topology_id;
-
- i = acpi_find_last_cache_level(cpu);
-
- if (i > 0) {
- /*
- * this is the only part of cpu_topology that has
- * a direct relationship with the cache topology
- */
- cache_id = find_acpi_cpu_cache_topology(cpu, i);
- if (cache_id > 0)
- cpu_topology[cpu].llc_id = cache_id;
- }
}

return 0;
--
2.36.1

2022-06-21 19:52:36

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 02/20] cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.

Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index dad296229161..b0bde272e2ae 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -14,7 +14,7 @@
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/init.h>
-#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/smp.h>
@@ -157,7 +157,6 @@ static int cache_setup_of_node(unsigned int cpu)
{
struct device_node *np;
struct cacheinfo *this_leaf;
- struct device *cpu_dev = get_cpu_device(cpu);
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
unsigned int index = 0;

@@ -166,11 +165,7 @@ static int cache_setup_of_node(unsigned int cpu)
return 0;
}

- if (!cpu_dev) {
- pr_err("No cpu device for CPU %d\n", cpu);
- return -ENODEV;
- }
- np = cpu_dev->of_node;
+ np = of_cpu_device_node_get(cpu);
if (!np) {
pr_err("Failed to find cpu%d device node\n", cpu);
return -ENOENT;
--
2.36.1

2022-06-21 19:55:06

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 14/20] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found

There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.

Let us just break out of the loop early in such case.

Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index ef90d9c00d9e..7a569aefe313 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -642,8 +642,10 @@ static int __init parse_dt_topology(void)
* only mark cores described in the DT as possible.
*/
for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id < 0)
+ if (cpu_topology[cpu].package_id < 0) {
ret = -EINVAL;
+ break;
+ }

out_map:
of_node_put(map);
--
2.36.1

2022-06-21 19:55:44

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 11/20] arch_topology: Drop LLC identifier stash from the CPU topology

Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.

Remove the redundant LLC ID from the generic CPU arch_topology information.

Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 1 -
include/linux/arch_topology.h | 1 -
2 files changed, 2 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index c314c7064397..b63cc52e12ce 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -752,7 +752,6 @@ void __init reset_cpu_topology(void)
cpu_topo->core_id = -1;
cpu_topo->cluster_id = -1;
cpu_topo->package_id = -1;
- cpu_topo->llc_id = -1;

clear_cpu_topology(cpu);
}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
int core_id;
int cluster_id;
int package_id;
- int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
--
2.36.1

2022-06-21 19:57:02

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 06/20] cacheinfo: Allow early detection and population of cache attributes

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 51 ++++++++++++++++++++++++++-------------
include/linux/cacheinfo.h | 1 +
2 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index fdc1baa342f1..2aa9e8e341b7 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
{
struct device_node *np;
struct cacheinfo *this_leaf;
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
unsigned int index = 0;

- /* skip if fw_token is already populated */
- if (this_cpu_ci->info_list->fw_token) {
- return 0;
- }
-
np = of_cpu_device_node_get(cpu);
if (!np) {
pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)

unsigned int coherency_max_size;

+static int cache_setup_properties(unsigned int cpu)
+{
+ int ret = 0;
+
+ if (of_have_populated_dt())
+ ret = cache_setup_of_node(cpu);
+ else if (!acpi_disabled)
+ ret = cache_setup_acpi(cpu);
+
+ return ret;
+}
+
static int cache_shared_cpu_map_setup(unsigned int cpu)
{
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
if (this_cpu_ci->cpu_map_populated)
return 0;

- if (of_have_populated_dt())
- ret = cache_setup_of_node(cpu);
- else if (!acpi_disabled)
- ret = cache_setup_acpi(cpu);
-
- if (ret)
- return ret;
+ /*
+ * skip setting up cache properties if LLC is valid, just need
+ * to update the shared cpu_map if the cache attributes were
+ * populated early before all the cpus are brought online
+ */
+ if (!last_level_cache_is_valid(cpu)) {
+ ret = cache_setup_properties(cpu);
+ if (ret)
+ return ret;
+ }

for (index = 0; index < cache_leaves(cpu); index++) {
unsigned int i;

this_leaf = per_cpu_cacheinfo_idx(cpu, index);
- /* skip if shared_cpu_map is already populated */
- if (!cpumask_empty(&this_leaf->shared_cpu_map))
- continue;

cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
for_each_online_cpu(i) {
@@ -330,10 +336,19 @@ int __weak populate_cache_leaves(unsigned int cpu)
return -ENOENT;
}

-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
{
int ret;

+ /* Since early detection of the cacheinfo is allowed via this
+ * function and this also gets called as CPU hotplug callbacks via
+ * cacheinfo_cpu_online, the initialisation can be skipped and only
+ * CPU maps can be updated as the CPU online status would be update
+ * if called via cacheinfo_cpu_online path.
+ */
+ if (per_cpu_cacheinfo(cpu))
+ goto update_cpu_map;
+
if (init_cache_level(cpu) || !cache_leaves(cpu))
return -ENOENT;

@@ -349,6 +364,8 @@ static int detect_cache_attributes(unsigned int cpu)
ret = populate_cache_leaves(cpu);
if (ret)
goto free_ci;
+
+update_cpu_map:
/*
* For systems using DT for cache hierarchy, fw_token
* and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
bool last_level_cache_is_valid(unsigned int cpu);
bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
#ifndef CONFIG_ACPI_PPTT
/*
* acpi_find_last_cache_level is only called on ACPI enabled
--
2.36.1

2022-06-21 19:57:09

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 07/20] cacheinfo: Use cache identifiers to check if the caches are shared if available

The cache identifiers is an optional property on most of the platforms. The
presence of one must be indicated by the CACHE_ID valid bit in the attributes.

We can use the cache identifiers provided by the firmware to check if any
two cpus share the same cache instead of relying on the fw_token generated
and set in the OS.

Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 2aa9e8e341b7..167abfa6f37d 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -44,6 +44,10 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
return !(this_leaf->level == 1);

+ if ((sib_leaf->attributes & CACHE_ID) &&
+ (this_leaf->attributes & CACHE_ID))
+ return sib_leaf->id == this_leaf->id;
+
return sib_leaf->fw_token == this_leaf->fw_token;
}

@@ -56,7 +60,8 @@ bool last_level_cache_is_valid(unsigned int cpu)

llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);

- return !!llc->fw_token;
+ return (llc->attributes & CACHE_ID) || !!llc->fw_token;
+
}

bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
--
2.36.1

2022-06-21 19:57:40

by Sudeep Holla

[permalink] [raw]
Subject: [PATCH v4 16/20] arch_topology: Drop unnecessary check for uninitialised package_id

With the support of socket node parsing from the device tree and
assigning 0 as package_id in absence of socket nodes, there is no need
to check for invalid package_id. It is always initialised to 0 or values
from the device tree socket nodes.

Just drop that now redundant check for uninitialised package_id.

Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 11 -----------
1 file changed, 11 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 46fa1b70b02b..42448a5a9412 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -611,7 +611,6 @@ static int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
- int cpu;

cn = of_find_node_by_path("/cpus");
if (!cn) {
@@ -633,16 +632,6 @@ static int __init parse_dt_topology(void)

topology_normalize_cpu_scale();

- /*
- * Check that all cores are in the topology; the SMP code will
- * only mark cores described in the DT as possible.
- */
- for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id < 0) {
- ret = -EINVAL;
- break;
- }
-
out_map:
of_node_put(map);
out:
--
2.36.1

2022-06-23 16:18:01

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v4 10/20] arm64: topology: Remove redundant setting of llc_id in CPU topology

On Tue, Jun 21, 2022 at 08:20:24PM +0100, Sudeep Holla wrote:
> Since the cacheinfo LLC information is used directly in arch_topology,
> there is no need to parse and fetch the LLC ID information only for
> ACPI systems.
>
> Just drop the redundant parsing and setting of llc_id in CPU topology
> from ACPI PPTT.
>
> Cc: Will Deacon <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Reviewed-by: Gavin Shan <[email protected]>
> Signed-off-by: Sudeep Holla <[email protected]>

Acked-by: Catalin Marinas <[email protected]>

> Hi Will/Catalin,
>
> This is part of a series updating topology to get both ACPI and DT view
> aligned. I have not cc-ed you assuming you won't be interested. Let me
> know if you are. The parts affecting arm64 is just this patch removing
> some unnecessary ACPI code that is now moved to core arch_topology.c
>
> Please ack if you are happy with this and OK to take this as part of the
> series.

Yeah, that's fine, keep it with the rest of the series.

--
Catalin

2022-06-27 13:56:15

by Ionela Voinescu

[permalink] [raw]
Subject: Re: [PATCH v4 16/20] arch_topology: Drop unnecessary check for uninitialised package_id

On Tuesday 21 Jun 2022 at 20:20:30 (+0100), Sudeep Holla wrote:
> With the support of socket node parsing from the device tree and
> assigning 0 as package_id in absence of socket nodes, there is no need
> to check for invalid package_id. It is always initialised to 0 or values
> from the device tree socket nodes.
>
> Just drop that now redundant check for uninitialised package_id.
>
> Signed-off-by: Sudeep Holla <[email protected]>
> ---
> drivers/base/arch_topology.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 46fa1b70b02b..42448a5a9412 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -611,7 +611,6 @@ static int __init parse_dt_topology(void)
> {
> struct device_node *cn, *map;
> int ret = 0;
> - int cpu;
>
> cn = of_find_node_by_path("/cpus");
> if (!cn) {
> @@ -633,16 +632,6 @@ static int __init parse_dt_topology(void)
>
> topology_normalize_cpu_scale();
>
> - /*
> - * Check that all cores are in the topology; the SMP code will
> - * only mark cores described in the DT as possible.
> - */
> - for_each_possible_cpu(cpu)
> - if (cpu_topology[cpu].package_id < 0) {
> - ret = -EINVAL;
> - break;
> - }
> -

Maybe it would still be good to keep this for systems with potential
errors in DT, where one forgets to add a core in cpu-map.

For example, if I modify juno.dts as follows:

--- a/arch/arm64/boot/dts/arm/juno.dts
+++ b/arch/arm64/boot/dts/arm/juno.dts
@@ -51,12 +51,9 @@ core0 {
cpu = <&A53_0>;
};
core1 {
- cpu = <&A53_1>;
- };
- core2 {
cpu = <&A53_2>;
};
- core3 {
+ core2 {
cpu = <&A53_3>;
};
};

and miss a little core in cluster1, I would end up with an incomplete
topology: core3 would have cluster_id as -1, while all other CPUs would
have a valid value; also, core3 would have package_id = -1.

Thanks,
Ionela.

> out_map:
> of_node_put(map);
> out:
> --
> 2.36.1
>

2022-06-27 14:15:37

by Ionela Voinescu

[permalink] [raw]
Subject: Re: [PATCH v4 00/20] arch_topology: Updates to add socket support and fix cluster ids

Hi Sudeep,

On Tuesday 21 Jun 2022 at 20:20:14 (+0100), Sudeep Holla wrote:
> Hi All,
>
> This version updates cacheinfo to populate and use the information from
> there for all the cache topology.
>
> This series intends to fix some discrepancies we have in the CPU topology
> parsing from the device tree /cpu-map node. Also this diverges from the
> behaviour on a ACPI enabled platform. The expectation is that both DT
> and ACPI enabled systems must present consistent view of the CPU topology.
>
> Currently we assign generated cluster count as the physical package identifier
> for each CPU which is wrong. The device tree bindings for CPU topology supports
> sockets to infer the socket or physical package identifier for a given CPU.
> Also we don't check if all the cores/threads belong to the same cluster before
> updating their sibling masks which is fine as we don't set the cluster id yet.
>
> These changes also assigns the cluster identifier as parsed from the device tree
> cluster nodes within /cpu-map without support for nesting of the clusters.
> Finally, it also add support for socket nodes in /cpu-map. With this the
> parsing of exact same information from ACPI PPTT and /cpu-map DT node
> aligns well.
>
> The only exception is that the last level cache id information can be
> inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> in the device tree.
>
> Hi Greg,
>
> I had not cc-ed you on earlier 3 versions as we had some disagreement
> amongst Arm developers which we have not settled. Let me know how you want to

s/not/now :)

> merge this once you agree with the changes. I can set pull request if
> you prefer. Let me know.
>
> v4[3]->v4:
> - Updated ACPI PPTT fw_token to use table offset instead of virtual
> address as it could get changed for everytime it is mapped before
> the global acpi_permanent_mmap is set
> - Added warning for the topology with nested clusters
> - Added update to cpu_clustergroup_mask so that introduction of
> correct cluster_id doesn't break existing platforms by limiting
> the span of clustergroup_mask(by Ionela)
>

I've tested v4 on quite a few platforms:
- DT: Juno R0, DB845c, RB5
- ACPI: TX2, Ampere Altra, Kunpeng920

and it all looks good from my point of view (topology and sched domain
hierarchy).

So for the full set (after the changes requested for 16/20 and 20/20):

Tested-by: Ionela Voinescu <[email protected]>

Hope it helps,
Ionela.

2022-06-27 16:52:02

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v4 00/20] arch_topology: Updates to add socket support and fix cluster ids

On Mon, Jun 27, 2022 at 02:54:28PM +0100, Ionela Voinescu wrote:
> Hi Sudeep,
>
> On Tuesday 21 Jun 2022 at 20:20:14 (+0100), Sudeep Holla wrote:
> > Hi All,
> >
> > This version updates cacheinfo to populate and use the information from
> > there for all the cache topology.
> >
> > This series intends to fix some discrepancies we have in the CPU topology
> > parsing from the device tree /cpu-map node. Also this diverges from the
> > behaviour on a ACPI enabled platform. The expectation is that both DT
> > and ACPI enabled systems must present consistent view of the CPU topology.
> >
> > Currently we assign generated cluster count as the physical package identifier
> > for each CPU which is wrong. The device tree bindings for CPU topology supports
> > sockets to infer the socket or physical package identifier for a given CPU.
> > Also we don't check if all the cores/threads belong to the same cluster before
> > updating their sibling masks which is fine as we don't set the cluster id yet.
> >
> > These changes also assigns the cluster identifier as parsed from the device tree
> > cluster nodes within /cpu-map without support for nesting of the clusters.
> > Finally, it also add support for socket nodes in /cpu-map. With this the
> > parsing of exact same information from ACPI PPTT and /cpu-map DT node
> > aligns well.
> >
> > The only exception is that the last level cache id information can be
> > inferred from the same ACPI PPTT while we need to parse CPU cache nodes
> > in the device tree.
> >
> > Hi Greg,
> >
> > I had not cc-ed you on earlier 3 versions as we had some disagreement
> > amongst Arm developers which we have not settled. Let me know how you want to
>
> s/not/now :)
>
> > merge this once you agree with the changes. I can set pull request if
> > you prefer. Let me know.
> >
> > v4[3]->v4:
> > - Updated ACPI PPTT fw_token to use table offset instead of virtual
> > address as it could get changed for everytime it is mapped before
> > the global acpi_permanent_mmap is set
> > - Added warning for the topology with nested clusters
> > - Added update to cpu_clustergroup_mask so that introduction of
> > correct cluster_id doesn't break existing platforms by limiting
> > the span of clustergroup_mask(by Ionela)
> >
>
> I've tested v4 on quite a few platforms:
> - DT: Juno R0, DB845c, RB5
> - ACPI: TX2, Ampere Altra, Kunpeng920
>
> and it all looks good from my point of view (topology and sched domain
> hierarchy).
>
> So for the full set (after the changes requested for 16/20 and 20/20):
>
> Tested-by: Ionela Voinescu <[email protected]>
>

Thanks for all the review and testing. Much appreciated!

--
Regards,
Sudeep

2022-06-27 17:28:04

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v4 16/20] arch_topology: Drop unnecessary check for uninitialised package_id

On Mon, Jun 27, 2022 at 02:12:52PM +0100, Ionela Voinescu wrote:
> On Tuesday 21 Jun 2022 at 20:20:30 (+0100), Sudeep Holla wrote:
> > With the support of socket node parsing from the device tree and
> > assigning 0 as package_id in absence of socket nodes, there is no need
> > to check for invalid package_id. It is always initialised to 0 or values
> > from the device tree socket nodes.
> >
> > Just drop that now redundant check for uninitialised package_id.
> >
> > Signed-off-by: Sudeep Holla <[email protected]>
> > ---
> > drivers/base/arch_topology.c | 11 -----------
> > 1 file changed, 11 deletions(-)
> >
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 46fa1b70b02b..42448a5a9412 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -611,7 +611,6 @@ static int __init parse_dt_topology(void)
> > {
> > struct device_node *cn, *map;
> > int ret = 0;
> > - int cpu;
> >
> > cn = of_find_node_by_path("/cpus");
> > if (!cn) {
> > @@ -633,16 +632,6 @@ static int __init parse_dt_topology(void)
> >
> > topology_normalize_cpu_scale();
> >
> > - /*
> > - * Check that all cores are in the topology; the SMP code will
> > - * only mark cores described in the DT as possible.
> > - */
> > - for_each_possible_cpu(cpu)
> > - if (cpu_topology[cpu].package_id < 0) {
> > - ret = -EINVAL;
> > - break;
> > - }
> > -
>
> Maybe it would still be good to keep this for systems with potential
> errors in DT, where one forgets to add a core in cpu-map.
>

Though I would ideally prefer to catch such DT issues with schema, I know
we are no there yet. So I agree to retain this.

> For example, if I modify juno.dts as follows:
>
> --- a/arch/arm64/boot/dts/arm/juno.dts
> +++ b/arch/arm64/boot/dts/arm/juno.dts
> @@ -51,12 +51,9 @@ core0 {
> cpu = <&A53_0>;
> };
> core1 {
> - cpu = <&A53_1>;
> - };
> - core2 {
> cpu = <&A53_2>;
> };
> - core3 {
> + core2 {
> cpu = <&A53_3>;
> };
> };
>
> and miss a little core in cluster1, I would end up with an incomplete
> topology: core3 would have cluster_id as -1, while all other CPUs would
> have a valid value; also, core3 would have package_id = -1.
>

Indeed. I didn't consider the case where DT would have issues when I dropped
this check. I will drop this patch. Thanks again.

--
Regards,
Sudeep