Hi All,
This version updates cacheinfo to populate and use the information from
there for all the cache topology.
This series intends to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node. Also this diverges from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
Currently we assign generated cluster count as the physical package identifier
for each CPU which is wrong. The device tree bindings for CPU topology supports
sockets to infer the socket or physical package identifier for a given CPU.
Also we don't check if all the cores/threads belong to the same cluster before
updating their sibling masks which is fine as we don't set the cluster id yet.
These changes also assigns the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map without support for nesting of the clusters.
Finally, it also add support for socket nodes in /cpu-map. With this the
parsing of exact same information from ACPI PPTT and /cpu-map DT node
aligns well.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree.
Hi Greg,
I had not cc-ed you on first 3 versions as we had some disagreement
amongst Arm developers which we have now(since v4) settled. Let me know
how you want to merge this once you agree with the changes. I can send
pull request if you prefer. Let me know.
v4[4]->v5:
- Added all the tags recieved so far. Rafael has acked only change
in ACPI and Catalin has acked only change in arm64.
- Addressed all the typos pointed by Ionela and dropped the patch
removing the checks for invalid package id as discussed and update
depth in nested cluster warning check.
v3[3]->v4[4]:
- Updated ACPI PPTT fw_token to use table offset instead of virtual
address as it could get changed for everytime it is mapped before
the global acpi_permanent_mmap is set
- Added warning for the topology with nested clusters
- Added update to cpu_clustergroup_mask so that introduction of
correct cluster_id doesn't break existing platforms by limiting
the span of clustergroup_mask(by Ionela)
v2[2]->v3[3]:
- Dropped support to get the device node for the CPU's LLC
- Updated cacheinfo to support calling of detect_cache_attributes
early in smp_prepare_cpus stage
- Added support to check if LLC is valid and shared in the cacheinfo
- Used the same in arch_topology
v1[1]->v2[2]:
- Updated ID validity check include all non-negative value
- Added support to get the device node for the CPU's last level cache
- Added support to build llc_sibling on DT platforms
[1] https://lore.kernel.org/lkml/[email protected]
[2] https://lore.kernel.org/lkml/[email protected]
[3] https://lore.kernel.org/lkml/[email protected]
[4] https://lore.kernel.org/lkml/[email protected]
Ionela Voinescu (1):
arch_topology: Limit span of cpu_clustergroup_mask()
Sudeep Holla (18):
ACPI: PPTT: Use table offset as fw_token instead of virtual address
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Use cache identifiers to check if the caches are shared if available
arch_topology: Add support to parse and detect cache attributes
arch_topology: Use the last level cache information from the cacheinfo
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Drop LLC identifier stash from the CPU topology
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Warn that topology for nested clusters is not supported
arch/arm64/kernel/topology.c | 14 ----
drivers/acpi/pptt.c | 3 +-
drivers/base/arch_topology.c | 97 +++++++++++++++++++-------
drivers/base/cacheinfo.c | 127 ++++++++++++++++++++++------------
include/linux/arch_topology.h | 1 -
include/linux/cacheinfo.h | 3 +
6 files changed, 162 insertions(+), 83 deletions(-)
--
2.36.1
The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.
Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index b0bde272e2ae..e13ef41763e4 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -25,6 +25,8 @@ static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
#define ci_cacheinfo(cpu) (&per_cpu(ci_cpu_cacheinfo, cpu))
#define cache_leaves(cpu) (ci_cacheinfo(cpu)->num_leaves)
#define per_cpu_cacheinfo(cpu) (ci_cacheinfo(cpu)->info_list)
+#define per_cpu_cacheinfo_idx(cpu, idx) \
+ (per_cpu_cacheinfo(cpu) + (idx))
struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
{
@@ -172,7 +174,7 @@ static int cache_setup_of_node(unsigned int cpu)
}
while (index < cache_leaves(cpu)) {
- this_leaf = this_cpu_ci->info_list + index;
+ this_leaf = per_cpu_cacheinfo_idx(cpu, index);
if (this_leaf->level != 1)
np = of_find_next_cache_node(np);
else
@@ -231,7 +233,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
for (index = 0; index < cache_leaves(cpu); index++) {
unsigned int i;
- this_leaf = this_cpu_ci->info_list + index;
+ this_leaf = per_cpu_cacheinfo_idx(cpu, index);
/* skip if shared_cpu_map is already populated */
if (!cpumask_empty(&this_leaf->shared_cpu_map))
continue;
@@ -242,7 +244,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
if (i == cpu || !sib_cpu_ci->info_list)
continue;/* skip if itself or no cacheinfo */
- sib_leaf = sib_cpu_ci->info_list + index;
+ sib_leaf = per_cpu_cacheinfo_idx(i, index);
if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
@@ -258,12 +260,11 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
static void cache_shared_cpu_map_remove(unsigned int cpu)
{
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
struct cacheinfo *this_leaf, *sib_leaf;
unsigned int sibling, index;
for (index = 0; index < cache_leaves(cpu); index++) {
- this_leaf = this_cpu_ci->info_list + index;
+ this_leaf = per_cpu_cacheinfo_idx(cpu, index);
for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
struct cpu_cacheinfo *sib_cpu_ci;
@@ -274,7 +275,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
if (!sib_cpu_ci->info_list)
continue;
- sib_leaf = sib_cpu_ci->info_list + index;
+ sib_leaf = per_cpu_cacheinfo_idx(sibling, index);
cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
}
@@ -609,7 +610,6 @@ static int cache_add_dev(unsigned int cpu)
int rc;
struct device *ci_dev, *parent;
struct cacheinfo *this_leaf;
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
const struct attribute_group **cache_groups;
rc = cpu_cache_sysfs_init(cpu);
@@ -618,7 +618,7 @@ static int cache_add_dev(unsigned int cpu)
parent = per_cpu_cache_dev(cpu);
for (i = 0; i < cache_leaves(cpu); i++) {
- this_leaf = this_cpu_ci->info_list + i;
+ this_leaf = per_cpu_cacheinfo_idx(cpu, i);
if (this_leaf->disable_sysfs)
continue;
if (this_leaf->type == CACHE_TYPE_NOCACHE)
--
2.36.1
It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.
This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.
Also add helper to check if the llc information in cacheinfo is valid
or not.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 26 ++++++++++++++++++++++++++
include/linux/cacheinfo.h | 2 ++
2 files changed, 28 insertions(+)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 2cea9201f31c..fdc1baa342f1 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -47,6 +47,32 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
return sib_leaf->fw_token == this_leaf->fw_token;
}
+bool last_level_cache_is_valid(unsigned int cpu)
+{
+ struct cacheinfo *llc;
+
+ if (!cache_leaves(cpu))
+ return false;
+
+ llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+ return !!llc->fw_token;
+}
+
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
+{
+ struct cacheinfo *llc_x, *llc_y;
+
+ if (!last_level_cache_is_valid(cpu_x) ||
+ !last_level_cache_is_valid(cpu_y))
+ return false;
+
+ llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
+ llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+
+ return cache_leaves_are_shared(llc_x, llc_y);
+}
+
#ifdef CONFIG_OF
/* OF properties to query for a given cache type */
struct cache_type_info {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 4ff37cb763ae..7e429bc5c1a4 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -82,6 +82,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
int init_cache_level(unsigned int cpu);
int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
+bool last_level_cache_is_valid(unsigned int cpu);
+bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
#ifndef CONFIG_ACPI_PPTT
/*
* acpi_find_last_cache_level is only called on ACPI enabled
--
2.36.1
The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.
We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 2aa9e8e341b7..167abfa6f37d 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -44,6 +44,10 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
return !(this_leaf->level == 1);
+ if ((sib_leaf->attributes & CACHE_ID) &&
+ (this_leaf->attributes & CACHE_ID))
+ return sib_leaf->id == this_leaf->id;
+
return sib_leaf->fw_token == this_leaf->fw_token;
}
@@ -56,7 +60,8 @@ bool last_level_cache_is_valid(unsigned int cpu)
llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
- return !!llc->fw_token;
+ return (llc->attributes & CACHE_ID) || !!llc->fw_token;
+
}
bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
--
2.36.1
Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.
In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 579c851a2bd7..23cb52180ce3 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -7,6 +7,7 @@
*/
#include <linux/acpi.h>
+#include <linux/cacheinfo.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
#include <linux/device.h>
@@ -780,15 +781,23 @@ __weak int __init parse_acpi_topology(void)
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void __init init_cpu_topology(void)
{
+ int ret, cpu;
+
reset_cpu_topology();
+ ret = parse_acpi_topology();
+ if (!ret)
+ ret = of_have_populated_dt() && parse_dt_topology();
- /*
- * Discard anything that was parsed if we hit an error so we
- * don't use partial information.
- */
- if (parse_acpi_topology())
- reset_cpu_topology();
- else if (of_have_populated_dt() && parse_dt_topology())
+ if (ret) {
+ /*
+ * Discard anything that was parsed if we hit an error so we
+ * don't use partial information.
+ */
reset_cpu_topology();
+ return;
+ }
+
+ for_each_possible_cpu(cpu)
+ detect_cache_attributes(cpu);
}
#endif
--
2.36.1
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.
Remove the redundant LLC ID from the generic CPU arch_topology
information.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 1 -
include/linux/arch_topology.h | 1 -
2 files changed, 2 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index c314c7064397..b63cc52e12ce 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -752,7 +752,6 @@ void __init reset_cpu_topology(void)
cpu_topo->core_id = -1;
cpu_topo->cluster_id = -1;
cpu_topo->package_id = -1;
- cpu_topo->llc_id = -1;
clear_cpu_topology(cpu);
}
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 58cbe18d825c..a07b510e7dc5 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -68,7 +68,6 @@ struct cpu_topology {
int core_id;
int cluster_id;
int package_id;
- int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
--
2.36.1
Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.
The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.
Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 7a569aefe313..46fa1b70b02b 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -549,7 +549,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
bool leaf = true;
bool has_cores = false;
struct device_node *c;
- static int package_id __initdata;
int core_id = 0;
int i, ret;
@@ -588,7 +587,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
}
if (leaf) {
- ret = parse_core(c, package_id, core_id++);
+ ret = parse_core(c, 0, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -605,9 +604,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
if (leaf && !has_cores)
pr_warn("%pOF: empty cluster\n", cluster);
- if (leaf)
- package_id++;
-
return 0;
}
--
2.36.1
Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.
Also it is likely that most single socket systems skip to as the node
since it is optional.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 37 +++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 80184c91c919..7cbe21b1b295 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -545,8 +545,8 @@ static int __init parse_core(struct device_node *core, int package_id,
return 0;
}
-static int __init
-parse_cluster(struct device_node *cluster, int cluster_id, int depth)
+static int __init parse_cluster(struct device_node *cluster, int package_id,
+ int cluster_id, int depth)
{
char name[20];
bool leaf = true;
@@ -566,7 +566,7 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
- ret = parse_cluster(c, i, depth + 1);
+ ret = parse_cluster(c, package_id, i, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
@@ -590,7 +590,8 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
}
if (leaf) {
- ret = parse_core(c, 0, cluster_id, core_id++);
+ ret = parse_core(c, package_id, cluster_id,
+ core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -610,6 +611,32 @@ parse_cluster(struct device_node *cluster, int cluster_id, int depth)
return 0;
}
+static int __init parse_socket(struct device_node *socket)
+{
+ char name[20];
+ struct device_node *c;
+ bool has_socket = false;
+ int package_id = 0, ret;
+
+ do {
+ snprintf(name, sizeof(name), "socket%d", package_id);
+ c = of_get_child_by_name(socket, name);
+ if (c) {
+ has_socket = true;
+ ret = parse_cluster(c, package_id, -1, 0);
+ of_node_put(c);
+ if (ret != 0)
+ return ret;
+ }
+ package_id++;
+ } while (c);
+
+ if (!has_socket)
+ ret = parse_cluster(socket, 0, -1, 0);
+
+ return ret;
+}
+
static int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
@@ -630,7 +657,7 @@ static int __init parse_dt_topology(void)
if (!map)
goto out;
- ret = parse_cluster(map, -1, 0);
+ ret = parse_socket(map);
if (ret != 0)
goto out_map;
--
2.36.1
There is need to use the cache sharing information quite early during
the boot before the secondary cores are up and running. The permanent
memory map for all the ACPI tables(via acpi_permanent_mmap) is turned
on in acpi_early_init() which is quite late for the above requirement.
As a result there is possibility that the ACPI PPTT gets mapped to
different virtual addresses. In such scenarios, using virtual address as
fw_token before the acpi_permanent_mmap is enabled results in different
fw_token for the same cache entity and hence wrong cache sharing
information will be deduced based on the same.
Instead of using virtual address, just use the table offset as the
unique firmware token for the caches. The same offset is used as
ACPI identifiers if the firmware has not set a valid one for other
entries in the ACPI PPTT.
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected]
Tested-by: Ionela Voinescu <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/acpi/pptt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 701f61c01359..763f021d45e6 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -437,7 +437,8 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
pr_debug("found = %p %p\n", found_cache, cpu_node);
if (found_cache)
update_cache_properties(this_leaf, found_cache,
- cpu_node, table->revision);
+ ACPI_TO_POINTER(ACPI_PTR_DIFF(cpu_node, table)),
+ table->revision);
index++;
}
--
2.36.1
From: Ionela Voinescu <[email protected]>
Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.
To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().
This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.
While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.
Link: https://lore.kernel.org/r/[email protected]
Cc: Darren Hart <[email protected]>
Tested-by: Ionela Voinescu <[email protected]>
Signed-off-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 46fa1b70b02b..277b65cf3306 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -686,6 +686,14 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
const struct cpumask *cpu_clustergroup_mask(int cpu)
{
+ /*
+ * Forbid cpu_clustergroup_mask() to span more or the same CPUs as
+ * cpu_coregroup_mask().
+ */
+ if (cpumask_subset(cpu_coregroup_mask(cpu),
+ &cpu_topology[cpu].cluster_sibling))
+ return get_cpu_mask(cpu);
+
return &cpu_topology[cpu].cluster_sibling;
}
--
2.36.1
Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Andy Shevchenko <[email protected]>
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 7a5ff1ea5f00..ef90d9c00d9e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -642,7 +642,7 @@ static int __init parse_dt_topology(void)
* only mark cores described in the DT as possible.
*/
for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id == -1)
+ if (cpu_topology[cpu].package_id < 0)
ret = -EINVAL;
out_map:
@@ -714,7 +714,7 @@ void update_siblings_masks(unsigned int cpuid)
if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
continue;
- if (cpuid_topo->cluster_id != -1) {
+ if (cpuid_topo->cluster_id >= 0) {
cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
}
--
2.36.1
We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 7cbe21b1b295..d7e48e995691 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -567,6 +567,8 @@ static int __init parse_cluster(struct device_node *cluster, int package_id,
if (c) {
leaf = false;
ret = parse_cluster(c, package_id, i, depth + 1);
+ if (depth > 0)
+ pr_warn("Topology for clusters of clusters not yet supported\n");
of_node_put(c);
if (ret != 0)
return ret;
--
2.36.1
The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.
This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 23cb52180ce3..c314c7064397 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -668,7 +668,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
/* not numa in package, lets use the package siblings */
core_mask = &cpu_topology[cpu].core_sibling;
}
- if (cpu_topology[cpu].llc_id != -1) {
+
+ if (last_level_cache_is_valid(cpu)) {
if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
core_mask = &cpu_topology[cpu].llc_sibling;
}
@@ -699,7 +700,7 @@ void update_siblings_masks(unsigned int cpuid)
for_each_online_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
- if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
+ if (last_level_cache_is_shared(cpu, cpuid)) {
cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
}
--
2.36.1
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and fetch the LLC ID information only for
ACPI systems.
Just drop the redundant parsing and setting of llc_id in CPU topology
from ACPI PPTT.
Link: https://lore.kernel.org/r/[email protected]
Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
arch/arm64/kernel/topology.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 9ab78ad826e2..869ffc4d4484 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -89,8 +89,6 @@ int __init parse_acpi_topology(void)
return 0;
for_each_possible_cpu(cpu) {
- int i, cache_id;
-
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
return topology_id;
@@ -107,18 +105,6 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].cluster_id = topology_id;
topology_id = find_acpi_cpu_topology_package(cpu);
cpu_topology[cpu].package_id = topology_id;
-
- i = acpi_find_last_cache_level(cpu);
-
- if (i > 0) {
- /*
- * this is the only part of cpu_topology that has
- * a direct relationship with the cache topology
- */
- cache_id = find_acpi_cpu_cache_topology(cpu, i);
- if (cache_id > 0)
- cpu_topology[cpu].llc_id = cache_id;
- }
}
return 0;
--
2.36.1
Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.
Allow detect_cache_attributes to be called quite early during the boot.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/cacheinfo.c | 51 ++++++++++++++++++++++++++-------------
include/linux/cacheinfo.h | 1 +
2 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index fdc1baa342f1..2aa9e8e341b7 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -193,14 +193,8 @@ static int cache_setup_of_node(unsigned int cpu)
{
struct device_node *np;
struct cacheinfo *this_leaf;
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
unsigned int index = 0;
- /* skip if fw_token is already populated */
- if (this_cpu_ci->info_list->fw_token) {
- return 0;
- }
-
np = of_cpu_device_node_get(cpu);
if (!np) {
pr_err("Failed to find cpu%d device node\n", cpu);
@@ -236,6 +230,18 @@ int __weak cache_setup_acpi(unsigned int cpu)
unsigned int coherency_max_size;
+static int cache_setup_properties(unsigned int cpu)
+{
+ int ret = 0;
+
+ if (of_have_populated_dt())
+ ret = cache_setup_of_node(cpu);
+ else if (!acpi_disabled)
+ ret = cache_setup_acpi(cpu);
+
+ return ret;
+}
+
static int cache_shared_cpu_map_setup(unsigned int cpu)
{
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -246,21 +252,21 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
if (this_cpu_ci->cpu_map_populated)
return 0;
- if (of_have_populated_dt())
- ret = cache_setup_of_node(cpu);
- else if (!acpi_disabled)
- ret = cache_setup_acpi(cpu);
-
- if (ret)
- return ret;
+ /*
+ * skip setting up cache properties if LLC is valid, just need
+ * to update the shared cpu_map if the cache attributes were
+ * populated early before all the cpus are brought online
+ */
+ if (!last_level_cache_is_valid(cpu)) {
+ ret = cache_setup_properties(cpu);
+ if (ret)
+ return ret;
+ }
for (index = 0; index < cache_leaves(cpu); index++) {
unsigned int i;
this_leaf = per_cpu_cacheinfo_idx(cpu, index);
- /* skip if shared_cpu_map is already populated */
- if (!cpumask_empty(&this_leaf->shared_cpu_map))
- continue;
cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
for_each_online_cpu(i) {
@@ -330,10 +336,19 @@ int __weak populate_cache_leaves(unsigned int cpu)
return -ENOENT;
}
-static int detect_cache_attributes(unsigned int cpu)
+int detect_cache_attributes(unsigned int cpu)
{
int ret;
+ /* Since early detection of the cacheinfo is allowed via this
+ * function and this also gets called as CPU hotplug callbacks via
+ * cacheinfo_cpu_online, the initialisation can be skipped and only
+ * CPU maps can be updated as the CPU online status would be update
+ * if called via cacheinfo_cpu_online path.
+ */
+ if (per_cpu_cacheinfo(cpu))
+ goto update_cpu_map;
+
if (init_cache_level(cpu) || !cache_leaves(cpu))
return -ENOENT;
@@ -349,6 +364,8 @@ static int detect_cache_attributes(unsigned int cpu)
ret = populate_cache_leaves(cpu);
if (ret)
goto free_ci;
+
+update_cpu_map:
/*
* For systems using DT for cache hierarchy, fw_token
* and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 7e429bc5c1a4..00b7a6ae8617 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -84,6 +84,7 @@ int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
bool last_level_cache_is_valid(unsigned int cpu);
bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int detect_cache_attributes(unsigned int cpu);
#ifndef CONFIG_ACPI_PPTT
/*
* acpi_find_last_cache_level is only called on ACPI enabled
--
2.36.1
Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.
So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.
This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Gavin Shan <[email protected]>
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b63cc52e12ce..7a5ff1ea5f00 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -708,15 +708,17 @@ void update_siblings_masks(unsigned int cpuid)
if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
- if (cpuid_topo->cluster_id == cpu_topo->cluster_id &&
- cpuid_topo->cluster_id != -1) {
+ cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+ cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+ if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+ continue;
+
+ if (cpuid_topo->cluster_id != -1) {
cpumask_set_cpu(cpu, &cpuid_topo->cluster_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->cluster_sibling);
}
- cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
- cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
-
if (cpuid_topo->core_id != cpu_topo->core_id)
continue;
--
2.36.1
There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.
Let us just break out of the loop early in such case.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index ef90d9c00d9e..7a569aefe313 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -642,8 +642,10 @@ static int __init parse_dt_topology(void)
* only mark cores described in the DT as possible.
*/
for_each_possible_cpu(cpu)
- if (cpu_topology[cpu].package_id < 0)
+ if (cpu_topology[cpu].package_id < 0) {
ret = -EINVAL;
+ break;
+ }
out_map:
of_node_put(map);
--
2.36.1
Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.
We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ionela Voinescu <[email protected]>
Reviewed-by: Ionela Voinescu <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/base/arch_topology.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 277b65cf3306..80184c91c919 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -497,7 +497,7 @@ static int __init get_cpu_for_node(struct device_node *node)
}
static int __init parse_core(struct device_node *core, int package_id,
- int core_id)
+ int cluster_id, int core_id)
{
char name[20];
bool leaf = true;
@@ -513,6 +513,7 @@ static int __init parse_core(struct device_node *core, int package_id,
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else if (cpu != -ENODEV) {
@@ -534,6 +535,7 @@ static int __init parse_core(struct device_node *core, int package_id,
}
cpu_topology[cpu].package_id = package_id;
+ cpu_topology[cpu].cluster_id = cluster_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf && cpu != -ENODEV) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -543,7 +545,8 @@ static int __init parse_core(struct device_node *core, int package_id,
return 0;
}
-static int __init parse_cluster(struct device_node *cluster, int depth)
+static int __init
+parse_cluster(struct device_node *cluster, int cluster_id, int depth)
{
char name[20];
bool leaf = true;
@@ -563,7 +566,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
- ret = parse_cluster(c, depth + 1);
+ ret = parse_cluster(c, i, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
@@ -587,7 +590,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
}
if (leaf) {
- ret = parse_core(c, 0, core_id++);
+ ret = parse_core(c, 0, cluster_id, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
@@ -627,7 +630,7 @@ static int __init parse_dt_topology(void)
if (!map)
goto out;
- ret = parse_cluster(map, 0);
+ ret = parse_cluster(map, -1, 0);
if (ret != 0)
goto out_map;
--
2.36.1
On Mon, 27 Jun 2022 at 18:51, Sudeep Holla <[email protected]> wrote:
>
> From: Ionela Voinescu <[email protected]>
>
> Currently the cluster identifier is not set on DT based platforms.
> The reset or default value is -1 for all the CPUs. Once we assign the
> cluster identifier values correctly, the cluster_sibling mask will be
> populated and returned by cpu_clustergroup_mask() to contribute in the
> creation of the CLS scheduling domain level, if SCHED_CLUSTER is
> enabled.
>
> To avoid topologies that will result in questionable or incorrect
> scheduling domains, impose restrictions regarding the span of clusters,
> as presented to scheduling domains building code: cluster_sibling should
> not span more or the same CPUs as cpu_coregroup_mask().
>
> This is needed in order to obtain a strict separation between the MC and
> CLS levels, and maintain the same domains for existing platforms in
> the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
> is redundant and irrelevant for the scheduler.
>
> While previously the scheduling domain builder code would have removed MC
> as redundant and kept CLS if SCHED_CLUSTER was enabled and the
> cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
> now CLS will be removed and MC kept.
>
> Link: https://lore.kernel.org/r/[email protected]
> Cc: Darren Hart <[email protected]>
> Tested-by: Ionela Voinescu <[email protected]>
> Signed-off-by: Ionela Voinescu <[email protected]>
> Signed-off-by: Sudeep Holla <[email protected]>
> ---
> drivers/base/arch_topology.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 46fa1b70b02b..277b65cf3306 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -686,6 +686,14 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>
> const struct cpumask *cpu_clustergroup_mask(int cpu)
> {
> + /*
> + * Forbid cpu_clustergroup_mask() to span more or the same CPUs as
> + * cpu_coregroup_mask().
> + */
> + if (cpumask_subset(cpu_coregroup_mask(cpu),
> + &cpu_topology[cpu].cluster_sibling))
> + return get_cpu_mask(cpu);
AFAICT, this will change the Altra scheduling topology which will now
have a MC level instead of the CLS but it's probably not a problem as
the flags are the same for now
Acked-by: Vincent Guittot <[email protected]>
> +
> return &cpu_topology[cpu].cluster_sibling;
> }
>
> --
> 2.36.1
>
The sole user of this find_acpi_cpu_cache_topology() was arm64 topology
which is now consolidated into the generic arch_topology without the need
of this function.
Drop the unused function find_acpi_cpu_cache_topology().
Reported-by: Ionela Voinescu <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
Signed-off-by: Sudeep Holla <[email protected]>
---
drivers/acpi/pptt.c | 37 -------------------------------------
include/linux/acpi.h | 5 -----
2 files changed, 42 deletions(-)
Hi Rafael,
This is another patch that I would like to be part of the series[1].
Please ack the same if you are OK to route this via Greg. I am avoiding
to repost the whole series just for this one additional patch for now.
Regards,
Sudeep
[1] https://lore.kernel.org/all/[email protected]/
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 763f021d45e6..dd3222a15c9c 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -691,43 +691,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
return find_acpi_cpu_topology_tag(cpu, level, 0);
}
-/**
- * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
- * @cpu: Kernel logical CPU number
- * @level: The cache level for which we would like a unique ID
- *
- * Determine a unique ID for each unified cache in the system
- *
- * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
- * Otherwise returns a value which represents a unique topological feature.
- */
-int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
-{
- struct acpi_table_header *table;
- struct acpi_pptt_cache *found_cache;
- acpi_status status;
- u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
- struct acpi_pptt_processor *cpu_node = NULL;
- int ret = -1;
-
- status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
- if (ACPI_FAILURE(status)) {
- acpi_pptt_warn_missing();
- return -ENOENT;
- }
-
- found_cache = acpi_find_cache_node(table, acpi_cpu_id,
- CACHE_TYPE_UNIFIED,
- level,
- &cpu_node);
- if (found_cache)
- ret = ACPI_PTR_DIFF(cpu_node, table);
-
- acpi_put_table(table);
-
- return ret;
-}
-
/**
* find_acpi_cpu_topology_package() - Determine a unique CPU package value
* @cpu: Kernel logical CPU number
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 4f82a5bc6d98..7b96a8bff6d2 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1429,7 +1429,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level);
int find_acpi_cpu_topology_cluster(unsigned int cpu);
int find_acpi_cpu_topology_package(unsigned int cpu);
int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
-int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
#else
static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
{
@@ -1451,10 +1450,6 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
{
return -EINVAL;
}
-static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
-{
- return -EINVAL;
-}
#endif
#ifdef CONFIG_ACPI_PCC
--
2.37.0
On Wed, Jun 29, 2022 at 3:07 PM Sudeep Holla <[email protected]> wrote:
>
> The sole user of this find_acpi_cpu_cache_topology() was arm64 topology
> which is now consolidated into the generic arch_topology without the need
> of this function.
>
> Drop the unused function find_acpi_cpu_cache_topology().
>
> Reported-by: Ionela Voinescu <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>
> Cc: [email protected]
> Signed-off-by: Sudeep Holla <[email protected]>
> ---
> drivers/acpi/pptt.c | 37 -------------------------------------
> include/linux/acpi.h | 5 -----
> 2 files changed, 42 deletions(-)
>
> Hi Rafael,
>
> This is another patch that I would like to be part of the series[1].
> Please ack the same if you are OK to route this via Greg. I am avoiding
> to repost the whole series just for this one additional patch for now.
Sure.
Acked-by: Rafael J. Wysocki <[email protected]>
> [1] https://lore.kernel.org/all/[email protected]/
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 763f021d45e6..dd3222a15c9c 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -691,43 +691,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
> return find_acpi_cpu_topology_tag(cpu, level, 0);
> }
>
> -/**
> - * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
> - * @cpu: Kernel logical CPU number
> - * @level: The cache level for which we would like a unique ID
> - *
> - * Determine a unique ID for each unified cache in the system
> - *
> - * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
> - * Otherwise returns a value which represents a unique topological feature.
> - */
> -int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
> -{
> - struct acpi_table_header *table;
> - struct acpi_pptt_cache *found_cache;
> - acpi_status status;
> - u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> - struct acpi_pptt_processor *cpu_node = NULL;
> - int ret = -1;
> -
> - status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> - if (ACPI_FAILURE(status)) {
> - acpi_pptt_warn_missing();
> - return -ENOENT;
> - }
> -
> - found_cache = acpi_find_cache_node(table, acpi_cpu_id,
> - CACHE_TYPE_UNIFIED,
> - level,
> - &cpu_node);
> - if (found_cache)
> - ret = ACPI_PTR_DIFF(cpu_node, table);
> -
> - acpi_put_table(table);
> -
> - return ret;
> -}
> -
> /**
> * find_acpi_cpu_topology_package() - Determine a unique CPU package value
> * @cpu: Kernel logical CPU number
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 4f82a5bc6d98..7b96a8bff6d2 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1429,7 +1429,6 @@ int find_acpi_cpu_topology(unsigned int cpu, int level);
> int find_acpi_cpu_topology_cluster(unsigned int cpu);
> int find_acpi_cpu_topology_package(unsigned int cpu);
> int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> -int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
> {
> @@ -1451,10 +1450,6 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> {
> return -EINVAL;
> }
> -static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
> -{
> - return -EINVAL;
> -}
> #endif
>
> #ifdef CONFIG_ACPI_PCC
> --
> 2.37.0
>
On 27/06/2022 17:50, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
>
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
Hey Sudeep,
I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
I suspect the issue is a missing "next-level-cache" in the the dt:
arch/riscv/boot/dts/microchip/mpfs.dtsi
Adding next-level-cache = <&cctrllr> fixes the boot.
Not sure what the resolution here is, old devicetrees are meant to keep
booting, right?
Thanks,
Conor.
>
> Link: https://lore.kernel.org/r/[email protected]
btw, why is this link in the patch? Why is a link to v4 relevant?
Links to both v4 and v5 exist in your for-linux-next branch.
Log:
git bisect start
# bad: [c4ef528bd006febc7de444d9775b28706d924f78] Add linux-next specific files for 20220629
git bisect bad c4ef528bd006febc7de444d9775b28706d924f78
# good: [b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3] Linux 5.19-rc2
git bisect good b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
# bad: [95c758a8899c4e8825a35a62a6f31667991217f9] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
git bisect bad 95c758a8899c4e8825a35a62a6f31667991217f9
# bad: [5cbb9aeefe0070b627cd5c5528e6e63701561d57] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git
git bisect bad 5cbb9aeefe0070b627cd5c5528e6e63701561d57
# good: [2e6556bae3e453cf27f3fb9c6144080e2a61707e] Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git
git bisect good 2e6556bae3e453cf27f3fb9c6144080e2a61707e
# good: [17efe76af33f6af09a821acce2e2e4e84819d381] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap.git
git bisect good 17efe76af33f6af09a821acce2e2e4e84819d381
# good: [5aeeaf40d31288e8efa6ff2cbd952b13de077aa9] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip.git
git bisect good 5aeeaf40d31288e8efa6ff2cbd952b13de077aa9
# bad: [f64dfa36b325d107d8aca9727410343bd86d37dc] Merge branch 'stm32-next' of git://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32.git
git bisect bad f64dfa36b325d107d8aca9727410343bd86d37dc
# good: [89459a2aef8832f044c8fbbec54b46cec05156c8] Merge branch 'next/dt' into for-next
git bisect good 89459a2aef8832f044c8fbbec54b46cec05156c8
# bad: [24cdefc96973ff1a1f6702470ad91ab019e5fedd] Merge branch 'arch_topology' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into for-linux-next
git bisect bad 24cdefc96973ff1a1f6702470ad91ab019e5fedd
# bad: [0d71f236f0a1067aba7660d056a9061b5877bf52] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
git bisect bad 0d71f236f0a1067aba7660d056a9061b5877bf52
# good: [be6ab2e822888b8d9983d670fdabc09d753fd24f] cacheinfo: Use cache identifiers to check if the caches are shared if available
git bisect good be6ab2e822888b8d9983d670fdabc09d753fd24f
# bad: [854a3115f9ec0b889015c6854fbc0c1d69a46e4a] arm64: topology: Remove redundant setting of llc_id in CPU topology
git bisect bad 854a3115f9ec0b889015c6854fbc0c1d69a46e4a
# bad: [3b23bb2573e65b11be8f4b89023296dee7f06c0b] arch_topology: Use the last level cache information from the cacheinfo
git bisect bad 3b23bb2573e65b11be8f4b89023296dee7f06c0b
# good: [2f7b757eb69df296554bd39b0b2b2f4da678c736] arch_topology: Add support to parse and detect cache attributes
git bisect good 2f7b757eb69df296554bd39b0b2b2f4da678c736
# first bad commit: [3b23bb2573e65b11be8f4b89023296dee7f06c0b] arch_topology: Use the last level cache information from the cacheinfo
On 29/06/2022 18:49, [email protected] wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On 27/06/2022 17:50, Sudeep Holla wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> The cacheinfo is now initialised early along with the CPU topology
>> initialisation. Instead of relying on the LLC ID information parsed
>> separately only with ACPI PPTT elsewhere, migrate to use the similar
>> information from the cacheinfo.
>>
>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
>> parsed separately can now be removed from arch specific code.
>
> Hey Sudeep,
> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
> I suspect the issue is a missing "next-level-cache" in the the dt:
> arch/riscv/boot/dts/microchip/mpfs.dtsi
>
> Adding next-level-cache = <&cctrllr> fixes the boot.
No, no it doesn't. Not sure what I was thinking there.
Prob tested that on the the last commit that bisect tested
rather than the one it pointed out the problem was with.
Either way, boot is broken in -next.
> Not sure what the resolution here is, old devicetrees are meant to keep
> booting, right?
>
> Thanks,
> Conor.
>
>>
>> Link: https://lore.kernel.org/r/[email protected]
>
> btw, why is this link in the patch? Why is a link to v4 relevant?
> Links to both v4 and v5 exist in your for-linux-next branch.
>
> Log:
> git bisect start
> # bad: [c4ef528bd006febc7de444d9775b28706d924f78] Add linux-next specific files for 20220629
> git bisect bad c4ef528bd006febc7de444d9775b28706d924f78
> # good: [b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3] Linux 5.19-rc2
> git bisect good b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
> # bad: [95c758a8899c4e8825a35a62a6f31667991217f9] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
> git bisect bad 95c758a8899c4e8825a35a62a6f31667991217f9
> # bad: [5cbb9aeefe0070b627cd5c5528e6e63701561d57] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git
> git bisect bad 5cbb9aeefe0070b627cd5c5528e6e63701561d57
> # good: [2e6556bae3e453cf27f3fb9c6144080e2a61707e] Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git
> git bisect good 2e6556bae3e453cf27f3fb9c6144080e2a61707e
> # good: [17efe76af33f6af09a821acce2e2e4e84819d381] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap.git
> git bisect good 17efe76af33f6af09a821acce2e2e4e84819d381
> # good: [5aeeaf40d31288e8efa6ff2cbd952b13de077aa9] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip.git
> git bisect good 5aeeaf40d31288e8efa6ff2cbd952b13de077aa9
> # bad: [f64dfa36b325d107d8aca9727410343bd86d37dc] Merge branch 'stm32-next' of git://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32.git
> git bisect bad f64dfa36b325d107d8aca9727410343bd86d37dc
> # good: [89459a2aef8832f044c8fbbec54b46cec05156c8] Merge branch 'next/dt' into for-next
> git bisect good 89459a2aef8832f044c8fbbec54b46cec05156c8
> # bad: [24cdefc96973ff1a1f6702470ad91ab019e5fedd] Merge branch 'arch_topology' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into for-linux-next
> git bisect bad 24cdefc96973ff1a1f6702470ad91ab019e5fedd
> # bad: [0d71f236f0a1067aba7660d056a9061b5877bf52] arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
> git bisect bad 0d71f236f0a1067aba7660d056a9061b5877bf52
> # good: [be6ab2e822888b8d9983d670fdabc09d753fd24f] cacheinfo: Use cache identifiers to check if the caches are shared if available
> git bisect good be6ab2e822888b8d9983d670fdabc09d753fd24f
> # bad: [854a3115f9ec0b889015c6854fbc0c1d69a46e4a] arm64: topology: Remove redundant setting of llc_id in CPU topology
> git bisect bad 854a3115f9ec0b889015c6854fbc0c1d69a46e4a
> # bad: [3b23bb2573e65b11be8f4b89023296dee7f06c0b] arch_topology: Use the last level cache information from the cacheinfo
> git bisect bad 3b23bb2573e65b11be8f4b89023296dee7f06c0b
> # good: [2f7b757eb69df296554bd39b0b2b2f4da678c736] arch_topology: Add support to parse and detect cache attributes
> git bisect good 2f7b757eb69df296554bd39b0b2b2f4da678c736
> # first bad commit: [3b23bb2573e65b11be8f4b89023296dee7f06c0b] arch_topology: Use the last level cache information from the cacheinfo
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
On Wed, Jun 29, 2022 at 05:49:20PM +0000, [email protected] wrote:
> >
> > Link: https://lore.kernel.org/r/[email protected]
>
> btw, why is this link in the patch? Why is a link to v4 relevant?
> Links to both v4 and v5 exist in your for-linux-next branch.
>
Yes I noticed that and fixed it today.
--
Regards,
Sudeep
On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>
> No, no it doesn't. Not sure what I was thinking there.
> Prob tested that on the the last commit that bisect tested
> rather than the one it pointed out the problem was with.
>
> Either way, boot is broken in -next.
>
Can you check if the below fixes the issue ? Assuming presenting L1 as
LLC might be causing issue.
Regards,
Sudeep
-->8
diff --git i/drivers/base/cacheinfo.c w/drivers/base/cacheinfo.c
index 167abfa6f37d..a691317f7fdd 100644
--- i/drivers/base/cacheinfo.c
+++ w/drivers/base/cacheinfo.c
@@ -60,7 +60,8 @@ bool last_level_cache_is_valid(unsigned int cpu)
llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
- return (llc->attributes & CACHE_ID) || !!llc->fw_token;
+ return (llc->type == CACHE_TYPE_UNIFIED) &&
+ ((llc->attributes & CACHE_ID) || !!llc->fw_token);
}
On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> On 29/06/2022 18:49, [email protected] wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On 27/06/2022 17:50, Sudeep Holla wrote:
> >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>
> >> The cacheinfo is now initialised early along with the CPU topology
> >> initialisation. Instead of relying on the LLC ID information parsed
> >> separately only with ACPI PPTT elsewhere, migrate to use the similar
> >> information from the cacheinfo.
> >>
> >> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> >> parsed separately can now be removed from arch specific code.
> >
> > Hey Sudeep,
> > I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
> > I suspect the issue is a missing "next-level-cache" in the the dt:
> > arch/riscv/boot/dts/microchip/mpfs.dtsi
Good that I included this in -next, I had not received any feedback from
RISC-V even after 5 iterations. I also see this DTS is very odd. It also
states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
a mistake or is it the reality ? Another breakage in userspace cacheinfo
sysfs entry of cpu0 has both I$ and D$.
--
Regards,
Sudeep
On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> On 29/06/2022 18:49, [email protected] wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On 27/06/2022 17:50, Sudeep Holla wrote:
> >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>
> >> The cacheinfo is now initialised early along with the CPU topology
> >> initialisation. Instead of relying on the LLC ID information parsed
> >> separately only with ACPI PPTT elsewhere, migrate to use the similar
> >> information from the cacheinfo.
> >>
> >> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> >> parsed separately can now be removed from arch specific code.
> >
> > Hey Sudeep,
> > I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
> > I suspect the issue is a missing "next-level-cache" in the the dt:
> > arch/riscv/boot/dts/microchip/mpfs.dtsi
> >
> > Adding next-level-cache = <&cctrllr> fixes the boot.
>
If the above is missing, even the existing cacheinfo will be incorrect on
that system. But we must end up with L1 as LLC, I need to check if that
breaks the boot.
> No, no it doesn't. Not sure what I was thinking there.
> Prob tested that on the the last commit that bisect tested
> rather than the one it pointed out the problem was with.
>
So can I assume the pointed commit is where the boot breaks ?
> Either way, boot is broken in -next.
>
OK, that's bad.
> > Not sure what the resolution here is, old devicetrees are meant to keep
> > booting, right?
> >
Ideally yes. But with this, I assume the cacheinfo to userspace is also broken
on this platform and I guess that needs fixing which can happen with DT update
only.
--
Regards,
Sudeep
On 29/06/2022 19:47, Sudeep Holla wrote:
> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>> On 29/06/2022 18:49, [email protected] wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 27/06/2022 17:50, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> The cacheinfo is now initialised early along with the CPU topology
>>>> initialisation. Instead of relying on the LLC ID information parsed
>>>> separately only with ACPI PPTT elsewhere, migrate to use the similar
>>>> information from the cacheinfo.
>>>>
>>>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
>>>> parsed separately can now be removed from arch specific code.
>>>
>>> Hey Sudeep,
>>> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
>>> I suspect the issue is a missing "next-level-cache" in the the dt:
>>> arch/riscv/boot/dts/microchip/mpfs.dtsi
>
> Good that I included this in -next, I had not received any feedback from
> RISC-V even after 5 iterations.
I'll be honest, I saw the titles and CC list and made some incorrect
assumptions as to whether looking at it was worthwhile! I am not at
this all too long and what is/isn't important to look at often is not
obvious to me. But hey, our CI boots -next every day for a reason ;)
> I also see this DTS is very odd. It also
> states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
> a mistake or is it the reality ?
AFAIK, reality. It's the same for the SiFive fu540 (with which this shares
a core complex. See page 12:
https://static.dev.sifive.com/FU540-C000-v1.0.pdf
> Another breakage in userspace cacheinfo
> sysfs entry of cpu0 has both I$ and D$.
Could you clarify what this means please?
Thanks,
Conor.
On 29/06/2022 20:12, Sudeep Holla wrote:
> On Wed, Jun 29, 2022 at 06:56:29PM +0000, [email protected] wrote:
>> On 29/06/2022 19:47, Sudeep Holla wrote:
>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>>> On 29/06/2022 18:49, [email protected] wrote:
>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>
>>>>> On 27/06/2022 17:50, Sudeep Holla wrote:
>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>
>>>>>> The cacheinfo is now initialised early along with the CPU topology
>>>>>> initialisation. Instead of relying on the LLC ID information parsed
>>>>>> separately only with ACPI PPTT elsewhere, migrate to use the similar
>>>>>> information from the cacheinfo.
>>>>>>
>>>>>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
>>>>>> parsed separately can now be removed from arch specific code.
>>>>>
>>>>> Hey Sudeep,
>>>>> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
>>>>> I suspect the issue is a missing "next-level-cache" in the the dt:
>>>>> arch/riscv/boot/dts/microchip/mpfs.dtsi
>>>
>>> Good that I included this in -next, I had not received any feedback from
>>> RISC-V even after 5 iterations.
>>
>> I'll be honest, I saw the titles and CC list and made some incorrect
>> assumptions as to whether looking at it was worthwhile! I am not at
>> this all too long and what is/isn't important to look at often is not
>> obvious to me.
>
> No worries, that's why I thought better to include in -next to get some
> attention and I did get it this time, hurray! ????
>
>> But hey, our CI boots -next every day for a reason ;)
>>
>
> Good to know and that is really great. Anyways let me know if the diff I sent
> helps. I strongly suspect that is the reason, but I may be wrong.
Aye, I'll get back to you on that one in a moment or two
>
>>> I also see this DTS is very odd. It also
>>> states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
>>> a mistake or is it the reality ?
>>
>> AFAIK, reality. It's the same for the SiFive fu540 (with which this shares
>> a core complex. See page 12:
>> https://static.dev.sifive.com/FU540-C000-v1.0.pdf
>>
>>> Another breakage in userspace cacheinfo
>>> sysfs entry of cpu0 has both I$ and D$.
>>
>> Could you clarify what this means please?
>
> Ignore me if the cpu0 really doesn't have L1-D$. However the userspace
> sysfs cacheinfo is incomplete without linking L2, so it can be considered
> as wrong info presented to the user.
Yeah, I'll send a patch hooking up the L2.
It wasn't in the initial fu540 dtsi so I guess it was added after the
initial dts for my stuff was created based on that.
>
> Check /sys/devices/system/cpu/cpu<n>/cache/index<i>/*.
> L2 won't be present there as the link with next-level-cache is missing.
> So userspace can interpret this as absence of L2.
>
# cat /sys/devices/system/cpu/cpu0/cache/index0/
coherency_line_size shared_cpu_list type
level shared_cpu_map uevent
number_of_sets size ways_of_associativity
# ls /sys/devices/system/cpu/cpu0/cache/
index0 index1 uevent
# cat /sys/devices/system/cpu/cpu0/cache/index0/level
1
# cat /sys/devices/system/cpu/cpu0/cache/index1/level
1
cpu0 is /not/ the one with only instruction cache, that is not
running Linux.
On Wed, Jun 29, 2022 at 06:56:29PM +0000, [email protected] wrote:
> On 29/06/2022 19:47, Sudeep Holla wrote:
> > On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> >> On 29/06/2022 18:49, [email protected] wrote:
> >>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>>
> >>> On 27/06/2022 17:50, Sudeep Holla wrote:
> >>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>>>
> >>>> The cacheinfo is now initialised early along with the CPU topology
> >>>> initialisation. Instead of relying on the LLC ID information parsed
> >>>> separately only with ACPI PPTT elsewhere, migrate to use the similar
> >>>> information from the cacheinfo.
> >>>>
> >>>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> >>>> parsed separately can now be removed from arch specific code.
> >>>
> >>> Hey Sudeep,
> >>> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
> >>> I suspect the issue is a missing "next-level-cache" in the the dt:
> >>> arch/riscv/boot/dts/microchip/mpfs.dtsi
> >
> > Good that I included this in -next, I had not received any feedback from
> > RISC-V even after 5 iterations.
>
> I'll be honest, I saw the titles and CC list and made some incorrect
> assumptions as to whether looking at it was worthwhile! I am not at
> this all too long and what is/isn't important to look at often is not
> obvious to me.
No worries, that's why I thought better to include in -next to get some
attention and I did get it this time, hurray! ????
> But hey, our CI boots -next every day for a reason ;)
>
Good to know and that is really great. Anyways let me know if the diff I sent
helps. I strongly suspect that is the reason, but I may be wrong.
> > I also see this DTS is very odd. It also
> > states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
> > a mistake or is it the reality ?
>
> AFAIK, reality. It's the same for the SiFive fu540 (with which this shares
> a core complex. See page 12:
> https://static.dev.sifive.com/FU540-C000-v1.0.pdf
>
> > Another breakage in userspace cacheinfo
> > sysfs entry of cpu0 has both I$ and D$.
>
> Could you clarify what this means please?
Ignore me if the cpu0 really doesn't have L1-D$. However the userspace
sysfs cacheinfo is incomplete without linking L2, so it can be considered
as wrong info presented to the user.
Check /sys/devices/system/cpu/cpu<n>/cache/index<i>/*.
L2 won't be present there as the link with next-level-cache is missing.
So userspace can interpret this as absence of L2.
--
Regards,
Sudeep
On 29/06/2022 19:42, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>
>> No, no it doesn't. Not sure what I was thinking there.
>> Prob tested that on the the last commit that bisect tested
>> rather than the one it pointed out the problem was with.
>>
>> Either way, boot is broken in -next.
>>
>
> Can you check if the below fixes the issue?
Unfortunately, no joy.
Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
last level cache information from the cacheinfo").
Thanks,
Conor.
> Assuming presenting L1 as
> LLC might be causing issue.
>
> Regards,
> Sudeep
>
> -->8
> diff --git i/drivers/base/cacheinfo.c w/drivers/base/cacheinfo.c
> index 167abfa6f37d..a691317f7fdd 100644
> --- i/drivers/base/cacheinfo.c
> +++ w/drivers/base/cacheinfo.c
> @@ -60,7 +60,8 @@ bool last_level_cache_is_valid(unsigned int cpu)
>
> llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
>
> - return (llc->attributes & CACHE_ID) || !!llc->fw_token;
> + return (llc->type == CACHE_TYPE_UNIFIED) &&
> + ((llc->attributes & CACHE_ID) || !!llc->fw_token);
>
> }
>
On Wed, Jun 29, 2022 at 07:25:41PM +0000, [email protected] wrote:
>
>
> On 29/06/2022 20:12, Sudeep Holla wrote:
> > On Wed, Jun 29, 2022 at 06:56:29PM +0000, [email protected] wrote:
> >> On 29/06/2022 19:47, Sudeep Holla wrote:
> >>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> >>>> On 29/06/2022 18:49, [email protected] wrote:
> >>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>>>>
> >>>>> On 27/06/2022 17:50, Sudeep Holla wrote:
> >>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>>>>>
> >>>>>> The cacheinfo is now initialised early along with the CPU topology
> >>>>>> initialisation. Instead of relying on the LLC ID information parsed
> >>>>>> separately only with ACPI PPTT elsewhere, migrate to use the similar
> >>>>>> information from the cacheinfo.
> >>>>>>
> >>>>>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> >>>>>> parsed separately can now be removed from arch specific code.
> >>>>>
> >>>>> Hey Sudeep,
> >>>>> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
> >>>>> I suspect the issue is a missing "next-level-cache" in the the dt:
> >>>>> arch/riscv/boot/dts/microchip/mpfs.dtsi
> >>>
> >>> Good that I included this in -next, I had not received any feedback from
> >>> RISC-V even after 5 iterations.
> >>
> >> I'll be honest, I saw the titles and CC list and made some incorrect
> >> assumptions as to whether looking at it was worthwhile! I am not at
> >> this all too long and what is/isn't important to look at often is not
> >> obvious to me.
> >
> > No worries, that's why I thought better to include in -next to get some
> > attention and I did get it this time, hurray! ????
> >
> >> But hey, our CI boots -next every day for a reason ;)
> >>
> >
> > Good to know and that is really great. Anyways let me know if the diff I sent
> > helps. I strongly suspect that is the reason, but I may be wrong.
>
> Aye, I'll get back to you on that one in a moment or two
>
Sure, take your time.
> >
> >>> I also see this DTS is very odd. It also
> >>> states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
> >>> a mistake or is it the reality ?
> >>
> >> AFAIK, reality. It's the same for the SiFive fu540 (with which this shares
> >> a core complex. See page 12:
> >> https://static.dev.sifive.com/FU540-C000-v1.0.pdf
> >>
> >>> Another breakage in userspace cacheinfo
> >>> sysfs entry of cpu0 has both I$ and D$.
> >>
> >> Could you clarify what this means please?
> >
> > Ignore me if the cpu0 really doesn't have L1-D$. However the userspace
> > sysfs cacheinfo is incomplete without linking L2, so it can be considered
> > as wrong info presented to the user.
>
> Yeah, I'll send a patch hooking up the L2.
> It wasn't in the initial fu540 dtsi so I guess it was added after the
> initial dts for my stuff was created based on that.
>
Thanks!
> >
> > Check /sys/devices/system/cpu/cpu<n>/cache/index<i>/*.
> > L2 won't be present there as the link with next-level-cache is missing.
> > So userspace can interpret this as absence of L2.
> >
>
> # cat /sys/devices/system/cpu/cpu0/cache/index0/
> coherency_line_size shared_cpu_list type
> level shared_cpu_map uevent
> number_of_sets size ways_of_associativity
> # ls /sys/devices/system/cpu/cpu0/cache/
> index0 index1 uevent
> # cat /sys/devices/system/cpu/cpu0/cache/index0/level
> 1
> # cat /sys/devices/system/cpu/cpu0/cache/index1/level
> 1
>
Ideally there must /sys/devices/system/cpu/cpu*/cache/index2/level
which reads 2 once you link it in the DT.
> cpu0 is /not/ the one with only instruction cache, that is not
> running Linux.
Ah, so there Linux runs only on cpu 1-4 ?
--
Regards,
Sudeep
On 29/06/2022 20:43, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Wed, Jun 29, 2022 at 07:25:41PM +0000, [email protected] wrote:
>>
>>
>> On 29/06/2022 20:12, Sudeep Holla wrote:
>>> On Wed, Jun 29, 2022 at 06:56:29PM +0000, [email protected] wrote:
>>>> On 29/06/2022 19:47, Sudeep Holla wrote:
>>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>>>>> On 29/06/2022 18:49, [email protected] wrote:
>>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>>
>>>>>>> On 27/06/2022 17:50, Sudeep Holla wrote:
>>>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>>>
>>>>>>>> The cacheinfo is now initialised early along with the CPU topology
>>>>>>>> initialisation. Instead of relying on the LLC ID information parsed
>>>>>>>> separately only with ACPI PPTT elsewhere, migrate to use the similar
>>>>>>>> information from the cacheinfo.
>>>>>>>>
>>>>>>>> This is generic for both DT and ACPI systems. The ACPI LLC ID information
>>>>>>>> parsed separately can now be removed from arch specific code.
>>>>>>>
>>>>>>> Hey Sudeep,
>>>>>>> I bisected broken boot on PolarFire SoC to this patch in next-20220629 :/
>>>>>>> I suspect the issue is a missing "next-level-cache" in the the dt:
>>>>>>> arch/riscv/boot/dts/microchip/mpfs.dtsi
>>>>>
>>>>> Good that I included this in -next, I had not received any feedback from
>>>>> RISC-V even after 5 iterations.
>>>>
>>>> I'll be honest, I saw the titles and CC list and made some incorrect
>>>> assumptions as to whether looking at it was worthwhile! I am not at
>>>> this all too long and what is/isn't important to look at often is not
>>>> obvious to me.
>>>
>>> No worries, that's why I thought better to include in -next to get some
>>> attention and I did get it this time, hurray! ????
>>>
>>>> But hey, our CI boots -next every day for a reason ;)
>>>>
>>>
>>> Good to know and that is really great. Anyways let me know if the diff I sent
>>> helps. I strongly suspect that is the reason, but I may be wrong.
>>
>> Aye, I'll get back to you on that one in a moment or two
>>
>
> Sure, take your time.
>
>>>
>>>>> I also see this DTS is very odd. It also
>>>>> states CPU0 doesn't have L1-D$ while the other 4 CPUs have L1-D$. Is that
>>>>> a mistake or is it the reality ?
>>>>
>>>> AFAIK, reality. It's the same for the SiFive fu540 (with which this shares
>>>> a core complex. See page 12:
>>>> https://static.dev.sifive.com/FU540-C000-v1.0.pdf
>>>>
>>>>> Another breakage in userspace cacheinfo
>>>>> sysfs entry of cpu0 has both I$ and D$.
>>>>
>>>> Could you clarify what this means please?
>>>
>>> Ignore me if the cpu0 really doesn't have L1-D$. However the userspace
>>> sysfs cacheinfo is incomplete without linking L2, so it can be considered
>>> as wrong info presented to the user.
>>
>> Yeah, I'll send a patch hooking up the L2.
>> It wasn't in the initial fu540 dtsi so I guess it was added after the
>> initial dts for my stuff was created based on that.
>>
>
> Thanks!
>
>>>
>>> Check /sys/devices/system/cpu/cpu<n>/cache/index<i>/*.
>>> L2 won't be present there as the link with next-level-cache is missing.
>>> So userspace can interpret this as absence of L2.
>>>
>>
>> # cat /sys/devices/system/cpu/cpu0/cache/index0/
>> coherency_line_size shared_cpu_list type
>> level shared_cpu_map uevent
>> number_of_sets size ways_of_associativity
>> # ls /sys/devices/system/cpu/cpu0/cache/
>> index0 index1 uevent
>> # cat /sys/devices/system/cpu/cpu0/cache/index0/level
>> 1
>> # cat /sys/devices/system/cpu/cpu0/cache/index1/level
>> 1
>>
> Ideally there must /sys/devices/system/cpu/cpu*/cache/index2/level
> which reads 2 once you link it in the DT.
# ls /sys/devices/system/cpu/cpu0/cache/
index0 index1 index2 uevent
# cat /sys/devices/system/cpu/cpu0/cache/index2/level
2
# cat /sys/devices/system/cpu/cpu0/cache/index1/level
1
# cat /sys/devices/system/cpu/cpu0/cache/index0/level
1
Sweet :)
>
>> cpu0 is /not/ the one with only instruction cache, that is not
>> running Linux.
>
> Ah, so there Linux runs only on cpu 1-4 ?
Correct. cpu0 supports fewer instructions & we just use it as a
monitor core for opensbi etc.
>
> --
> Regards,
> Sudeep
On Wed, Jun 29, 2022 at 07:39:43PM +0000, [email protected] wrote:
> On 29/06/2022 19:42, Sudeep Holla wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> >>
> >> No, no it doesn't. Not sure what I was thinking there.
> >> Prob tested that on the the last commit that bisect tested
> >> rather than the one it pointed out the problem was with.
> >>
> >> Either way, boot is broken in -next.
> >>
> >
> > Can you check if the below fixes the issue?
>
> Unfortunately, no joy.
> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
> last level cache information from the cacheinfo").
That's bad. Does the system boot with
Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
attributes") ?
--
Regards,
Sudeep
On 29/06/2022 20:54, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Wed, Jun 29, 2022 at 07:39:43PM +0000, [email protected] wrote:
>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>>>
>>>> No, no it doesn't. Not sure what I was thinking there.
>>>> Prob tested that on the the last commit that bisect tested
>>>> rather than the one it pointed out the problem was with.
>>>>
>>>> Either way, boot is broken in -next.
>>>>
>>>
>>> Can you check if the below fixes the issue?
>>
>> Unfortunately, no joy.
>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>> last level cache information from the cacheinfo").
>
> That's bad. Does the system boot with
> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
> attributes") ?
It does.
>
> --
> Regards,
> Sudeep
On 29/06/2022 21:32, [email protected] wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On 29/06/2022 20:54, Sudeep Holla wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> On Wed, Jun 29, 2022 at 07:39:43PM +0000, [email protected] wrote:
>>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>>>>
>>>>> No, no it doesn't. Not sure what I was thinking there.
>>>>> Prob tested that on the the last commit that bisect tested
>>>>> rather than the one it pointed out the problem was with.
>>>>>
>>>>> Either way, boot is broken in -next.
>>>>>
>>>>
>>>> Can you check if the below fixes the issue?
>>>
>>> Unfortunately, no joy.
>>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>>> last level cache information from the cacheinfo").
>>
>> That's bad. Does the system boot with
>> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
>> attributes") ?
>
> It does.
FWIW boot log of the failure:
[ 0.000000] Linux version 5.19.0-rc4-00009-g3b23bb2573e6-dirty (conor@) (riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0, GNU ld (GNU Binutils) 2.37) #1 SMP Thu Jun 30 00:22:42 IST 2022
[ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
[ 0.000000] Machine model: Microchip PolarFire-SoC Icicle Kit
[ 0.000000] earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8')
[ 0.000000] printk: bootconsole [ns16550a0] enabled
[ 0.000000] printk: debug: skip boot console de-registration.
[ 0.000000] efi: UEFI not found.
[ 0.000000] Zone ranges:
[ 0.000000] DMA32 [mem 0x0000000080200000-0x00000000ffffffff]
[ 0.000000] Normal [mem 0x0000000100000000-0x000000103fffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080200000-0x00000000adffffff]
[ 0.000000] node 0: [mem 0x0000001000000000-0x000000103fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000103fffffff]
[ 0.000000] On node 0, zone Normal: 16064512 pages in unavailable ranges
[ 0.000000] SBI specification v0.3 detected
[ 0.000000] SBI implementation ID=0x1 Version=0x9
[ 0.000000] SBI TIME extension detected
[ 0.000000] SBI IPI extension detected
[ 0.000000] SBI RFENCE extension detected
[ 0.000000] SBI HSM extension detected
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] riscv: base ISA extensions acdfim
[ 0.000000] riscv: ELF capabilities acdfim
[ 0.000000] percpu: Embedded 18 pages/cpu s34104 r8192 d31432 u73728
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 224263
[ 0.000000] Kernel command line: earlycon keep_bootcon
[ 0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] software IO TLB: mapped [mem 0x00000000aa000000-0x00000000ae000000] (64MB)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] fixmap : 0xffffffc6fee00000 - 0xffffffc6ff000000 (2048 kB)
[ 0.000000] pci io : 0xffffffc6ff000000 - 0xffffffc700000000 ( 16 MB)
[ 0.000000] vmemmap : 0xffffffc700000000 - 0xffffffc800000000 (4096 MB)
[ 0.000000] vmalloc : 0xffffffc800000000 - 0xffffffd800000000 ( 64 GB)
[ 0.000000] lowmem : 0xffffffd800000000 - 0xffffffe7bfe00000 ( 62 GB)
[ 0.000000] kernel : 0xffffffff80000000 - 0xffffffffffffffff (2047 MB)
[ 0.000000] Memory: 807748K/1800192K available (6523K kernel code, 4857K rwdata, 2048K rodata, 2171K init, 396K bss, 992444K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
[ 0.000000] rcu: RCU debug extended QS entry/exit.
[ 0.000000] Tracing variant of Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
[ 0.000000] riscv-intc: 64 local interrupts mapped
[ 0.000000] plic: interrupt-controller@c000000: mapped 186 interrupts with 4 handlers for 9 contexts.
[ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[ 0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [4]
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
[ 0.000003] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
[ 0.009639] Console: colour dummy device 80x25
[ 0.014583] printk: console [tty0] enabled
[ 0.019220] Calibrating delay loop (skipped), value calculated using timer frequency.. 2.00 BogoMIPS (lpj=4000)
[ 0.030377] pid_max: default: 32768 minimum: 301
[ 0.035905] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[ 0.044169] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[ 0.057018] cblist_init_generic: Setting adjustable number of callback queues.
[ 0.065031] cblist_init_generic: Setting shift to 2 and lim to 1.
[ 0.072068] riscv: ELF compat mode failed
[ 0.072084] ASID allocator disabled (0 bits)
[ 0.081558] rcu: Hierarchical SRCU implementation.
[ 0.087482] EFI services will not be available.
[ 0.093816] smp: Bringing up secondary CPUs ...
On Wed, Jun 29, 2022 at 11:25:41PM +0000, [email protected] wrote:
> On 29/06/2022 21:32, [email protected] wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On 29/06/2022 20:54, Sudeep Holla wrote:
> >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>
> >> On Wed, Jun 29, 2022 at 07:39:43PM +0000, [email protected] wrote:
> >>> On 29/06/2022 19:42, Sudeep Holla wrote:
> >>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >>>>
> >>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
> >>>>>
> >>>>> No, no it doesn't. Not sure what I was thinking there.
> >>>>> Prob tested that on the the last commit that bisect tested
> >>>>> rather than the one it pointed out the problem was with.
> >>>>>
> >>>>> Either way, boot is broken in -next.
> >>>>>
> >>>>
> >>>> Can you check if the below fixes the issue?
> >>>
> >>> Unfortunately, no joy.
> >>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
> >>> last level cache information from the cacheinfo").
> >>
> >> That's bad. Does the system boot with
> >> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
> >> attributes") ?
> >
> > It does.
>
I can't think of any reason for that to happen unless detect_cache_attributes
is failing from init_cpu_topology and we are ignoring that.
Are all RISC-V platforms failing on -next or is it just this platform ?
We may have to try with some logs in detect_cache_attributes,
last_level_cache_is_valid and last_level_cache_is_shared to check where it
is going wrong.
It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
--
Regards,
Sudeep
On 30/06/2022 11:39, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Wed, Jun 29, 2022 at 11:25:41PM +0000, [email protected] wrote:
>> On 29/06/2022 21:32, [email protected] wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>
>>> On 29/06/2022 20:54, Sudeep Holla wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On Wed, Jun 29, 2022 at 07:39:43PM +0000, [email protected] wrote:
>>>>> On 29/06/2022 19:42, Sudeep Holla wrote:
>>>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>>>
>>>>>> On Wed, Jun 29, 2022 at 06:18:25PM +0000, [email protected] wrote:
>>>>>>>
>>>>>>> No, no it doesn't. Not sure what I was thinking there.
>>>>>>> Prob tested that on the the last commit that bisect tested
>>>>>>> rather than the one it pointed out the problem was with.
>>>>>>>
>>>>>>> Either way, boot is broken in -next.
>>>>>>>
>>>>>>
>>>>>> Can you check if the below fixes the issue?
>>>>>
>>>>> Unfortunately, no joy.
>>>>> Applied to a HEAD of 3b23bb2573e6 ("arch_topology: Use the
>>>>> last level cache information from the cacheinfo").
>>>>
>>>> That's bad. Does the system boot with
>>>> Commit 2f7b757eb69d ("arch_topology: Add support to parse and detect cache
>>>> attributes") ?
>>>
>>> It does.
>>
>
> I can't think of any reason for that to happen unless detect_cache_attributes
> is failing from init_cpu_topology and we are ignoring that.
>
> Are all RISC-V platforms failing on -next or is it just this platform ?
I don't know. I only have SoCs with this core complex & one that does not
work with upstream. I can try my other board with this SoC - but I am on
leave at the moment w/ a computer or internet during the day so it may be
a few days before I can try it.
However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
but had issues with RCU stalling:
https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
Not going to claim any relation, but that's minus 1 to the platforms that
can be used to test this on upstream RISC-V.
> We may have to try with some logs in detect_cache_attributes,
> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> is going wrong.
>
> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
Yeah, I was playing around last night for a while but didn't figure out the
root cause. I'll try again tonight.
In the meantime, would you mind taking the patches out of -next?
FWIW I repro'd the failure on next-20220630.
Thanks,
Conor.
On Thu, Jun 30, 2022 at 04:37:50PM +0000, [email protected] wrote:
> On 30/06/2022 11:39, Sudeep Holla wrote:
> >
> > I can't think of any reason for that to happen unless detect_cache_attributes
> > is failing from init_cpu_topology and we are ignoring that.
> >
> > Are all RISC-V platforms failing on -next or is it just this platform ?
>
> I don't know. I only have SoCs with this core complex & one that does not
> work with upstream. I can try my other board with this SoC - but I am on
> leave at the moment w/ a computer or internet during the day so it may be
> a few days before I can try it.
>
Sure, no worries.
> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
> but had issues with RCU stalling:
> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
> Not going to claim any relation, but that's minus 1 to the platforms that
> can be used to test this on upstream RISC-V.
>
Ah OK, will check and ask full logs to see if there is any relation.
> > We may have to try with some logs in detect_cache_attributes,
> > last_level_cache_is_valid and last_level_cache_is_shared to check where it
> > is going wrong.
> >
> > It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
>
> Yeah, I was playing around last night for a while but didn't figure out the
> root cause. I'll try again tonight.
>
OK, thanks for that. I tried qemu, but doesn't have any cache info in DT
provided by qemu itself. The other sifive_u machine didn't work on qemu,
only virt booted with mainline as well.
> In the meantime, would you mind taking the patches out of -next?
I don't want to take out as we will loose all the test coverage.
I would like to know if any other RISC-V platform is affected or not
before removing it.
> FWIW I repro'd the failure on next-20220630.
Yes, nothing has changed yet.
--
Regards,
Sudeep
On 30/06/2022 18:35, Sudeep Holla wrote:
> On Thu, Jun 30, 2022 at 04:37:50PM +0000, [email protected] wrote:
>> On 30/06/2022 11:39, Sudeep Holla wrote:
>>>
>>> I can't think of any reason for that to happen unless detect_cache_attributes
>>> is failing from init_cpu_topology and we are ignoring that.
>>>
>>> Are all RISC-V platforms failing on -next or is it just this platform ?
>>
>> I don't know. I only have SoCs with this core complex & one that does not
>> work with upstream. I can try my other board with this SoC - but I am on
>> leave at the moment w/ a computer or internet during the day so it may be
>> a few days before I can try it.
>>
>
> Sure, no worries.
>
>> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
>> but had issues with RCU stalling:
>> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
>> Not going to claim any relation, but that's minus 1 to the platforms that
>> can be used to test this on upstream RISC-V.
>>
>
> Ah OK, will check and ask full logs to see if there is any relation.
>
>>> We may have to try with some logs in detect_cache_attributes,
>>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
>>> is going wrong.
>>>
>>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
So, looks like there's a problem in cache_leaves_are_shared() which is hit
by the above path. Both of the if clauses are false, and the function falls
through to return sib_leaf->fw_token == this_leaf->fw_token;
Both sib_leaf & this_leaf seem to be null.
static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
struct cacheinfo *sib_leaf)
{
/*
* For non DT/ACPI systems, assume unique level 1 caches,
* system-wide shared caches for all other levels. This will be used
* only if arch specific code has not populated shared_cpu_map
*/
if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
return !(this_leaf->level == 1);
if ((sib_leaf->attributes & CACHE_ID) &&
(this_leaf->attributes & CACHE_ID))
return sib_leaf->id == this_leaf->id;
return sib_leaf->fw_token == this_leaf->fw_token;
}
Any ideas what to look at next?
On 30/06/2022 21:07, Sudeep Holla wrote:
> On Thu, Jun 30, 2022 at 07:20:04PM +0000, [email protected] wrote:
>>
>>
>> On 30/06/2022 18:35, Sudeep Holla wrote:
>>> On Thu, Jun 30, 2022 at 04:37:50PM +0000, [email protected] wrote:
>>>> On 30/06/2022 11:39, Sudeep Holla wrote:
>>>>>
>>>>> I can't think of any reason for that to happen unless detect_cache_attributes
>>>>> is failing from init_cpu_topology and we are ignoring that.
>>>>>
>>>>> Are all RISC-V platforms failing on -next or is it just this platform ?
>>>>
>>>> I don't know. I only have SoCs with this core complex & one that does not
>>>> work with upstream. I can try my other board with this SoC - but I am on
>>>> leave at the moment w/ a computer or internet during the day so it may be
>>>> a few days before I can try it.
>>>>
>>>
>>> Sure, no worries.
>>>
>>>> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
>>>> but had issues with RCU stalling:
>>>> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
>>>> Not going to claim any relation, but that's minus 1 to the platforms that
>>>> can be used to test this on upstream RISC-V.
>>>>
>>>
>>> Ah OK, will check and ask full logs to see if there is any relation.
>>>
>>>>> We may have to try with some logs in detect_cache_attributes,
>>>>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
>>>>> is going wrong.
>>>>>
>>>>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
>>
>>
>> So, looks like there's a problem in cache_leaves_are_shared() which is hit
>> by the above path. Both of the if clauses are false, and the function falls
>> through to return sib_leaf->fw_token == this_leaf->fw_token;
>
> Both if() failing is expected and that statement
> return sib_leaf->fw_token == this_leaf->fw_token;
> execution is correct.
>
>> Both sib_leaf & this_leaf seem to be null.
>>
>
> But this is wrong as last_level_cache_is_shared checks for
> last_level_cache_is_valid which must return false if the fw_token = NULL
> So we must not hit the above return statement with NULL fw_token.
>
>> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>> struct cacheinfo *sib_leaf)
>> {
>> /*
>> * For non DT/ACPI systems, assume unique level 1 caches,
>> * system-wide shared caches for all other levels. This will be used
>> * only if arch specific code has not populated shared_cpu_map
>> */
>> if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
>> return !(this_leaf->level == 1);
>>
>> if ((sib_leaf->attributes & CACHE_ID) &&
>> (this_leaf->attributes & CACHE_ID))
>> return sib_leaf->id == this_leaf->id;
>>
>> return sib_leaf->fw_token == this_leaf->fw_token;
>> }
>>
>> Any ideas what to look at next?
>
> I wonder how did we not get last_level_cache_is_valid as false if the
> fw_node is NULL. But it should not be NULL at the first place.
>
I didn't have the time to go digging into things, but the following
macro looked odd:
#define per_cpu_cacheinfo_idx(cpu, idx) \
(per_cpu_cacheinfo(cpu) + (idx))
Maybe it is just badly named, but is this getting the per_cpu_cacheinfo
and then incrementing intentionally, or is it meant to get the
per_cpu_cacheinfo of cpu + idx?
On Thu, Jun 30, 2022 at 07:20:04PM +0000, [email protected] wrote:
>
>
> On 30/06/2022 18:35, Sudeep Holla wrote:
> > On Thu, Jun 30, 2022 at 04:37:50PM +0000, [email protected] wrote:
> >> On 30/06/2022 11:39, Sudeep Holla wrote:
> >>>
> >>> I can't think of any reason for that to happen unless detect_cache_attributes
> >>> is failing from init_cpu_topology and we are ignoring that.
> >>>
> >>> Are all RISC-V platforms failing on -next or is it just this platform ?
> >>
> >> I don't know. I only have SoCs with this core complex & one that does not
> >> work with upstream. I can try my other board with this SoC - but I am on
> >> leave at the moment w/ a computer or internet during the day so it may be
> >> a few days before I can try it.
> >>
> >
> > Sure, no worries.
> >
> >> However, Niklas Cassel has tried to use the Canaan K210 on next-20220630
> >> but had issues with RCU stalling:
> >> https://lore.kernel.org/linux-riscv/Yr3PKR0Uj1bE5Y6O@x1-carbon/T/#m52016996fcf5fa0501066d73352ed8e806803e06
> >> Not going to claim any relation, but that's minus 1 to the platforms that
> >> can be used to test this on upstream RISC-V.
> >>
> >
> > Ah OK, will check and ask full logs to see if there is any relation.
> >
> >>> We may have to try with some logs in detect_cache_attributes,
> >>> last_level_cache_is_valid and last_level_cache_is_shared to check where it
> >>> is going wrong.
> >>>
> >>> It must be crashing in smp_callin->update_siblings_masks->last_level_cache_is_shared
>
>
> So, looks like there's a problem in cache_leaves_are_shared() which is hit
> by the above path. Both of the if clauses are false, and the function falls
> through to return sib_leaf->fw_token == this_leaf->fw_token;
Both if() failing is expected and that statement
return sib_leaf->fw_token == this_leaf->fw_token;
execution is correct.
> Both sib_leaf & this_leaf seem to be null.
>
But this is wrong as last_level_cache_is_shared checks for
last_level_cache_is_valid which must return false if the fw_token = NULL
So we must not hit the above return statement with NULL fw_token.
> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> struct cacheinfo *sib_leaf)
> {
> /*
> * For non DT/ACPI systems, assume unique level 1 caches,
> * system-wide shared caches for all other levels. This will be used
> * only if arch specific code has not populated shared_cpu_map
> */
> if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> return !(this_leaf->level == 1);
>
> if ((sib_leaf->attributes & CACHE_ID) &&
> (this_leaf->attributes & CACHE_ID))
> return sib_leaf->id == this_leaf->id;
>
> return sib_leaf->fw_token == this_leaf->fw_token;
> }
>
> Any ideas what to look at next?
I wonder how did we not get last_level_cache_is_valid as false if the
fw_node is NULL. But it should not be NULL at the first place.
--
Regards,
Sudeep
On Thu, Jun 30, 2022 at 08:13:55PM +0000, [email protected] wrote:
>
> I didn't have the time to go digging into things, but the following
> macro looked odd:
> #define per_cpu_cacheinfo_idx(cpu, idx) \
> (per_cpu_cacheinfo(cpu) + (idx))
> Maybe it is just badly named, but is this getting the per_cpu_cacheinfo
> and then incrementing intentionally, or is it meant to get the
> per_cpu_cacheinfo of cpu + idx?
OK, basically per_cpu_cacheinfo(cpu) get the information for a cpu
while per_cpu_cacheinfo_idx(cpu, idx) will fetch the information for a
given cpu and given index within the cpu. So we are incrementing the
pointer by the index. These work just fine on arm64 platform.
Not sure if compiler is optimising something as I still can't understand
how we can end up with valid llc but fw_token as NULL.
--
Regards,
Sudeep
On 30/06/2022 21:21, Sudeep Holla wrote:
> On Thu, Jun 30, 2022 at 08:13:55PM +0000, [email protected] wrote:
>>
>> I didn't have the time to go digging into things, but the following
>> macro looked odd:
>> #define per_cpu_cacheinfo_idx(cpu, idx) \
>> (per_cpu_cacheinfo(cpu) + (idx))
>> Maybe it is just badly named, but is this getting the per_cpu_cacheinfo
>> and then incrementing intentionally, or is it meant to get the
>> per_cpu_cacheinfo of cpu + idx?
>
> OK, basically per_cpu_cacheinfo(cpu) get the information for a cpu
> while per_cpu_cacheinfo_idx(cpu, idx) will fetch the information for a
> given cpu and given index within the cpu. So we are incrementing the
> pointer by the index. These work just fine on arm64 platform.
Right, that's what I figured but wanted to be sure.
>
> Not sure if compiler is optimising something as I still can't understand
> how we can end up with valid llc but fw_token as NULL.
See idk about that. The following fails to boot.
index 167abfa6f37d..9d45c37fb004 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
struct cacheinfo *sib_leaf)
{
+ if (!this_leaf || !sib_leaf)
+ return false;
/*
* For non DT/ACPI systems, assume unique level 1 caches,
* system-wide shared caches for all other levels. This will be used
@@ -74,8 +76,12 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
+ if (!llc_x || !llc_y){
+ printk("llc was null\n");
+ return false;
+ }
- return cache_leaves_are_shared(llc_x, llc_y);
+ return false; //cache_leaves_are_shared(llc_x, llc_y);
}
#ifdef CONFIG_OF
and this boots:
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 167abfa6f37d..01900908fe31 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
struct cacheinfo *sib_leaf)
{
+ if (!this_leaf || !sib_leaf)
+ return false;
/*
* For non DT/ACPI systems, assume unique level 1 caches,
* system-wide shared caches for all other levels. This will be used
@@ -75,7 +77,7 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
- return cache_leaves_are_shared(llc_x, llc_y);
+ return false; //cache_leaves_are_shared(llc_x, llc_y);
}
#ifdef CONFIG_OF
On Thu, Jun 30, 2022 at 10:07:49PM +0000, [email protected] wrote:
>
>
> On 30/06/2022 21:21, Sudeep Holla wrote:
> > On Thu, Jun 30, 2022 at 08:13:55PM +0000, [email protected] wrote:
> >>
> >> I didn't have the time to go digging into things, but the following
> >> macro looked odd:
> >> #define per_cpu_cacheinfo_idx(cpu, idx) \
> >> (per_cpu_cacheinfo(cpu) + (idx))
> >> Maybe it is just badly named, but is this getting the per_cpu_cacheinfo
> >> and then incrementing intentionally, or is it meant to get the
> >> per_cpu_cacheinfo of cpu + idx?
> >
> > OK, basically per_cpu_cacheinfo(cpu) get the information for a cpu
> > while per_cpu_cacheinfo_idx(cpu, idx) will fetch the information for a
> > given cpu and given index within the cpu. So we are incrementing the
> > pointer by the index. These work just fine on arm64 platform.
>
> Right, that's what I figured but wanted to be sure.
>
OK
> >
> > Not sure if compiler is optimising something as I still can't understand
> > how we can end up with valid llc but fw_token as NULL.
> See idk about that. The following fails to boot.
> index 167abfa6f37d..9d45c37fb004 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> struct cacheinfo *sib_leaf)
> {
> + if (!this_leaf || !sib_leaf)
> + return false;
Did you hit this ?
> /*
> * For non DT/ACPI systems, assume unique level 1 caches,
> * system-wide shared caches for all other levels. This will be used
> @@ -74,8 +76,12 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
>
> llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
> llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
> + if (!llc_x || !llc_y){
> + printk("llc was null\n");
Or this ?
> + return false;
> + }
>
> - return cache_leaves_are_shared(llc_x, llc_y);
> + return false; //cache_leaves_are_shared(llc_x, llc_y);
Even the above change fails to boot ? Coz you are always returning false here
too.
> }
>
> #ifdef CONFIG_OF
>
> and this boots:
>
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 167abfa6f37d..01900908fe31 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> struct cacheinfo *sib_leaf)
> {
> + if (!this_leaf || !sib_leaf)
> + return false;
> /*
> * For non DT/ACPI systems, assume unique level 1 caches,
> * system-wide shared caches for all other levels. This will be used
> @@ -75,7 +77,7 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
> llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
> llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
>
You are just missing the checks for llc_x and llc_y and it works which
means llc_x and llc_y is where things are going wrong.
--
Regards,
Sudeep
On 01/07/2022 12:11, Sudeep Holla wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Thu, Jun 30, 2022 at 10:07:49PM +0000, [email protected] wrote:
>>
>>
>> On 30/06/2022 21:21, Sudeep Holla wrote:
>>> On Thu, Jun 30, 2022 at 08:13:55PM +0000, [email protected] wrote:
>>>>
>>>> I didn't have the time to go digging into things, but the following
>>>> macro looked odd:
>>>> #define per_cpu_cacheinfo_idx(cpu, idx) \
>>>> (per_cpu_cacheinfo(cpu) + (idx))
>>>> Maybe it is just badly named, but is this getting the per_cpu_cacheinfo
>>>> and then incrementing intentionally, or is it meant to get the
>>>> per_cpu_cacheinfo of cpu + idx?
>>>
>>> OK, basically per_cpu_cacheinfo(cpu) get the information for a cpu
>>> while per_cpu_cacheinfo_idx(cpu, idx) will fetch the information for a
>>> given cpu and given index within the cpu. So we are incrementing the
>>> pointer by the index. These work just fine on arm64 platform.
>>
>> Right, that's what I figured but wanted to be sure.
>>
>
> OK
>
>>>
>>> Not sure if compiler is optimising something as I still can't understand
>>> how we can end up with valid llc but fw_token as NULL.
>> See idk about that. The following fails to boot.
>> index 167abfa6f37d..9d45c37fb004 100644
>> --- a/drivers/base/cacheinfo.c
>> +++ b/drivers/base/cacheinfo.c
>> @@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>> struct cacheinfo *sib_leaf)
>> {
>> + if (!this_leaf || !sib_leaf)
>> + return false;
>
> Did you hit this ?
Ah I forgot to remove this, I had added it (since I knew a return value
of false was correct) but it was still failing to boot. It was my step
prior to adding the if(!llc...
>
>> /*
>> * For non DT/ACPI systems, assume unique level 1 caches,
>> * system-wide shared caches for all other levels. This will be used
>> @@ -74,8 +76,12 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
>>
>> llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
>> llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
>> + if (!llc_x || !llc_y){
>> + printk("llc was null\n");
>
> Or this ?
This printk never prints out.
>
>> + return false;
>> + }
>>
>> - return cache_leaves_are_shared(llc_x, llc_y);
>> + return false; //cache_leaves_are_shared(llc_x, llc_y);
>
> Even the above change fails to boot ? Coz you are always returning false here
> too.
Correct, fails to boot.
>
>> }
>>
>> #ifdef CONFIG_OF
>>
>> and this boots:
>>
>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>> index 167abfa6f37d..01900908fe31 100644
>> --- a/drivers/base/cacheinfo.c
>> +++ b/drivers/base/cacheinfo.c
>> @@ -36,6 +36,8 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
>> static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>> struct cacheinfo *sib_leaf)
>> {
>> + if (!this_leaf || !sib_leaf)
>> + return false;
>> /*
>> * For non DT/ACPI systems, assume unique level 1 caches,
>> * system-wide shared caches for all other levels. This will be used
>> @@ -75,7 +77,7 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
>> llc_x = per_cpu_cacheinfo_idx(cpu_x, cache_leaves(cpu_x) - 1);
>> llc_y = per_cpu_cacheinfo_idx(cpu_y, cache_leaves(cpu_y) - 1);
>>
>
> You are just missing the checks for llc_x and llc_y and it works which
> means llc_x and llc_y is where things are going wrong.
>
Sounds about right to me.