LinuxLists.cc - [PATCH v4 0/6] arch_topology: Build cacheinfo from primary CPU

2023-01-04 18:42:18

Subject: [PATCH v4 0/6] arch_topology: Build cacheinfo from primary CPU

v2:
- Applied renaming/formatting comments from v1.
- Check CACHE_TYPE_VALID flag in pppt.c.
v3:
- Applied Sudeep's suggestions (for patch 5/5):
- Renaming allocate_cache_info() -> fecth_cache_info()
- Updated error message
- Extract an inline allocate_cache_info() function
- Re-run checkpatch with --strict option
v4:
- Remove RISC-V's implementation of init_cache_level() as not
necessary.
- Add patch: 'cacheinfo: Check 'cache-unified' property to count
cache leaves' to increase the number of leaves at a cache level
when no cache-size property is found.
- In cacheinfo: Use RISC-V's init_cache_level() [...],
make 'levels', 'leaves' and 'level' unsigned int to match
of_property_read_u32()'s parameters signedness.

Note:
This patchset requires the following patch to be applied first in
order to avoid the same bug described in the commit message:
https://lore.kernel.org/all/[email protected]/

[1] and [2] build the CPU topology from the cacheinfo information for
both DT/ACPI based systems and remove (struct cpu_topology).llc_id
which was used by ACPI only.

Creating the cacheinfo for secondary CPUs is done during early boot.
Preemption and interrupts are disabled at this stage. On PREEMPT_RT
kernels, allocating memory (and parsing the PPTT table for ACPI based
systems) triggers a:
'BUG: sleeping function called from invalid context' [4]

To prevent this bug, allocate the cacheinfo from the primary CPU when
preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on the arm64 CLIDR_EL1 register.
If no cache information is found in the DT/ACPI PPTT, then fallback
to the current state, triggering [4] on PREEMPT_RT kernels.

Patches to update the arm64 device trees that have incomplete cacheinfo
(mostly for missing the 'cache-level' or 'cache-unified' property)
have been sent at [3].

Tested platforms:
- ACPI + PPTT: Ampere Altra, Ampere eMAG, Cavium ThunderX2,
Kunpeng 920, Juno-r2
- DT: rb5, db845c, Juno-r2

[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/
[3] https://lore.kernel.org/all/[email protected]/
[4] On an Ampere Altra, with PREEMPT_RT kernel based on v6.0.0-rc4:

[ 7.560791] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
[ 7.560794] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
[ 7.560796] preempt_count: 1, expected: 0
[ 7.560797] RCU nest depth: 1, expected: 1
[ 7.560799] 3 locks held by swapper/111/0:
[ 7.560800] #0: ffff403e406cae98 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
[ 7.560811] #1: ffffc5f8ed09f8e8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
[ 7.560820] #2: ffff403f400b4fd8 (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
[ 7.560824] irq event stamp: 0
[ 7.560825] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 7.560827] hardirqs last disabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[ 7.560830] softirqs last enabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[ 7.560833] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 7.560834] Preemption disabled at:
[ 7.560835] [<ffffc5f8e9fd3c28>] migrate_enable+0x30/0x130
[ 7.560838] CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-[...]
[ 7.560841] Call trace:
[...]
[ 7.560870] __kmalloc+0xbc/0x1e8
[ 7.560873] detect_cache_attributes+0x2d4/0x5f0
[ 7.560876] update_siblings_masks+0x30/0x368
[ 7.560880] store_cpu_topology+0x78/0xb8
[ 7.560883] secondary_start_kernel+0xd0/0x198
[ 7.560885] __secondary_switched+0xb0/0xb4

Pierre Gondois (6):
cacheinfo: Use RISC-V's init_cache_level() as generic OF
implementation
cacheinfo: Return error code in init_of_cache_level()
cacheinfo: Check 'cache-unified' property to count cache leaves
ACPI: PPTT: Remove acpi_find_cache_levels()
ACPI: PPTT: Update acpi_find_last_cache_level() to
acpi_get_cache_info()
arch_topology: Build cacheinfo from primary CPU

arch/arm64/kernel/cacheinfo.c | 11 ++-
arch/riscv/kernel/cacheinfo.c | 42 -----------
drivers/acpi/pptt.c | 93 +++++++++++++----------
drivers/base/arch_topology.c | 12 ++-
drivers/base/cacheinfo.c | 134 +++++++++++++++++++++++++++++-----
include/linux/cacheinfo.h | 11 ++-
6 files changed, 196 insertions(+), 107 deletions(-)

--
2.25.1

2023-01-04 18:42:53

by Pierre Gondois

[permalink] [raw]

Subject: [PATCH v4 2/6] cacheinfo: Return error code in init_of_cache_level()

Make init_of_cache_level() return an error code when the cache
information parsing fails to help detecting missing information.

init_of_cache_level() is only called for riscv. Returning an error
code instead of 0 will prevent detect_cache_attributes() to allocate
memory if an incomplete DT is parsed.

Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
---
drivers/base/cacheinfo.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 5ae58e32a77f..3e965ef6ee42 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -246,11 +246,11 @@ int init_of_cache_level(unsigned int cpu)
of_node_put(prev);
prev = np;
if (!of_device_is_compatible(np, "cache"))
- break;
+ goto err_out;
if (of_property_read_u32(np, "cache-level", &level))
- break;
+ goto err_out;
if (level <= levels)
- break;
+ goto err_out;
if (of_property_read_bool(np, "cache-size"))
++leaves;
if (of_property_read_bool(np, "i-cache-size"))
@@ -265,6 +265,10 @@ int init_of_cache_level(unsigned int cpu)
this_cpu_ci->num_leaves = leaves;

return 0;
+
+err_out:
+ of_node_put(np);
+ return -EINVAL;
}

#else
--
2.25.1

2023-01-04 18:52:17

by Pierre Gondois

[permalink] [raw]

Subject: [PATCH v4 5/6] ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()

acpi_find_last_cache_level() allows to find the last level of cache
for a given CPU. The function is only called on arm64 ACPI based
platforms to check for cache information that would be missing in
the CLIDR_EL1 register.
To allow populating (struct cpu_cacheinfo).num_leaves by only parsing
a PPTT, update acpi_find_last_cache_level() to get the 'split_levels',
i.e. the number of cache levels being split in data/instruction
caches.

It is assumed that there will not be data/instruction caches above a
unified cache.
If a split level consist of one data cache and no instruction cache
(or opposite), then the missing cache will still be populated
by default with minimal cache information, and maximal cpumask
(all non-existing caches have the same fw_token).

Suggested-by: Jeremy Linton <[email protected]>
Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
---
arch/arm64/kernel/cacheinfo.c | 11 +++--
drivers/acpi/pptt.c | 76 +++++++++++++++++++++++------------
include/linux/cacheinfo.h | 9 +++--
3 files changed, 63 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 97c42be71338..36c3b07cdf2d 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -46,7 +46,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
int init_cache_level(unsigned int cpu)
{
unsigned int ctype, level, leaves;
- int fw_level;
+ int fw_level, ret;
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);

for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
@@ -59,10 +59,13 @@ int init_cache_level(unsigned int cpu)
leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
}

- if (acpi_disabled)
+ if (acpi_disabled) {
fw_level = of_find_last_cache_level(cpu);
- else
- fw_level = acpi_find_last_cache_level(cpu);
+ } else {
+ ret = acpi_get_cache_info(cpu, &fw_level, NULL);
+ if (ret < 0)
+ return ret;
+ }

if (fw_level < 0)
return fw_level;
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 97c1d33822d1..10975bb603fb 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -81,6 +81,7 @@ static inline bool acpi_pptt_match_type(int table_type, int type)
* acpi_pptt_walk_cache() - Attempt to find the requested acpi_pptt_cache
* @table_hdr: Pointer to the head of the PPTT table
* @local_level: passed res reflects this cache level
+ * @split_levels: Number of split cache levels (data/instruction).
* @res: cache resource in the PPTT we want to walk
* @found: returns a pointer to the requested level if found
* @level: the requested cache level
@@ -100,6 +101,7 @@ static inline bool acpi_pptt_match_type(int table_type, int type)
*/
static unsigned int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
unsigned int local_level,
+ unsigned int *split_levels,
struct acpi_subtable_header *res,
struct acpi_pptt_cache **found,
unsigned int level, int type)
@@ -113,8 +115,17 @@ static unsigned int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
while (cache) {
local_level++;

+ if (!(cache->flags & ACPI_PPTT_CACHE_TYPE_VALID)) {
+ cache = fetch_pptt_cache(table_hdr, cache->next_level_of_cache);
+ continue;
+ }
+
+ if (split_levels &&
+ (acpi_pptt_match_type(cache->attributes, ACPI_PPTT_CACHE_TYPE_DATA) ||
+ acpi_pptt_match_type(cache->attributes, ACPI_PPTT_CACHE_TYPE_INSTR)))
+ *split_levels = local_level;
+
if (local_level == level &&
- cache->flags & ACPI_PPTT_CACHE_TYPE_VALID &&
acpi_pptt_match_type(cache->attributes, type)) {
if (*found != NULL && cache != *found)
pr_warn("Found duplicate cache level/type unable to determine uniqueness\n");
@@ -135,8 +146,8 @@ static unsigned int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
static struct acpi_pptt_cache *
acpi_find_cache_level(struct acpi_table_header *table_hdr,
struct acpi_pptt_processor *cpu_node,
- unsigned int *starting_level, unsigned int level,
- int type)
+ unsigned int *starting_level, unsigned int *split_levels,
+ unsigned int level, int type)
{
struct acpi_subtable_header *res;
unsigned int number_of_levels = *starting_level;
@@ -149,7 +160,8 @@ acpi_find_cache_level(struct acpi_table_header *table_hdr,
resource++;

local_level = acpi_pptt_walk_cache(table_hdr, *starting_level,
- res, &ret, level, type);
+ split_levels, res, &ret,
+ level, type);
/*
* we are looking for the max depth. Since its potentially
* possible for a given node to have resources with differing
@@ -165,29 +177,29 @@ acpi_find_cache_level(struct acpi_table_header *table_hdr,
}

/**
- * acpi_count_levels() - Given a PPTT table, and a CPU node, count the caches
+ * acpi_count_levels() - Given a PPTT table, and a CPU node, count the cache
+ * levels and split cache levels (data/instruction).
* @table_hdr: Pointer to the head of the PPTT table
* @cpu_node: processor node we wish to count caches for
+ * @levels: Number of levels if success.
+ * @split_levels: Number of split cache levels (data/instruction) if
+ * success. Can by NULL.
*
* Given a processor node containing a processing unit, walk into it and count
* how many levels exist solely for it, and then walk up each level until we hit
* the root node (ignore the package level because it may be possible to have
- * caches that exist across packages). Count the number of cache levels that
- * exist at each level on the way up.
- *
- * Return: Total number of levels found.
+ * caches that exist across packages). Count the number of cache levels and
+ * split cache levels (data/instruction) that exist at each level on the way
+ * up.
*/
-static int acpi_count_levels(struct acpi_table_header *table_hdr,
- struct acpi_pptt_processor *cpu_node)
+static void acpi_count_levels(struct acpi_table_header *table_hdr,
+ struct acpi_pptt_processor *cpu_node,
+ unsigned int *levels, unsigned int *split_levels)
{
- int total_levels = 0;
-
do {
- acpi_find_cache_level(table_hdr, cpu_node, &total_levels, 0, 0);
+ acpi_find_cache_level(table_hdr, cpu_node, levels, split_levels, 0, 0);
cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
} while (cpu_node);
-
- return total_levels;
}

/**
@@ -321,7 +333,7 @@ static struct acpi_pptt_cache *acpi_find_cache_node(struct acpi_table_header *ta

while (cpu_node && !found) {
found = acpi_find_cache_level(table_hdr, cpu_node,
- &total_levels, level, acpi_type);
+ &total_levels, NULL, level, acpi_type);
*node = cpu_node;
cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
}
@@ -589,36 +601,48 @@ static int check_acpi_cpu_flag(unsigned int cpu, int rev, u32 flag)
}

/**
- * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
+ * acpi_get_cache_info() - Determine the number of cache levels and
+ * split cache levels (data/instruction) and for a PE.
* @cpu: Kernel logical CPU number
+ * @levels: Number of levels if success.
+ * @split_levels: Number of levels being split (i.e. data/instruction)
+ * if success. Can by NULL.
*
* Given a logical CPU number, returns the number of levels of cache represented
* in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
* indicating we didn't find any cache levels.
*
- * Return: Cache levels visible to this core.
+ * Return: -ENOENT if no PPTT table or no PPTT processor struct found.
+ * 0 on success.
*/
-int acpi_find_last_cache_level(unsigned int cpu)
+int acpi_get_cache_info(unsigned int cpu, unsigned int *levels,
+ unsigned int *split_levels)
{
struct acpi_pptt_processor *cpu_node;
struct acpi_table_header *table;
- int number_of_levels = 0;
u32 acpi_cpu_id;

+ *levels = 0;
+ if (split_levels)
+ *split_levels = 0;
+
table = acpi_get_pptt();
if (!table)
return -ENOENT;

- pr_debug("Cache Setup find last level CPU=%d\n", cpu);
+ pr_debug("Cache Setup: find cache levels for CPU=%d\n", cpu);

acpi_cpu_id = get_acpi_id_for_cpu(cpu);
cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
- if (cpu_node)
- number_of_levels = acpi_count_levels(table, cpu_node);
+ if (!cpu_node)
+ return -ENOENT;

- pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
+ acpi_count_levels(table, cpu_node, levels, split_levels);

- return number_of_levels;
+ pr_debug("Cache Setup: last_level=%d split_levels=%d\n",
+ *levels, split_levels ? *split_levels : -1);
+
+ return 0;
}

/**
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index ff0328f3fbb0..00d8e7f9d1c6 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -88,19 +88,22 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
int detect_cache_attributes(unsigned int cpu);
#ifndef CONFIG_ACPI_PPTT
/*
- * acpi_find_last_cache_level is only called on ACPI enabled
+ * acpi_get_cache_info() is only called on ACPI enabled
* platforms using the PPTT for topology. This means that if
* the platform supports other firmware configuration methods
* we need to stub out the call when ACPI is disabled.
* ACPI enabled platforms not using PPTT won't be making calls
* to this function so we need not worry about them.
*/
-static inline int acpi_find_last_cache_level(unsigned int cpu)
+static inline
+int acpi_get_cache_info(unsigned int cpu,
+ unsigned int *levels, unsigned int *split_levels)
{
return 0;
}
#else
-int acpi_find_last_cache_level(unsigned int cpu);
+int acpi_get_cache_info(unsigned int cpu,
+ unsigned int *levels, unsigned int *split_levels);
#endif

const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
--
2.25.1

2023-01-04 18:53:01

by Pierre Gondois

[permalink] [raw]

Subject: [PATCH v4 1/6] cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation

RISC-V's implementation of init_of_cache_level() is following
the Devicetree Specification v0.3 regarding caches, cf.:
- s3.7.3 'Internal (L1) Cache Properties'
- s3.8 'Multi-level and Shared Cache Nodes'

Allow reusing the implementation by moving it.

Also make 'levels', 'leaves' and 'level' unsigned int.

Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
---
arch/riscv/kernel/cacheinfo.c | 39 +------------------------------
drivers/base/cacheinfo.c | 44 +++++++++++++++++++++++++++++++++++
include/linux/cacheinfo.h | 1 +
3 files changed, 46 insertions(+), 38 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 90deabfe63ea..440a3df5944c 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -115,44 +115,7 @@ static void fill_cacheinfo(struct cacheinfo **this_leaf,

int init_cache_level(unsigned int cpu)
{
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
- struct device_node *np = of_cpu_device_node_get(cpu);
- struct device_node *prev = NULL;
- int levels = 0, leaves = 0, level;
-
- if (of_property_read_bool(np, "cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "i-cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "d-cache-size"))
- ++leaves;
- if (leaves > 0)
- levels = 1;
-
- prev = np;
- while ((np = of_find_next_cache_node(np))) {
- of_node_put(prev);
- prev = np;
- if (!of_device_is_compatible(np, "cache"))
- break;
- if (of_property_read_u32(np, "cache-level", &level))
- break;
- if (level <= levels)
- break;
- if (of_property_read_bool(np, "cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "i-cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "d-cache-size"))
- ++leaves;
- levels = level;
- }
-
- of_node_put(np);
- this_cpu_ci->num_levels = levels;
- this_cpu_ci->num_leaves = leaves;
-
- return 0;
+ return init_of_cache_level(cpu);
}

int populate_cache_leaves(unsigned int cpu)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 4b5cd08c5a65..5ae58e32a77f 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -224,8 +224,52 @@ static int cache_setup_of_node(unsigned int cpu)

return 0;
}
+
+int init_of_cache_level(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+ struct device_node *np = of_cpu_device_node_get(cpu);
+ struct device_node *prev = NULL;
+ unsigned int levels = 0, leaves = 0, level;
+
+ if (of_property_read_bool(np, "cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "i-cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "d-cache-size"))
+ ++leaves;
+ if (leaves > 0)
+ levels = 1;
+
+ prev = np;
+ while ((np = of_find_next_cache_node(np))) {
+ of_node_put(prev);
+ prev = np;
+ if (!of_device_is_compatible(np, "cache"))
+ break;
+ if (of_property_read_u32(np, "cache-level", &level))
+ break;
+ if (level <= levels)
+ break;
+ if (of_property_read_bool(np, "cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "i-cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "d-cache-size"))
+ ++leaves;
+ levels = level;
+ }
+
+ of_node_put(np);
+ this_cpu_ci->num_levels = levels;
+ this_cpu_ci->num_leaves = leaves;
+
+ return 0;
+}
+
#else
static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
+int init_of_cache_level(unsigned int cpu) { return 0; }
#endif

int __weak cache_setup_acpi(unsigned int cpu)
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 00b7a6ae8617..ff0328f3fbb0 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -80,6 +80,7 @@ struct cpu_cacheinfo {

struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
int init_cache_level(unsigned int cpu);
+int init_of_cache_level(unsigned int cpu);
int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
bool last_level_cache_is_valid(unsigned int cpu);
--
2.25.1

2023-01-04 18:57:30

by Pierre Gondois

[permalink] [raw]

Subject: [PATCH v4 3/6] cacheinfo: Check 'cache-unified' property to count cache leaves

The DeviceTree Specification v0.3 specifies that the cache node
'[d-|i-|]cache-size' property is required. The 'cache-unified'
property is specifies whether the cache level is separate
or unified.

If the cache-size property is missing, no cache leaves is accounted.
This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.

Check 'cache-unified' property and always account for at least
one cache leave when parsing the device tree.

[1] https://lore.kernel.org/all/[email protected]/

Reported-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/cacheinfo.c | 37 ++++++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 3e965ef6ee42..339cd1e3d5af 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -225,12 +225,9 @@ static int cache_setup_of_node(unsigned int cpu)
return 0;
}

-int init_of_cache_level(unsigned int cpu)
+static int of_count_cache_leaves(struct device_node *np)
{
- struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
- struct device_node *np = of_cpu_device_node_get(cpu);
- struct device_node *prev = NULL;
- unsigned int levels = 0, leaves = 0, level;
+ unsigned int leaves = 0;

if (of_property_read_bool(np, "cache-size"))
++leaves;
@@ -238,6 +235,28 @@ int init_of_cache_level(unsigned int cpu)
++leaves;
if (of_property_read_bool(np, "d-cache-size"))
++leaves;
+
+ if (!leaves) {
+ /* The '[i-|d-|]cache-size' property is required, but
+ * if absent, fallback on the 'cache-unified' property.
+ */
+ if (of_property_read_bool(np, "cache-unified"))
+ return 1;
+ else
+ return 2;
+ }
+
+ return leaves;
+}
+
+int init_of_cache_level(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+ struct device_node *np = of_cpu_device_node_get(cpu);
+ struct device_node *prev = NULL;
+ unsigned int levels = 0, leaves, level;
+
+ leaves = of_count_cache_leaves(np);
if (leaves > 0)
levels = 1;

@@ -251,12 +270,8 @@ int init_of_cache_level(unsigned int cpu)
goto err_out;
if (level <= levels)
goto err_out;
- if (of_property_read_bool(np, "cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "i-cache-size"))
- ++leaves;
- if (of_property_read_bool(np, "d-cache-size"))
- ++leaves;
+
+ leaves += of_count_cache_leaves(np);
levels = level;
}

--
2.25.1

2023-01-04 19:26:16

by Pierre Gondois

[permalink] [raw]

Subject: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.

The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
the CPU topology")

allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.

When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].

Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.

[1]:
[ 7.560791] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
[ 7.560794] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
[ 7.560796] preempt_count: 1, expected: 0
[ 7.560797] RCU nest depth: 1, expected: 1
[ 7.560799] 3 locks held by swapper/111/0:
[ 7.560800] #0: ffff403e406cae98 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
[ 7.560811] #1: ffffc5f8ed09f8e8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
[ 7.560820] #2: ffff403f400b4fd8 (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
[ 7.560824] irq event stamp: 0
[ 7.560825] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 7.560827] hardirqs last disabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[ 7.560830] softirqs last enabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
[ 7.560833] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 7.560834] Preemption disabled at:
[ 7.560835] [<ffffc5f8e9fd3c28>] migrate_enable+0x30/0x130
[ 7.560838] CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
[ 7.560841] Call trace:
[...]
[ 7.560870] __kmalloc+0xbc/0x1e8
[ 7.560873] detect_cache_attributes+0x2d4/0x5f0
[ 7.560876] update_siblings_masks+0x30/0x368
[ 7.560880] store_cpu_topology+0x78/0xb8
[ 7.560883] secondary_start_kernel+0xd0/0x198
[ 7.560885] __secondary_switched+0xb0/0xb4

Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
---
arch/riscv/kernel/cacheinfo.c | 5 ---
drivers/base/arch_topology.c | 12 +++++-
drivers/base/cacheinfo.c | 71 ++++++++++++++++++++++++++---------
include/linux/cacheinfo.h | 1 +
4 files changed, 65 insertions(+), 24 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 440a3df5944c..3a13113f1b29 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -113,11 +113,6 @@ static void fill_cacheinfo(struct cacheinfo **this_leaf,
}
}

-int init_cache_level(unsigned int cpu)
-{
- return init_of_cache_level(cpu);
-}
-
int populate_cache_leaves(unsigned int cpu)
{
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7d6e6657ffa..b1c1dd38ab01 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -736,7 +736,7 @@ void update_siblings_masks(unsigned int cpuid)

ret = detect_cache_attributes(cpuid);
if (ret && ret != -ENOENT)
- pr_info("Early cacheinfo failed, ret = %d\n", ret);
+ pr_info("Early cacheinfo allocation failed, ret = %d\n", ret);

/* update core and thread sibling masks */
for_each_online_cpu(cpu) {
@@ -825,7 +825,7 @@ __weak int __init parse_acpi_topology(void)
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void __init init_cpu_topology(void)
{
- int ret;
+ int cpu, ret;

reset_cpu_topology();
ret = parse_acpi_topology();
@@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
reset_cpu_topology();
return;
}
+
+ for_each_possible_cpu(cpu) {
+ ret = fetch_cache_info(cpu);
+ if (ret) {
+ pr_err("Early cacheinfo failed, ret = %d\n", ret);
+ break;
+ }
+ }
}

void store_cpu_topology(unsigned int cpuid)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 339cd1e3d5af..b57fbd0d7114 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -386,10 +386,6 @@ static void free_cache_attributes(unsigned int cpu)
return;

cache_shared_cpu_map_remove(cpu);
-
- kfree(per_cpu_cacheinfo(cpu));
- per_cpu_cacheinfo(cpu) = NULL;
- cache_leaves(cpu) = 0;
}

int __weak init_cache_level(unsigned int cpu)
@@ -402,29 +398,71 @@ int __weak populate_cache_leaves(unsigned int cpu)
return -ENOENT;
}

+static inline
+int allocate_cache_info(int cpu)
+{
+ per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu),
+ sizeof(struct cacheinfo), GFP_ATOMIC);
+ if (!per_cpu_cacheinfo(cpu)) {
+ cache_leaves(cpu) = 0;
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+int fetch_cache_info(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci;
+ unsigned int levels, split_levels;
+ int ret;
+
+ if (acpi_disabled) {
+ ret = init_of_cache_level(cpu);
+ if (ret < 0)
+ return ret;
+ } else {
+ ret = acpi_get_cache_info(cpu, &levels, &split_levels);
+ if (ret < 0)
+ return ret;
+
+ this_cpu_ci = get_cpu_cacheinfo(cpu);
+ this_cpu_ci->num_levels = levels;
+ /*
+ * This assumes that:
+ * - there cannot be any split caches (data/instruction)
+ * above a unified cache
+ * - data/instruction caches come by pair
+ */
+ this_cpu_ci->num_leaves = levels + split_levels;
+ }
+ if (!cache_leaves(cpu))
+ return -ENOENT;
+
+ return allocate_cache_info(cpu);
+}
+
int detect_cache_attributes(unsigned int cpu)
{
int ret;

- /* Since early detection of the cacheinfo is allowed via this
- * function and this also gets called as CPU hotplug callbacks via
- * cacheinfo_cpu_online, the initialisation can be skipped and only
- * CPU maps can be updated as the CPU online status would be update
- * if called via cacheinfo_cpu_online path.
+ /* Since early initialization/allocation of the cacheinfo is allowed
+ * via fetch_cache_info() and this also gets called as CPU hotplug
+ * callbacks via cacheinfo_cpu_online, the init/alloc can be skipped
+ * as it will happen only once (the cacheinfo memory is never freed).
+ * Just populate the cacheinfo.
*/
if (per_cpu_cacheinfo(cpu))
- goto update_cpu_map;
+ goto populate_leaves;

if (init_cache_level(cpu) || !cache_leaves(cpu))
return -ENOENT;

- per_cpu_cacheinfo(cpu) = kcalloc(cache_leaves(cpu),
- sizeof(struct cacheinfo), GFP_ATOMIC);
- if (per_cpu_cacheinfo(cpu) == NULL) {
- cache_leaves(cpu) = 0;
- return -ENOMEM;
- }
+ ret = allocate_cache_info(cpu);
+ if (ret)
+ return ret;

+populate_leaves:
/*
* populate_cache_leaves() may completely setup the cache leaves and
* shared_cpu_map or it may leave it partially setup.
@@ -433,7 +471,6 @@ int detect_cache_attributes(unsigned int cpu)
if (ret)
goto free_ci;

-update_cpu_map:
/*
* For systems using DT for cache hierarchy, fw_token
* and shared_cpu_map will be set up here only if they are
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 00d8e7f9d1c6..dfef57077cd0 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -85,6 +85,7 @@ int populate_cache_leaves(unsigned int cpu);
int cache_setup_acpi(unsigned int cpu);
bool last_level_cache_is_valid(unsigned int cpu);
bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+int fetch_cache_info(unsigned int cpu);
int detect_cache_attributes(unsigned int cpu);
#ifndef CONFIG_ACPI_PPTT
/*
--
2.25.1

2023-01-09 16:19:38

by Krzysztof Kozlowski

[permalink] [raw]

Subject: Re: [PATCH v4 3/6] cacheinfo: Check 'cache-unified' property to count cache leaves

On 04/01/2023 19:30, Pierre Gondois wrote:
> The DeviceTree Specification v0.3 specifies that the cache node
> '[d-|i-|]cache-size' property is required. The 'cache-unified'
> property is specifies whether the cache level is separate
> or unified.
>
> If the cache-size property is missing, no cache leaves is accounted.
> This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.
>
> Check 'cache-unified' property and always account for at least
> one cache leave when parsing the device tree.
>
> [1] https://lore.kernel.org/all/[email protected]/
>
> Reported-by: Krzysztof Kozlowski <[email protected]>
> Signed-off-by: Pierre Gondois <[email protected]>

Thanks, solves the issue:

Tested-by: Krzysztof Kozlowski <[email protected]>

Best regards,
Krzysztof

2023-01-18 13:10:24

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 0/6] arch_topology: Build cacheinfo from primary CPU

On Wed, 4 Jan 2023 19:30:23 +0100, Pierre Gondois wrote:
> v2:
> - Applied renaming/formatting comments from v1.
> - Check CACHE_TYPE_VALID flag in pppt.c.
> v3:
> - Applied Sudeep's suggestions (for patch 5/5):
> - Renaming allocate_cache_info() -> fecth_cache_info()
> - Updated error message
> - Extract an inline allocate_cache_info() function
> - Re-run checkpatch with --strict option
> v4:
> - Remove RISC-V's implementation of init_cache_level() as not
> necessary.
> - Add patch: 'cacheinfo: Check 'cache-unified' property to count
> cache leaves' to increase the number of leaves at a cache level
> when no cache-size property is found.
> - In cacheinfo: Use RISC-V's init_cache_level() [...],
> make 'levels', 'leaves' and 'level' unsigned int to match
> of_property_read_u32()'s parameters signedness.
>
> [...]

Applied to sudeep.holla/linux (for-next/cacheinfo), thanks!

[1/6] cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
https://git.kernel.org/sudeep.holla/c/c3719bd9eeb2
[2/6] cacheinfo: Return error code in init_of_cache_level()
https://git.kernel.org/sudeep.holla/c/8844c3df001b
[3/6] cacheinfo: Check 'cache-unified' property to count cache leaves
https://git.kernel.org/sudeep.holla/c/de0df442ee49
[4/6] ACPI: PPTT: Remove acpi_find_cache_levels()
https://git.kernel.org/sudeep.holla/c/fa4d566a605b
[5/6] ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()
https://git.kernel.org/sudeep.holla/c/bd500361a937
[6/6] arch_topology: Build cacheinfo from primary CPU
https://git.kernel.org/sudeep.holla/c/5944ce092b97

--
Regards,
Sudeep

2023-01-24 13:53:36

by Geert Uytterhoeven

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

Hi Pierre,

On Wed, Jan 4, 2023 at 7:35 PM Pierre Gondois <[email protected]> wrote:
> commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
> in the CPU hotplug path")
> adds a call to detect_cache_attributes() to populate the cacheinfo
> before updating the siblings mask. detect_cache_attributes() allocates
> memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
> kernels, on secondary CPUs, this triggers a:
> 'BUG: sleeping function called from invalid context' [1]
> as the code is executed with preemption and interrupts disabled.
>
> The primary CPU was previously storing the cache information using
> the now removed (struct cpu_topology).llc_id:
> commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
> the CPU topology")
>
> allocate_cache_info() tries to build the cacheinfo from the primary
> CPU prior secondary CPUs boot, if the DT/ACPI description
> contains cache information.
> If allocate_cache_info() fails, then fallback to the current state
> for the cacheinfo allocation. [1] will be triggered in such case.
>
> When unplugging a CPU, the cacheinfo memory cannot be freed. If it
> was, then the memory would be allocated early by the re-plugged
> CPU and would trigger [1].
>
> Note that populate_cache_leaves() might be called multiple times
> due to populate_leaves being moved up. This is required since
> detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
> being allocated but not populated.
>
> [1]:
> [ 7.560791] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
> [ 7.560794] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
> [ 7.560796] preempt_count: 1, expected: 0
> [ 7.560797] RCU nest depth: 1, expected: 1
> [ 7.560799] 3 locks held by swapper/111/0:
> [ 7.560800] #0: ffff403e406cae98 (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
> [ 7.560811] #1: ffffc5f8ed09f8e8 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
> [ 7.560820] #2: ffff403f400b4fd8 (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
> [ 7.560824] irq event stamp: 0
> [ 7.560825] hardirqs last enabled at (0): [<0000000000000000>] 0x0
> [ 7.560827] hardirqs last disabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
> [ 7.560830] softirqs last enabled at (0): [<ffffc5f8e9f7d594>] copy_process+0x5dc/0x1ab8
> [ 7.560833] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 7.560834] Preemption disabled at:
> [ 7.560835] [<ffffc5f8e9fd3c28>] migrate_enable+0x30/0x130
> [ 7.560838] CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
> [ 7.560841] Call trace:
> [...]
> [ 7.560870] __kmalloc+0xbc/0x1e8
> [ 7.560873] detect_cache_attributes+0x2d4/0x5f0
> [ 7.560876] update_siblings_masks+0x30/0x368
> [ 7.560880] store_cpu_topology+0x78/0xb8
> [ 7.560883] secondary_start_kernel+0xd0/0x198
> [ 7.560885] __secondary_switched+0xb0/0xb4
>
> Signed-off-by: Pierre Gondois <[email protected]>
> Reviewed-by: Sudeep Holla <[email protected]>
> Acked-by: Palmer Dabbelt <[email protected]>

Thanks for your patch, which is now commit 5944ce092b97caed
("arch_topology: Build cacheinfo from primary CPU") in
driver-core/driver-core-next.

> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -736,7 +736,7 @@ void update_siblings_masks(unsigned int cpuid)
>
> ret = detect_cache_attributes(cpuid);
> if (ret && ret != -ENOENT)
> - pr_info("Early cacheinfo failed, ret = %d\n", ret);
> + pr_info("Early cacheinfo allocation failed, ret = %d\n", ret);
>
> /* update core and thread sibling masks */
> for_each_online_cpu(cpu) {
> @@ -825,7 +825,7 @@ __weak int __init parse_acpi_topology(void)
> #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> void __init init_cpu_topology(void)
> {
> - int ret;
> + int cpu, ret;
>
> reset_cpu_topology();
> ret = parse_acpi_topology();
> @@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
> reset_cpu_topology();
> return;
> }
> +
> + for_each_possible_cpu(cpu) {
> + ret = fetch_cache_info(cpu);
> + if (ret) {
> + pr_err("Early cacheinfo failed, ret = %d\n", ret);

This triggers on all my RV64 platforms (K210, Icicle, Starlight,
RZ/Five).

This seems to be a respin of
https://lore.kernel.org/all/CAMuHMdUBZ791fxCPkKQ6HCwLE4GJB2S35QC=SQ+X8w5Q4C_70g@mail.gmail.com
which had the same issue.

> + break;
> + }
> + }
> }
>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-01-24 14:05:18

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

On Tue, Jan 24, 2023 at 02:50:16PM +0100, Geert Uytterhoeven wrote:
> Hi Pierre,
>

[...]

> > @@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
> > reset_cpu_topology();
> > return;
> > }
> > +
> > + for_each_possible_cpu(cpu) {
> > + ret = fetch_cache_info(cpu);
> > + if (ret) {
> > + pr_err("Early cacheinfo failed, ret = %d\n", ret);
>
> This triggers on all my RV64 platforms (K210, Icicle, Starlight,
> RZ/Five).
>
> This seems to be a respin of
> https://lore.kernel.org/all/CAMuHMdUBZ791fxCPkKQ6HCwLE4GJB2S35QC=SQ+X8w5Q4C_70g@mail.gmail.com
> which had the same issue.
>

I need to recollect my memories reading all the thread, but even after the
fixes there were few platforms that failed with so early allocation but were
fine with initcalls. Are these such platforms or am I mixing up things here ?
Do you still see all the cacheinfo in the sysfs with initcalls that happen
later in the boot ?

Conor might help me remember the details.

--
Regards,
Sudeep

2023-01-24 14:40:04

by Conor Dooley

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

On Tue, Jan 24, 2023 at 02:04:20PM +0000, Sudeep Holla wrote:
> Conor might help me remember the details.

And I can't shirk either since you know I just replied to Pierre!

> On Tue, Jan 24, 2023 at 02:50:16PM +0100, Geert Uytterhoeven wrote:

> > > @@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
> > > reset_cpu_topology();
> > > return;
> > > }
> > > +
> > > + for_each_possible_cpu(cpu) {
> > > + ret = fetch_cache_info(cpu);
> > > + if (ret) {
> > > + pr_err("Early cacheinfo failed, ret = %d\n", ret);
> >
> > This triggers on all my RV64 platforms (K210, Icicle, Starlight,
> > RZ/Five).
> >
> > This seems to be a respin of
> > https://lore.kernel.org/all/CAMuHMdUBZ791fxCPkKQ6HCwLE4GJB2S35QC=SQ+X8w5Q4C_70g@mail.gmail.com
> > which had the same issue.
> >
>
> I need to recollect my memories reading all the thread, but even after the
> fixes there were few platforms that failed with so early allocation but were
> fine with initcalls. Are these such platforms or am I mixing up things here ?
> Do you still see all the cacheinfo in the sysfs with initcalls that happen
> later in the boot ?

IIRC that stuff was failing back then because riscv calls
init_cpu_topology() far sooner in boot than arm64 does, and therefore
caused allocation failures. You made that warning go away in the below
patch by moving detect_cache_attributes() to update_siblings_masks(),
which both arches call later during boot IIRC:
https://lore.kernel.org/all/[email protected]

Pierre's patch has added fetch_cache_info() to the problematic
init_cpu_topology() which is called before we can actually do any
allocation in smp_prepare_boot_cpu() or something like that.

That's what I get for only reviewing the patch that was specifically for
riscv, and not the rest of the series... D'oh.

This actually came up a few weeks ago, although I kinda considered the
reason it was triggered to be a bit bogus there, since that dmips property
is not (yet?) a valid property on RISC-V. The patch for that is here:
https://patchwork.kernel.org/project/linux-riscv/patch/[email protected]/
I tried it on a PolarFire SoC (unfortunately not an Icicle, I just went
and bricked mine an hour ago) & it should be a fix for this problem too.

My suggested commit message for that is somewhat prophetic now that I
look back at it:
https://lore.kernel.org/all/Y7V4byskevAWKM3G@spud/

I'll ping Palmer about that patch I guess...

Cheers,
Conor.

Attachments:

(No filename) (2.53 kB)
signature.asc (228.00 B)
Download all attachments

2023-01-24 14:49:17

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

On Tue, Jan 24, 2023 at 02:39:19PM +0000, Conor Dooley wrote:
> On Tue, Jan 24, 2023 at 02:04:20PM +0000, Sudeep Holla wrote:
> > Conor might help me remember the details.
>
> And I can't shirk either since you know I just replied to Pierre!
>
> > On Tue, Jan 24, 2023 at 02:50:16PM +0100, Geert Uytterhoeven wrote:
>
> > > > @@ -840,6 +840,14 @@ void __init init_cpu_topology(void)
> > > > reset_cpu_topology();
> > > > return;
> > > > }
> > > > +
> > > > + for_each_possible_cpu(cpu) {
> > > > + ret = fetch_cache_info(cpu);
> > > > + if (ret) {
> > > > + pr_err("Early cacheinfo failed, ret = %d\n", ret);
> > >
> > > This triggers on all my RV64 platforms (K210, Icicle, Starlight,
> > > RZ/Five).
> > >
> > > This seems to be a respin of
> > > https://lore.kernel.org/all/CAMuHMdUBZ791fxCPkKQ6HCwLE4GJB2S35QC=SQ+X8w5Q4C_70g@mail.gmail.com
> > > which had the same issue.
> > >
> >
> > I need to recollect my memories reading all the thread, but even after the
> > fixes there were few platforms that failed with so early allocation but were
> > fine with initcalls. Are these such platforms or am I mixing up things here ?
> > Do you still see all the cacheinfo in the sysfs with initcalls that happen
> > later in the boot ?
>
> IIRC that stuff was failing back then because riscv calls
> init_cpu_topology() far sooner in boot than arm64 does, and therefore
> caused allocation failures. You made that warning go away in the below
> patch by moving detect_cache_attributes() to update_siblings_masks(),
> which both arches call later during boot IIRC:
> https://lore.kernel.org/all/[email protected]
>
> Pierre's patch has added fetch_cache_info() to the problematic
> init_cpu_topology() which is called before we can actually do any
> allocation in smp_prepare_boot_cpu() or something like that.
>
> That's what I get for only reviewing the patch that was specifically for
> riscv, and not the rest of the series... D'oh.
>
> This actually came up a few weeks ago, although I kinda considered the
> reason it was triggered to be a bit bogus there, since that dmips property
> is not (yet?) a valid property on RISC-V. The patch for that is here:
> https://patchwork.kernel.org/project/linux-riscv/patch/[email protected]/
> I tried it on a PolarFire SoC (unfortunately not an Icicle, I just went
> and bricked mine an hour ago) & it should be a fix for this problem too.
>
> My suggested commit message for that is somewhat prophetic now that I
> look back at it:
> https://lore.kernel.org/all/Y7V4byskevAWKM3G@spud/
>

Ah, that thread, I remember that :).

I still need to understand how this is related to memory allocation.
Pierre was suggesting(in private) if we need to keep fetch_cache_info()
arch specific but I really don't want to go down that patch until I
understand and there is no other option.

Thanks for your time. I will try to recall boot flow and see if I can
gather the reasoning for the seen behaviour.

--
Regards,
Sudeep

2023-01-24 14:55:56

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

On Tue, Jan 24, 2023 at 02:48:39PM +0000, Sudeep Holla wrote:
>
> Ah, that thread, I remember that :).
>
> I still need to understand how this is related to memory allocation.
> Pierre was suggesting(in private) if we need to keep fetch_cache_info()
> arch specific but I really don't want to go down that patch until I
> understand and there is no other option.
>
> Thanks for your time. I will try to recall boot flow and see if I can
> gather the reasoning for the seen behaviour.
>

OK, I must have atleast taken a look at the code before I replied.
smp_prepare_boot_cpu() is called quite early before page_alloc_init()
and mm_init()(in init_main.c) while smp_prepare_cpus() get called
quite late from kernel_init->kernel_init_freeable().

Geert, can you please try with the patch Conor pointed out and see if
that helps to fix the allocation failures[1]

--
Regards,
Sudeep

[1] https://lore.kernel.org/all/[email protected]/

2023-01-25 14:54:32

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

Hi Geert,

On Tue, Jan 24, 2023 at 02:55:41PM +0000, Sudeep Holla wrote:
>
> Geert, can you please try with the patch Conor pointed out and see if
> that helps to fix the allocation failures[1]
>

Sorry for the nag, but did you get the chance to test -next with [1]
and see if it fixes the cacheinfo memory failure you were observing ?

--
Regards,
Sudeep

[1] https://lore.kernel.org/all/[email protected]/

2023-01-25 15:28:25

by Geert Uytterhoeven

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

Hi Sudeep,

On Wed, Jan 25, 2023 at 3:54 PM Sudeep Holla <[email protected]> wrote:
> On Tue, Jan 24, 2023 at 02:55:41PM +0000, Sudeep Holla wrote:
> > Geert, can you please try with the patch Conor pointed out and see if
> > that helps to fix the allocation failures[1]
> >
>
> Sorry for the nag, but did you get the chance to test -next with [1]
> and see if it fixes the cacheinfo memory failure you were observing ?

> [1] https://lore.kernel.org/all/[email protected]/

After applying that patch, the issue is gone.
Thanks, sending my Tb!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-01-25 16:36:39

by Sudeep Holla

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

On Wed, Jan 25, 2023 at 04:28:04PM +0100, Geert Uytterhoeven wrote:
> Hi Sudeep,
>
> On Wed, Jan 25, 2023 at 3:54 PM Sudeep Holla <[email protected]> wrote:
> > On Tue, Jan 24, 2023 at 02:55:41PM +0000, Sudeep Holla wrote:
> > > Geert, can you please try with the patch Conor pointed out and see if
> > > that helps to fix the allocation failures[1]
> > >
> >
> > Sorry for the nag, but did you get the chance to test -next with [1]
> > and see if it fixes the cacheinfo memory failure you were observing ?
>
> > [1] https://lore.kernel.org/all/[email protected]/
>
> After applying that patch, the issue is gone.
> Thanks, sending my Tb!
>

Thanks for testing and sorry for wrong information in my PR. I had mixed
up things and assumed it was tested. I usually leave it in -next for a week
before sending PR but rushed things once I was back from the holidays, won't
repeat hopefully ????.

--
Regards,
Sudeep

2023-01-26 08:55:27

by Pierre Gondois

[permalink] [raw]

Subject: Re: [PATCH v4 6/6] arch_topology: Build cacheinfo from primary CPU

Hello Geert, Sudeep,

On 1/25/23 16:28, Geert Uytterhoeven wrote:
> Hi Sudeep,
>
> On Wed, Jan 25, 2023 at 3:54 PM Sudeep Holla <[email protected]> wrote:
>> On Tue, Jan 24, 2023 at 02:55:41PM +0000, Sudeep Holla wrote:
>>> Geert, can you please try with the patch Conor pointed out and see if
>>> that helps to fix the allocation failures[1]
>>>
>>
>> Sorry for the nag, but did you get the chance to test -next with [1]
>> and see if it fixes the cacheinfo memory failure you were observing ?
>
>> [1] https://lore.kernel.org/all/[email protected]/
>
> After applying that patch, the issue is gone.
> Thanks, sending my Tb!
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

Happy to hear the issue disappears with the patch, thanks for solving the issue,
Regards,
Pierre