2023-04-13 09:17:13

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH v3 0/4] cacheinfo: Correctly fallback to using clidr_el1's information

v3:
cacheinfo: Check sib_leaf in cache_leaves_are_shared():
- Reformulate commit message
- Fix rebase issue and move '&&' condition which was in the last patch
to this patch.
cacheinfo: Add use_arch[|_cache]_info field/function:
- Put the function declaration in one line.
arch_topology: Remove early cacheinfo error message:
- New patch.

v2:
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
- Reformulate commit message
- Add 'Fixes: f16d1becf96f ("cacheinfo: Use cache identifiers [...]'
cacheinfo: Check cache properties are present in DT
- Use of_property_present()
- Add 'Reported-by: Alexandre Ghiti <[email protected]>'
cacheinfo: Add use_arch[|_cache]_info field/function:
- Make use_arch_cache_info() a static inline function

The cache information can be extracted from either a Device
Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
for arm64).

When the DT is used but no cache properties are advertised,
the current code doesn't correctly fallback to using arch information.

Correct this. Also use the assumption that L1 data/instruction caches
are private and L2/higher caches are shared when the cache information
is coming form clidr_el1.

As suggested by Alexandre, this serie should ideally go to 6.3 fixes.

Pierre Gondois (4):
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Check cache properties are present in DT
arch_topology: Remove early cacheinfo error message
cacheinfo: Add use_arch[|_cache]_info field/function

drivers/base/arch_topology.c | 4 +--
drivers/base/cacheinfo.c | 48 +++++++++++++++++++++++++++++++++---
include/linux/cacheinfo.h | 10 ++++++++
3 files changed, 55 insertions(+), 7 deletions(-)

--
2.25.1


2023-04-13 09:17:20

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH v3 1/4] cacheinfo: Check sib_leaf in cache_leaves_are_shared()

If there is no ACPI/DT information, it is assumed that L1 caches
are private and L2 (and higher) caches are shared. A cache is
'shared' between two CPUs if it is accessible from these two
CPUs.

Each CPU owns a representation (i.e. has a dedicated cacheinfo struct)
of the caches it has access to. cache_leaves_are_shared() tries to
identify whether two representations are designating the same actual
cache.

In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher)
and 'sib_leaf' is a L1 cache, the caches are detected as shared as
only this_leaf's cache level is checked.
This is leads to setting sib_leaf as being shared with another CPU,
which is incorrect as this is a L1 cache.

Check 'sib_leaf->level'. Also update the comment as the function is
called when populating 'shared_cpu_map'.

Fixes: f16d1becf96f ("cacheinfo: Use cache identifiers to check if the caches are shared if available")
Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/cacheinfo.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index f3903d002819..c5d2293ac2a6 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -38,11 +38,10 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
{
/*
* For non DT/ACPI systems, assume unique level 1 caches,
- * system-wide shared caches for all other levels. This will be used
- * only if arch specific code has not populated shared_cpu_map
+ * system-wide shared caches for all other levels.
*/
if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
- return !(this_leaf->level == 1);
+ return (this_leaf->level != 1) && (sib_leaf->level != 1);

if ((sib_leaf->attributes & CACHE_ID) &&
(this_leaf->attributes & CACHE_ID))
--
2.25.1

2023-04-13 09:17:25

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH v3 4/4] cacheinfo: Add use_arch[|_cache]_info field/function

The cache information can be extracted from either a Device
Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
for arm64).

The clidr_el1 register is used only if DT/ACPI information is not
available. It does not states how caches are shared among CPUs.

Add a use_arch_cache_info field/function to identify when the
DT/ACPI doesn't provide cache information. Use this information
to assume L1 caches are privates and L2 and higher are shared among
all CPUs.

Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/cacheinfo.c | 13 ++++++++++++-
include/linux/cacheinfo.h | 10 ++++++++++
2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 06de9a468958..49dbb4357911 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -40,7 +40,8 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
* For non DT/ACPI systems, assume unique level 1 caches,
* system-wide shared caches for all other levels.
*/
- if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
+ if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)) ||
+ this_leaf->use_arch_info)
return (this_leaf->level != 1) && (sib_leaf->level != 1);

if ((sib_leaf->attributes & CACHE_ID) &&
@@ -349,6 +350,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
struct cacheinfo *this_leaf, *sib_leaf;
unsigned int index, sib_index;
+ bool use_arch_info = false;
int ret = 0;

if (this_cpu_ci->cpu_map_populated)
@@ -361,6 +363,12 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
*/
if (!last_level_cache_is_valid(cpu)) {
ret = cache_setup_properties(cpu);
+ if (ret && use_arch_cache_info()) {
+ // Possibility to rely on arch specific information.
+ use_arch_info = true;
+ ret = 0;
+ }
+
if (ret)
return ret;
}
@@ -370,6 +378,9 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)

this_leaf = per_cpu_cacheinfo_idx(cpu, index);

+ if (use_arch_info)
+ this_leaf->use_arch_info = true;
+
cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
for_each_online_cpu(i) {
struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 908e19d17f49..fed675b251a2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -66,6 +66,7 @@ struct cacheinfo {
#define CACHE_ALLOCATE_POLICY_MASK \
(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
#define CACHE_ID BIT(4)
+ bool use_arch_info;
void *fw_token;
bool disable_sysfs;
void *priv;
@@ -129,4 +130,13 @@ static inline int get_cpu_cacheinfo_id(int cpu, int level)
return -1;
}

+static inline bool use_arch_cache_info(void)
+{
+#if defined(CONFIG_ARM64)
+ return true;
+#else
+ return false;
+#endif
+}
+
#endif /* _LINUX_CACHEINFO_H */
--
2.25.1

2023-04-13 09:17:38

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH v3 3/4] arch_topology: Remove early cacheinfo error message

fetch_cache_info() tries to get the number of cache leaves/levels
for each CPU in order to pre-allocate memory for cacheinfo struct.
Allocating this memory later triggers a:
'BUG: sleeping function called from invalid context'
in PREEMPT_RT kernels.

If there is no cache related information available in DT or ACPI,
fetch_cache_info() fails and an error message is printed:
'Early cacheinfo failed, ret = ...'

Not having cache information should be a valid configuration.
Remove the error message if fetch_cache_info() fails.

Suggested-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/arch_topology.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b1c1dd38ab01..1f071eaede5b 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -843,10 +843,8 @@ void __init init_cpu_topology(void)

for_each_possible_cpu(cpu) {
ret = fetch_cache_info(cpu);
- if (ret) {
- pr_err("Early cacheinfo failed, ret = %d\n", ret);
+ if (ret)
break;
- }
}
}

--
2.25.1

2023-04-13 09:17:45

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

If a Device Tree (DT) is used, the presence of cache properties is
assumed. Not finding any is not considered. For arm64 platforms,
cache information can be fetched from the clidr_el1 register.
Checking whether cache information is available in the DT
allows to switch to using clidr_el1.

init_of_cache_level()
\-of_count_cache_leaves()
will assume there a 2 cache leaves (L1 data/instruction caches), which
can be different from clidr_el1 information.

cache_setup_of_node() tries to read cache properties in the DT.
If there are none, this is considered a success. Knowing no
information was available would allow to switch to using clidr_el1.

Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
Reported-by: Alexandre Ghiti <[email protected]>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/cacheinfo.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index c5d2293ac2a6..06de9a468958 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -78,6 +78,9 @@ bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
}

#ifdef CONFIG_OF
+
+static bool of_check_cache_nodes(struct device_node *np);
+
/* OF properties to query for a given cache type */
struct cache_type_info {
const char *size_prop;
@@ -205,6 +208,11 @@ static int cache_setup_of_node(unsigned int cpu)
return -ENOENT;
}

+ if (!of_check_cache_nodes(np)) {
+ of_node_put(np);
+ return -ENOENT;
+ }
+
prev = np;

while (index < cache_leaves(cpu)) {
@@ -229,6 +237,25 @@ static int cache_setup_of_node(unsigned int cpu)
return 0;
}

+static bool of_check_cache_nodes(struct device_node *np)
+{
+ struct device_node *next;
+
+ if (of_property_present(np, "cache-size") ||
+ of_property_present(np, "i-cache-size") ||
+ of_property_present(np, "d-cache-size") ||
+ of_property_present(np, "cache-unified"))
+ return true;
+
+ next = of_find_next_cache_node(np);
+ if (next) {
+ of_node_put(next);
+ return true;
+ }
+
+ return false;
+}
+
static int of_count_cache_leaves(struct device_node *np)
{
unsigned int leaves = 0;
@@ -260,6 +287,9 @@ int init_of_cache_level(unsigned int cpu)
struct device_node *prev = NULL;
unsigned int levels = 0, leaves, level;

+ if (!of_check_cache_nodes(np))
+ goto err_out;
+
leaves = of_count_cache_leaves(np);
if (leaves > 0)
levels = 1;
--
2.25.1

2023-04-13 09:53:44

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] cacheinfo: Add use_arch[|_cache]_info field/function

On Thu, Apr 13, 2023 at 11:14:34AM +0200, Pierre Gondois wrote:
> The cache information can be extracted from either a Device
> Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
> for arm64).
>
> The clidr_el1 register is used only if DT/ACPI information is not
> available. It does not states how caches are shared among CPUs.
>
> Add a use_arch_cache_info field/function to identify when the
> DT/ACPI doesn't provide cache information. Use this information
> to assume L1 caches are privates and L2 and higher are shared among
> all CPUs.
>

I have tentatively merged first 3 patches along with Radu's series(waiting
for build tests still before confirming). I am not yet sure on this.

> Signed-off-by: Pierre Gondois <[email protected]>
> ---
> drivers/base/cacheinfo.c | 13 ++++++++++++-
> include/linux/cacheinfo.h | 10 ++++++++++
> 2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 06de9a468958..49dbb4357911 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -40,7 +40,8 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> * For non DT/ACPI systems, assume unique level 1 caches,
> * system-wide shared caches for all other levels.
> */
> - if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> + if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)) ||
> + this_leaf->use_arch_info)

Can't we just use use_arch_cache_info() here ?

> return (this_leaf->level != 1) && (sib_leaf->level != 1);
>
> if ((sib_leaf->attributes & CACHE_ID) &&
> @@ -349,6 +350,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> struct cacheinfo *this_leaf, *sib_leaf;
> unsigned int index, sib_index;
> + bool use_arch_info = false;
> int ret = 0;
>
> if (this_cpu_ci->cpu_map_populated)
> @@ -361,6 +363,12 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> */
> if (!last_level_cache_is_valid(cpu)) {
> ret = cache_setup_properties(cpu);
> + if (ret && use_arch_cache_info()) {
> + // Possibility to rely on arch specific information.
> + use_arch_info = true;
> + ret = 0;
> + }
> +
> if (ret)
> return ret;
> }
> @@ -370,6 +378,9 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>
> this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>
> + if (use_arch_info)
> + this_leaf->use_arch_info = true;
> +
> cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
> for_each_online_cpu(i) {
> struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 908e19d17f49..fed675b251a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -66,6 +66,7 @@ struct cacheinfo {
> #define CACHE_ALLOCATE_POLICY_MASK \
> (CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
> #define CACHE_ID BIT(4)
> + bool use_arch_info;

Do you see the need to stash this value as it is either globally true or
false based on the arch ?

> void *fw_token;
> bool disable_sysfs;
> void *priv;
> @@ -129,4 +130,13 @@ static inline int get_cpu_cacheinfo_id(int cpu, int level)
> return -1;
> }
>
> +static inline bool use_arch_cache_info(void)
> +{
> +#if defined(CONFIG_ARM64)
> + return true;
> +#else
> + return false;
> +#endif
> +}
> +

Can we just have it as:
#ifdef CONFIG_ARM64
#define use_arch_cache_info() (true)
#else
#define use_arch_cache_info() (false)
#endif

> #endif /* _LINUX_CACHEINFO_H */
> --
> 2.25.1
>

--
Regards,
Sudeep

2023-04-13 10:10:44

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] arch_topology: Remove early cacheinfo error message

On Thu, Apr 13, 2023 at 11:14:33AM +0200, Pierre Gondois wrote:
> fetch_cache_info() tries to get the number of cache leaves/levels
> for each CPU in order to pre-allocate memory for cacheinfo struct.
> Allocating this memory later triggers a:
> 'BUG: sleeping function called from invalid context'
> in PREEMPT_RT kernels.
>
> If there is no cache related information available in DT or ACPI,
> fetch_cache_info() fails and an error message is printed:
> 'Early cacheinfo failed, ret = ...'
>
> Not having cache information should be a valid configuration.
> Remove the error message if fetch_cache_info() fails.
>
> Suggested-by: Conor Dooley <[email protected]>

Not that it really matters for suggested-by, and there's no way really
for you to know, but the corporate overlords prefer:
s/[email protected]/[email protected]/

> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> Signed-off-by: Pierre Gondois <[email protected]>
> ---
> drivers/base/arch_topology.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b1c1dd38ab01..1f071eaede5b 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -843,10 +843,8 @@ void __init init_cpu_topology(void)
>
> for_each_possible_cpu(cpu) {
> ret = fetch_cache_info(cpu);
> - if (ret) {
> - pr_err("Early cacheinfo failed, ret = %d\n", ret);

Hmm do you really want to remove the print altogether? This can fail
with -EINVAL and -ENOMEM too, so should we just check for
| if (ret && ret != -ENOENT)
instead, since in the other cases it really did fail?

Cheers,
Conor.


Attachments:
(No filename) (1.70 kB)
signature.asc (235.00 B)
Download all attachments

2023-04-13 10:10:46

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] cacheinfo: Check sib_leaf in cache_leaves_are_shared()

On Thu, Apr 13, 2023 at 11:14:31AM +0200, Pierre Gondois wrote:
> If there is no ACPI/DT information, it is assumed that L1 caches
> are private and L2 (and higher) caches are shared. A cache is
> 'shared' between two CPUs if it is accessible from these two
> CPUs.
>
> Each CPU owns a representation (i.e. has a dedicated cacheinfo struct)
> of the caches it has access to. cache_leaves_are_shared() tries to
> identify whether two representations are designating the same actual
> cache.
>
> In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher)
> and 'sib_leaf' is a L1 cache, the caches are detected as shared as
> only this_leaf's cache level is checked.
> This is leads to setting sib_leaf as being shared with another CPU,
> which is incorrect as this is a L1 cache.
>
> Check 'sib_leaf->level'. Also update the comment as the function is
> called when populating 'shared_cpu_map'.
>
> Fixes: f16d1becf96f ("cacheinfo: Use cache identifiers to check if the caches are shared if available")

Cool, thanks for the updated commit message. Easier to twig the
rationale this way.
Reviewed-by: Conor Dooley <[email protected]>

Thanks,
Conor.


Attachments:
(No filename) (1.17 kB)
signature.asc (235.00 B)
Download all attachments

2023-04-13 10:10:49

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On Thu, Apr 13, 2023 at 11:14:32AM +0200, Pierre Gondois wrote:
> If a Device Tree (DT) is used, the presence of cache properties is
> assumed. Not finding any is not considered. For arm64 platforms,
> cache information can be fetched from the clidr_el1 register.
> Checking whether cache information is available in the DT
> allows to switch to using clidr_el1.
>
> init_of_cache_level()
> \-of_count_cache_leaves()
> will assume there a 2 cache leaves (L1 data/instruction caches), which
> can be different from clidr_el1 information.
>
> cache_setup_of_node() tries to read cache properties in the DT.
> If there are none, this is considered a success. Knowing no
> information was available would allow to switch to using clidr_el1.
>
> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
> Reported-by: Alexandre Ghiti <[email protected]>
> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> Signed-off-by: Pierre Gondois <[email protected]>


Since there's no longer an error printed
Reviewed-by: Conor Dooley <[email protected]>

Thanks for the update Pierre,
Conor.


Attachments:
(No filename) (1.16 kB)
signature.asc (235.00 B)
Download all attachments

2023-04-13 10:18:19

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] cacheinfo: Add use_arch[|_cache]_info field/function



On 4/13/23 11:49, Sudeep Holla wrote:
> On Thu, Apr 13, 2023 at 11:14:34AM +0200, Pierre Gondois wrote:
>> The cache information can be extracted from either a Device
>> Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
>> for arm64).
>>
>> The clidr_el1 register is used only if DT/ACPI information is not
>> available. It does not states how caches are shared among CPUs.
>>
>> Add a use_arch_cache_info field/function to identify when the
>> DT/ACPI doesn't provide cache information. Use this information
>> to assume L1 caches are privates and L2 and higher are shared among
>> all CPUs.
>>
>
> I have tentatively merged first 3 patches along with Radu's series(waiting
> for build tests still before confirming). I am not yet sure on this.
>
>> Signed-off-by: Pierre Gondois <[email protected]>
>> ---
>> drivers/base/cacheinfo.c | 13 ++++++++++++-
>> include/linux/cacheinfo.h | 10 ++++++++++
>> 2 files changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>> index 06de9a468958..49dbb4357911 100644
>> --- a/drivers/base/cacheinfo.c
>> +++ b/drivers/base/cacheinfo.c
>> @@ -40,7 +40,8 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>> * For non DT/ACPI systems, assume unique level 1 caches,
>> * system-wide shared caches for all other levels.
>> */
>> - if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
>> + if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)) ||
>> + this_leaf->use_arch_info)
>
> Can't we just use use_arch_cache_info() here ?

I think that if we use use_arch_cache_info() here, then arm64 platforms
will always return here and never check fw_token/this_leaf->id values.
Indeed, we also need to know that no cache information is available in
DT/ACPI, cf. [1]

>
>> return (this_leaf->level != 1) && (sib_leaf->level != 1);
>>
>> if ((sib_leaf->attributes & CACHE_ID) &&
>> @@ -349,6 +350,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> struct cacheinfo *this_leaf, *sib_leaf;
>> unsigned int index, sib_index;
>> + bool use_arch_info = false;
>> int ret = 0;
>>
>> if (this_cpu_ci->cpu_map_populated)
>> @@ -361,6 +363,12 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>> */
>> if (!last_level_cache_is_valid(cpu)) {
>> ret = cache_setup_properties(cpu);
>> + if (ret && use_arch_cache_info()) {
>> + // Possibility to rely on arch specific information.

[1]

>> + use_arch_info = true;
>> + ret = 0;
>> + }
>> +
>> if (ret)
>> return ret;
>> }
>> @@ -370,6 +378,9 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>>
>> this_leaf = per_cpu_cacheinfo_idx(cpu, index);
>>
>> + if (use_arch_info)
>> + this_leaf->use_arch_info = true;
>> +
>> cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
>> for_each_online_cpu(i) {
>> struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
>> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
>> index 908e19d17f49..fed675b251a2 100644
>> --- a/include/linux/cacheinfo.h
>> +++ b/include/linux/cacheinfo.h
>> @@ -66,6 +66,7 @@ struct cacheinfo {
>> #define CACHE_ALLOCATE_POLICY_MASK \
>> (CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
>> #define CACHE_ID BIT(4)
>> + bool use_arch_info;
>
> Do you see the need to stash this value as it is either globally true or
> false based on the arch ?

A static variable could be used instead and set to true if we fail to fetch the
cache information from DT/ACPI, cf. [1]. The only possible transition for this
variable would be from false->true. I'll check if this works like this.

>
>> void *fw_token;
>> bool disable_sysfs;
>> void *priv;
>> @@ -129,4 +130,13 @@ static inline int get_cpu_cacheinfo_id(int cpu, int level)
>> return -1;
>> }
>>
>> +static inline bool use_arch_cache_info(void)
>> +{
>> +#if defined(CONFIG_ARM64)
>> + return true;
>> +#else
>> + return false;
>> +#endif
>> +}
>> +
>
> Can we just have it as:
> #ifdef CONFIG_ARM64
> #define use_arch_cache_info() (true)
> #else
> #define use_arch_cache_info() (false)
> #endif

Yes sure, I'll post a v4 with this along Conor's requested change.

>
>> #endif /* _LINUX_CACHEINFO_H */
>> --
>> 2.25.1
>>
>

2023-04-13 10:26:08

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] cacheinfo: Add use_arch[|_cache]_info field/function

On Thu, Apr 13, 2023 at 12:17:09PM +0200, Pierre Gondois wrote:
>
>
> On 4/13/23 11:49, Sudeep Holla wrote:
> > On Thu, Apr 13, 2023 at 11:14:34AM +0200, Pierre Gondois wrote:
> > > The cache information can be extracted from either a Device
> > > Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
> > > for arm64).
> > >
> > > The clidr_el1 register is used only if DT/ACPI information is not
> > > available. It does not states how caches are shared among CPUs.
> > >
> > > Add a use_arch_cache_info field/function to identify when the
> > > DT/ACPI doesn't provide cache information. Use this information
> > > to assume L1 caches are privates and L2 and higher are shared among
> > > all CPUs.
> > >
> >
> > I have tentatively merged first 3 patches along with Radu's series(waiting
> > for build tests still before confirming). I am not yet sure on this.
> >
> > > Signed-off-by: Pierre Gondois <[email protected]>
> > > ---
> > > drivers/base/cacheinfo.c | 13 ++++++++++++-
> > > include/linux/cacheinfo.h | 10 ++++++++++
> > > 2 files changed, 22 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > > index 06de9a468958..49dbb4357911 100644
> > > --- a/drivers/base/cacheinfo.c
> > > +++ b/drivers/base/cacheinfo.c
> > > @@ -40,7 +40,8 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
> > > * For non DT/ACPI systems, assume unique level 1 caches,
> > > * system-wide shared caches for all other levels.
> > > */
> > > - if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)))
> > > + if (!(IS_ENABLED(CONFIG_OF) || IS_ENABLED(CONFIG_ACPI)) ||
> > > + this_leaf->use_arch_info)
> >
> > Can't we just use use_arch_cache_info() here ?
>
> I think that if we use use_arch_cache_info() here, then arm64 platforms
> will always return here and never check fw_token/this_leaf->id values.
> Indeed, we also need to know that no cache information is available in
> DT/ACPI, cf. [1]
>

Ah right, I missed to see that. I was sure there is a reason but couldn't
figure out myself quickly.

> >
> > > return (this_leaf->level != 1) && (sib_leaf->level != 1);
> > > if ((sib_leaf->attributes & CACHE_ID) &&
> > > @@ -349,6 +350,7 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> > > struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> > > struct cacheinfo *this_leaf, *sib_leaf;
> > > unsigned int index, sib_index;
> > > + bool use_arch_info = false;
> > > int ret = 0;
> > > if (this_cpu_ci->cpu_map_populated)
> > > @@ -361,6 +363,12 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> > > */
> > > if (!last_level_cache_is_valid(cpu)) {
> > > ret = cache_setup_properties(cpu);
> > > + if (ret && use_arch_cache_info()) {
> > > + // Possibility to rely on arch specific information.
>
> [1]
>
> > > + use_arch_info = true;
> > > + ret = 0;
> > > + }
> > > +
> > > if (ret)
> > > return ret;
> > > }
> > > @@ -370,6 +378,9 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
> > > this_leaf = per_cpu_cacheinfo_idx(cpu, index);
> > > + if (use_arch_info)
> > > + this_leaf->use_arch_info = true;
> > > +
> > > cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
> > > for_each_online_cpu(i) {
> > > struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
> > > diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> > > index 908e19d17f49..fed675b251a2 100644
> > > --- a/include/linux/cacheinfo.h
> > > +++ b/include/linux/cacheinfo.h
> > > @@ -66,6 +66,7 @@ struct cacheinfo {
> > > #define CACHE_ALLOCATE_POLICY_MASK \
> > > (CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
> > > #define CACHE_ID BIT(4)
> > > + bool use_arch_info;
> >
> > Do you see the need to stash this value as it is either globally true or
> > false based on the arch ?
>
> A static variable could be used instead and set to true if we fail to fetch the
> cache information from DT/ACPI, cf. [1]. The only possible transition for this
> variable would be from false->true. I'll check if this works like this.
>

Yes that would be good.

> >
> > > void *fw_token;
> > > bool disable_sysfs;
> > > void *priv;
> > > @@ -129,4 +130,13 @@ static inline int get_cpu_cacheinfo_id(int cpu, int level)
> > > return -1;
> > > }
> > > +static inline bool use_arch_cache_info(void)
> > > +{
> > > +#if defined(CONFIG_ARM64)
> > > + return true;
> > > +#else
> > > + return false;
> > > +#endif
> > > +}
> > > +
> >
> > Can we just have it as:
> > #ifdef CONFIG_ARM64
> > #define use_arch_cache_info() (true)
> > #else
> > #define use_arch_cache_info() (false)
> > #endif
>
> Yes sure, I'll post a v4 with this along Conor's requested change.
>

Sure.

--
Regards,
Sudeep

2023-04-13 10:38:34

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] arch_topology: Remove early cacheinfo error message

On Thu, Apr 13, 2023 at 11:02:49AM +0100, Conor Dooley wrote:
> On Thu, Apr 13, 2023 at 11:14:33AM +0200, Pierre Gondois wrote:
> > fetch_cache_info() tries to get the number of cache leaves/levels
> > for each CPU in order to pre-allocate memory for cacheinfo struct.
> > Allocating this memory later triggers a:
> > 'BUG: sleeping function called from invalid context'
> > in PREEMPT_RT kernels.
> >
> > If there is no cache related information available in DT or ACPI,
> > fetch_cache_info() fails and an error message is printed:
> > 'Early cacheinfo failed, ret = ...'
> >
> > Not having cache information should be a valid configuration.
> > Remove the error message if fetch_cache_info() fails.
> >
> > Suggested-by: Conor Dooley <[email protected]>
>
> Not that it really matters for suggested-by, and there's no way really
> for you to know, but the corporate overlords prefer:
> s/[email protected]/[email protected]/
>
> > Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> > Signed-off-by: Pierre Gondois <[email protected]>
> > ---
> > drivers/base/arch_topology.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b1c1dd38ab01..1f071eaede5b 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -843,10 +843,8 @@ void __init init_cpu_topology(void)
> >
> > for_each_possible_cpu(cpu) {
> > ret = fetch_cache_info(cpu);
> > - if (ret) {
> > - pr_err("Early cacheinfo failed, ret = %d\n", ret);
>
> Hmm do you really want to remove the print altogether? This can fail
> with -EINVAL and -ENOMEM too, so should we just check for
> | if (ret && ret != -ENOENT)
> instead, since in the other cases it really did fail?

To save Sudeep (potentially) waiting for me when you resubmit, with that
change:
Reviewed-by: Conor Dooley <[email protected]>

Thanks,
Conor.


Attachments:
(No filename) (1.98 kB)
signature.asc (235.00 B)
Download all attachments

2023-04-13 15:26:34

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] arch_topology: Remove early cacheinfo error message



On 4/13/23 12:02, Conor Dooley wrote:
> On Thu, Apr 13, 2023 at 11:14:33AM +0200, Pierre Gondois wrote:
>> fetch_cache_info() tries to get the number of cache leaves/levels
>> for each CPU in order to pre-allocate memory for cacheinfo struct.
>> Allocating this memory later triggers a:
>> 'BUG: sleeping function called from invalid context'
>> in PREEMPT_RT kernels.
>>
>> If there is no cache related information available in DT or ACPI,
>> fetch_cache_info() fails and an error message is printed:
>> 'Early cacheinfo failed, ret = ...'
>>
>> Not having cache information should be a valid configuration.
>> Remove the error message if fetch_cache_info() fails.
>>
>> Suggested-by: Conor Dooley <[email protected]>
>
> Not that it really matters for suggested-by, and there's no way really
> for you to know, but the corporate overlords prefer:
> s/[email protected]/[email protected]/
>
>> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
>> Signed-off-by: Pierre Gondois <[email protected]>
>> ---
>> drivers/base/arch_topology.c | 4 +---
>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index b1c1dd38ab01..1f071eaede5b 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -843,10 +843,8 @@ void __init init_cpu_topology(void)
>>
>> for_each_possible_cpu(cpu) {
>> ret = fetch_cache_info(cpu);
>> - if (ret) {
>> - pr_err("Early cacheinfo failed, ret = %d\n", ret);
>
> Hmm do you really want to remove the print altogether? This can fail
> with -EINVAL and -ENOMEM too, so should we just check for
> | if (ret && ret != -ENOENT)
> instead, since in the other cases it really did fail?

I think [PATCH 2/4] requires the following update in this case:

--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -288,8 +288,10 @@ int init_of_cache_level(unsigned int cpu)
struct device_node *prev = NULL;
unsigned int levels = 0, leaves, level;

- if (!of_check_cache_nodes(np))
- goto err_out;
+ if (!of_check_cache_nodes(np)) {
+ of_node_put(np);
+ return -ENOENT;
+ }

leaves = of_count_cache_leaves(np);
if (leaves > 0)

Is it ok to do this and keep your Reviewed-by ?

Thanks for the review,
Regards,
Pierre

2023-04-13 16:34:35

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] arch_topology: Remove early cacheinfo error message

On Thu, Apr 13, 2023 at 05:25:25PM +0200, Pierre Gondois wrote:

> Is it ok to do this and keep your Reviewed-by ?

Yah, should be grand chief.


Attachments:
(No filename) (149.00 B)
signature.asc (235.00 B)
Download all attachments

2023-04-13 18:24:14

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On 4/13/23 02:14, Pierre Gondois wrote:
> If a Device Tree (DT) is used, the presence of cache properties is
> assumed. Not finding any is not considered. For arm64 platforms,
> cache information can be fetched from the clidr_el1 register.
> Checking whether cache information is available in the DT
> allows to switch to using clidr_el1.
>
> init_of_cache_level()
> \-of_count_cache_leaves()
> will assume there a 2 cache leaves (L1 data/instruction caches), which
> can be different from clidr_el1 information.
>
> cache_setup_of_node() tries to read cache properties in the DT.
> If there are none, this is considered a success. Knowing no
> information was available would allow to switch to using clidr_el1.
>
> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
> Reported-by: Alexandre Ghiti <[email protected]>
> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> Signed-off-by: Pierre Gondois <[email protected]>

Humm, it would appear that the cache levels and topology is still
provided, despite the lack of cache properties in the Device Tree which
is intended by this patch set however we lost the size/ways/sets
information, could we not complement the missing properties here?

If this is out of the scope of what you are doing:

Tested-by: Florian Fainelli <[email protected]>

Before:

# lscpu -C
NAME ONE-SIZE ALL-SIZE WAYS TYPE LEVEL SETS PHY-LINE COHERENCY-SIZE
L1d 32K 128K 4 Data 1 128 64
L1i 32K 128K 2 Instruction 1 256 64
L2 512K 512K 16 Unified 2 512 64

After:

# lscpu -C
NAME ONE-SIZE ALL-SIZE WAYS TYPE LEVEL SETS PHY-LINE COHERENCY-SIZE
L1d Data 1
L1i Instruction 1
L2 Unified 2


--
Florian

2023-04-13 19:54:00

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On Thu, Apr 13, 2023 at 11:16:37AM -0700, Florian Fainelli wrote:
> On 4/13/23 02:14, Pierre Gondois wrote:
> > If a Device Tree (DT) is used, the presence of cache properties is
> > assumed. Not finding any is not considered. For arm64 platforms,
> > cache information can be fetched from the clidr_el1 register.
> > Checking whether cache information is available in the DT
> > allows to switch to using clidr_el1.
> >
> > init_of_cache_level()
> > \-of_count_cache_leaves()
> > will assume there a 2 cache leaves (L1 data/instruction caches), which
> > can be different from clidr_el1 information.
> >
> > cache_setup_of_node() tries to read cache properties in the DT.
> > If there are none, this is considered a success. Knowing no
> > information was available would allow to switch to using clidr_el1.
> >
> > Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
> > Reported-by: Alexandre Ghiti <[email protected]>
> > Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> > Signed-off-by: Pierre Gondois <[email protected]>
>
> Humm, it would appear that the cache levels and topology is still provided,
> despite the lack of cache properties in the Device Tree which is intended by
> this patch set however we lost the size/ways/sets information, could we not
> complement the missing properties here?
>

I am confused. How and from where the information was fetched before this
change ?

> If this is out of the scope of what you are doing:
>
> Tested-by: Florian Fainelli <[email protected]>
>

Just looking at the lscpu output before and after, it looks something is
broken. What am I missing here ?

--
Regards,
Sudeep

2023-04-13 20:21:10

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On 4/13/23 12:50, Sudeep Holla wrote:
> On Thu, Apr 13, 2023 at 11:16:37AM -0700, Florian Fainelli wrote:
>> On 4/13/23 02:14, Pierre Gondois wrote:
>>> If a Device Tree (DT) is used, the presence of cache properties is
>>> assumed. Not finding any is not considered. For arm64 platforms,
>>> cache information can be fetched from the clidr_el1 register.
>>> Checking whether cache information is available in the DT
>>> allows to switch to using clidr_el1.
>>>
>>> init_of_cache_level()
>>> \-of_count_cache_leaves()
>>> will assume there a 2 cache leaves (L1 data/instruction caches), which
>>> can be different from clidr_el1 information.
>>>
>>> cache_setup_of_node() tries to read cache properties in the DT.
>>> If there are none, this is considered a success. Knowing no
>>> information was available would allow to switch to using clidr_el1.
>>>
>>> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
>>> Reported-by: Alexandre Ghiti <[email protected]>
>>> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
>>> Signed-off-by: Pierre Gondois <[email protected]>
>>
>> Humm, it would appear that the cache levels and topology is still provided,
>> despite the lack of cache properties in the Device Tree which is intended by
>> this patch set however we lost the size/ways/sets information, could we not
>> complement the missing properties here?
>>
>
> I am confused. How and from where the information was fetched before this
> change ?

I applied Pierre's patches to my tree and then did the following:

- before means booting with the patches applied and the Device Tree
providing cache information: {d,i}-cache-{size,line-size,sets} and
next-level-cache

- after means removing all of those properties still with the patches
applied

My expectation is that if we omit the properties in the Device Tree, we
will fallback to reading that information out of clidr_el1. However as
can be seen from the "before" and "after" outputs, there is loss of
information, as we no longer have the cacheline size, number of
sets/ways, the rest is valid though.

So my question is whether this is expected and in scope of what is being
done here, or not.

>
>> If this is out of the scope of what you are doing:
>>
>> Tested-by: Florian Fainelli <[email protected]>
>>
>
> Just looking at the lscpu output before and after, it looks something is
> broken. What am I missing here ?
>

What is broken in the "before" output? It contains the entire set of
possible information we know about the caches. As for the "after", well
yes there is information missing, the whole point of my email actually...
--
Florian

2023-04-14 07:34:20

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT



On 4/13/23 22:06, Florian Fainelli wrote:
> On 4/13/23 12:50, Sudeep Holla wrote:
>> On Thu, Apr 13, 2023 at 11:16:37AM -0700, Florian Fainelli wrote:
>>> On 4/13/23 02:14, Pierre Gondois wrote:
>>>> If a Device Tree (DT) is used, the presence of cache properties is
>>>> assumed. Not finding any is not considered. For arm64 platforms,
>>>> cache information can be fetched from the clidr_el1 register.
>>>> Checking whether cache information is available in the DT
>>>> allows to switch to using clidr_el1.
>>>>
>>>> init_of_cache_level()
>>>> \-of_count_cache_leaves()
>>>> will assume there a 2 cache leaves (L1 data/instruction caches), which
>>>> can be different from clidr_el1 information.
>>>>
>>>> cache_setup_of_node() tries to read cache properties in the DT.
>>>> If there are none, this is considered a success. Knowing no
>>>> information was available would allow to switch to using clidr_el1.
>>>>
>>>> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
>>>> Reported-by: Alexandre Ghiti <[email protected]>
>>>> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
>>>> Signed-off-by: Pierre Gondois <[email protected]>
>>>
>>> Humm, it would appear that the cache levels and topology is still provided,
>>> despite the lack of cache properties in the Device Tree which is intended by
>>> this patch set however we lost the size/ways/sets information, could we not
>>> complement the missing properties here?
>>>
>>
>> I am confused. How and from where the information was fetched before this
>> change ?
>
> I applied Pierre's patches to my tree and then did the following:
>
> - before means booting with the patches applied and the Device Tree
> providing cache information: {d,i}-cache-{size,line-size,sets} and
> next-level-cache
>
> - after means removing all of those properties still with the patches
> applied
>
> My expectation is that if we omit the properties in the Device Tree, we
> will fallback to reading that information out of clidr_el1. However as
> can be seen from the "before" and "after" outputs, there is loss of
> information, as we no longer have the cacheline size, number of
> sets/ways, the rest is valid though.
>
> So my question is whether this is expected and in scope of what is being
> done here, or not.
>
>>
>>> If this is out of the scope of what you are doing:
>>>
>>> Tested-by: Florian Fainelli <[email protected]>
>>>
>>
>> Just looking at the lscpu output before and after, it looks something is
>> broken. What am I missing here ?
>>
>
> What is broken in the "before" output? It contains the entire set of
> possible information we know about the caches. As for the "after", well
> yes there is information missing, the whole point of my email actually...

I think this is the expected behaviour. There are other registers containing
cache information, like CCSIDR_EL1 and CCSIDR2_EL1.
However the information contained in CCSIDR_EL1 cannot really be trusted, cf [1]:
| You cannot make any inference about the actual sizes of caches based
| on these parametersand Arm spec.

and for CCSIDR2_EL1 I assume that knowing the number of cache sets is not a
crucial information.

So if there is no cache information in DT/ACPI, the only information extracted
from registers is the level/type of caches coming from CLIDR_EL1.

Regards,
Pierre

[1] https://lore.kernel.org/all/[email protected]/

2023-04-14 08:20:58

by Pierre Gondois

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

Hello Florian,

On 4/13/23 20:16, Florian Fainelli wrote:
> On 4/13/23 02:14, Pierre Gondois wrote:
>> If a Device Tree (DT) is used, the presence of cache properties is
>> assumed. Not finding any is not considered. For arm64 platforms,
>> cache information can be fetched from the clidr_el1 register.
>> Checking whether cache information is available in the DT
>> allows to switch to using clidr_el1.
>>
>> init_of_cache_level()
>> \-of_count_cache_leaves()
>> will assume there a 2 cache leaves (L1 data/instruction caches), which
>> can be different from clidr_el1 information.
>>
>> cache_setup_of_node() tries to read cache properties in the DT.
>> If there are none, this is considered a success. Knowing no
>> information was available would allow to switch to using clidr_el1.
>>
>> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
>> Reported-by: Alexandre Ghiti <[email protected]>
>> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
>> Signed-off-by: Pierre Gondois <[email protected]>
>
> Humm, it would appear that the cache levels and topology is still
> provided, despite the lack of cache properties in the Device Tree which
> is intended by this patch set however we lost the size/ways/sets
> information, could we not complement the missing properties here?
>
> If this is out of the scope of what you are doing:
>
> Tested-by: Florian Fainelli <[email protected]>

I submitted a v4 at:
https://lore.kernel.org/all/[email protected]/

I haven't included your Tested-by as there were some small changes.
If you consider these changes are small enough to include your tag,
please let know,

Regards,
Pierre

2023-04-14 09:21:32

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On Thu, Apr 13, 2023 at 01:06:46PM -0700, Florian Fainelli wrote:
> On 4/13/23 12:50, Sudeep Holla wrote:
> > On Thu, Apr 13, 2023 at 11:16:37AM -0700, Florian Fainelli wrote:
> > > On 4/13/23 02:14, Pierre Gondois wrote:
> > > > If a Device Tree (DT) is used, the presence of cache properties is
> > > > assumed. Not finding any is not considered. For arm64 platforms,
> > > > cache information can be fetched from the clidr_el1 register.
> > > > Checking whether cache information is available in the DT
> > > > allows to switch to using clidr_el1.
> > > >
> > > > init_of_cache_level()
> > > > \-of_count_cache_leaves()
> > > > will assume there a 2 cache leaves (L1 data/instruction caches), which
> > > > can be different from clidr_el1 information.
> > > >
> > > > cache_setup_of_node() tries to read cache properties in the DT.
> > > > If there are none, this is considered a success. Knowing no
> > > > information was available would allow to switch to using clidr_el1.
> > > >
> > > > Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
> > > > Reported-by: Alexandre Ghiti <[email protected]>
> > > > Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
> > > > Signed-off-by: Pierre Gondois <[email protected]>
> > >
> > > Humm, it would appear that the cache levels and topology is still provided,
> > > despite the lack of cache properties in the Device Tree which is intended by
> > > this patch set however we lost the size/ways/sets information, could we not
> > > complement the missing properties here?
> > >
> >
> > I am confused. How and from where the information was fetched before this
> > change ?
>
> I applied Pierre's patches to my tree and then did the following:
>
> - before means booting with the patches applied and the Device Tree
> providing cache information: {d,i}-cache-{size,line-size,sets} and
> next-level-cache
>
> - after means removing all of those properties still with the patches
> applied
>

Ah okay, I assumed something totally different and hence thought patches
broke something.

> My expectation is that if we omit the properties in the Device Tree, we will
> fallback to reading that information out of clidr_el1. However as can be
> seen from the "before" and "after" outputs, there is loss of information, as
> we no longer have the cacheline size, number of sets/ways, the rest is valid
> though.
>

Correct and that is expected. We dropped ccsidr_el1 support to fetch cache
geometry with the commit a8d4636f96ad ("arm64: cacheinfo: Remove CCSIDR-based
cache information probing") after Arm ARM added wordings not to infer the
information. However clidr_el1 info still holds except it may not include
transparent system level caches. Hope that clarifies.

> So my question is whether this is expected and in scope of what is being
> done here, or not.
>
> >
> > > If this is out of the scope of what you are doing:
> > >
> > > Tested-by: Florian Fainelli <[email protected]>
> > >
> >
> > Just looking at the lscpu output before and after, it looks something is
> > broken. What am I missing here ?
> >
>
> What is broken in the "before" output? It contains the entire set of
> possible information we know about the caches. As for the "after", well yes
> there is information missing, the whole point of my email actually...

Sorry, I was not referring to "before" output. I assumed "before" means
without patches and "after" means with patches, hence I thought patches
broke something but got confused why are you giving tested-by :D.

--
Regards,
Sudeep

2023-04-14 22:47:52

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] cacheinfo: Check cache properties are present in DT

On 4/14/23 02:05, Sudeep Holla wrote:
> On Thu, Apr 13, 2023 at 01:06:46PM -0700, Florian Fainelli wrote:
>> On 4/13/23 12:50, Sudeep Holla wrote:
>>> On Thu, Apr 13, 2023 at 11:16:37AM -0700, Florian Fainelli wrote:
>>>> On 4/13/23 02:14, Pierre Gondois wrote:
>>>>> If a Device Tree (DT) is used, the presence of cache properties is
>>>>> assumed. Not finding any is not considered. For arm64 platforms,
>>>>> cache information can be fetched from the clidr_el1 register.
>>>>> Checking whether cache information is available in the DT
>>>>> allows to switch to using clidr_el1.
>>>>>
>>>>> init_of_cache_level()
>>>>> \-of_count_cache_leaves()
>>>>> will assume there a 2 cache leaves (L1 data/instruction caches), which
>>>>> can be different from clidr_el1 information.
>>>>>
>>>>> cache_setup_of_node() tries to read cache properties in the DT.
>>>>> If there are none, this is considered a success. Knowing no
>>>>> information was available would allow to switch to using clidr_el1.
>>>>>
>>>>> Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
>>>>> Reported-by: Alexandre Ghiti <[email protected]>
>>>>> Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
>>>>> Signed-off-by: Pierre Gondois <[email protected]>
>>>>
>>>> Humm, it would appear that the cache levels and topology is still provided,
>>>> despite the lack of cache properties in the Device Tree which is intended by
>>>> this patch set however we lost the size/ways/sets information, could we not
>>>> complement the missing properties here?
>>>>
>>>
>>> I am confused. How and from where the information was fetched before this
>>> change ?
>>
>> I applied Pierre's patches to my tree and then did the following:
>>
>> - before means booting with the patches applied and the Device Tree
>> providing cache information: {d,i}-cache-{size,line-size,sets} and
>> next-level-cache
>>
>> - after means removing all of those properties still with the patches
>> applied
>>
>
> Ah okay, I assumed something totally different and hence thought patches
> broke something.
>
>> My expectation is that if we omit the properties in the Device Tree, we will
>> fallback to reading that information out of clidr_el1. However as can be
>> seen from the "before" and "after" outputs, there is loss of information, as
>> we no longer have the cacheline size, number of sets/ways, the rest is valid
>> though.
>>
>
> Correct and that is expected. We dropped ccsidr_el1 support to fetch cache
> geometry with the commit a8d4636f96ad ("arm64: cacheinfo: Remove CCSIDR-based
> cache information probing") after Arm ARM added wordings not to infer the
> information. However clidr_el1 info still holds except it may not include
> transparent system level caches. Hope that clarifies.

Yes it definitively does, thanks a lot for the refresher.
--
Florian