2022-11-15 22:33:03

by Pierre Gondois

[permalink] [raw]
Subject: [PATCH -next] cacheinfo: Remove of_node_put() for fw_token

fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
For DT based systems, fw_token is set to a pointer to a DT node.

commit ("cacheinfo: Decrement refcount in cache_setup_of_node()")
doesn't increment the refcount of fw_token anymore in
cache_setup_of_node(). fw_token is indeed used as a token and not
as a (struct device_node*), so no reference to fw_token should be
kept.

However, [1] is triggered when hotplugging a CPU multiple times
since cache_shared_cpu_map_remove() decrements the refcount to
fw_token at each CPU unplugging, eventually reaching 0.

Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().

[1]
[ 53.651182] ------------[ cut here ]------------
[ 53.651186] refcount_t: saturated; leaking memory.
[ 53.651223] WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[ 53.651241] Modules linked in:
[ 53.651249] CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G W 6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
[ 53.651261] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
[ 53.651268] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 53.651279] pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[ 53.651293] lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[...]
[ 53.651513] Call trace:
[...]
[ 53.651735] of_node_release (drivers/of/dynamic.c:335)
[ 53.651750] kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
[ 53.651762] of_node_put (drivers/of/dynamic.c:49)
[ 53.651776] free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
[ 53.651792] cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
[ 53.651807] cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
[ 53.651819] cpuhp_thread_fun (kernel/cpu.c:785)
[ 53.651832] smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
[ 53.651847] kthread (kernel/kthread.c:376)
[ 53.651858] ret_from_fork (arch/arm64/kernel/entry.S:861)
[ 53.651869] ---[ end trace 0000000000000000 ]---

Reported-by: Geert Uytterhoeven <[email protected]>
Reported-by: Marek Szyprowski <[email protected]>
Signed-off-by: Pierre Gondois <[email protected]>
---
drivers/base/cacheinfo.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 04317cde800c..950b22cdb5f7 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -317,8 +317,6 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
}
- if (of_have_populated_dt())
- of_node_put(this_leaf->fw_token);
}
}

--
2.25.1



2022-11-16 08:21:37

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH -next] cacheinfo: Remove of_node_put() for fw_token

On Tue, Nov 15, 2022 at 11:05:20PM +0100, Pierre Gondois wrote:
> fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
> For DT based systems, fw_token is set to a pointer to a DT node.
>
> commit ("cacheinfo: Decrement refcount in cache_setup_of_node()")

Commit 3da72e18371c ("cacheinfo: Decrement refcount in cache_setup_of_node()")

> doesn't increment the refcount of fw_token anymore in
> cache_setup_of_node(). fw_token is indeed used as a token and not
> as a (struct device_node*), so no reference to fw_token should be
> kept.
>
> However, [1] is triggered when hotplugging a CPU multiple times
> since cache_shared_cpu_map_remove() decrements the refcount to
> fw_token at each CPU unplugging, eventually reaching 0.
>
> Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().
>
> [1]
> [ 53.651182] ------------[ cut here ]------------
> [ 53.651186] refcount_t: saturated; leaking memory.
> [ 53.651223] WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [ 53.651241] Modules linked in:
> [ 53.651249] CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G W 6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
> [ 53.651261] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
> [ 53.651268] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 53.651279] pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [ 53.651293] lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [...]
> [ 53.651513] Call trace:
> [...]
> [ 53.651735] of_node_release (drivers/of/dynamic.c:335)
> [ 53.651750] kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
> [ 53.651762] of_node_put (drivers/of/dynamic.c:49)
> [ 53.651776] free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
> [ 53.651792] cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
> [ 53.651807] cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
> [ 53.651819] cpuhp_thread_fun (kernel/cpu.c:785)
> [ 53.651832] smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
> [ 53.651847] kthread (kernel/kthread.c:376)
> [ 53.651858] ret_from_fork (arch/arm64/kernel/entry.S:861)
> [ 53.651869] ---[ end trace 0000000000000000 ]---
>

Please remove the timestamp as they add no value to the commit log.
Also it is worth adding IMO:
Fixes: 3da72e18371c ("cacheinfo: Decrement refcount in cache_setup_of_node()")

I did a quick test and so,
Tested-by: Sudeep Holla <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>

Thanks for fixing the quickly and sorry for not noticing the extra of_node_put
even though I thought if it was just incrementing the refcount earlier.

--
Regards,
Sudeep

2022-11-16 09:17:01

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH -next] cacheinfo: Remove of_node_put() for fw_token

On Tue, Nov 15, 2022 at 11:06 PM Pierre Gondois <[email protected]> wrote:
> fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
> For DT based systems, fw_token is set to a pointer to a DT node.
>
> commit ("cacheinfo: Decrement refcount in cache_setup_of_node()")
> doesn't increment the refcount of fw_token anymore in
> cache_setup_of_node(). fw_token is indeed used as a token and not
> as a (struct device_node*), so no reference to fw_token should be
> kept.
>
> However, [1] is triggered when hotplugging a CPU multiple times
> since cache_shared_cpu_map_remove() decrements the refcount to
> fw_token at each CPU unplugging, eventually reaching 0.
>
> Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().
>
> [1]
> [ 53.651182] ------------[ cut here ]------------
> [ 53.651186] refcount_t: saturated; leaking memory.
> [ 53.651223] WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [ 53.651241] Modules linked in:
> [ 53.651249] CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G W 6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
> [ 53.651261] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
> [ 53.651268] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 53.651279] pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [ 53.651293] lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
> [...]
> [ 53.651513] Call trace:
> [...]
> [ 53.651735] of_node_release (drivers/of/dynamic.c:335)
> [ 53.651750] kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
> [ 53.651762] of_node_put (drivers/of/dynamic.c:49)
> [ 53.651776] free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
> [ 53.651792] cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
> [ 53.651807] cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
> [ 53.651819] cpuhp_thread_fun (kernel/cpu.c:785)
> [ 53.651832] smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
> [ 53.651847] kthread (kernel/kthread.c:376)
> [ 53.651858] ret_from_fork (arch/arm64/kernel/entry.S:861)
> [ 53.651869] ---[ end trace 0000000000000000 ]---
>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Reported-by: Marek Szyprowski <[email protected]>
> Signed-off-by: Pierre Gondois <[email protected]>

Thanks, this fixes the issue for me!

Tested-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds