From: Shawn Guo <[email protected]>
When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following
sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called
with a spinlock being held.
[ 0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460
[ 0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
[ 0.029219] preempt_count: 1, expected: 0
[ 0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489
[ 0.029245] Hardware name: Thundercomm TurboX CM2290 (DT)
[ 0.029256] Call trace:
[ 0.029265] dump_backtrace.part.0+0xbc/0xd0
[ 0.029285] show_stack+0x3c/0xa0
[ 0.029298] dump_stack_lvl+0x7c/0xa0
[ 0.029311] dump_stack+0x18/0x34
[ 0.029323] __might_resched+0x10c/0x13c
[ 0.029338] __might_sleep+0x4c/0x80
[ 0.029351] down_read+0x24/0xd0
[ 0.029363] lookup_one_len_unlocked+0x9c/0xcc
[ 0.029379] lookup_positive_unlocked+0x10/0x50
[ 0.029392] debugfs_lookup+0x68/0xac
[ 0.029406] genpd_remove.part.0+0x12c/0x1b4
[ 0.029419] of_genpd_remove_last+0xa8/0xd4
[ 0.029434] psci_cpuidle_domain_probe+0x174/0x53c
[ 0.029449] platform_probe+0x68/0xe0
[ 0.029462] really_probe+0x190/0x430
[ 0.029473] __driver_probe_device+0x90/0x18c
[ 0.029485] driver_probe_device+0x40/0xe0
[ 0.029497] __driver_attach+0xf4/0x1d0
[ 0.029508] bus_for_each_dev+0x70/0xd0
[ 0.029523] driver_attach+0x24/0x30
[ 0.029534] bus_add_driver+0x164/0x22c
[ 0.029545] driver_register+0x78/0x130
[ 0.029556] __platform_driver_register+0x28/0x34
[ 0.029569] psci_idle_init_domains+0x1c/0x28
[ 0.029583] do_one_initcall+0x50/0x1b0
[ 0.029595] kernel_init_freeable+0x214/0x280
[ 0.029609] kernel_init+0x2c/0x13c
[ 0.029622] ret_from_fork+0x10/0x20
It doesn't seem necessary to call genpd_debug_remove() with the lock, so
move it out from locking to fix the problem.
Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Shawn Guo <[email protected]>
---
drivers/base/power/domain.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 5db704f02e71..7e8039d1884c 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -2058,9 +2058,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
kfree(link);
}
- genpd_debug_remove(genpd);
list_del(&genpd->gpd_list_node);
genpd_unlock(genpd);
+ genpd_debug_remove(genpd);
cancel_work_sync(&genpd->power_off_work);
if (genpd_is_cpu_domain(genpd))
free_cpumask_var(genpd->cpus);
--
2.25.1
On Tue, Mar 1, 2022 at 11:38 AM Ulf Hansson <[email protected]> wrote:
>
> On Fri, 25 Feb 2022 at 07:48, Shawn Guo <[email protected]> wrote:
> >
> > From: Shawn Guo <[email protected]>
> >
> > When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following
> > sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called
> > with a spinlock being held.
> >
> > [ 0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460
> > [ 0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
> > [ 0.029219] preempt_count: 1, expected: 0
> > [ 0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489
> > [ 0.029245] Hardware name: Thundercomm TurboX CM2290 (DT)
> > [ 0.029256] Call trace:
> > [ 0.029265] dump_backtrace.part.0+0xbc/0xd0
> > [ 0.029285] show_stack+0x3c/0xa0
> > [ 0.029298] dump_stack_lvl+0x7c/0xa0
> > [ 0.029311] dump_stack+0x18/0x34
> > [ 0.029323] __might_resched+0x10c/0x13c
> > [ 0.029338] __might_sleep+0x4c/0x80
> > [ 0.029351] down_read+0x24/0xd0
> > [ 0.029363] lookup_one_len_unlocked+0x9c/0xcc
> > [ 0.029379] lookup_positive_unlocked+0x10/0x50
> > [ 0.029392] debugfs_lookup+0x68/0xac
> > [ 0.029406] genpd_remove.part.0+0x12c/0x1b4
> > [ 0.029419] of_genpd_remove_last+0xa8/0xd4
> > [ 0.029434] psci_cpuidle_domain_probe+0x174/0x53c
> > [ 0.029449] platform_probe+0x68/0xe0
> > [ 0.029462] really_probe+0x190/0x430
> > [ 0.029473] __driver_probe_device+0x90/0x18c
> > [ 0.029485] driver_probe_device+0x40/0xe0
> > [ 0.029497] __driver_attach+0xf4/0x1d0
> > [ 0.029508] bus_for_each_dev+0x70/0xd0
> > [ 0.029523] driver_attach+0x24/0x30
> > [ 0.029534] bus_add_driver+0x164/0x22c
> > [ 0.029545] driver_register+0x78/0x130
> > [ 0.029556] __platform_driver_register+0x28/0x34
> > [ 0.029569] psci_idle_init_domains+0x1c/0x28
> > [ 0.029583] do_one_initcall+0x50/0x1b0
> > [ 0.029595] kernel_init_freeable+0x214/0x280
> > [ 0.029609] kernel_init+0x2c/0x13c
> > [ 0.029622] ret_from_fork+0x10/0x20
> >
> > It doesn't seem necessary to call genpd_debug_remove() with the lock, so
> > move it out from locking to fix the problem.
> >
> > Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
> > Signed-off-by: Shawn Guo <[email protected]>
>
> Thanks for fixing this!
>
> Reviewed-by: Ulf Hansson <[email protected]>
Applied as 5.18 material.
> Rafael, I think we should tag this for stable kernels too.
Done.
Thanks!
On Fri, 25 Feb 2022 at 07:48, Shawn Guo <[email protected]> wrote:
>
> From: Shawn Guo <[email protected]>
>
> When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following
> sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called
> with a spinlock being held.
>
> [ 0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460
> [ 0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
> [ 0.029219] preempt_count: 1, expected: 0
> [ 0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489
> [ 0.029245] Hardware name: Thundercomm TurboX CM2290 (DT)
> [ 0.029256] Call trace:
> [ 0.029265] dump_backtrace.part.0+0xbc/0xd0
> [ 0.029285] show_stack+0x3c/0xa0
> [ 0.029298] dump_stack_lvl+0x7c/0xa0
> [ 0.029311] dump_stack+0x18/0x34
> [ 0.029323] __might_resched+0x10c/0x13c
> [ 0.029338] __might_sleep+0x4c/0x80
> [ 0.029351] down_read+0x24/0xd0
> [ 0.029363] lookup_one_len_unlocked+0x9c/0xcc
> [ 0.029379] lookup_positive_unlocked+0x10/0x50
> [ 0.029392] debugfs_lookup+0x68/0xac
> [ 0.029406] genpd_remove.part.0+0x12c/0x1b4
> [ 0.029419] of_genpd_remove_last+0xa8/0xd4
> [ 0.029434] psci_cpuidle_domain_probe+0x174/0x53c
> [ 0.029449] platform_probe+0x68/0xe0
> [ 0.029462] really_probe+0x190/0x430
> [ 0.029473] __driver_probe_device+0x90/0x18c
> [ 0.029485] driver_probe_device+0x40/0xe0
> [ 0.029497] __driver_attach+0xf4/0x1d0
> [ 0.029508] bus_for_each_dev+0x70/0xd0
> [ 0.029523] driver_attach+0x24/0x30
> [ 0.029534] bus_add_driver+0x164/0x22c
> [ 0.029545] driver_register+0x78/0x130
> [ 0.029556] __platform_driver_register+0x28/0x34
> [ 0.029569] psci_idle_init_domains+0x1c/0x28
> [ 0.029583] do_one_initcall+0x50/0x1b0
> [ 0.029595] kernel_init_freeable+0x214/0x280
> [ 0.029609] kernel_init+0x2c/0x13c
> [ 0.029622] ret_from_fork+0x10/0x20
>
> It doesn't seem necessary to call genpd_debug_remove() with the lock, so
> move it out from locking to fix the problem.
>
> Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
> Signed-off-by: Shawn Guo <[email protected]>
Thanks for fixing this!
Reviewed-by: Ulf Hansson <[email protected]>
Rafael, I think we should tag this for stable kernels too.
Kind regards
Uffe
> ---
> drivers/base/power/domain.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 5db704f02e71..7e8039d1884c 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -2058,9 +2058,9 @@ static int genpd_remove(struct generic_pm_domain *genpd)
> kfree(link);
> }
>
> - genpd_debug_remove(genpd);
> list_del(&genpd->gpd_list_node);
> genpd_unlock(genpd);
> + genpd_debug_remove(genpd);
> cancel_work_sync(&genpd->power_off_work);
> if (genpd_is_cpu_domain(genpd))
> free_cpumask_var(genpd->cpus);
> --
> 2.25.1
>