There is a recursive lock on the cpu_hotplug_lock.
In kernel/trace/trace_osnoise.c:<start/stop>_per_cpu_kthreads:
- start_per_cpu_kthreads calls cpus_read_lock() and if
start_kthreads returns a error it will call stop_per_cpu_kthreads.
- stop_per_cpu_kthreads then calls cpus_read_lock() again causing
deadlock.
Fix this by calling cpus_read_unlock() before calling
stop_per_cpu_kthreads. This behavior can also be seen in commit
f46b16520a08 ("trace/hwlat: Implement the per-cpu mode").
This error was noticed during the LTP ftrace-stress-test:
WARNING: possible recursive locking detected
--------------------------------------------
sh/275006 is trying to acquire lock:
ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: stop_per_cpu_kthreads
but task is already holding lock:
ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(cpu_hotplug_lock);
lock(cpu_hotplug_lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by sh/275006:
#0: ffff8881023f0470 (sb_writers#24){.+.+}-{0:0}, at: ksys_write
#1: ffffffffb084f430 (trace_types_lock){+.+.}-{3:3}, at: rb_simple_write
#2: ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Nico Pache <[email protected]>
---
kernel/trace/trace_osnoise.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 313439920a8c..78d536d3ff3d 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1786,8 +1786,9 @@ static int start_per_cpu_kthreads(void)
for_each_cpu(cpu, current_mask) {
retval = start_kthread(cpu);
if (retval) {
+ cpus_read_unlock();
stop_per_cpu_kthreads();
- break;
+ return retval;
}
}
--
2.37.2
Hi Nico!
On 9/19/22 16:49, Nico Pache wrote:
> There is a recursive lock on the cpu_hotplug_lock.
>
> In kernel/trace/trace_osnoise.c:<start/stop>_per_cpu_kthreads:
> - start_per_cpu_kthreads calls cpus_read_lock() and if
> start_kthreads returns a error it will call stop_per_cpu_kthreads.
> - stop_per_cpu_kthreads then calls cpus_read_lock() again causing
> deadlock.
Yep, I see the problem.
> Fix this by calling cpus_read_unlock() before calling
> stop_per_cpu_kthreads. This behavior can also be seen in commit
> f46b16520a08 ("trace/hwlat: Implement the per-cpu mode").
Correct, we have a similar function that *without* this problem.
> This error was noticed during the LTP ftrace-stress-test:
>
> WARNING: possible recursive locking detected
> --------------------------------------------
> sh/275006 is trying to acquire lock:
> ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: stop_per_cpu_kthreads
>
> but task is already holding lock:
> ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(cpu_hotplug_lock);
> lock(cpu_hotplug_lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by sh/275006:
> #0: ffff8881023f0470 (sb_writers#24){.+.+}-{0:0}, at: ksys_write
> #1: ffffffffb084f430 (trace_types_lock){+.+.}-{3:3}, at: rb_simple_write
> #2: ffffffffb02f5400 (cpu_hotplug_lock){++++}-{0:0}, at: start_per_cpu_kthreads
>
> Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
> Signed-off-by: Nico Pache <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Thanks!
-- Daniel