2024-03-21 08:29:13

by Zqiang

[permalink] [raw]
Subject: [PATCH] rcutorture: Make stall-tasks directly exit when rcutorture tests end

When the rcutorture tests start to exit, the rcu_torture_cleanup() is
invoked to stop kthreads and release resources, if the stall-task
kthreads exist, cpu-stall has started and the rcutorture.stall_cpu
is set to a larger value, the rcu_torture_cleanup() will be blocked
for a long time and the hung-task may occur, this commit therefore
add kthread_should_stop() to the loop of cpu-stall operation, when
rcutorture tests ends, no need to wait for cpu-stall to end, exit
directly.

Signed-off-by: Zqiang <[email protected]>
---
kernel/rcu/rcutorture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 3f9c3766f52b..6a3cd6ed8b25 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2490,7 +2490,7 @@ static int rcu_torture_stall(void *args)
pr_alert("%s start on CPU %d.\n",
__func__, raw_smp_processor_id());
while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
- stop_at))
+ stop_at) && !kthread_should_stop())
if (stall_cpu_block) {
#ifdef CONFIG_PREEMPTION
preempt_schedule();
--
2.17.1



2024-03-21 08:32:31

by Zqiang

[permalink] [raw]
Subject: Re: [PATCH] rcutorture: Make stall-tasks directly exit when rcutorture tests end

>
> When the rcutorture tests start to exit, the rcu_torture_cleanup() is
> invoked to stop kthreads and release resources, if the stall-task
> kthreads exist, cpu-stall has started and the rcutorture.stall_cpu
> is set to a larger value, the rcu_torture_cleanup() will be blocked
> for a long time and the hung-task may occur, this commit therefore
> add kthread_should_stop() to the loop of cpu-stall operation, when
> rcutorture tests ends, no need to wait for cpu-stall to end, exit
> directly.
>
> Signed-off-by: Zqiang <[email protected]>
> ---


Use the following command to test:

insmod rcutorture.ko torture_type=srcu fwd_progress=0 stat_interval=4
stall_cpu_block=1 stall_cpu=200 stall_cpu_holdoff=10 read_exit_burst=0
object_debug=1
rmmod rcutorture

[15361.918610] INFO: task rmmod:878 blocked for more than 122 seconds.
[15361.918613] Tainted: G W
6.8.0-rc2-yoctodev-standard+ #25
[15361.918615] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[15361.918616] task:rmmod state:D stack:0 pid:878
tgid:878 ppid:773 flags:0x00004002
[15361.918621] Call Trace:
[15361.918623] <TASK>
[15361.918626] __schedule+0xc0d/0x28f0
[15361.918631] ? __pfx___schedule+0x10/0x10
[15361.918635] ? rcu_is_watching+0x19/0xb0
[15361.918638] ? schedule+0x1f6/0x290
[15361.918642] ? __pfx_lock_release+0x10/0x10
[15361.918645] ? schedule+0xc9/0x290
[15361.918648] ? schedule+0xc9/0x290
[15361.918653] ? trace_preempt_off+0x54/0x100
[15361.918657] ? schedule+0xc9/0x290
[15361.918661] schedule+0xd0/0x290
[15361.918665] schedule_timeout+0x56d/0x7d0
[15361.918669] ? debug_smp_processor_id+0x1b/0x30
[15361.918672] ? rcu_is_watching+0x19/0xb0
[15361.918676] ? __pfx_schedule_timeout+0x10/0x10
[15361.918679] ? debug_smp_processor_id+0x1b/0x30
[15361.918683] ? rcu_is_watching+0x19/0xb0
[15361.918686] ? wait_for_completion+0x179/0x4c0
[15361.918690] ? __pfx_lock_release+0x10/0x10
[15361.918693] ? __kasan_check_write+0x18/0x20
[15361.918696] ? wait_for_completion+0x9d/0x4c0
[15361.918700] ? _raw_spin_unlock_irq+0x36/0x50
[15361.918703] ? wait_for_completion+0x179/0x4c0
[15361.918707] ? _raw_spin_unlock_irq+0x36/0x50
[15361.918710] ? wait_for_completion+0x179/0x4c0
[15361.918714] ? trace_preempt_on+0x54/0x100
[15361.918718] ? wait_for_completion+0x179/0x4c0
[15361.918723] wait_for_completion+0x181/0x4c0
[15361.918728] ? __pfx_wait_for_completion+0x10/0x10
[15361.918738] kthread_stop+0x152/0x470
[15361.918742] _torture_stop_kthread+0x44/0xc0 [torture
7af7f9cbba28271a10503b653f9e05d518fbc8c3]
[15361.918752] rcu_torture_cleanup+0x2ac/0xe90 [rcutorture
f2cb1f556ee7956270927183c4c2c7749a336529]
[15361.918766] ? __pfx_rcu_torture_cleanup+0x10/0x10 [rcutorture
f2cb1f556ee7956270927183c4c2c7749a336529]
[15361.918777] ? __kasan_check_write+0x18/0x20
[15361.918781] ? __mutex_unlock_slowpath+0x17c/0x670
[15361.918789] ? __might_fault+0xcd/0x180
[15361.918793] ? find_module_all+0x104/0x1d0
[15361.918799] __x64_sys_delete_module+0x2a4/0x3f0
[15361.918803] ? __pfx___x64_sys_delete_module+0x10/0x10
[15361.918807] ? syscall_exit_to_user_mode+0x149/0x280

Thanks
Zqiang


> kernel/rcu/rcutorture.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 3f9c3766f52b..6a3cd6ed8b25 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2490,7 +2490,7 @@ static int rcu_torture_stall(void *args)
> pr_alert("%s start on CPU %d.\n",
> __func__, raw_smp_processor_id());
> while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
> - stop_at))
> + stop_at) && !kthread_should_stop())
> if (stall_cpu_block) {
> #ifdef CONFIG_PREEMPTION
> preempt_schedule();
> --
> 2.17.1
>

2024-03-26 18:03:14

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcutorture: Make stall-tasks directly exit when rcutorture tests end

On Thu, Mar 21, 2024 at 04:28:50PM +0800, Zqiang wrote:
> When the rcutorture tests start to exit, the rcu_torture_cleanup() is
> invoked to stop kthreads and release resources, if the stall-task
> kthreads exist, cpu-stall has started and the rcutorture.stall_cpu
> is set to a larger value, the rcu_torture_cleanup() will be blocked
> for a long time and the hung-task may occur, this commit therefore
> add kthread_should_stop() to the loop of cpu-stall operation, when
> rcutorture tests ends, no need to wait for cpu-stall to end, exit
> directly.
>
> Signed-off-by: Zqiang <[email protected]>

Good eyes!

Queued for testing and further review, thank you!

Thanx, Paul

> ---
> kernel/rcu/rcutorture.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 3f9c3766f52b..6a3cd6ed8b25 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2490,7 +2490,7 @@ static int rcu_torture_stall(void *args)
> pr_alert("%s start on CPU %d.\n",
> __func__, raw_smp_processor_id());
> while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
> - stop_at))
> + stop_at) && !kthread_should_stop())
> if (stall_cpu_block) {
> #ifdef CONFIG_PREEMPTION
> preempt_schedule();
> --
> 2.17.1
>