For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
running the RCU stall tests.
runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
rcutorture.stall_cpu_block=1" -d
[ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
[ 10.841073] rcu_torture_stall start on CPU 3.
[ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
....
[ 10.841108] Call Trace:
[ 10.841110] <TASK>
[ 10.841112] dump_stack_lvl+0x64/0xb0
[ 10.841118] dump_stack+0x10/0x20
[ 10.841121] __schedule_bug+0x8b/0xb0
[ 10.841126] __schedule+0x2172/0x2940
[ 10.841157] schedule+0x9b/0x150
[ 10.841160] schedule_timeout+0x2e8/0x4f0
[ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
[ 10.841195] rcu_torture_stall+0x2e8/0x300
[ 10.841199] kthread+0x175/0x1a0
[ 10.841206] ret_from_fork+0x2c/0x50
The above calltrace occurs in the local_irq_disable/enable() critical
section call schedule_timeout(), and invoke schedule_timeout() also
implies a quiescent state, of course it also fails to trigger RCU stall,
this commit therefore use mdelay() instead of schedule_timeout() to
trigger RCU stall.
Suggested-by: Joel Fernandes <[email protected]>
Signed-off-by: Zqiang <[email protected]>
---
kernel/rcu/rcutorture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index d06c2da04c34..a08a72bef5f1 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
#ifdef CONFIG_PREEMPTION
preempt_schedule();
#else
- schedule_timeout_uninterruptible(HZ);
+ mdelay(jiffies_to_msecs(HZ));
#endif
} else if (stall_no_softlockup) {
touch_softlockup_watchdog();
--
2.25.1
Hi Qiang,
> From: Zqiang <[email protected]>
> Sent: Monday, March 20, 2023 11:24 AM
> To: [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]
> Subject: [PATCH v2] rcutorture: Convert schedule_timeout_uninterruptible()
> to mdelay() in rcu_torture_stall()
>
> For kernels built with enable PREEMPT_NONE and
s/enable/enabling/
> CONFIG_DEBUG_ATOMIC_SLEEP, running the RCU stall tests.
s/running/run
>
> runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1" -d
>
> [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> [ 10.841073] rcu_torture_stall start on CPU 3.
> [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> ....
> [ 10.841108] Call Trace:
> [ 10.841110] <TASK>
> [ 10.841112] dump_stack_lvl+0x64/0xb0
> [ 10.841118] dump_stack+0x10/0x20
> [ 10.841121] __schedule_bug+0x8b/0xb0
> [ 10.841126] __schedule+0x2172/0x2940
> [ 10.841157] schedule+0x9b/0x150
> [ 10.841160] schedule_timeout+0x2e8/0x4f0
> [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> [ 10.841195] rcu_torture_stall+0x2e8/0x300
> [ 10.841199] kthread+0x175/0x1a0
> [ 10.841206] ret_from_fork+0x2c/0x50
>
> The above calltrace occurs in the local_irq_disable/enable() critical section
> call schedule_timeout(), and invoke schedule_timeout() also implies a
> quiescent state, of course it also fails to trigger RCU stall, this commit
> therefore use mdelay() instead of schedule_timeout() to trigger RCU stall.
Tweak the commit description above to fix some grammar errors:
The above call trace occurred in the local_irq_disable/enable() critical section
when calling schedule_timeout() from rcu_torture_stall(). Invoking schedule_timeout()
also implies a quiescent state, of course, it also fails to trigger RCU stall. This commit,
therefore, uses mdelay() instead of schedule_timeout() to trigger the RCU stall.
> Suggested-by: Joel Fernandes <[email protected]>
> Signed-off-by: Zqiang <[email protected]>
I didn't reproduce the call trace after applying your patch.
So, with the above minor fixes, then
Tested-by: Qiuxu Zhuo <[email protected]>
Thanks
-Qiuxu
> ---
> kernel/rcu/rcutorture.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index
> d06c2da04c34..a08a72bef5f1 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) #ifdef
> CONFIG_PREEMPTION
> preempt_schedule();
> #else
> - schedule_timeout_uninterruptible(HZ);
> + mdelay(jiffies_to_msecs(HZ));
> #endif
> } else if (stall_no_softlockup) {
> touch_softlockup_watchdog();
> --
> 2.25.1
On Mon, Mar 20, 2023 at 11:24:22AM +0800, Zqiang wrote:
> For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> running the RCU stall tests.
>
> runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1" -d
>
> [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> [ 10.841073] rcu_torture_stall start on CPU 3.
> [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> ....
> [ 10.841108] Call Trace:
> [ 10.841110] <TASK>
> [ 10.841112] dump_stack_lvl+0x64/0xb0
> [ 10.841118] dump_stack+0x10/0x20
> [ 10.841121] __schedule_bug+0x8b/0xb0
> [ 10.841126] __schedule+0x2172/0x2940
> [ 10.841157] schedule+0x9b/0x150
> [ 10.841160] schedule_timeout+0x2e8/0x4f0
> [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> [ 10.841195] rcu_torture_stall+0x2e8/0x300
> [ 10.841199] kthread+0x175/0x1a0
> [ 10.841206] ret_from_fork+0x2c/0x50
>
> The above calltrace occurs in the local_irq_disable/enable() critical
> section call schedule_timeout(), and invoke schedule_timeout() also
> implies a quiescent state, of course it also fails to trigger RCU stall,
> this commit therefore use mdelay() instead of schedule_timeout() to
> trigger RCU stall.
>
> Suggested-by: Joel Fernandes <[email protected]>
> Signed-off-by: Zqiang <[email protected]>
> ---
> kernel/rcu/rcutorture.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index d06c2da04c34..a08a72bef5f1 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
Right here there is:
if (stall_cpu_block) {
In other words, the rcutorture.stall_cpu_block module parameter says to
block, even if it is a bad thing to do. The point of this is to verify
the error messages that are supposed to be printed on the console when
this happens.
> #ifdef CONFIG_PREEMPTION
> preempt_schedule();
> #else
> - schedule_timeout_uninterruptible(HZ);
> + mdelay(jiffies_to_msecs(HZ));
So this really needs to stay schedule_timeout_uninterruptible(HZ).
So should there be a change to kernel-parameters.txt to make it
more clear that this is intended behavior?
Thanx, Paul
> #endif
> } else if (stall_no_softlockup) {
> touch_softlockup_watchdog();
> --
> 2.25.1
>
> For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> running the RCU stall tests.
>
> runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1" -d
>
> [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> [ 10.841073] rcu_torture_stall start on CPU 3.
> [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> ....
> [ 10.841108] Call Trace:
> [ 10.841110] <TASK>
> [ 10.841112] dump_stack_lvl+0x64/0xb0
> [ 10.841118] dump_stack+0x10/0x20
> [ 10.841121] __schedule_bug+0x8b/0xb0
> [ 10.841126] __schedule+0x2172/0x2940
> [ 10.841157] schedule+0x9b/0x150
> [ 10.841160] schedule_timeout+0x2e8/0x4f0
> [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> [ 10.841195] rcu_torture_stall+0x2e8/0x300
> [ 10.841199] kthread+0x175/0x1a0
> [ 10.841206] ret_from_fork+0x2c/0x50
>
> The above calltrace occurs in the local_irq_disable/enable() critical
> section call schedule_timeout(), and invoke schedule_timeout() also
> implies a quiescent state, of course it also fails to trigger RCU stall,
> this commit therefore use mdelay() instead of schedule_timeout() to
> trigger RCU stall.
>
> Suggested-by: Joel Fernandes <[email protected]>
> Signed-off-by: Zqiang <[email protected]>
> ---
> kernel/rcu/rcutorture.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index d06c2da04c34..a08a72bef5f1 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
>
>Right here there is:
>
> if (stall_cpu_block) {
>
>In other words, the rcutorture.stall_cpu_block module parameter says to
>block, even if it is a bad thing to do. The point of this is to verify
>the error messages that are supposed to be printed on the console when
>this happens.
>
> #ifdef CONFIG_PREEMPTION
> preempt_schedule();
> #else
> - schedule_timeout_uninterruptible(HZ);
> + mdelay(jiffies_to_msecs(HZ));
>
>So this really needs to stay schedule_timeout_uninterruptible(HZ).
But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state,
this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y).
It didn't happen RCU stall when I tested with the following parameters for
rcutorture.stall_cpu=30
rcutorture.stall_no_softlockup=1
rcutorture.stall_cpu_irqsoff=1
rcutorture.stall_cpu_block=1
Thanks
Zqiang
>
>So should there be a change to kernel-parameters.txt to make it
>more clear that this is intended behavior?
>
> Thanx, Paul
>
> #endif
> } else if (stall_no_softlockup) {
> touch_softlockup_watchdog();
> --
> 2.25.1
>
On Mon, Mar 20, 2023 at 11:05:17PM +0000, Zhang, Qiang1 wrote:
> > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> > running the RCU stall tests.
> >
> > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> > rcutorture.stall_cpu_block=1" -d
> >
> > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> > [ 10.841073] rcu_torture_stall start on CPU 3.
> > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> > ....
> > [ 10.841108] Call Trace:
> > [ 10.841110] <TASK>
> > [ 10.841112] dump_stack_lvl+0x64/0xb0
> > [ 10.841118] dump_stack+0x10/0x20
> > [ 10.841121] __schedule_bug+0x8b/0xb0
> > [ 10.841126] __schedule+0x2172/0x2940
> > [ 10.841157] schedule+0x9b/0x150
> > [ 10.841160] schedule_timeout+0x2e8/0x4f0
> > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> > [ 10.841195] rcu_torture_stall+0x2e8/0x300
> > [ 10.841199] kthread+0x175/0x1a0
> > [ 10.841206] ret_from_fork+0x2c/0x50
> >
> > The above calltrace occurs in the local_irq_disable/enable() critical
> > section call schedule_timeout(), and invoke schedule_timeout() also
> > implies a quiescent state, of course it also fails to trigger RCU stall,
> > this commit therefore use mdelay() instead of schedule_timeout() to
> > trigger RCU stall.
> >
> > Suggested-by: Joel Fernandes <[email protected]>
> > Signed-off-by: Zqiang <[email protected]>
> > ---
> > kernel/rcu/rcutorture.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index d06c2da04c34..a08a72bef5f1 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
> >
> >Right here there is:
> >
> > if (stall_cpu_block) {
> >
> >In other words, the rcutorture.stall_cpu_block module parameter says to
> >block, even if it is a bad thing to do. The point of this is to verify
> >the error messages that are supposed to be printed on the console when
> >this happens.
> >
> > #ifdef CONFIG_PREEMPTION
> > preempt_schedule();
> > #else
> > - schedule_timeout_uninterruptible(HZ);
> > + mdelay(jiffies_to_msecs(HZ));
> >
> >So this really needs to stay schedule_timeout_uninterruptible(HZ).
>
> But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state,
> this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y).
>
> It didn't happen RCU stall when I tested with the following parameters for
> rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1
> rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1
Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n
kernel, you should not use rcutorture.stall_cpu_block=1.
In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces
the grace period to be stalled on a task rather than a CPU, exercising
a different part of the RCU CPU stall warning code.
In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1
forces the CPU to go through a quiescent state, as you say. It can
also cause lockdep and scheduling-while-atomic complaints, depending on
exactly what type of RCU reader is in effect.
So these are test-the-diagnostics parameters. The mdelay() instead
makes rcutorture.stall_cpu_block=1 do the same thing as does
rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right?
Thanx, Paul
> Thanks
> Zqiang
>
> >
> >So should there be a change to kernel-parameters.txt to make it
> >more clear that this is intended behavior?
> >
> > Thanx, Paul
> >
> > #endif
> > } else if (stall_no_softlockup) {
> > touch_softlockup_watchdog();
> > --
> > 2.25.1
> >
> > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> > running the RCU stall tests.
> >
> > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> > rcutorture.stall_cpu_block=1" -d
> >
> > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> > [ 10.841073] rcu_torture_stall start on CPU 3.
> > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> > ....
> > [ 10.841108] Call Trace:
> > [ 10.841110] <TASK>
> > [ 10.841112] dump_stack_lvl+0x64/0xb0
> > [ 10.841118] dump_stack+0x10/0x20
> > [ 10.841121] __schedule_bug+0x8b/0xb0
> > [ 10.841126] __schedule+0x2172/0x2940
> > [ 10.841157] schedule+0x9b/0x150
> > [ 10.841160] schedule_timeout+0x2e8/0x4f0
> > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> > [ 10.841195] rcu_torture_stall+0x2e8/0x300
> > [ 10.841199] kthread+0x175/0x1a0
> > [ 10.841206] ret_from_fork+0x2c/0x50
> >
> > The above calltrace occurs in the local_irq_disable/enable() critical
> > section call schedule_timeout(), and invoke schedule_timeout() also
> > implies a quiescent state, of course it also fails to trigger RCU stall,
> > this commit therefore use mdelay() instead of schedule_timeout() to
> > trigger RCU stall.
> >
> > Suggested-by: Joel Fernandes <[email protected]>
> > Signed-off-by: Zqiang <[email protected]>
> > ---
> > kernel/rcu/rcutorture.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index d06c2da04c34..a08a72bef5f1 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
> >
> >Right here there is:
> >
> > if (stall_cpu_block) {
> >
> >In other words, the rcutorture.stall_cpu_block module parameter says to
> >block, even if it is a bad thing to do. The point of this is to verify
> >the error messages that are supposed to be printed on the console when
> >this happens.
> >
> > #ifdef CONFIG_PREEMPTION
> > preempt_schedule();
> > #else
> > - schedule_timeout_uninterruptible(HZ);
> > + mdelay(jiffies_to_msecs(HZ));
> >
> >So this really needs to stay schedule_timeout_uninterruptible(HZ).
>
> But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state,
> this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y).
>
> It didn't happen RCU stall when I tested with the following parameters for
> rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1
> rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1
>
>Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n
>kernel, you should not use rcutorture.stall_cpu_block=1.
>
>In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces
>the grace period to be stalled on a task rather than a CPU, exercising
>a different part of the RCU CPU stall warning code.
>
>In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1
>forces the CPU to go through a quiescent state, as you say. It can
>also cause lockdep and scheduling-while-atomic complaints, depending on
>exactly what type of RCU reader is in effect.
>
>So these are test-the-diagnostics parameters. The mdelay() instead
>makes rcutorture.stall_cpu_block=1 do the same thing as does
>rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right?
Yes, maybe we can increase the description of the stall_cpu_block in kernel-parameters.txt.
>
> Thanx, Paul
>
> Thanks
> Zqiang
>
> >
> >So should there be a change to kernel-parameters.txt to make it
> >more clear that this is intended behavior?
Agree
Thanks
Zqiang
> >
> > Thanx, Paul
> >
> > #endif
> > } else if (stall_no_softlockup) {
> > touch_softlockup_watchdog();
> > --
> > 2.25.1
> >
> From: Paul E. McKenney <[email protected]>
> [...]
> > But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent
> > state, this will not cause an RCU stall to occur, and still in the RCU read
> critical section(PREEMPT_COUNT=y).
> >
> > It didn't happen RCU stall when I tested with the following parameters
> > for
> > rcutorture.stall_cpu=30
> > rcutorture.stall_no_softlockup=1
> > rcutorture.stall_cpu_irqsoff=1
> > rcutorture.stall_cpu_block=1
>
> Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n
> kernel, you should not use rcutorture.stall_cpu_block=1.
>
Verified.
if rcutorture.stall_cpu_block=0, it can trigger the expected RCU CPU stall for either
torture_type=srcu or torture_type=rcu.
> In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces the
> grace period to be stalled on a task rather than a CPU, exercising a different
> part of the RCU CPU stall warning code.
>
> In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1
> forces the CPU to go through a quiescent state, as you say. It can also cause
> lockdep and scheduling-while-atomic complaints, depending on exactly what
> type of RCU reader is in effect.
>
Verified.
If rcutorture.stall_cpu_block=1:
There were lockdep and scheduling-while-atomic complaints for torture_type=rcu.
No lockdep and scheduling-while-atomic complaints for torture_type=srcu.
> So these are test-the-diagnostics parameters. The mdelay() instead makes
> rcutorture.stall_cpu_block=1 do the same thing as does
> rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right?
Good to know that these are test-the-diagnostics parameters and their expected behaviors. ;-)
Thanks!
-Qiuxu
> Thanx, Paul