When we were preparing the patch 6dcd5d7a7a29c1e, we made a mistake noticed
by Linus: schedule_timeout() was called without setting the task state to
anything particular. It calls the scheduler, but doesn't delay anything,
because the task stays runnable. That happens because sched_submit_work()
does nothing for tasks in TASK_RUNNING state.
Let's add a WARN_ONCE() under CONFIG_SCHED_DEBUG to detect such kernel
API misuse.
Signed-off-by: Alexander Popov <[email protected]>
---
kernel/time/timer.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 4820823515e9..52ad2d6ce352 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1887,6 +1887,11 @@ signed long __sched schedule_timeout(signed long timeout)
}
}
+#ifdef CONFIG_SCHED_DEBUG
+ WARN_ONCE(current->state == TASK_RUNNING,
+ "schedule_timeout for TASK_RUNNING\n");
+#endif
+
expire = timeout + jiffies;
timer.task = current;
--
2.24.1
On Thu, 16 Jan 2020 17:02:18 +0300
Alexander Popov <[email protected]> wrote:
> When we were preparing the patch 6dcd5d7a7a29c1e, we made a mistake noticed
> by Linus: schedule_timeout() was called without setting the task state to
> anything particular. It calls the scheduler, but doesn't delay anything,
> because the task stays runnable. That happens because sched_submit_work()
> does nothing for tasks in TASK_RUNNING state.
>
> Let's add a WARN_ONCE() under CONFIG_SCHED_DEBUG to detect such kernel
> API misuse.
>
> Signed-off-by: Alexander Popov <[email protected]>
> ---
> kernel/time/timer.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 4820823515e9..52ad2d6ce352 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1887,6 +1887,11 @@ signed long __sched schedule_timeout(signed long timeout)
> }
> }
>
> +#ifdef CONFIG_SCHED_DEBUG
> + WARN_ONCE(current->state == TASK_RUNNING,
> + "schedule_timeout for TASK_RUNNING\n");
> +#endif
> +
But this can trigger false warnings. For example, if we are waiting on
an event with a timeout:
DEFINE_WAIT(wait);
for (;;) {
prepare_to_wait(&waitq, &wait, TASK_UNINTERRUPTIBLE);
if (event)
break;
timeout = schedule_timeout(timeout);
if (!timeout)
break;
}
finish_wait(&waitq, &wait);
If the event happens between "prepare_to_wait" and just before
schedule_timeout(), the wait queue will set this task's state to
TASK_RUNNING, which in turn triggers your warning.
-- Steve
> expire = timeout + jiffies;
>
> timer.task = current;
On Thu, 16 Jan 2020 09:52:20 -0500
Steven Rostedt <[email protected]> wrote:
> > --- a/kernel/time/timer.c
> > +++ b/kernel/time/timer.c
> > @@ -1887,6 +1887,11 @@ signed long __sched schedule_timeout(signed long timeout)
> > }
> > }
> >
> > +#ifdef CONFIG_SCHED_DEBUG
> > + WARN_ONCE(current->state == TASK_RUNNING,
> > + "schedule_timeout for TASK_RUNNING\n");
> > +#endif
> > +
>
> But this can trigger false warnings. For example, if we are waiting on
> an event with a timeout:
Also, there are helpers here that you can use:
schedule_timeout_interruptible(signed long timeout);
schedule_timeout_uninterruptible(signed long timeout)
-- Steve
On 16.01.2020 17:52, Steven Rostedt wrote:
> On Thu, 16 Jan 2020 17:02:18 +0300
> Alexander Popov <[email protected]> wrote:
>
>> When we were preparing the patch 6dcd5d7a7a29c1e, we made a mistake noticed
>> by Linus: schedule_timeout() was called without setting the task state to
>> anything particular. It calls the scheduler, but doesn't delay anything,
>> because the task stays runnable. That happens because sched_submit_work()
>> does nothing for tasks in TASK_RUNNING state.
>>
>> Let's add a WARN_ONCE() under CONFIG_SCHED_DEBUG to detect such kernel
>> API misuse.
>>
>> Signed-off-by: Alexander Popov <[email protected]>
>> ---
>> kernel/time/timer.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
>> index 4820823515e9..52ad2d6ce352 100644
>> --- a/kernel/time/timer.c
>> +++ b/kernel/time/timer.c
>> @@ -1887,6 +1887,11 @@ signed long __sched schedule_timeout(signed long timeout)
>> }
>> }
>>
>> +#ifdef CONFIG_SCHED_DEBUG
>> + WARN_ONCE(current->state == TASK_RUNNING,
>> + "schedule_timeout for TASK_RUNNING\n");
>> +#endif
>> +
>
> But this can trigger false warnings. For example, if we are waiting on
> an event with a timeout:
>
>
> DEFINE_WAIT(wait);
>
> for (;;) {
> prepare_to_wait(&waitq, &wait, TASK_UNINTERRUPTIBLE);
> if (event)
> break;
> timeout = schedule_timeout(timeout);
> if (!timeout)
> break;
> }
> finish_wait(&waitq, &wait);
>
>
> If the event happens between "prepare_to_wait" and just before
> schedule_timeout(), the wait queue will set this task's state to
> TASK_RUNNING, which in turn triggers your warning.
Steven, thanks for the explanation.
If I understand you right, it is the intended behavior of schedule_timeout() in
some sense.
So the best thing I can do here is adding an explanatory comment to the
schedule_timeout() description.
Maybe that would help against such situations:
https://lore.kernel.org/lkml/CAHk-=wgE-veRb7+mw9oMmsD97BLnL+q8Gxu0QRrK65S2yQfMdQ@mail.gmail.com/#t
I'll come with the patch soon.
Best regards,
Alexander