2020-08-31 11:21:53

by Lucas Stach

[permalink] [raw]
Subject: [PATCH] sched/deadline: Fix stale throttling on de-/boosted tasks

When a boosted task gets throttled, what normally happens is that it's
immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
runtime and clears the dl_throttled flag. There is a special case however:
if the throttling happened on sched-out and the task has been deboosted in
the meantime, the replenish is skipped as the task will return to its
normal scheduling class. This leaves the task with the dl_throttled flag
set.

Now if the task gets boosted up to the deadline scheduling class again
while it is sleeping, it's still in the throttled state. The normal wakeup
however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
actually place it on the rq. Thus we end up with a task that is runnable,
but not actually on the rq and neither a immediate replenishment happens,
nor is the replenishment timer set up, so the task is stuck in
forever-throttled limbo.

Clear the dl_throttled flag before dropping back to the normal scheduling
class to fix this issue.

Signed-off-by: Lucas Stach <[email protected]>
---
This is the root cause and fix of the issue described at [1]. After working
on other stuff for the last few months, I finally was able to circle back
to this issue and gather the required data to pinpoint the failure mode.

[1] https://lkml.org/lkml/2020/3/20/765
---
kernel/sched/deadline.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 3862a28cd05d..c19c1883d695 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1527,12 +1527,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
pi_se = &pi_task->dl;
} else if (!dl_prio(p->normal_prio)) {
/*
- * Special case in which we have a !SCHED_DEADLINE task
- * that is going to be deboosted, but exceeds its
- * runtime while doing so. No point in replenishing
- * it, as it's going to return back to its original
- * scheduling class after this.
+ * Special case in which we have a !SCHED_DEADLINE task that is going
+ * to be deboosted, but exceeds its runtime while doing so. No point in
+ * replenishing it, as it's going to return back to its original
+ * scheduling class after this. If it has been throttled, we need to
+ * clear the flag, otherwise the task may wake up as throttled after
+ * being boosted again with no means to replenish the runtime and clear
+ * the throttle.
*/
+ p->dl.dl_throttled = 0;
BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH);
return;
}
--
2.20.1


2020-09-02 06:02:41

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Fix stale throttling on de-/boosted tasks

Hi,

On 31/08/20 13:07, Lucas Stach wrote:
> When a boosted task gets throttled, what normally happens is that it's
> immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
> runtime and clears the dl_throttled flag. There is a special case however:
> if the throttling happened on sched-out and the task has been deboosted in
> the meantime, the replenish is skipped as the task will return to its
> normal scheduling class. This leaves the task with the dl_throttled flag
> set.
>
> Now if the task gets boosted up to the deadline scheduling class again
> while it is sleeping, it's still in the throttled state. The normal wakeup
> however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
> actually place it on the rq. Thus we end up with a task that is runnable,
> but not actually on the rq and neither a immediate replenishment happens,
> nor is the replenishment timer set up, so the task is stuck in
> forever-throttled limbo.
>
> Clear the dl_throttled flag before dropping back to the normal scheduling
> class to fix this issue.
>
> Signed-off-by: Lucas Stach <[email protected]>
> ---
> This is the root cause and fix of the issue described at [1]. After working
> on other stuff for the last few months, I finally was able to circle back
> to this issue and gather the required data to pinpoint the failure mode.
>
> [1] https://lkml.org/lkml/2020/3/20/765
> ---
> kernel/sched/deadline.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3862a28cd05d..c19c1883d695 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1527,12 +1527,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
> pi_se = &pi_task->dl;
> } else if (!dl_prio(p->normal_prio)) {
> /*
> - * Special case in which we have a !SCHED_DEADLINE task
> - * that is going to be deboosted, but exceeds its
> - * runtime while doing so. No point in replenishing
> - * it, as it's going to return back to its original
> - * scheduling class after this.
> + * Special case in which we have a !SCHED_DEADLINE task that is going
> + * to be deboosted, but exceeds its runtime while doing so. No point in
> + * replenishing it, as it's going to return back to its original
> + * scheduling class after this. If it has been throttled, we need to
> + * clear the flag, otherwise the task may wake up as throttled after
> + * being boosted again with no means to replenish the runtime and clear
> + * the throttle.
> */
> + p->dl.dl_throttled = 0;
> BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH);
> return;
> }

Ah, right, thanks for looking into this issue!

Wonder if we should be calling __dl_clear_params() instead of just
clearing dl_throttled, but what you propose makes sense to me.

Acked-by: Juri Lelli <[email protected]>

Best,

Juri

2020-09-02 09:47:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Fix stale throttling on de-/boosted tasks

On Wed, Sep 02, 2020 at 08:00:24AM +0200, Juri Lelli wrote:
> On 31/08/20 13:07, Lucas Stach wrote:
> > When a boosted task gets throttled, what normally happens is that it's
> > immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
> > runtime and clears the dl_throttled flag. There is a special case however:
> > if the throttling happened on sched-out and the task has been deboosted in
> > the meantime, the replenish is skipped as the task will return to its
> > normal scheduling class. This leaves the task with the dl_throttled flag
> > set.
> >
> > Now if the task gets boosted up to the deadline scheduling class again
> > while it is sleeping, it's still in the throttled state. The normal wakeup
> > however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
> > actually place it on the rq. Thus we end up with a task that is runnable,
> > but not actually on the rq and neither a immediate replenishment happens,
> > nor is the replenishment timer set up, so the task is stuck in
> > forever-throttled limbo.
> >
> > Clear the dl_throttled flag before dropping back to the normal scheduling
> > class to fix this issue.
> >
> > Signed-off-by: Lucas Stach <[email protected]>

> Acked-by: Juri Lelli <[email protected]>

Thanks!

2020-09-09 14:16:34

by Lucas Stach

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Fix stale throttling on de-/boosted tasks

On Mi, 2020-09-02 at 11:43 +0200, [email protected] wrote:
> On Wed, Sep 02, 2020 at 08:00:24AM +0200, Juri Lelli wrote:
> > On 31/08/20 13:07, Lucas Stach wrote:
> > > When a boosted task gets throttled, what normally happens is that it's
> > > immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
> > > runtime and clears the dl_throttled flag. There is a special case however:
> > > if the throttling happened on sched-out and the task has been deboosted in
> > > the meantime, the replenish is skipped as the task will return to its
> > > normal scheduling class. This leaves the task with the dl_throttled flag
> > > set.
> > >
> > > Now if the task gets boosted up to the deadline scheduling class again
> > > while it is sleeping, it's still in the throttled state. The normal wakeup
> > > however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
> > > actually place it on the rq. Thus we end up with a task that is runnable,
> > > but not actually on the rq and neither a immediate replenishment happens,
> > > nor is the replenishment timer set up, so the task is stuck in
> > > forever-throttled limbo.
> > >
> > > Clear the dl_throttled flag before dropping back to the normal scheduling
> > > class to fix this issue.
> > >
> > > Signed-off-by: Lucas Stach <[email protected]>
> > Acked-by: Juri Lelli <[email protected]>
>
> Thanks!

Does this mean the patch will get picked up as-is, or are there any
changes required?

Regards,
Lucas

Subject: Re: [PATCH] sched/deadline: Fix stale throttling on de-/boosted tasks

On 9/2/20 11:43 AM, [email protected] wrote:
> On Wed, Sep 02, 2020 at 08:00:24AM +0200, Juri Lelli wrote:
>> On 31/08/20 13:07, Lucas Stach wrote:
>>> When a boosted task gets throttled, what normally happens is that it's
>>> immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
>>> runtime and clears the dl_throttled flag. There is a special case however:
>>> if the throttling happened on sched-out and the task has been deboosted in
>>> the meantime, the replenish is skipped as the task will return to its
>>> normal scheduling class. This leaves the task with the dl_throttled flag
>>> set.
>>>
>>> Now if the task gets boosted up to the deadline scheduling class again
>>> while it is sleeping, it's still in the throttled state. The normal wakeup
>>> however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
>>> actually place it on the rq. Thus we end up with a task that is runnable,
>>> but not actually on the rq and neither a immediate replenishment happens,
>>> nor is the replenishment timer set up, so the task is stuck in
>>> forever-throttled limbo.
>>>
>>> Clear the dl_throttled flag before dropping back to the normal scheduling
>>> class to fix this issue.
>>>
>>> Signed-off-by: Lucas Stach <[email protected]>
>
>> Acked-by: Juri Lelli <[email protected]>

I faced a similar issue, but involving DL tasks (not !DL):
https://lore.kernel.org/lkml/5076e003450835ec74e6fa5917d02c4fa41687e6.1600170294.git.bristot@redhat.com/

While debugging that problem, I reviewed and tested this patch,
and we need it. So:

Reviewed-by: Daniel Bristot de Oliveira <[email protected]>

Thanks!
-- Daniel


> Thanks!
>

Subject: [tip: sched/core] sched/deadline: Fix stale throttling on de-/boosted tasks

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 46fcc4b00c3cca8adb9b7c9afdd499f64e427135
Gitweb: https://git.kernel.org/tip/46fcc4b00c3cca8adb9b7c9afdd499f64e427135
Author: Lucas Stach <[email protected]>
AuthorDate: Mon, 31 Aug 2020 13:07:19 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Fri, 25 Sep 2020 14:23:24 +02:00

sched/deadline: Fix stale throttling on de-/boosted tasks

When a boosted task gets throttled, what normally happens is that it's
immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the
runtime and clears the dl_throttled flag. There is a special case however:
if the throttling happened on sched-out and the task has been deboosted in
the meantime, the replenish is skipped as the task will return to its
normal scheduling class. This leaves the task with the dl_throttled flag
set.

Now if the task gets boosted up to the deadline scheduling class again
while it is sleeping, it's still in the throttled state. The normal wakeup
however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't
actually place it on the rq. Thus we end up with a task that is runnable,
but not actually on the rq and neither a immediate replenishment happens,
nor is the replenishment timer set up, so the task is stuck in
forever-throttled limbo.

Clear the dl_throttled flag before dropping back to the normal scheduling
class to fix this issue.

Signed-off-by: Lucas Stach <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/deadline.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 3862a28..c19c188 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1527,12 +1527,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
pi_se = &pi_task->dl;
} else if (!dl_prio(p->normal_prio)) {
/*
- * Special case in which we have a !SCHED_DEADLINE task
- * that is going to be deboosted, but exceeds its
- * runtime while doing so. No point in replenishing
- * it, as it's going to return back to its original
- * scheduling class after this.
+ * Special case in which we have a !SCHED_DEADLINE task that is going
+ * to be deboosted, but exceeds its runtime while doing so. No point in
+ * replenishing it, as it's going to return back to its original
+ * scheduling class after this. If it has been throttled, we need to
+ * clear the flag, otherwise the task may wake up as throttled after
+ * being boosted again with no means to replenish the runtime and clear
+ * the throttle.
*/
+ p->dl.dl_throttled = 0;
BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH);
return;
}