2006-03-18 05:06:20

by Mike Galbraith

[permalink] [raw]
Subject: [2.6.16-rc6 patch] fix interactive task starvation

Greetings,

The patch below fixes a starvation problem that occurs when a stream of
highly interactive tasks delay an array switch for extended periods
despite EXPIRED_STARVING(rq) being true. AFAIKT, the only choice is to
enqueue awakening tasks on the expired array in this case.

Without this patch, it can be nearly impossible to remotely login to a
busy server, and interactive shell commands can starve for minutes.

This has not been verified by anyone. Comments?

-Mike

Signed-off-by: Mike Galbraith <[email protected]>

--- linux-2.6.16-rc6/kernel/sched.c.org 2006-03-17 14:48:35.000000000 +0100
+++ linux-2.6.16-rc6/kernel/sched.c 2006-03-17 17:41:25.000000000 +0100
@@ -662,11 +662,30 @@
}

/*
+ * We place interactive tasks back into the active array, if possible.
+ *
+ * To guarantee that this does not starve expired tasks we ignore the
+ * interactivity of a task if the first expired task had to wait more
+ * than a 'reasonable' amount of time. This deadline timeout is
+ * load-dependent, as the frequency of array switched decreases with
+ * increasing number of running tasks. We also ignore the interactivity
+ * if a better static_prio task has expired:
+ */
+#define EXPIRED_STARVING(rq) \
+ ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
+ (jiffies - (rq)->expired_timestamp >= \
+ STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
+ ((rq)->curr->static_prio > (rq)->best_expired_prio))
+
+/*
* __activate_task - move a task to the runqueue.
*/
static inline void __activate_task(task_t *p, runqueue_t *rq)
{
- enqueue_task(p, rq->active);
+ prio_array_t *array = rq->active;
+ if (unlikely(EXPIRED_STARVING(rq)))
+ array = rq->expired;
+ enqueue_task(p, array);
rq->nr_running++;
}

@@ -2461,22 +2480,6 @@
}

/*
- * We place interactive tasks back into the active array, if possible.
- *
- * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks. We also ignore the interactivity
- * if a better static_prio task has expired:
- */
-#define EXPIRED_STARVING(rq) \
- ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
- (jiffies - (rq)->expired_timestamp >= \
- STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
- ((rq)->curr->static_prio > (rq)->best_expired_prio))
-
-/*
* Account user cpu time to a process.
* @p: the process that the cpu time gets accounted to
* @hardirq_offset: the offset to subtract from hardirq_count()



2006-03-18 05:18:25

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

Mike Galbraith <[email protected]> wrote:
>
> +#define EXPIRED_STARVING(rq) \
> + ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
> + (jiffies - (rq)->expired_timestamp >= \
> + STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
> + ((rq)->curr->static_prio > (rq)->best_expired_prio))

Does this have to be a macro?

2006-03-18 05:48:50

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

On Fri, 2006-03-17 at 21:15 -0800, Andrew Morton wrote:
> Mike Galbraith <[email protected]> wrote:
> >
> > +#define EXPIRED_STARVING(rq) \
> > + ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
> > + (jiffies - (rq)->expired_timestamp >= \
> > + STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
> > + ((rq)->curr->static_prio > (rq)->best_expired_prio))
>
> Does this have to be a macro?
>

I suppose not, now inlined.

-Mike

Signed-off-by: Mike Galbraith <[email protected]>

--- linux-2.6.16-rc6/kernel/sched.c.org 2006-03-17 14:48:35.000000000 +0100
+++ linux-2.6.16-rc6/kernel/sched.c 2006-03-18 06:43:59.000000000 +0100
@@ -662,11 +662,36 @@
}

/*
+ * We place interactive tasks back into the active array, if possible.
+ *
+ * To guarantee that this does not starve expired tasks we ignore the
+ * interactivity of a task if the first expired task had to wait more
+ * than a 'reasonable' amount of time. This deadline timeout is
+ * load-dependent, as the frequency of array switched decreases with
+ * increasing number of running tasks. We also ignore the interactivity
+ * if a better static_prio task has expired:
+ */
+static inline int expired_starving(runqueue_t *rq)
+{
+ int limit = STARVATION_LIMIT * rq->nr_running, starving;
+
+ if (!limit || !rq->expired_timestamp)
+ return 0;
+ starving = jiffies - rq->expired_timestamp >= limit;
+ starving += rq->curr->static_prio > rq->best_expired_prio;
+
+ return starving;
+}
+
+/*
* __activate_task - move a task to the runqueue.
*/
static inline void __activate_task(task_t *p, runqueue_t *rq)
{
- enqueue_task(p, rq->active);
+ prio_array_t *array = rq->active;
+ if (unlikely(expired_starving(rq)))
+ array = rq->expired;
+ enqueue_task(p, array);
rq->nr_running++;
}

@@ -2461,22 +2486,6 @@
}

/*
- * We place interactive tasks back into the active array, if possible.
- *
- * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks. We also ignore the interactivity
- * if a better static_prio task has expired:
- */
-#define EXPIRED_STARVING(rq) \
- ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
- (jiffies - (rq)->expired_timestamp >= \
- STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
- ((rq)->curr->static_prio > (rq)->best_expired_prio))
-
-/*
* Account user cpu time to a process.
* @p: the process that the cpu time gets accounted to
* @hardirq_offset: the offset to subtract from hardirq_count()
@@ -2611,7 +2620,7 @@

if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
- if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
+ if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
enqueue_task(p, rq->expired);
if (p->static_prio < rq->best_expired_prio)
rq->best_expired_prio = p->static_prio;


2006-03-18 06:25:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

Mike Galbraith <[email protected]> wrote:
>
> > Does this have to be a macro?
> >
>
> I suppose not, now inlined.
>

It would be nice to uninline the function and then to modify it in a
followup patch. That way, we get to see what changed, which is one of the
reasons to not use megamacros (sorry).

> +static inline int expired_starving(runqueue_t *rq)
> +{
> + int limit = STARVATION_LIMIT * rq->nr_running, starving;
> +
> + if (!limit || !rq->expired_timestamp)
> + return 0;
> + starving = jiffies - rq->expired_timestamp >= limit;
> + starving += rq->curr->static_prio > rq->best_expired_prio;
> +
> + return starving;
> +}

ick. Is that really what that macros does??

The function returns a boolean, so we should short-circuit the evaluation
where possible.


static inline int expired_starving(runqueue_t *rq)
{
int limit;

/* Comment goes here */
if (!rq->expired_timestamp)
return 0;

limit = STARVATION_LIMIT * rq->nr_running;

/* Here too */
if (!limit)
return 0;

/* And here */
if (jiffies - rq->expired_timestamp >= limit)
return 1;

/* And here */
if (rq->curr->static_prio > rq->best_expired_prio)
return 1;

/* And here */
return 0;
}

This way

a) We get somewhere to put comments describing each step of the logic.

b) We get to select the order of the comparisons in decreasing
(probability*expensiveness) order.

See how you're performing an unneeded multiplication if
!rq->expired_timestamp?

c) See how the first test of `limit' comes after that multiplication?
STARVATION_LIMIT is a constant (isn't it?) If so, we need only test
rq->nr_running.

d) The next guy who comes along has to update the comments ;)


2006-03-18 07:28:04

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

On Fri, 2006-03-17 at 22:22 -0800, Andrew Morton wrote:
> Mike Galbraith <[email protected]> wrote:
> >
> > > Does this have to be a macro?
> > >
> >
> > I suppose not, now inlined.
> >
>
> It would be nice to uninline the function and then to modify it in a
> followup patch. That way, we get to see what changed, which is one of the
> reasons to not use megamacros (sorry).

Ok, take 3 below, with updated main comment as well.


> The function returns a boolean, so we should short-circuit the evaluation
> where possible.

Done.

-Mike

Signed-off-by: Mike Galbraith <[email protected]>

--- linux-2.6.16-rc6/kernel/sched.c.org 2006-03-17 14:48:35.000000000 +0100
+++ linux-2.6.16-rc6/kernel/sched.c 2006-03-18 08:03:34.000000000 +0100
@@ -662,11 +662,56 @@
}

/*
+ * We place interactive tasks back into the active array, if possible.
+ *
+ * To guarantee that this does not starve expired tasks we ignore the
+ * interactivity of a task if the first expired task had to wait more
+ * than a 'reasonable' amount of time. This deadline timeout is
+ * load-dependent, as the frequency of array switched decreases with
+ * increasing number of running tasks. We also ignore the interactivity
+ * if a better static_prio task has expired, and switch periodically
+ * regardless, to ensure that highly interactive tasks do not starve
+ * the less fortunate for unreasonably long periods.
+ */
+static int expired_starving(runqueue_t *rq)
+{
+ int limit;
+
+ /*
+ * Arrays were recently switched, all is well.
+ */
+ if (!rq->expired_timestamp)
+ return 0;
+
+ limit = STARVATION_LIMIT * rq->nr_running;
+
+ /*
+ * It's time to switch arrays.
+ */
+ if (jiffies - rq->expired_timestamp >= limit)
+ return 1;
+
+ /*
+ * There's a better selection in the expired array.
+ */
+ if (rq->curr->static_prio > rq->best_expired_prio)
+ return 1;
+
+ /*
+ * All is well.
+ */
+ return 0;
+}
+
+/*
* __activate_task - move a task to the runqueue.
*/
static inline void __activate_task(task_t *p, runqueue_t *rq)
{
- enqueue_task(p, rq->active);
+ prio_array_t *array = rq->active;
+ if (unlikely(expired_starving(rq)))
+ array = rq->expired;
+ enqueue_task(p, array);
rq->nr_running++;
}

@@ -2461,22 +2506,6 @@
}

/*
- * We place interactive tasks back into the active array, if possible.
- *
- * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks. We also ignore the interactivity
- * if a better static_prio task has expired:
- */
-#define EXPIRED_STARVING(rq) \
- ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
- (jiffies - (rq)->expired_timestamp >= \
- STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
- ((rq)->curr->static_prio > (rq)->best_expired_prio))
-
-/*
* Account user cpu time to a process.
* @p: the process that the cpu time gets accounted to
* @hardirq_offset: the offset to subtract from hardirq_count()
@@ -2611,7 +2640,7 @@

if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
- if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
+ if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
enqueue_task(p, rq->expired);
if (p->static_prio < rq->best_expired_prio)
rq->best_expired_prio = p->static_prio;


2006-03-18 07:36:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

Mike Galbraith <[email protected]> wrote:
>
> On Fri, 2006-03-17 at 22:22 -0800, Andrew Morton wrote:
> > Mike Galbraith <[email protected]> wrote:
> > >
> > > > Does this have to be a macro?
> > > >
> > >
> > > I suppose not, now inlined.
> > >
> >
> > It would be nice to uninline the function and then to modify it in a
> > followup patch. That way, we get to see what changed, which is one of the
> > reasons to not use megamacros (sorry).
>
> Ok, take 3 below, with updated main comment as well.
>

Thanks for doing that. Not really your job, but someone has to do these
things ;)

>
> Signed-off-by: Mike Galbraith <[email protected]>

You can't trick me that easily - I kept a copy of your changlog!

> /*
> + * We place interactive tasks back into the active array, if possible.
> + *
> + * To guarantee that this does not starve expired tasks we ignore the
> + * interactivity of a task if the first expired task had to wait more
> + * than a 'reasonable' amount of time. This deadline timeout is
> + * load-dependent, as the frequency of array switched decreases with
> + * increasing number of running tasks. We also ignore the interactivity
> + * if a better static_prio task has expired, and switch periodically
> + * regardless, to ensure that highly interactive tasks do not starve
> + * the less fortunate for unreasonably long periods.
> + */
> +static int expired_starving(runqueue_t *rq)

I'll make that inline..

> +{
> + int limit;
> +
> + /*
> + * Arrays were recently switched, all is well.
> + */
> + if (!rq->expired_timestamp)
> + return 0;
> +
> + limit = STARVATION_LIMIT * rq->nr_running;

In the previous iteration you had, effectively,

if (!limit)
return 0;

in here. But it's now gone. Deliberate?

> + /*
> + * It's time to switch arrays.
> + */
> + if (jiffies - rq->expired_timestamp >= limit)
> + return 1;
> +
> + /*
> + * There's a better selection in the expired array.
> + */
> + if (rq->curr->static_prio > rq->best_expired_prio)
> + return 1;
> +
> + /*
> + * All is well.
> + */
> + return 0;
> +}

2006-03-18 07:46:57

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

On Fri, 2006-03-17 at 23:33 -0800, Andrew Morton wrote:
> > +static int expired_starving(runqueue_t *rq)
>
> I'll make that inline..
>

Oops, I understood you to want that uninlined.

> > +{
> > + int limit;
> > +
> > + /*
> > + * Arrays were recently switched, all is well.
> > + */
> > + if (!rq->expired_timestamp)
> > + return 0;
> > +
> > + limit = STARVATION_LIMIT * rq->nr_running;
>
> In the previous iteration you had, effectively,
>
> if (!limit)
> return 0;
>
> in here. But it's now gone. Deliberate?

Yes. I see no way for it to be zero. I think that was just a leftover.

-Mike

2006-03-18 07:55:06

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

Mike Galbraith <[email protected]> wrote:
>
> On Fri, 2006-03-17 at 23:33 -0800, Andrew Morton wrote:
> > > +static int expired_starving(runqueue_t *rq)
> >
> > I'll make that inline..
> >
>
> Oops, I understood you to want that uninlined.
>

It has just one callsite. Modern gcc should inline it anyway, but older
versions tend to need help.

> > > +{
> > > + int limit;
> > > +
> > > + /*
> > > + * Arrays were recently switched, all is well.
> > > + */
> > > + if (!rq->expired_timestamp)
> > > + return 0;
> > > +
> > > + limit = STARVATION_LIMIT * rq->nr_running;
> >
> > In the previous iteration you had, effectively,
> >
> > if (!limit)
> > return 0;
> >
> > in here. But it's now gone. Deliberate?
>
> Yes. I see no way for it to be zero. I think that was just a leftover.
>

ok..

2006-03-18 08:08:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

Mike Galbraith <[email protected]> wrote:
>
> The patch below fixes a starvation problem that occurs when a stream of
> highly interactive tasks delay an array switch for extended periods
> despite EXPIRED_STARVING(rq) being true. AFAIKT, the only choice is to
> enqueue awakening tasks on the expired array in this case.
>
> Without this patch, it can be nearly impossible to remotely login to a
> busy server, and interactive shell commands can starve for minutes.
>
> This has not been verified by anyone. Comments?

What does that question mean, btw?


-mm is looking like linux-2.6.38 at present so of course things got tangled
up - sched-activate-sched-batch-expired.patch modifies __activate_task().

I ended up with the below.

Which do we think is more likely to be true - batch_task(p) or
expired_starving(rq)? batch_task() looks cheaper to evaluate so I put that
first. But I guess it's less likely to be true. hmm.


diff -puN kernel/sched.c~sched-fix-interactive-task-starvation kernel/sched.c
--- devel/kernel/sched.c~sched-fix-interactive-task-starvation 2006-03-17 23:55:12.000000000 -0800
+++ devel-akpm/kernel/sched.c 2006-03-17 23:59:03.000000000 -0800
@@ -733,14 +733,56 @@ static inline void dec_nr_running(task_t
}

/*
+ * We place interactive tasks back into the active array, if possible.
+ *
+ * To guarantee that this does not starve expired tasks we ignore the
+ * interactivity of a task if the first expired task had to wait more
+ * than a 'reasonable' amount of time. This deadline timeout is
+ * load-dependent, as the frequency of array switched decreases with
+ * increasing number of running tasks. We also ignore the interactivity
+ * if a better static_prio task has expired, and switch periodically
+ * regardless, to ensure that highly interactive tasks do not starve
+ * the less fortunate for unreasonably long periods.
+ */
+static inline int expired_starving(runqueue_t *rq)
+{
+ int limit;
+
+ /*
+ * Arrays were recently switched, all is well
+ */
+ if (!rq->expired_timestamp)
+ return 0;
+
+ limit = STARVATION_LIMIT * rq->nr_running;
+
+ /*
+ * It's time to switch arrays
+ */
+ if (jiffies - rq->expired_timestamp >= limit)
+ return 1;
+
+ /*
+ * There's a better selection in the expired array
+ */
+ if (rq->curr->static_prio > rq->best_expired_prio)
+ return 1;
+
+ /*
+ * All is well
+ */
+ return 0;
+}
+
+/*
* __activate_task - move a task to the runqueue.
*/
static void __activate_task(task_t *p, runqueue_t *rq)
{
prio_array_t *target = rq->active;

- if (batch_task(p))
- target = rq->expired;
+ if (unlikely(batch_task(p) || expired_starving(rq)))
+ target = rq->expired;
enqueue_task(p, target);
inc_nr_running(p, rq);
}
@@ -2614,22 +2656,6 @@ unsigned long long current_sched_time(co
}

/*
- * We place interactive tasks back into the active array, if possible.
- *
- * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks. We also ignore the interactivity
- * if a better static_prio task has expired:
- */
-#define EXPIRED_STARVING(rq) \
- ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
- (jiffies - (rq)->expired_timestamp >= \
- STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
- ((rq)->curr->static_prio > (rq)->best_expired_prio))
-
-/*
* Account user cpu time to a process.
* @p: the process that the cpu time gets accounted to
* @hardirq_offset: the offset to subtract from hardirq_count()
@@ -2764,7 +2790,7 @@ void scheduler_tick(void)

if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
- if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
+ if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
enqueue_task(p, rq->expired);
if (p->static_prio < rq->best_expired_prio)
rq->best_expired_prio = p->static_prio;
_

2006-03-18 08:16:08

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

On Saturday 18 March 2006 19:05, Andrew Morton wrote:
> Mike Galbraith <[email protected]> wrote:
> > The patch below fixes a starvation problem that occurs when a stream of
> > highly interactive tasks delay an array switch for extended periods
> > despite EXPIRED_STARVING(rq) being true. AFAIKT, the only choice is to
> > enqueue awakening tasks on the expired array in this case.
> >
> > Without this patch, it can be nearly impossible to remotely login to a
> > busy server, and interactive shell commands can starve for minutes.
> >
> > This has not been verified by anyone. Comments?
>
> What does that question mean, btw?

He's waiting for me to say I don't like it. But I do like it.

> -mm is looking like linux-2.6.38 at present so of course things got tangled
> up - sched-activate-sched-batch-expired.patch modifies __activate_task().
>
> I ended up with the below.
>
> Which do we think is more likely to be true - batch_task(p) or
> expired_starving(rq)? batch_task() looks cheaper to evaluate so I put that
> first. But I guess it's less likely to be true. hmm.

Depends entirely on workload so it's impossible to predict in advance. Any
order will do I suspect.

> + if (unlikely(batch_task(p) || expired_starving(rq)))

Looks good to me.

Acked-by: Con Kolivas <[email protected]>

Cheers,
Con

2006-03-18 08:53:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation


* Mike Galbraith <[email protected]> wrote:

> Signed-off-by: Mike Galbraith <[email protected]>

looks good to me.

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-03-18 08:55:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation


* Andrew Morton <[email protected]> wrote:

> I ended up with the below.
>
> Which do we think is more likely to be true - batch_task(p) or
> expired_starving(rq)? batch_task() looks cheaper to evaluate so I put
> that first. But I guess it's less likely to be true. hmm.

it doesnt really matter - scheduler_tick() is a slowpath. Putting
batch_task() first is ok.

Ingo

2006-03-18 09:42:14

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] fix interactive task starvation

On Sat, 2006-03-18 at 19:15 +1100, Con Kolivas wrote:
> On Saturday 18 March 2006 19:05, Andrew Morton wrote:
> > Mike Galbraith <[email protected]> wrote:
> > > The patch below fixes a starvation problem that occurs when a stream of
> > > highly interactive tasks delay an array switch for extended periods
> > > despite EXPIRED_STARVING(rq) being true. AFAIKT, the only choice is to
> > > enqueue awakening tasks on the expired array in this case.
> > >
> > > Without this patch, it can be nearly impossible to remotely login to a
> > > busy server, and interactive shell commands can starve for minutes.
> > >
> > > This has not been verified by anyone. Comments?
> >
> > What does that question mean, btw?
>
> He's waiting for me to say I don't like it. But I do like it.

<chuckle>

No. Actually, I'm waiting for Ingo to just yawn and nuke the problem
with a casual flick of his pinkie.

-Mike