2003-09-09 19:58:17

by John Yau

[permalink] [raw]
Subject: Priority Inversion in Scheduling

Hi folks,

I've noticed some very interesting priority inversion scenarios in the test4
scheduling.

For example let's say you have task dumps-a-lot, which is a CPU hog and
dumps a lot of data to stdout and task interactive/real-time (e.g. xmms,
Emacs, Mozilla). When stdout of dumps-a-lot is directed to a terminal under
X, X's priority is demoted to 25, because X spends a lot of time rendering
data from dumps-a-lot. In the mean while, because dumps-a-lot is not
actually doing much because it's sleeping quite a lot, it's priority is at
15-17 or so and continues to flood X whenever it gets a chance to. This
leaves the interactive/real-time, who's priorities are at 15-17 as well have
an effective priority of 25 because they have to wait for X to service them,
thus making them feel not so interactive anymore to the user. When the
stdout of dumps-a-lot is pointed to /dev/null, interactive/real-time
responds just fine.

To get around scenarios such as this and priority inversions in general, I
propose to have some sort of priority inheritance mechanism in futex and the
scheduler. If a task is blocked by something of lower priority, the higher
priority task "donates" its time to the lower priority task in hopes of
resolving the block. The time is charged to the higher priority task in
this situation so that it cannot do this forever without being penalized.
This way in the above scenario dumps-a-lot gets penalized for being a CPU
hog and interactive/real-time stays interactive though they get penalized
too.

I'd like some feedback on the above proposal, especially from folks working
heavily on the scheduler. If the consensus is that it'd be worthwhile to
have such a mechanism, I'll go ahead and code a patch. I haven't had a
chance to look at code outside of Linus' branch in detail, so Nick, Con,
Ingo, or Andrew have already dealt with this, let me know and point me to
the code.


John Yau


2003-09-10 00:23:58

by Nick Piggin

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling



John Yau wrote:

>Hi folks,
>
>I've noticed some very interesting priority inversion scenarios in the test4
>scheduling.
>
>For example let's say you have task dumps-a-lot, which is a CPU hog and
>dumps a lot of data to stdout and task interactive/real-time (e.g. xmms,
>Emacs, Mozilla). When stdout of dumps-a-lot is directed to a terminal under
>X, X's priority is demoted to 25, because X spends a lot of time rendering
>data from dumps-a-lot. In the mean while, because dumps-a-lot is not
>actually doing much because it's sleeping quite a lot, it's priority is at
>15-17 or so and continues to flood X whenever it gets a chance to. This
>leaves the interactive/real-time, who's priorities are at 15-17 as well have
>an effective priority of 25 because they have to wait for X to service them,
>thus making them feel not so interactive anymore to the user. When the
>stdout of dumps-a-lot is pointed to /dev/null, interactive/real-time
>responds just fine.
>
>To get around scenarios such as this and priority inversions in general, I
>propose to have some sort of priority inheritance mechanism in futex and the
>scheduler. If a task is blocked by something of lower priority, the higher
>priority task "donates" its time to the lower priority task in hopes of
>resolving the block. The time is charged to the higher priority task in
>this situation so that it cannot do this forever without being penalized.
>This way in the above scenario dumps-a-lot gets penalized for being a CPU
>hog and interactive/real-time stays interactive though they get penalized
>too.
>
>I'd like some feedback on the above proposal, especially from folks working
>heavily on the scheduler. If the consensus is that it'd be worthwhile to
>have such a mechanism, I'll go ahead and code a patch. I haven't had a
>chance to look at code outside of Linus' branch in detail, so Nick, Con,
>Ingo, or Andrew have already dealt with this, let me know and point me to
>the code.
>

Hi John,
Your mechanism is basically "backboost". Its how you get X to keep a
high piroirity, but quite unpredictable. Giving a boost to a process
holding a semaphore is an interesting idea, but it doesn't address the
X problem.

The scheduler in Linus' tree is basically obsolete now, so there isn't
any point testing it really. Test Con's or my patches, and let us know
if you're still having problems with sir dumps-a-lot.


2003-09-10 04:38:13

by Mike Galbraith

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling

At 02:23 AM 9/10/2003, Nick Piggin wrote:


>John Yau wrote:
>
>>Hi folks,
>>
>>I've noticed some very interesting priority inversion scenarios in the test4
>>scheduling.
>>
>>For example let's say you have task dumps-a-lot, which is a CPU hog and
>>dumps a lot of data to stdout and task interactive/real-time (e.g. xmms,
>>Emacs, Mozilla). When stdout of dumps-a-lot is directed to a terminal under
>>X, X's priority is demoted to 25, because X spends a lot of time rendering
>>data from dumps-a-lot. In the mean while, because dumps-a-lot is not
>>actually doing much because it's sleeping quite a lot, it's priority is at
>>15-17 or so and continues to flood X whenever it gets a chance to. This
>>leaves the interactive/real-time, who's priorities are at 15-17 as well have
>>an effective priority of 25 because they have to wait for X to service them,
>>thus making them feel not so interactive anymore to the user. When the
>>stdout of dumps-a-lot is pointed to /dev/null, interactive/real-time
>>responds just fine.
>>
>>To get around scenarios such as this and priority inversions in general, I
>>propose to have some sort of priority inheritance mechanism in futex and the
>>scheduler. If a task is blocked by something of lower priority, the higher
>>priority task "donates" its time to the lower priority task in hopes of
>>resolving the block. The time is charged to the higher priority task in
>>this situation so that it cannot do this forever without being penalized.
>>This way in the above scenario dumps-a-lot gets penalized for being a CPU
>>hog and interactive/real-time stays interactive though they get penalized
>>too.
>>
>>I'd like some feedback on the above proposal, especially from folks working
>>heavily on the scheduler. If the consensus is that it'd be worthwhile to
>>have such a mechanism, I'll go ahead and code a patch. I haven't had a
>>chance to look at code outside of Linus' branch in detail, so Nick, Con,
>>Ingo, or Andrew have already dealt with this, let me know and point me to
>>the code.
>
>Hi John,
>Your mechanism is basically "backboost". Its how you get X to keep a
>high piroirity, but quite unpredictable. Giving a boost to a process
>holding a semaphore is an interesting idea, but it doesn't address the
>X problem.

FWIW, I tried the hardware usage bonus thing, and it does cure the X
inversion problem (yeah, it's a pretty cheezy way to do it). It also
cures xmms skips if you can't get to the top without hw usage. I also
tried a cpu limited backboost from/to tasks associated with hardware, and
it hasn't run amok... yet ;-)

-Mike

2003-09-10 05:42:16

by John Yau

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling

>
> Well I haven't read the paper, but I'm guessing this is semaphore
> priority inheritance.
>

Yip...it outlines the basic idea of the priority inheritance where semphores
are the locking mechanism. However, though it buys you better prioritized
scheduling in certain situation, things can get rather hairy when you have
lots of semaphores nested inside each other. If you have a cyclic
dependency somewhere in those nested semaphores, you're headed for deadlock
or worse livelock. The Mars Rover had a classic case of priority inversion
in it's software that was later solved through this approach via an update
after it landed on Mars a while back.

>
> I _think_ communication with X will mostly be done with waitqueues.
> Someone has a priority inheritance futex patch around. I'm not sure
> that it is such an open and shut case as you think though. Even if you
> could use futexes in communication with X.
>

It's not an open and shut case...actually it'd be quite an undertaking I
suspect. Because a poorly designed and/or poorly implemented scheme can
easily lead to deadlocks and what not, any implementation would need massive
peer review.

2003-09-10 05:35:32

by Mike Fedyk

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling

On Wed, Sep 10, 2003 at 06:42:10AM +0200, Mike Galbraith wrote:
> At 02:23 AM 9/10/2003, Nick Piggin wrote:
> >Hi John,
> >Your mechanism is basically "backboost". Its how you get X to keep a
> >high piroirity, but quite unpredictable. Giving a boost to a process
> >holding a semaphore is an interesting idea, but it doesn't address the
> >X problem.
>
> FWIW, I tried the hardware usage bonus thing, and it does cure the X
> inversion problem (yeah, it's a pretty cheezy way to do it). It also
> cures xmms skips if you can't get to the top without hw usage. I also
> tried a cpu limited backboost from/to tasks associated with hardware, and
> it hasn't run amok... yet ;-)

Against which scheduler, and when are you going to post the patch?

2003-09-10 06:18:53

by Mike Galbraith

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling

At 07:35 AM 9/10/2003, Mike Fedyk wrote:
>On Wed, Sep 10, 2003 at 06:42:10AM +0200, Mike Galbraith wrote:
> > At 02:23 AM 9/10/2003, Nick Piggin wrote:
> > >Hi John,
> > >Your mechanism is basically "backboost". Its how you get X to keep a
> > >high piroirity, but quite unpredictable. Giving a boost to a process
> > >holding a semaphore is an interesting idea, but it doesn't address the
> > >X problem.
> >
> > FWIW, I tried the hardware usage bonus thing, and it does cure the X
> > inversion problem (yeah, it's a pretty cheezy way to do it). It also
> > cures xmms skips if you can't get to the top without hw usage. I also
> > tried a cpu limited backboost from/to tasks associated with hardware, and
> > it hasn't run amok... yet ;-)
>
>Against which scheduler, and when are you going to post the patch?

Against stock test-4, but I'm not going to post it. It's just an
experiment to verify that there is another simple way to defeat the X
inversion problem (while retaining active list requeue). Also, backboost
is a tricky little bugger, and I thought I'd let Nick know that I had some
success with this heavily restricted form. (global backboost can be down
right evil)

If anyone having inversion or concurrency troubles wants to give it a try
for grins, they can drop me a line. My tree tends to morph a lot though,
depending on what aspect of scheduling I'm tinkering with at the time. It
currently does well at defeating known starvation issues, but I don't like
it's priority distribution much (and it's not destined for inclusion, and
it's pretty darn ugly, and I'll likely break it all to pieces again soon,
and...;).

-Mike

2003-09-10 09:29:17

by Nick Piggin

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling



Mike Galbraith wrote:

> At 07:35 AM 9/10/2003, Mike Fedyk wrote:
>
>> On Wed, Sep 10, 2003 at 06:42:10AM +0200, Mike Galbraith wrote:
>> > At 02:23 AM 9/10/2003, Nick Piggin wrote:
>> > >Hi John,
>> > >Your mechanism is basically "backboost". Its how you get X to keep a
>> > >high piroirity, but quite unpredictable. Giving a boost to a process
>> > >holding a semaphore is an interesting idea, but it doesn't address
>> the
>> > >X problem.
>> >
>> > FWIW, I tried the hardware usage bonus thing, and it does cure the X
>> > inversion problem (yeah, it's a pretty cheezy way to do it). It also
>> > cures xmms skips if you can't get to the top without hw usage. I also
>> > tried a cpu limited backboost from/to tasks associated with
>> hardware, and
>> > it hasn't run amok... yet ;-)
>>
>> Against which scheduler, and when are you going to post the patch?
>
>
> Against stock test-4, but I'm not going to post it. It's just an
> experiment to verify that there is another simple way to defeat the X
> inversion problem (while retaining active list requeue). Also,
> backboost is a tricky little bugger, and I thought I'd let Nick know
> that I had some success with this heavily restricted form. (global
> backboost can be down right evil)
>
> If anyone having inversion or concurrency troubles wants to give it a
> try for grins, they can drop me a line. My tree tends to morph a lot
> though, depending on what aspect of scheduling I'm tinkering with at
> the time. It currently does well at defeating known starvation
> issues, but I don't like it's priority distribution much (and it's not
> destined for inclusion, and it's pretty darn ugly, and I'll likely
> break it all to pieces again soon, and...;).


Sounds interesting. I my scheduler doesn't have any inversion or
starvation issues that I know of without backboost though. I'd like to
know if you find any.


2003-09-10 10:43:39

by Mike Galbraith

[permalink] [raw]
Subject: Re: Priority Inversion in Scheduling

At 11:28 AM 9/10/2003, Nick Piggin wrote:

>Sounds interesting. I my scheduler doesn't have any inversion or
>starvation issues that I know of without backboost though. I'd like to
>know if you find any.

I haven't been able to stimulate inversion, or found any terminal
starvation with your mods. The only things I can see is array switch
latency making X choppy under load, and the cpu distribution differences
shown by contest (and both may have changed considerably since last version
tested).

-Mike