2003-06-18 07:39:59

by Felipe Alfaro Solana

[permalink] [raw]
Subject: O(1) scheduler starvation

Hi!

I've been poking around and found the following link on O(1) scheduler
starvation problems:

http://www.hpl.hp.com/research/linux/kernel/o1-starve.php

The web page contains a small test program which supposedly is able to
make two processes starvate. However, I've been unable to reproduce what
is described in the above link. In fact, the CPU usage ratio ranges
between 60-40% or 70-30% in the worst cases.

I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
small patch I made myself to improve interactivity (mainly, to stop XMMS
from skipping by adjusting some scheduler parameters).

What's going on here? Is the previous article outdated, or have the
starvation problems been definitively fixed?

Thanks!


Attachments:
galbraith.patch (7.20 kB)
sched.patch (695.00 B)
Download all attachments

2003-06-18 11:46:44

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>Hi!
>
>I've been poking around and found the following link on O(1) scheduler
>starvation problems:
>
>http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>
>The web page contains a small test program which supposedly is able to
>make two processes starvate. However, I've been unable to reproduce what
>is described in the above link. In fact, the CPU usage ratio ranges
>between 60-40% or 70-30% in the worst cases.

(you're talking about with my monotinic_clock() diff in your kernel right?)

If you examine the priorities vs cpu usage, therein lies a big fat bug.

I think the fundamental problem is that you can only execute in series, but
can sleep in parallel, which makes for more sleep time existing than all
execution time combined. If you're running test-starve with my
monotonic_clock() diff, you should notice that one task is at maximum
priority and eating ~75% cpu, while the other is at minumum and getting the
rest minus what top gets. In a sane universe, that should be
impossible. In my current tree, this _is_ now impossible, but I haven't
worked out some nasty kinks.

I've got thud licked (without the restricting sleep to time_slice),
test-starve works right as well, and interactivity is up. Tasks waking
each other in a loop is a bitch and a half though, and need to be beaten
about the head and shoulders. Going to a synchronous wakeup for pipes
(talking stock kernel now) cures irman process_load's ability to starve...
IFF you're running it from a vt. If you're in an xterm, it'll still climb
up from the bottom (only place where it can't starve anybody) and starve
via pass-the-baton wakeup DoS. That will/does take the joy out of using
xmms. If xmms didn't use multiple threads, it'd be much worse... right
now, you'll lose eye-candy [cpu hungry visualization stuff] before you lose
sound [at next song].

>I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
>small patch I made myself to improve interactivity (mainly, to stop XMMS
>from skipping by adjusting some scheduler parameters).

Please show me your xmms hack, and show me how you make xmms skip without
doing something that puts tons of stress on the cache. I built xmms here,
and the only time the audio thread gets starved is when starting a new
song. That's because of CHILD_PENALTY when starting a new copy of xmms
while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
playing, it's rock solid here.

>What's going on here? Is the previous article outdated, or have the
>starvation problems been definitively fixed?

Nope, definitely not.

(if you like bean-counter diff, run contest/irman's process load:)

-Mike

2003-06-18 11:57:05

by Helge Hafting

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

Mike Galbraith wrote:
> At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>
>> Hi!
>>
>> I've been poking around and found the following link on O(1) scheduler
>> starvation problems:
>>
>> http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>>
>> The web page contains a small test program which supposedly is able to
>> make two processes starvate. However, I've been unable to reproduce what
>> is described in the above link. In fact, the CPU usage ratio ranges
>> between 60-40% or 70-30% in the worst cases.
>
>
> (you're talking about with my monotinic_clock() diff in your kernel right?)
>
> If you examine the priorities vs cpu usage, therein lies a big fat bug.
>
> I think the fundamental problem is that you can only execute in series,
> but can sleep in parallel, which makes for more sleep time existing than
> all execution time combined.

Would dividing the sleep time by the number of sleepers fix this?
Or is division a too heavy operation here?

Helge Hafting

2003-06-18 12:14:12

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 02:16 PM 6/18/2003 +0200, Helge Hafting wrote:
>Mike Galbraith wrote:
>>At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>>
>>>Hi!
>>>
>>>I've been poking around and found the following link on O(1) scheduler
>>>starvation problems:
>>>
>>>http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>>>
>>>The web page contains a small test program which supposedly is able to
>>>make two processes starvate. However, I've been unable to reproduce what
>>>is described in the above link. In fact, the CPU usage ratio ranges
>>>between 60-40% or 70-30% in the worst cases.
>>
>>(you're talking about with my monotinic_clock() diff in your kernel right?)
>>If you examine the priorities vs cpu usage, therein lies a big fat bug.
>>I think the fundamental problem is that you can only execute in series,
>>but can sleep in parallel, which makes for more sleep time existing than
>>all execution time combined.
>
>Would dividing the sleep time by the number of sleepers fix this?
>Or is division a too heavy operation here?

That won't work. You'd have to plunk it all into a pot, and divvy it up by
%cpu usage or something. I solved it the simple way. I keep a smoothed
run_avg (%cpu * 100), and use that as a sleep_avg limit... ie if you're
run_avg is 9000 (you're eating 10% cpu), no sleep will push you into
insanity land. I still have the problem that tasks will seek their
appropriate priority level and _stick_ there, so I probably need to add
some decay. (which will no doubt open the next can-O-worms)

-Mike

2003-06-18 14:11:56

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> >Hi!
> >
> >I've been poking around and found the following link on O(1) scheduler
> >starvation problems:
> >
> >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
> >
> >The web page contains a small test program which supposedly is able to
> >make two processes starvate. However, I've been unable to reproduce what
> >is described in the above link. In fact, the CPU usage ratio ranges
> >between 60-40% or 70-30% in the worst cases.
>
> (you're talking about with my monotinic_clock() diff in your kernel right?)

Don't know exactly its name. wli posted to LKLM a few days ago. In a few
words, the patch creates

+inline void __scheduler_tick(runqueue_t *rq, task_t *p)

and

+#define SCHED_NANOSECOND 1
+#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
+#define SCHED_TICK (SCHED_SECOND / HZ)
+#define TICKS_PER_SECOND (SCHED_SECOND / SCHED_TICK)

Don't know if this is the patch you're talking about. It's not thud,
anyways.

> If you examine the priorities vs cpu usage, therein lies a big fat bug.
>
> I think the fundamental problem is that you can only execute in series, but
> can sleep in parallel, which makes for more sleep time existing than all
> execution time combined. If you're running test-starve with my
> monotonic_clock() diff, you should notice that one task is at maximum
> priority and eating ~75% cpu, while the other is at minumum and getting the
> rest minus what top gets. In a sane universe, that should be
> impossible. In my current tree, this _is_ now impossible, but I haven't
> worked out some nasty kinks.

Exactly. This is more or less what was happening, roughly a 70-30
balance of CPU usage.

> >I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> >small patch I made myself to improve interactivity (mainly, to stop XMMS
> >from skipping by adjusting some scheduler parameters).
>
> Please show me your xmms hack, and show me how you make xmms skip without
> doing something that puts tons of stress on the cache. I built xmms here,
> and the only time the audio thread gets starved is when starting a new
> song. That's because of CHILD_PENALTY when starting a new copy of xmms
> while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
> playing, it's rock solid here.

No, I wasn't talking about an XMMS hack. I was talking about changing
scheduler defaults, as shown in the following patch:

--- old/kernel/sched.c 2003-06-17 21:04:21.240902000 +0200
+++ new/kernel/sched.c 2003-06-17 20:58:54.840902000 +0200
@@ -66,14 +66,14 @@
* they expire.
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
-#define MAX_TIMESLICE (200 * HZ / 1000)
+#define MAX_TIMESLICE ( 20 * HZ / 1000)
#define CHILD_PENALTY 50
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
-#define PRIO_BONUS_RATIO 25
+#define PRIO_BONUS_RATIO 20
#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (10*HZ)
-#define STARVATION_LIMIT (10*HZ)
+#define MAX_SLEEP_AVG (2*HZ)
+#define STARVATION_LIMIT (2*HZ)
#define NODE_THRESHOLD 125
#define SCHED_NANOSECOND 1
#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)

To make XMMS skip, just force the X server to do a lot of repainting,
for example, by dragging a big window slowly enough over another one
which requires a lot of painting (Evolution, for example, is a good
candidate as it requires a lot of CPU to repaint uncovered areas). It's
easy to reproduce just after launching XMMS. However, after a while, it
gets difficult to make XMMS to skip sound (it seems the scheduler
adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
niced processes at all.

With your patch and my scheduler changes, I've been unable to make XMMS
skip audio, even when the X server is reniced to -20, or another CPU hog
is running.

2003-06-18 15:36:04

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > >Hi!
> > >
> > >I've been poking around and found the following link on O(1) scheduler
> > >starvation problems:
> > >
> > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
> > >
> > >The web page contains a small test program which supposedly is able to
> > >make two processes starvate. However, I've been unable to reproduce what
> > >is described in the above link. In fact, the CPU usage ratio ranges
> > >between 60-40% or 70-30% in the worst cases.
> >
> > (you're talking about with my monotinic_clock() diff in your kernel right?)
>
>Don't know exactly its name. wli posted to LKLM a few days ago. In a few
>words, the patch creates
>
>+inline void __scheduler_tick(runqueue_t *rq, task_t *p)
>
>and
>
>+#define SCHED_NANOSECOND 1
>+#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
>+#define SCHED_TICK (SCHED_SECOND / HZ)
>+#define TICKS_PER_SECOND (SCHED_SECOND / SCHED_TICK)
>
>Don't know if this is the patch you're talking about. It's not thud,
>anyways.

Yeah, that's the diff. (thud is a starvation testcase posted a while back)

> > If you examine the priorities vs cpu usage, therein lies a big fat bug.
> >
> > I think the fundamental problem is that you can only execute in series,
> but
> > can sleep in parallel, which makes for more sleep time existing than all
> > execution time combined. If you're running test-starve with my
> > monotonic_clock() diff, you should notice that one task is at maximum
> > priority and eating ~75% cpu, while the other is at minumum and getting
> the
> > rest minus what top gets. In a sane universe, that should be
> > impossible. In my current tree, this _is_ now impossible, but I haven't
> > worked out some nasty kinks.
>
>Exactly. This is more or less what was happening, roughly a 70-30
>balance of CPU usage.
>
> > >I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> > >small patch I made myself to improve interactivity (mainly, to stop XMMS
> > >from skipping by adjusting some scheduler parameters).
> >
> > Please show me your xmms hack, and show me how you make xmms skip without
> > doing something that puts tons of stress on the cache. I built xmms here,
> > and the only time the audio thread gets starved is when starting a new
> > song. That's because of CHILD_PENALTY when starting a new copy of xmms
> > while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
> > playing, it's rock solid here.
>
>No, I wasn't talking about an XMMS hack. I was talking about changing
>scheduler defaults, as shown in the following patch:

Ah, ok (darn, was hoping for fresh idea)

>--- old/kernel/sched.c 2003-06-17 21:04:21.240902000 +0200
>+++ new/kernel/sched.c 2003-06-17 20:58:54.840902000 +0200
>@@ -66,14 +66,14 @@
> * they expire.
> */
> #define MIN_TIMESLICE ( 10 * HZ / 1000)
>-#define MAX_TIMESLICE (200 * HZ / 1000)
>+#define MAX_TIMESLICE ( 20 * HZ / 1000)

(tiny slices make for much context switching, but [alone] doesn't help low
prio tasks get cpu)

> #define CHILD_PENALTY 50
> #define PARENT_PENALTY 100
> #define EXIT_WEIGHT 3
>-#define PRIO_BONUS_RATIO 25
>+#define PRIO_BONUS_RATIO 20
> #define INTERACTIVE_DELTA 2
>-#define MAX_SLEEP_AVG (10*HZ)
>-#define STARVATION_LIMIT (10*HZ)
>+#define MAX_SLEEP_AVG (2*HZ)

(this will increase cpu multiplexing drastically, but will also make the
desktop feel any background load pretty much instantly. btdt too, ok for
very light load, but bad for balls-to-the-wall throughput here)

>+#define STARVATION_LIMIT (2*HZ)
> #define NODE_THRESHOLD 125
> #define SCHED_NANOSECOND 1
> #define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
>
>To make XMMS skip, just force the X server to do a lot of repainting,
>for example, by dragging a big window slowly enough over another one
>which requires a lot of painting (Evolution, for example, is a good
>candidate as it requires a lot of CPU to repaint uncovered areas). It's
>easy to reproduce just after launching XMMS. However, after a while, it
>gets difficult to make XMMS to skip sound (it seems the scheduler
>adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
>niced processes at all.

Thanks. I don't have evolution on my linux box (pine/vi/procmail
rules). ImageMagic ought to give X more than enough spurts of frenetic
activity though. Do you have that, and does image manipulation make xmms
stutter as well? Just moving windows around and changing backgrounds
doesn't do anything here. (500mhz piii/128mb ram btw)

>With your patch and my scheduler changes, I've been unable to make XMMS
>skip audio, even when the X server is reniced to -20, or another CPU hog
>is running.

<g> I can make it do a skip so big that SysRq-E or BRB are the only
recovery options.

-Mike

2003-06-18 15:45:18

by Con Kolivas

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Thu, 19 Jun 2003 01:54, Mike Galbraith wrote:
> At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> >On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > > >Hi!
> > > >
> > > >I've been poking around and found the following link on O(1) scheduler
> > > >starvation problems:
> > > >
> > > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
<BIG SNIP>

Have you seen this email I just posted to lkml?

[PATCH] 2.5.72 O(1) interactivity bugfix

It may be affecting the scheduler in all sorts of ways.

Con

2003-06-18 16:11:03

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 01:59 AM 6/19/2003 +1000, Con Kolivas wrote:
>On Thu, 19 Jun 2003 01:54, Mike Galbraith wrote:
> > At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > >On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > > > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > > > >Hi!
> > > > >
> > > > >I've been poking around and found the following link on O(1) scheduler
> > > > >starvation problems:
> > > > >
> > > > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
><BIG SNIP>
>
>Have you seen this email I just posted to lkml?

(yeah, replied off-line)

2003-06-18 16:37:42

by William Lee Irwin III

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Wed, Jun 18, 2003 at 09:53:28AM +0200, Felipe Alfaro Solana wrote:
> I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> small patch I made myself to improve interactivity (mainly, to stop XMMS
> from skipping by adjusting some scheduler parameters).
> What's going on here? Is the previous article outdated, or have the
> starvation problems been definitively fixed?

The MAX_TIMESLICE change is inappropriate.

There's papering over broken algorithms, and then there's getting really
blatant about it -- and MAX_TIMESLICE going from 200 -> 20 is blatant.


-- wli

2003-06-18 16:38:48

by William Lee Irwin III

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Wed, Jun 18, 2003 at 02:04:45PM +0200, Mike Galbraith wrote:
> I've got thud licked (without the restricting sleep to time_slice),
> test-starve works right as well, and interactivity is up. Tasks waking
> each other in a loop is a bitch and a half though, and need to be beaten
> about the head and shoulders. Going to a synchronous wakeup for pipes
> (talking stock kernel now) cures irman process_load's ability to starve...
> IFF you're running it from a vt. If you're in an xterm, it'll still climb
> up from the bottom (only place where it can't starve anybody) and starve
> via pass-the-baton wakeup DoS. That will/does take the joy out of using
> xmms. If xmms didn't use multiple threads, it'd be much worse... right
> now, you'll lose eye-candy [cpu hungry visualization stuff] before you lose
> sound [at next song].

That's great. I'd love to see the patch.


-- wli

2003-06-18 16:58:43

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 09:52 AM 6/18/2003 -0700, William Lee Irwin III wrote:
>On Wed, Jun 18, 2003 at 02:04:45PM +0200, Mike Galbraith wrote:
> > I've got thud licked (without the restricting sleep to time_slice),
> > test-starve works right as well, and interactivity is up. Tasks waking
> > each other in a loop is a bitch and a half though, and need to be beaten
> > about the head and shoulders. Going to a synchronous wakeup for pipes
> > (talking stock kernel now) cures irman process_load's ability to starve...
> > IFF you're running it from a vt. If you're in an xterm, it'll still climb
> > up from the bottom (only place where it can't starve anybody) and starve
> > via pass-the-baton wakeup DoS. That will/does take the joy out of using
> > xmms. If xmms didn't use multiple threads, it'd be much worse... right
> > now, you'll lose eye-candy [cpu hungry visualization stuff] before you
> lose
> > sound [at next song].
>
>That's great. I'd love to see the patch.

If I can get the kinks worked out and still have something left. I'm
gaining on the darn thing, but only a net millimeter at a time... with much
cha-cha-cha action in between ;-)

-Mike

2003-06-18 20:31:02

by Ricardo Galli

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

> Have you seen this email I just posted to lkml?
>
> [PATCH] 2.5.72 O(1) interactivity bugfix
>
> It may be affecting the scheduler in all sorts of ways.

I've applied the patch you posted in your web page for 2.4.21-ck1. It feels
more reactive but has an annoying collateral effect, it seems to slow down to
new processes forked from an interactive program.

I see it when selecting a PGP signed message in kmail, it takes up to two
seconds to show the message body. The lag wasn't so noticeable before.

Regards,

--
ricardo galli GPG id C8114D34

2003-06-18 20:35:04

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Wednesday 18 June 2003 22:44, Ricardo Galli wrote:

Hi Ricardo,

> > Have you seen this email I just posted to lkml?
> > [PATCH] 2.5.72 O(1) interactivity bugfix
> > It may be affecting the scheduler in all sorts of ways.
>
> I've applied the patch you posted in your web page for 2.4.21-ck1. It feels
> more reactive but has an annoying collateral effect, it seems to slow down
> to new processes forked from an interactive program.
> I see it when selecting a PGP signed message in kmail, it takes up to two
> seconds to show the message body. The lag wasn't so noticeable before.
Robert already told so.

ciao, Marc

2003-06-18 21:16:56

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

On Wed, 2003-06-18 at 17:54, Mike Galbraith wrote:
> >To make XMMS skip, just force the X server to do a lot of repainting,
> >for example, by dragging a big window slowly enough over another one
> >which requires a lot of painting (Evolution, for example, is a good
> >candidate as it requires a lot of CPU to repaint uncovered areas). It's
> >easy to reproduce just after launching XMMS. However, after a while, it
> >gets difficult to make XMMS to skip sound (it seems the scheduler
> >adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
> >niced processes at all.
>
> Thanks. I don't have evolution on my linux box (pine/vi/procmail
> rules). ImageMagic ought to give X more than enough spurts of frenetic
> activity though. Do you have that, and does image manipulation make xmms
> stutter as well? Just moving windows around and changing backgrounds
> doesn't do anything here. (500mhz piii/128mb ram btw)

In fact, I can only make XMMS skip sound for a very brief period, just
after starting it up. After a few seconds, it seems the dynamic
priorities are adjusted and I can't make XMMS skip sound anymore. To
reproduce it, open up a big window, then launch XMMS and make it play
some MP3 file. Then, start moving the big window around. For me, this
causes XMMS to skip sound for a brief period of time (~5 seconds,
approx).

2003-06-18 22:14:32

by Mike Galbraith

[permalink] [raw]
Subject: Re: O(1) scheduler starvation

At 11:30 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>On Wed, 2003-06-18 at 17:54, Mike Galbraith wrote:
> > >To make XMMS skip, just force the X server to do a lot of repainting,
> > >for example, by dragging a big window slowly enough over another one
> > >which requires a lot of painting (Evolution, for example, is a good
> > >candidate as it requires a lot of CPU to repaint uncovered areas). It's
> > >easy to reproduce just after launching XMMS. However, after a while, it
> > >gets difficult to make XMMS to skip sound (it seems the scheduler
> > >adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
> > >niced processes at all.
> >
> > Thanks. I don't have evolution on my linux box (pine/vi/procmail
> > rules). ImageMagic ought to give X more than enough spurts of frenetic
> > activity though. Do you have that, and does image manipulation make xmms
> > stutter as well? Just moving windows around and changing backgrounds
> > doesn't do anything here. (500mhz piii/128mb ram btw)
>
>In fact, I can only make XMMS skip sound for a very brief period, just
>after starting it up. After a few seconds, it seems the dynamic
>priorities are adjusted and I can't make XMMS skip sound anymore. To
>reproduce it, open up a big window, then launch XMMS and make it play
>some MP3 file. Then, start moving the big window around. For me, this
>causes XMMS to skip sound for a brief period of time (~5 seconds,
>approx).

If you bump CHILD_PENALTY up (to say 75) you should see a marked improvement.

-Mike