LinuxLists.cc - Ingo Molnar and Con Kolivas 2.6 scheduler patches

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

I'd really wish people would use examples other than a single player to
generalize the performance of the kernel. For all you know xmms's
decoder for the type of music you listen to could be written unoptimally
to put it as nice as possible. If all players and different types of
pcm audio decoders are skipping in the situations you experience then
that's different, but i hardly think that will be the case as it is not
with me nor, ever has been that i can remember (maybe in early 2.3.x).

Eugene Teo wrote:
> What I really want to see is the best of both worlds if possible.
> Well, some may be more keen to see responsiveness in work-related
> tasks, there are others who wants more responsiveness in their
> leisure-related work. I hope that Con do not stop developing his
> interactive improvements just because mingo is starting to work
> his too.
>
> Eugene
>
> <quote sender="Ismael Valladolid Torres">
>
>>Marc-Christian Petersen escribe el 26/07/03 11:46:
>>
>>>XMMS does not skip, but hey, I don't care about XMMS skipping at all.
>>
>>For those of us who'd like to use Linux as a serious musical production
>>environment in the near future, it is important to have the choice of a
>>system that does exactly that. This is, audio should not skip even on a
>>heavily loaded system. We do not care much about graphical
>>responsiveness. Think of something like Pro Tools LE running over Mac
>>OS, with up to 32 audio tracks being mixed without the help of a DSP
>>chip. Even when CPU usage gets higher than 80%, you don't get a single
>>audio glitch.
>>
>>Of course, for musical production, also the lowest latency achievable is
>>a must.
>>
>>This is only a humble opinion which I hope you find useful.
>>
>>Regards, Ismael
>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to [email protected]
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at http://www.tux.org/lkml/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-07-26 11:54:19

by Ismael Valladolid Torres

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Eugene Teo escribe el 26/07/03 12:10:
> What I really want to see is the best of both worlds if possible.
> Well, some may be more keen to see responsiveness in work-related
> tasks, there are others who wants more responsiveness in their
> leisure-related work. I hope that Con do not stop developing his
> interactive improvements just because mingo is starting to work
> his too.

Of course! Let us have the choice between different kernel patches for
different latency and responsiveness needs, and let us build whichever
kernel we want, according to the use we intend to give to our system.

Regards, Ismael

2003-07-26 14:28:39

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 26 Jul 2003 19:46, Marc-Christian Petersen wrote:
> On Saturday 26 July 2003 11:30, Felipe Alfaro Solana wrote:
>
> Hi Felipe,
>
> > I just only wanted to publicly invite Con Kolivas to keep on working
> > with the scheduler patches he has been doing and that have required a
> > constant and fair amount of time from him. I don't know if Con patches
> > do work as good for others in this list as for me, so I also invite
> > everyone who is/has been testing them to express their feelings so we
> > all can know what's the current status of the 2.6 scheduler.
>
> For me, all the Oxint Scheduler patches won't work well. Even for O8int I
> can say the same as for 01int to 07int, the system is dog slow when doing
> "make -j2 bzImage modules". XMMS does not skip, but hey, I don't care about
> XMMS skipping at all. I want a system which is responsive under heavy load,
> where opening an new xterm does not take 5-10 seconds, or writing an email
> in my MUA looks like a child is writing on a typewriter with one finger ;)
>
> Now that I've tested 2.6.0-test-1-wli (William Lee Irwin's Tree) for over a
> week, I thought about, that the problem might _not_ be only the O(1)
> Scheduler, because -wli has changed almost nothing to the scheduler stuff,
> it's almost 2.6.0-test1 code and running that kernel, my system is _alot_
> more responsive than 2.6.0-test1 or 2.6.0-test1-mm* or all the Oxint
> scheduler fixes have ever been.
>
> Strange no?

Actually this is not strange to me. It has become obvious that the problems
with interactivity that have evolved in 2.5 are not scheduler related. Just
try plugging in all the old 2.4 O(1) scheduler settings into the current
scheduler and you will see that it still performs badly. What exactly is the
cause is a mystery but seems to be more a combination of factors with a
careful look at the way the vm behaves being part of that. However, as has
been evident, evolving the interactivity estimation in the scheduler by
whatever means is able to help, which is why I started on this project in the
first place.

I make no apologies for the fact my changes so far have made it feel slower at
starting applications under load (your mileage may vary depending on hardware
or whatever as no doubt MCP's experience seems bad) when other very obvious
limitations abound in the performance in other areas. I have addressed each
and every issue I could find along the way and this issue is a tweaking one,
not an infrastructure change, unless other kernel areas are radically changed
(as in -wli) which appears increasingly unlikely as 2.6 final approaches.

Con

2003-07-26 14:30:05

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 26 Jul 2003 20:02, Ismael Valladolid Torres wrote:
> Marc-Christian Petersen escribe el 26/07/03 11:46:
> > XMMS does not skip, but hey, I don't care about XMMS skipping at all.
>
> For those of us who'd like to use Linux as a serious musical production
> environment in the near future, it is important to have the choice of a
> system that does exactly that. This is, audio should not skip even on a
> heavily loaded system. We do not care much about graphical
> responsiveness. Think of something like Pro Tools LE running over Mac
> OS, with up to 32 audio tracks being mixed without the help of a DSP
> chip. Even when CPU usage gets higher than 80%, you don't get a single
> audio glitch.
>
> Of course, for musical production, also the lowest latency achievable is
> a must.
>
> This is only a humble opinion which I hope you find useful.

Everyone's opinion counts here as everyone uses the scheduler. As far as I'm
concerned it should perform well in as many settings as possible.

Con

2003-07-26 14:29:45

by Downing, Thomas

[permalink] [raw]

Subject: RE: Ingo Molnar and Con Kolivas 2.6 scheduler patches

> -----Original Message-----
> From: Felipe Alfaro Solana [mailto:[email protected]]
> Sent: Saturday, July 26, 2003 5:31 AM
>
> Hi, everyone,
>
> In first place, let me publicly thanks both of you (Info and Con) for
> your great work at fixing/tuning the 2.6 scheduler to its best.
>
> Now that Ingo seems to be working again on the scheduler, I feel that
> Con and Ingo work is starting to collide. I have been testing Con's
> interactivity changes to the scheduler for a very long time,
> since it's
> first O1int patch and I must say that, for my specific workloads, it
> gives me the best end-user experience with interactive usage.
[snip]

Second the thanks.

I don't see much subjective difference between test1-mm(x) and
test1-G2. I've never gotten an audio skip anyway. The only
skipping I can get is video only skips under xine, but the audio
doesn't skip.

I guess this may be in part due to how I load the machine. Any
meaningful comparison of the two bodies of work would have to
be made with (at a minimum) a standard set of loads.

The way I loaded my machine (dual Xeon HT) to > 9 load average
was: 1. continuous loop 'ps -ef', 2. KDE make -j8, 3. pov-ray
rendering, 3. continuous bitmap operations in X.

What I've left to date is (among others): 1. heavy disk i/o load,
2. heavy network load, 3. deliberate memory torture.

Operations such as new terminal window, new browser, new Konquerer
etc, are slower of course, and somewhat jerky, but given a load
of 9, even Mozilla and Konquerer loaded in < 15 seconds, a new
terminal loaded and accepted keyboard input in less than 3.

So I wonder if the seemingly disparate results are weirdness,
or are they a combination of basic machine variations coupled
with loading variations?

2003-07-26 14:44:30

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 26 Jul 2003 22:08, Ismael Valladolid Torres wrote:
> Eugene Teo escribe el 26/07/03 12:10:
> > What I really want to see is the best of both worlds if possible.
> > Well, some may be more keen to see responsiveness in work-related
> > tasks, there are others who wants more responsiveness in their
> > leisure-related work. I hope that Con do not stop developing his
> > interactive improvements just because mingo is starting to work
> > his too.
>
> Of course! Let us have the choice between different kernel patches for
> different latency and responsiveness needs, and let us build whichever
> kernel we want, according to the use we intend to give to our system.

While this may sound like a solution, I still believe one scheduler should
perform well in as many settings as possible without a different kernel tree.
You can bet your bottom dollar the alternative 2.6 trees will be out as fast
as you can say Andrea Arcangeli anyway, but let's get the main tree as
versatile as possible.

Con

2003-07-26 15:34:44

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Saturday 26 July 2003 06:24, Ed Sweetman wrote:
> I'd really wish people would use examples other than a single player to
> generalize the performance of the kernel. For all you know xmms's
> decoder for the type of music you listen to could be written unoptimally
> to put it as nice as possible. If all players and different types of
> pcm audio decoders are skipping in the situations you experience then
> that's different, but i hardly think that will be the case as it is not
> with me nor, ever has been that i can remember (maybe in early 2.3.x).

Audio players fall into a special category of application, the kind where it's
not unreasonable to change the code around to take advantage of new kernel
features to make them work better. Remember this word: audiophile. An
audiophile will do whatever it takes to attain more perfect reproduction.
Furthermore, where goes the audophile, soon follows the regular user. Just
go into a stereo store if you need convincing about that.

Now to translate this into concrete terms. Audio reproduction is a realtime
task - there is no way to argue it's not. Ergo, perfect audo reproduction
requires a true realtime scheduler. Hence we require realtime scheduling.

The definition of a realtime scheduler is that the worst case latency is
bounded. The current crop of interactive tweaks do not do that. So we need
a scheduler with a bounded worst case. Davide Libenzi's recent patch that
implements a new SCHED_SOFTRR scheduler policy, usable by non-root users,
provides such a bound. Please don't lose sight of the fact that this is the
correct solution to the problem, and that interactive tweaking, while it may
produce good results for some or even most users in some or even most
situations, will never magically transform Linux into an operating system
that an audiophile could love.

Note: none of the above should be construed as discouragement for Con's work
on improving interactive performance. Just don't lump audio playback into
the "interactive" category, please, it's not. It's realtime. By keeping
this firmly in mind we will end up with better interactive performance,
excellent audio reproduction, and simpler, better code overall.

Another note: I have not tested Davide's patch, nor have I read it in detail,
or Ingo's scheduling code for that matter. For that I plead "road trip".
I'll do all of the above as soon as I get back to Berlin. I do know that the
Linux Audio guys I talked to at Linuxtag are excited about Davide's patch,
and think it's exactly the right way to go.

Regards,

Daniel

2003-07-26 16:27:30

by Lou Langholtz

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Con Kolivas wrote:

>. . .
>Actually this is not strange to me. It has become obvious that the problems
>with interactivity that have evolved in 2.5 are not scheduler related. Just
>try plugging in all the old 2.4 O(1) scheduler settings into the current
>scheduler and you will see that it still performs badly. What exactly is the
>cause is a mystery but seems to be more a combination of factors with a
>careful look at the way the vm behaves being part of that. . . .
>
Any chance that the problem may be due to the block layer system (and
block driver(s)) getting more cycles than it should? Particularly with
the out-of-band like work queue scheduling? That would at least explain
the scheduling oddities I'm seeing with 2.6.0-test1 after a minute of so
of intense I/O.

2003-07-26 16:29:11

by Rahul Karnik

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Ed Sweetman wrote:
>
> I'd really wish people would use examples other than a single player to
> generalize the performance of the kernel.

I must wholeheartedly agree. I have not tried any of the interactivity
work yet, but it is clear that your particular experience will depend on
the specific test application used. For instance, using net-rhythmbox to
play mp3s causes skips every time a web page is loaded, but it does not
happen with xmms. Perhaps we should write down the various interactivity
tests people have come up with, so that Con/Ingo/whoever else can test
their work to some extent.

Thanks,
Rahul

2003-07-26 16:37:45

by Diego Calleja

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

El Sun, 27 Jul 2003 10:46:30 -0500 Daniel Phillips <[email protected]> escribi?:

> Audio players fall into a special category of application, the kind where it's
> not unreasonable to change the code around to take advantage of new kernel
> features to make them work better. Remember this word: audiophile. An
> audiophile will do whatever it takes to attain more perfect reproduction.
> Furthermore, where goes the audophile, soon follows the regular user. Just
> go into a stereo store if you need convincing about that.

I wonder if X falls in this category too; the X behaviour should be as
good as the scheduler allows it; but for a "desktop user" X behaviour
should be reponsive *always* (ie: it should have priority even if it
doesn't need it) and i wonder if that should be changed too.

2003-07-26 16:39:10

by Jens Axboe

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, Jul 26 2003, Lou Langholtz wrote:
> Con Kolivas wrote:
>
> >. . .
> >Actually this is not strange to me. It has become obvious that the
> >problems with interactivity that have evolved in 2.5 are not scheduler
> >related. Just try plugging in all the old 2.4 O(1) scheduler settings into
> >the current scheduler and you will see that it still performs badly. What
> >exactly is the cause is a mystery but seems to be more a combination of
> >factors with a careful look at the way the vm behaves being part of that.
> >. . .
> >
> Any chance that the problem may be due to the block layer system (and
> block driver(s)) getting more cycles than it should? Particularly with

Not likely

> the out-of-band like work queue scheduling? That would at least explain

If anything, 2.6 will unplug sooner than 2.4. 2.4 will unplug only when
someone does a wait_on_buffer/page, 2.6 will unplug when:

- queued requests exceed unplug threshold, 4 requests
- unplug timeout, 3ms
- someone doing wait_on_page/buffer

Can't rule out a bug of course, the horrible audio skips I've seen in
2.6 are not io related though. Block layer will only do out of band
unplugs for the timer unplug, and even that could be superceded by a new
request entering the queue (or someone doing wait_bla)

> the scheduling oddities I'm seeing with 2.6.0-test1 after a minute of so
> of intense I/O.

CPU or disk scheduling oddities?

People should try booting with elevator=deadline to see if it changes
anything for these types of workloads, AS is really not optimal for that
at all.

--
Jens Axboe

2003-07-26 18:02:50

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, Jul 26, 2003 at 11:46:45AM +0200, Marc-Christian Petersen wrote:
> Now that I've tested 2.6.0-test-1-wli (William Lee Irwin's Tree) for over a
> week, I thought about, that the problem might _not_ be only the O(1)
> Scheduler, because -wli has changed almost nothing to the scheduler stuff,
> it's almost 2.6.0-test1 code and running that kernel, my system is _alot_
> more responsive than 2.6.0-test1 or 2.6.0-test1-mm* or all the Oxint
> scheduler fixes have ever been.
> Strange no?
> P.S.: I've not tested Ingo's G3 scheduler fix yet. More to come.

I've no plausible explanation for this; perhaps the only possible one
is that one of the algorithms that was sped up was behaving badly enough
to interfere with scheduling.

-- wli

2003-07-26 18:16:12

by Felipe Alfaro Solana

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 2003-07-26 at 20:19, William Lee Irwin III wrote:
> On Sat, Jul 26, 2003 at 11:46:45AM +0200, Marc-Christian Petersen wrote:
> > Now that I've tested 2.6.0-test-1-wli (William Lee Irwin's Tree) for over a
> > week, I thought about, that the problem might _not_ be only the O(1)
> > Scheduler, because -wli has changed almost nothing to the scheduler stuff,
> > it's almost 2.6.0-test1 code and running that kernel, my system is _alot_
> > more responsive than 2.6.0-test1 or 2.6.0-test1-mm* or all the Oxint
> > scheduler fixes have ever been.
> > Strange no?
> > P.S.: I've not tested Ingo's G3 scheduler fix yet. More to come.
>
> I've no plausible explanation for this; perhaps the only possible one
> is that one of the algorithms that was sped up was behaving badly enough
> to interfere with scheduling.

I've also noticed that 2.6.0-test1-wl1A behaves pretty well, given that
no major changes to the CPU scheduler are included.

2003-07-26 18:21:07

by Andrew Morton

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Daniel Phillips <[email protected]> wrote:
>
> Audio players fall into a special category of application, the kind where it's
> not unreasonable to change the code around to take advantage of new kernel
> features to make them work better.

One shouldn't even need to modify the player application to start using a
new scheduler policy - policy is inherited, so a wrapper will suffice:

sudo /bin/run-something-as-softrr mplayer

> Remember this word: audiophile.

That is one problem space, and I guess if we fix that, we fix the X11
problems too.

Let us not lose sight of the other problem: particular sleep/run patterns
as demonstrated in irman are causing extremem starvation. Arguably we
should be addressing this as the higher priority problem.

It is interesting that Felipe says that stock 2.5.69 was the best CPU
scheduler of the 2.5 series. Do others agree with that?

And what about the O(1) backports? RH and UL and -aa kernels? Are people
complaining about those kernels? If not, why? What is different?

2003-07-26 19:04:06

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 2003-07-26 at 20:19, William Lee Irwin III wrote:
>> I've no plausible explanation for this; perhaps the only possible one
>> is that one of the algorithms that was sped up was behaving badly enough
>> to interfere with scheduling.

On Sat, Jul 26, 2003 at 08:31:18PM +0200, Felipe Alfaro Solana wrote:
> I've also noticed that 2.6.0-test1-wl1A behaves pretty well, given that
> no major changes to the CPU scheduler are included.

Okay, now there's a question of narrowing down which piece of it did
it. It consists of 38 or so pieces so 6 boots should do it.

-- wli

2003-07-26 19:31:37

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, Jul 26, 2003 at 08:31:18PM +0200, Felipe Alfaro Solana wrote:
> I've also noticed that 2.6.0-test1-wl1A behaves pretty well, given that
> no major changes to the CPU scheduler are included.

Please roll back to 2.5.74-wli or apply the diffs from the tarball; I
rolled Con's stuff into the monolithic 2.6.0-test1 diff.

-- wli

2003-07-26 20:01:26

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Saturday 26 July 2003 13:35, Andrew Morton wrote:
> Daniel Phillips <[email protected]> wrote:
> > Audio players fall into a special category of application, the kind where
> > it's not unreasonable to change the code around to take advantage of new
> > kernel features to make them work better.
>
> One shouldn't even need to modify the player application to start using a
> new scheduler policy - policy is inherited, so a wrapper will suffice:
>
> sudo /bin/run-something-as-softrr mplayer

True, and that's roughly what I do now (except just with elevated priority as
opposed to a realtime scheduler policy). However, it's more friendly to the
system if the realtime priority is limited to just the thread that needs it,
and that's why the application itself needs to provide the hint.

Zinf already does try to provide such a hint by setting a higher priority for
its sound servicing thread. Unfortunately, this is ignored unless zinf is
running as root. Given the number of bugs in Zinf, I am uncomfortable
running the whole application as root. It's altogether more conservative to
limit the risk to a single, simple thread that can be easily audited.

> > Remember this word: audiophile.
>
> That is one problem space, and I guess if we fix that, we fix the X11
> problems too.
>
> Let us not lose sight of the other problem: particular sleep/run patterns
> as demonstrated in irman are causing extremem starvation. Arguably we
> should be addressing this as the higher priority problem.

I agree it's the more important problem, but there are also more people
already working on it, whereas over here in the audio corner, there are just
Davide and me (sorry if I left anyone out, please feel free to flame me so I
know who you are).

> It is interesting that Felipe says that stock 2.5.69 was the best CPU
> scheduler of the 2.5 series. Do others agree with that?

I never tried audio until 2.5.73. With Con's patches, life has been pretty
good for me from then on, from the non-starvation point of view.

> And what about the O(1) backports? RH and UL and -aa kernels? Are people
> complaining about those kernels? If not, why? What is different?

In case anybody wants to hearken back to the good old days of 2.4, forget it.
It is only good for sound if you are lucky enough to have a configuration it
likes. My unlucky wife on the other hand, who gets by with a 233 MHz K6
(because she can) is running 2.4 and says sound skips whenever she does
anything with the machine other that just letting it sit and play. Now that
I think of it, this will be an ideal machine for testing audio robustness,
and scheduler robustness in general.

Regards,

Daniel

2003-07-26 22:49:23

by Diego Calleja

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

El Sat, 26 Jul 2003 11:35:22 -0700 Andrew Morton <[email protected]> escribi?:

> It is interesting that Felipe says that stock 2.5.69 was the best CPU
> scheduler of the 2.5 series. Do others agree with that?

No.
For me, 2.5.63 was the best. Or perhaps it was .64 or .65?

What I know is that the best CPU scheduler was the one previous
to the "interactivity changes" from Linus. I mean, if Linus' changes
went in .65, then it's .64, etc.

Perhaps for other people it's .69....probably it also depends a lot
on the hardware side. For me .63 was "good"

2003-07-27 02:19:19

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 27 Jul 2003 04:35, Andrew Morton wrote:
> It is interesting that Felipe says that stock 2.5.69 was the best CPU
> scheduler of the 2.5 series. Do others agree with that?

Well this had the original tuning settings of 2 seconds for max sleep avg and
starvation limit, and 95% for child penalty, which are the 2.4 O(1) settings.
Interestingly, they are also what Ingo has put into the G3 patch (except for
starvation limit), and account for a large part of the improvement in G3 as
well as the increased resolution.

> And what about the O(1) backports? RH and UL and -aa kernels? Are people
> complaining about those kernels? If not, why? What is different?

No, this is what I have been trying to figure out; why is it that if we put
all the settings the same as 2.4 that it doesn't perform as nicely. 2.5/6
with the old settings is certainly better than with the vanilla settings, but
not as good as 2.4 O(1). It does not appear to be scheduler alone, but the
architectural changes to 2.5 that have changed interactivity are here to
stay, and improving the interactivity estimator in the scheduler does help it
anyway. It also gives us a chance to address certain corner cases that have
always existed.

Con

2003-07-27 07:24:32

by Willy Tarreau

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Hi Con,

On Sun, Jul 27, 2003 at 12:38:37PM +1000, Con Kolivas wrote:

> No, this is what I have been trying to figure out; why is it that if we put
> all the settings the same as 2.4 that it doesn't perform as nicely. 2.5/6
> with the old settings is certainly better than with the vanilla settings, but
> not as good as 2.4 O(1). It does not appear to be scheduler alone, but the
> architectural changes to 2.5 that have changed interactivity are here to
> stay, and improving the interactivity estimator in the scheduler does help it
> anyway.

just a thought : have you tried to set the timer to 100Hz instead of 1kHz to
compare with 2.4 ? It might make a difference too.

Cheers,
Willy

2003-07-27 08:57:36

by Ingo Molnar

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 27 Jul 2003, Willy Tarreau wrote:

> just a thought : have you tried to set the timer to 100Hz instead of
> 1kHz to compare with 2.4 ? It might make a difference too.

especially for X, a HZ of 1000 has caused performance problems before -
short-timeout select()s were looping 10 times faster, which can be
noticeable.

Ingo

2003-07-27 09:18:30

by Ingo Molnar

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sat, 26 Jul 2003, Felipe Alfaro Solana wrote:

> [...] I feel that Con and Ingo work is starting to collide.

they do collide only on the patch level - both change the same code.
Otherwise, most of Con's tunings/changes are still valid with my patches
applied - and i'd more than encourage Con's work to continue! Watching the
tuning work i got the impression that the problem areas are suffering from
a lack of infrastructure, not from a lack of tuning. So i introduced 3 new
items: accurate statistics, on-runqueue boosting and timeslice
granularity. The fact that these items improved certain characteristics
(and fixed a couple of corner cases like test-starve.c) prove that it's a
step in the right direction. It's definitely not the final step.

Ingo

2003-07-27 09:37:57

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 27 Jul 2003 19:24, Ingo Molnar wrote:
> On Sat, 26 Jul 2003, Felipe Alfaro Solana wrote:
> > [...] I feel that Con and Ingo work is starting to collide.
>
> they do collide only on the patch level - both change the same code.
> Otherwise, most of Con's tunings/changes are still valid with my patches
> applied - and i'd more than encourage Con's work to continue! Watching the
> tuning work i got the impression that the problem areas are suffering from
> a lack of infrastructure, not from a lack of tuning. So i introduced 3 new
> items: accurate statistics, on-runqueue boosting and timeslice
> granularity. The fact that these items improved certain characteristics
> (and fixed a couple of corner cases like test-starve.c) prove that it's a
> step in the right direction. It's definitely not the final step.

Thanks Ingo. I will continue then and stepwise make use of the extra
infrastructure you've made available when I can decide how best to benefit
from it.

Con

2003-07-27 09:48:49

by Ingo Molnar

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Con,

would you mind to explain the reasoning behind the avg_start,
MIN_SLEEP_AVG and normalise_sleep() logic in your patch?

[for reference i've attached your patches in a single unified patch up to
O8int, against 2.6.0-test1.]

Ingo

--- linux/include/linux/sched.h.orig
+++ linux/include/linux/sched.h
@@ -341,6 +341,7 @@ struct task_struct {
prio_array_t *array;

unsigned long sleep_avg;
+ unsigned long avg_start;
unsigned long last_run;

unsigned long policy;
--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -68,14 +68,16 @@
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
#define MAX_TIMESLICE (200 * HZ / 1000)
-#define CHILD_PENALTY 50
+#define CHILD_PENALTY 95
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
#define PRIO_BONUS_RATIO 25
#define INTERACTIVE_DELTA 2
+#define MIN_SLEEP_AVG (HZ)
#define MAX_SLEEP_AVG (10*HZ)
#define STARVATION_LIMIT (10*HZ)
#define NODE_THRESHOLD 125
+#define MAX_BONUS (MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)

/*
* If a task is 'interactive' then we reinsert it in the active
@@ -297,6 +299,26 @@ static inline void enqueue_task(struct t
array->nr_active++;
p->array = array;
}
+/*
+ * normalise_sleep converts a task's sleep_avg to
+ * an appropriate proportion of MIN_SLEEP_AVG.
+ */
+static inline void normalise_sleep(task_t *p)
+{
+ unsigned long old_avg_time = jiffies - p->avg_start;
+
+ if (unlikely(old_avg_time < MIN_SLEEP_AVG))
+ return;
+
+ if (p->sleep_avg > MAX_SLEEP_AVG)
+ p->sleep_avg = MAX_SLEEP_AVG;
+
+ if (old_avg_time > MAX_SLEEP_AVG)
+ old_avg_time = MAX_SLEEP_AVG;
+
+ p->sleep_avg = p->sleep_avg * MIN_SLEEP_AVG / old_avg_time;
+ p->avg_start = jiffies - MIN_SLEEP_AVG;
+}

/*
* effective_prio - return the priority that is based on the static
@@ -315,11 +337,28 @@ static inline void enqueue_task(struct t
static int effective_prio(task_t *p)
{
int bonus, prio;
+ unsigned long sleep_period;

if (rt_task(p))
return p->prio;

- bonus = MAX_USER_PRIO*PRIO_BONUS_RATIO*p->sleep_avg/MAX_SLEEP_AVG/100 -
+ sleep_period = jiffies - p->avg_start;
+
+ if (unlikely(!sleep_period))
+ return p->static_prio;
+
+ if (sleep_period > MAX_SLEEP_AVG)
+ sleep_period = MAX_SLEEP_AVG;
+
+ if (p->sleep_avg > sleep_period)
+ sleep_period = p->sleep_avg;
+
+ /*
+ * The bonus is determined according to the accumulated
+ * sleep avg over the duration the task has been running
+ * until it reaches MAX_SLEEP_AVG. -ck
+ */
+ bonus = MAX_USER_PRIO*PRIO_BONUS_RATIO*p->sleep_avg/sleep_period/100 -
MAX_USER_PRIO*PRIO_BONUS_RATIO/100/2;

prio = p->static_prio - bonus;
@@ -350,31 +389,47 @@ static inline void activate_task(task_t
long sleep_time = jiffies - p->last_run - 1;

if (sleep_time > 0) {
- int sleep_avg;
-
/*
- * This code gives a bonus to interactive tasks.
- *
- * The boost works by updating the 'average sleep time'
- * value here, based on ->last_run. The more time a task
- * spends sleeping, the higher the average gets - and the
- * higher the priority boost gets as well.
+ * User tasks that sleep a long time are categorised as idle and
+ * will get just under interactive status with a small runtime
+ * to allow them to become interactive or non-interactive rapidly
*/
- sleep_avg = p->sleep_avg + sleep_time;
+ if (sleep_time > MIN_SLEEP_AVG && p->mm){
+ p->avg_start = jiffies - MIN_SLEEP_AVG;
+ p->sleep_avg = MIN_SLEEP_AVG * (MAX_BONUS - INTERACTIVE_DELTA - 2) /
+ MAX_BONUS;
+ } else {
+ unsigned long runtime = jiffies - p->avg_start;

- /*
- * 'Overflow' bonus ticks go to the waker as well, so the
- * ticks are not lost. This has the effect of further
- * boosting tasks that are related to maximum-interactive
- * tasks.
- */
- if (sleep_avg > MAX_SLEEP_AVG)
- sleep_avg = MAX_SLEEP_AVG;
- if (p->sleep_avg != sleep_avg) {
- p->sleep_avg = sleep_avg;
- p->prio = effective_prio(p);
+ if (runtime > MAX_SLEEP_AVG)
+ runtime = MAX_SLEEP_AVG;
+
+ /*
+ * This code gives a bonus to interactive tasks.
+ *
+ * The boost works by updating the 'average sleep time'
+ * value here, based on ->last_run. The more time a task
+ * spends sleeping, the higher the average gets - and the
+ * higher the priority boost gets as well.
+ */
+ p->sleep_avg += sleep_time;
+
+ /*
+ * Processes that sleep get pushed to a higher priority
+ * each time they sleep
+ */
+ p->sleep_avg = (p->sleep_avg * MAX_BONUS / runtime + 1) * runtime / MAX_BONUS;
+
+ if (p->sleep_avg > MAX_SLEEP_AVG)
+ p->sleep_avg = MAX_SLEEP_AVG;
+ }
+
+ if (unlikely(p->avg_start > jiffies)){
+ p->avg_start = jiffies;
+ p->sleep_avg = 0;
}
}
+ p->prio = effective_prio(p);
__activate_task(p, rq);
}

@@ -551,6 +606,7 @@ void wake_up_forked_process(task_t * p)
* from forking tasks that are max-interactive.
*/
current->sleep_avg = current->sleep_avg * PARENT_PENALTY / 100;
+ normalise_sleep(p);
p->sleep_avg = p->sleep_avg * CHILD_PENALTY / 100;
p->prio = effective_prio(p);
set_task_cpu(p, smp_processor_id());
@@ -591,6 +647,8 @@ void sched_exit(task_t * p)
* If the child was a (relative-) CPU hog then decrease
* the sleep_avg of the parent as well.
*/
+ normalise_sleep(p);
+ normalise_sleep(p->parent);
if (p->sleep_avg < p->parent->sleep_avg)
p->parent->sleep_avg = (p->parent->sleep_avg * EXIT_WEIGHT +
p->sleep_avg) / (EXIT_WEIGHT + 1);
@@ -1213,11 +1271,7 @@ void scheduler_tick(int user_ticks, int
spin_lock(&rq->lock);
/*
* The task was running during this tick - update the
- * time slice counter and the sleep average. Note: we
- * do not update a thread's priority until it either
- * goes to sleep or uses up its timeslice. This makes
- * it possible for interactive tasks to use up their
- * timeslices at their highest priority levels.
+ * time slice counter and the sleep average.
*/
if (p->sleep_avg)
p->sleep_avg--;
@@ -1250,6 +1304,17 @@ void scheduler_tick(int user_ticks, int
enqueue_task(p, rq->expired);
} else
enqueue_task(p, rq->active);
+ } else if (p->mm && !((task_timeslice(p) - p->time_slice) %
+ (MIN_TIMESLICE * (MAX_BONUS + 1 - p->sleep_avg * MAX_BONUS / MAX_SLEEP_AVG)))){
+ /*
+ * Running user tasks get requeued with their remaining timeslice
+ * after a period proportional to how cpu intensive they are to
+ * minimise the duration one interactive task can starve another
+ */
+ dequeue_task(p, rq->active);
+ set_tsk_need_resched(p);
+ p->prio = effective_prio(p);
+ enqueue_task(p, rq->active);
}
out_unlock:
spin_unlock(&rq->lock);

2003-07-27 10:00:37

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 27 Jul 2003 20:02, Ingo Molnar wrote:
> Con,
>
> would you mind to explain the reasoning behind the avg_start,
> MIN_SLEEP_AVG and normalise_sleep() logic in your patch?
>
> [for reference i've attached your patches in a single unified patch up to
> O8int, against 2.6.0-test1.]

Unfortunately that was my older approach so none of that work is valid, and
MIN_SLEEP_AVG and normalise_sleep() have been killed off. In essence
the most essential difference is the way the sleep avg is incremented.

p->sleep_avg = (p->sleep_avg * MAX_BONUS /
MAX_SLEEP_AVG + 1) *
MAX_SLEEP_AVG / MAX_BONUS;

where

#define MAX_BONUS (MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)

...
basically any task that sleeps >= 2 milliseconds gets elevated by
one bonus each time, and the tuning to go with that to prevent starvation
and maintain fairness. I have some more tuning in the works as well.

Here is a full current O9int against 2.6.0-test1 for clarity.

--- linux-2.6.0-test1/kernel/sched.c 2003-07-23 21:03:43.000000000 +1000
+++ linux-2.6.0-test1-O9/kernel/sched.c 2003-07-27 12:19:30.000000000 +1000
@@ -68,14 +68,16 @@
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
#define MAX_TIMESLICE (200 * HZ / 1000)
-#define CHILD_PENALTY 50
+#define TIMESLICE_GRANULARITY (HZ / 20 ?: 1)
+#define CHILD_PENALTY 90
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
#define PRIO_BONUS_RATIO 25
#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (10*HZ)
-#define STARVATION_LIMIT (10*HZ)
+#define MAX_SLEEP_AVG (HZ)
+#define STARVATION_LIMIT (HZ)
#define NODE_THRESHOLD 125
+#define MAX_BONUS (MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)

/*
* If a task is 'interactive' then we reinsert it in the active
@@ -347,34 +349,38 @@ static inline void __activate_task(task_
*/
static inline void activate_task(task_t *p, runqueue_t *rq)
{
- long sleep_time = jiffies - p->last_run - 1;
+ if (likely(p->last_run)){
+ long sleep_time = jiffies - p->last_run - 1;

- if (sleep_time > 0) {
- int sleep_avg;
+ if (sleep_time > 0) {
+ /*
+ * User tasks that sleep a long time are categorised as
+ * idle and will get just under interactive status to
+ * prevent them suddenly becoming cpu hogs and starving
+ * other processes.
+ */
+ if (p->mm && sleep_time > HZ)
+ p->sleep_avg = MAX_SLEEP_AVG *
+ (MAX_BONUS - 1) / MAX_BONUS - 1;
+ else {

- /*
- * This code gives a bonus to interactive tasks.
- *
- * The boost works by updating the 'average sleep time'
- * value here, based on ->last_run. The more time a task
- * spends sleeping, the higher the average gets - and the
- * higher the priority boost gets as well.
- */
- sleep_avg = p->sleep_avg + sleep_time;
+ /*
+ * Processes that sleep get pushed to one higher
+ * priority each time they sleep greater than
+ * one tick. -ck
+ */
+ p->sleep_avg = (p->sleep_avg * MAX_BONUS /
+ MAX_SLEEP_AVG + 1) *
+ MAX_SLEEP_AVG / MAX_BONUS;

- /*
- * 'Overflow' bonus ticks go to the waker as well, so the
- * ticks are not lost. This has the effect of further
- * boosting tasks that are related to maximum-interactive
- * tasks.
- */
- if (sleep_avg > MAX_SLEEP_AVG)
- sleep_avg = MAX_SLEEP_AVG;
- if (p->sleep_avg != sleep_avg) {
- p->sleep_avg = sleep_avg;
- p->prio = effective_prio(p);
+ if (p->sleep_avg > MAX_SLEEP_AVG)
+ p->sleep_avg = MAX_SLEEP_AVG;
+ }
}
- }
+ } else
+ p->last_run = jiffies;
+
+ p->prio = effective_prio(p);
__activate_task(p, rq);
}

@@ -553,6 +559,7 @@ void wake_up_forked_process(task_t * p)
current->sleep_avg = current->sleep_avg * PARENT_PENALTY / 100;
p->sleep_avg = p->sleep_avg * CHILD_PENALTY / 100;
p->prio = effective_prio(p);
+ p->last_run = 0;
set_task_cpu(p, smp_processor_id());

if (unlikely(!current->array))
@@ -1244,6 +1251,16 @@ void scheduler_tick(int user_ticks, int
enqueue_task(p, rq->expired);
} else
enqueue_task(p, rq->active);
+ } else if (!((task_timeslice(p) - p->time_slice) %
+ TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE)) {
+ /*
+ * Running user tasks get requeued with their remaining
+ * timeslice after TIMESLICE_GRANULARITY provided they have at
+ * least MIN_TIMESLICE to go.
+ */
+ dequeue_task(p, rq->active);
+ set_tsk_need_resched(p);
+ enqueue_task(p, rq->active);
}
out_unlock:
spin_unlock(&rq->lock);

2003-07-27 11:35:21

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 27 Jul 2003 19:12, Ingo Molnar wrote:
> On Sun, 27 Jul 2003, Willy Tarreau wrote:
> > just a thought : have you tried to set the timer to 100Hz instead of
> > 1kHz to compare with 2.4 ? It might make a difference too.
>
> especially for X, a HZ of 1000 has caused performance problems before -
> short-timeout select()s were looping 10 times faster, which can be
> noticeable.

No doubt X was a bit smoother at 100Hz in 2.5, but not remarkably so. In 2.4
O(1) there was a slight X flutter (jerkiness) at 1000Hz not evident at 100Hz,
but very consistent in the frequency/duration of that jerkiness. The
difference is in 2.5, when X is not smooth it's almost like there's jitter in
the jerkiness.

Con

2003-07-28 09:22:49

by Felipe Alfaro Solana

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sun, 2003-07-27 at 01:01, Diego Calleja Garc?a wrote:

> > It is interesting that Felipe says that stock 2.5.69 was the best CPU
> > scheduler of the 2.5 series. Do others agree with that?

> No.
> For me, 2.5.63 was the best. Or perhaps it was .64 or .65?

Well, for me now, the best scheduler is 2.6.0-test2 plus Con's O10
patch.

2003-07-28 10:27:10

by MånsRullgård

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Felipe Alfaro Solana <[email protected]> writes:

>> > It is interesting that Felipe says that stock 2.5.69 was the best CPU
>> > scheduler of the 2.5 series. Do others agree with that?
>
>> No.
>> For me, 2.5.63 was the best. Or perhaps it was .64 or .65?
>
> Well, for me now, the best scheduler is 2.6.0-test2 plus Con's O10
> patch.

Could someone give a brief comparison between Con's O10int and Ingo's
G7 patches?

--
M?ns Rullg?rd
[email protected]

2003-07-28 13:30:52

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

(this was sent again because it didn't seem to get sent the first time)

My comment was towards the fact that although playing audio is a
realtime priority, decoding audio is not, and is not coded that way in
many programs, in fact, many programs will sleep() in order to decode
only when the output buffer is getting low. This means it can have cpu
suddenly a lot less than it needs if load is added during this sleep()
because it was written assuming you could decode the necessary amount of
audio within the time it restarts the decode thread to the time the
output buffer can run out.
Simply put, try a few wav playing programs just playing wav files and
see how that performs skip-wise as the benchmark for realtime scheduling
performance instead of this xmms/whatever mp3 crap. I can cause
scheduling starvation-like performance by using certain decoders but not
others, this means using them at all is not acceptable to measure
anything kernel related. Just because what these decoders do worked
before doesn't mean that they are correct and the kernel is now wrong.
Stick with simple wav's since it's decode process is negliable, and no
gui. Too many ways for the user app to be causing the problems and
we'll never get any worthwhile results.

if the kernel is having a schedular issue with realtime processes, you
should be able to get aplay to skip at a niceness of -20 by loading up
the machine with normal niced processes. Technically, you shouldn't be
able to get aplay to skip at all as long as no other processes are at an
equal or higher nice value.

Daniel Phillips wrote:
> On Saturday 26 July 2003 13:35, Andrew Morton wrote:
>
>>Daniel Phillips <[email protected]> wrote:
>>
>>>Audio players fall into a special category of application, the kind where
>>>it's not unreasonable to change the code around to take advantage of new
>>>kernel features to make them work better.
>>
>>One shouldn't even need to modify the player application to start using a
>>new scheduler policy - policy is inherited, so a wrapper will suffice:
>>
>> sudo /bin/run-something-as-softrr mplayer
>
>
> True, and that's roughly what I do now (except just with elevated priority as
> opposed to a realtime scheduler policy). However, it's more friendly to the
> system if the realtime priority is limited to just the thread that needs it,
> and that's why the application itself needs to provide the hint.
>
> Zinf already does try to provide such a hint by setting a higher priority for
> its sound servicing thread. Unfortunately, this is ignored unless zinf is
> running as root. Given the number of bugs in Zinf, I am uncomfortable
> running the whole application as root. It's altogether more conservative to
> limit the risk to a single, simple thread that can be easily audited.
>
>
>>>Remember this word: audiophile.
>>
>>That is one problem space, and I guess if we fix that, we fix the X11
>>problems too.
>>
>>Let us not lose sight of the other problem: particular sleep/run patterns
>>as demonstrated in irman are causing extremem starvation. Arguably we
>>should be addressing this as the higher priority problem.
>
>
> I agree it's the more important problem, but there are also more people
> already working on it, whereas over here in the audio corner, there are just
> Davide and me (sorry if I left anyone out, please feel free to flame me so I
> know who you are).
>
>
>>It is interesting that Felipe says that stock 2.5.69 was the best CPU
>>scheduler of the 2.5 series. Do others agree with that?
>
>
> I never tried audio until 2.5.73. With Con's patches, life has been pretty
> good for me from then on, from the non-starvation point of view.
>
>
>>And what about the O(1) backports? RH and UL and -aa kernels? Are people
>>complaining about those kernels? If not, why? What is different?
>
>
> In case anybody wants to hearken back to the good old days of 2.4, forget it.
> It is only good for sound if you are lucky enough to have a configuration it
> likes. My unlucky wife on the other hand, who gets by with a 233 MHz K6
> (because she can) is running 2.4 and says sound skips whenever she does
> anything with the machine other that just letting it sit and play. Now that
> I think of it, this will be an ideal machine for testing audio robustness,
> and scheduler robustness in general.
>
> Regards,
>
> Daniel
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-07-28 13:50:24

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

My comment was towards the fact that although playing audio is a
realtime priority, decoding audio is not, and is not coded that way in
many programs, in fact, many programs will sleep() in order to decode
only when the output buffer is getting low. This means it can have cpu
suddenly a lot less than it needs if load is added during this sleep()
because it was written assuming you could decode the necessary amount of
audio within the time it restarts the decode thread to the time the
output buffer can run out.
Simply put, try a few wav playing programs just playing wav files and
see how that performs skip-wise as the benchmark for realtime scheduling
performance instead of this xmms/whatever mp3 crap. I can cause
scheduling starvation-like performance by using certain decoders but not
others, this means using them at all is not acceptable to measure
anything kernel related. Just because what these decoders do worked
before doesn't mean that they are correct and the kernel is now wrong.
Stick with simple wav's since it's decode process is negliable, and no
gui. Too many ways for the user app to be causing the problems and
we'll never get any worthwhile results.

if the kernel is having a schedular issue with realtime processes, you
should be able to get aplay to skip at a niceness of -20 by loading up
the machine with normal niced processes. Technically, you shouldn't be
able to get aplay to skip at all as long as no other processes are at an
equal or higher nice value.

Daniel Phillips wrote:
> On Saturday 26 July 2003 13:35, Andrew Morton wrote:
>
>>Daniel Phillips <[email protected]> wrote:
>>
>>>Audio players fall into a special category of application, the kind where
>>>it's not unreasonable to change the code around to take advantage of new
>>>kernel features to make them work better.
>>
>>One shouldn't even need to modify the player application to start using a
>>new scheduler policy - policy is inherited, so a wrapper will suffice:
>>
>> sudo /bin/run-something-as-softrr mplayer
>
>
> True, and that's roughly what I do now (except just with elevated priority as
> opposed to a realtime scheduler policy). However, it's more friendly to the
> system if the realtime priority is limited to just the thread that needs it,
> and that's why the application itself needs to provide the hint.
>
> Zinf already does try to provide such a hint by setting a higher priority for
> its sound servicing thread. Unfortunately, this is ignored unless zinf is
> running as root. Given the number of bugs in Zinf, I am uncomfortable
> running the whole application as root. It's altogether more conservative to
> limit the risk to a single, simple thread that can be easily audited.
>
>
>>>Remember this word: audiophile.
>>
>>That is one problem space, and I guess if we fix that, we fix the X11
>>problems too.
>>
>>Let us not lose sight of the other problem: particular sleep/run patterns
>>as demonstrated in irman are causing extremem starvation. Arguably we
>>should be addressing this as the higher priority problem.
>
>
> I agree it's the more important problem, but there are also more people
> already working on it, whereas over here in the audio corner, there are just
> Davide and me (sorry if I left anyone out, please feel free to flame me so I
> know who you are).
>
>
>>It is interesting that Felipe says that stock 2.5.69 was the best CPU
>>scheduler of the 2.5 series. Do others agree with that?
>
>
> I never tried audio until 2.5.73. With Con's patches, life has been pretty
> good for me from then on, from the non-starvation point of view.
>
>
>>And what about the O(1) backports? RH and UL and -aa kernels? Are people
>>complaining about those kernels? If not, why? What is different?
>
>
> In case anybody wants to hearken back to the good old days of 2.4, forget it.
> It is only good for sound if you are lucky enough to have a configuration it
> likes. My unlucky wife on the other hand, who gets by with a 233 MHz K6
> (because she can) is running 2.4 and says sound skips whenever she does
> anything with the machine other that just letting it sit and play. Now that
> I think of it, this will be an ideal machine for testing audio robustness,
> and scheduler robustness in general.
>
> Regards,
>
> Daniel
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-07-28 16:50:51

by Diego Calleja

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

El Mon, 28 Jul 2003 11:38:01 +0200 Felipe Alfaro Solana <[email protected]> escribi?:

>
> Well, for me now, the best scheduler is 2.6.0-test2 plus Con's O10
> patch.
>

I'm testing now test2-mm1 (which has O10) and yes, I must say that this
one feels GREAT

/me forgets old kernels forever.

2003-07-28 22:34:37

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Con Kolivas wrote:
>
>
> Thanks Ingo. I will continue then and stepwise make use of the extra
> infrastructure you've made available when I can decide how best to benefit
> from it.
>
> Con

To see you and Ingo cooperating on the scheduler like this makes me feel
all warm an fuzzy inside. Seriously! :)

That's one of the the greatest things about Linux and other free
software. I really get a kick out of seeing people working together well.

2003-07-29 13:45:35

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

I have a couple of questions about the interactive scheduling.

First, since we're dealing with real-time and audio issues, is there any
way we can do this: When the interrupt arrives from the sound card so
that the driver needs to set up DMA for the next block or whatever it
does, move any processes which talk to an audio device to the head of
the process queue? Can this idea be applied to other things, such as
moving X to the head of the queue when the DRI driver gets a "there is
free space in the command queue" interrupt from the graphics engine?

Second, we're dealing with lots of different CPUs here, and so results
are going to vary. Is this being taken into account? For any given
interactive load, different systems will be able to carry that load only
to the point where one has a CPU slow enough that it can't complete all
interactive processing in the desired time. I don't think we should be
making scheduler tweaks to fix this corner case because it's impossible
to fix, no?

2003-07-29 13:53:07

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Tue, 29 Jul 2003 23:56, Timothy Miller wrote:
> First, since we're dealing with real-time and audio issues, is there any

Actually this is only a tiny part of this work and adequate improvement in
many different scheduler tweaks have already addressed this. This is now more
about maintaining good all round interactivity and fairness. Improving audio
beyond ordinary scheduling tweaks is another issue which may lead to some
form of soft user RR task. su tasks already can be reniced or made RR to
help.

> interactive processing in the desired time. I don't think we should be
> making scheduler tweaks to fix this corner case because it's impossible
> to fix, no?

Your concerns are well founded. However neither Ingo nor I (and all the other
contributors) are trying to make an audio app scheduler. At some stage a
modification will be made to the mainline kernel which will have adequate
audio performance in many (but not all) settings, and more importantly be
fair and interactive.

Con

2003-07-29 14:44:24

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Tuesday 29 July 2003 08:56, Timothy Miller wrote:
> ...since we're dealing with real-time and audio issues, is there any
> way we can do this: When the interrupt arrives from the sound card so
> that the driver needs to set up DMA for the next block or whatever it
> does, move any processes which talk to an audio device to the head of
> the process queue?

That would be a good thing. To clarify, there are typically two buffers
involved:

- A short DMA buffer
- A longer buffer into which the audio process generates samples

There are several cycles through the short buffer for each cycle through the
long buffer. On one of these cycles, the contents of the long buffer will
drop below some threshold and the refill process should be scheduled,
according to your suggestion. Developing a sane API for that seems a little
challenging. Somebody should just hack this and demonstrate the benefit.

In the meantime, the SCHED_SOFTRR proposal provides a way of closely
approximating the above behaviour without being intrusive or
application-specific.

Regards,

Daniel

2003-07-29 15:19:33

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Con Kolivas wrote:
> On Tue, 29 Jul 2003 23:56, Timothy Miller wrote:
>

>
>>interactive processing in the desired time. I don't think we should be
>>making scheduler tweaks to fix this corner case because it's impossible
>>to fix, no?
>
>
> Your concerns are well founded. However neither Ingo nor I (and all the other
> contributors) are trying to make an audio app scheduler. At some stage a
> modification will be made to the mainline kernel which will have adequate
> audio performance in many (but not all) settings, and more importantly be
> fair and interactive.
>

For this case, I wasn't concerned with audio but with any combination of
interactive tasks. That is, if you have enough interactive tasks going
on on a slow machine, you're going to get audio skips and other
interactivity problems and just have to live with it.

2003-07-29 15:30:14

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Daniel Phillips wrote:

>
> In the meantime, the SCHED_SOFTRR proposal provides a way of closely
> approximating the above behaviour without being intrusive or
> application-specific.
>

And there are obvious benefits to keeping things application-general.

IF it's possible to intelligently determine interactivity and other such
things, and lots of impressive progress is being made in that area, then
that is definately preferable. But there may be some circumstances
where we simply cannot determine need from application behavior.

It might help to have an API for real-time processes that is accessible
by non-root tasks. If a task sets itself to real-time, its scheduling
is more predictable, but it gets a shorter timeslice (perhaps) so that
being real-time doesn't adversely impact the system when abused.

The nice thing about the smart schedulers is that (a) no one has to
change their apps (although they can tweak to cooperate better), and (b)
future apps will behave well without us having to anticipate anything.

2003-07-29 15:45:58

by Helge Hafting

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Tue, Jul 29, 2003 at 09:56:09AM -0400, Timothy Miller wrote:
> I have a couple of questions about the interactive scheduling.
>
>
> First, since we're dealing with real-time and audio issues, is there any
> way we can do this: When the interrupt arrives from the sound card so
> that the driver needs to set up DMA for the next block or whatever it
> does, move any processes which talk to an audio device to the head of
> the process queue? Can this idea be applied to other things, such as
> moving X to the head of the queue when the DRI driver gets a "there is
> free space in the command queue" interrupt from the graphics engine?
>
This is sort of what interactivity bonus is all about, although
on a more general level. I.e. app wait for io, io
happens, app wakes up with priority bonus which it may use for processing
the io or start another one.

This goes for sound, graphichs, and ordinary disk file io.

There is no explicit connection between sound drivers and
processes - the processes get the bonus because they waited for
the io (in this case sound) to happen. Explicit connections
of this kind is hard to set up, because there may be several
processes using the sound device, or even several sound devices.
Even more so for graphichs - lots of processes use the same display.

If you care more about sound than anything else, run your
sound apps at higher priority than other processes.

Helge Hafting

2003-07-29 16:14:51

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Tuesday 29 July 2003 10:40, Timothy Miller wrote:
> IF it's possible to intelligently determine interactivity and other such
> things, and lots of impressive progress is being made in that area, then
> that is definately preferable.

But it's not possible to determine realtimeness automatically, as far as I
know.

> ...It might help to have an API for real-time processes that is accessible
> by non-root tasks. If a task sets itself to real-time, its scheduling
> is more predictable, but it gets a shorter timeslice (perhaps) so that
> being real-time doesn't adversely impact the system when abused.

That's precisely what Davide's SCHED_SOFTRR is and does.

> The nice thing about the smart schedulers is that (a) no one has to
> change their apps (although they can tweak to cooperate better), and (b)
> future apps will behave well without us having to anticipate anything.

On the other hand, you want to avoid messing up the kernel just because some
app is broken. While it's not always possible to avoid changing apps to fix
them, in the case of audio apps on Linux at this point in time, it most
certainly is.

Regards,

Daniel

2003-07-29 16:17:38

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Helge Hafting wrote:

>
> If you care more about sound than anything else, run your
> sound apps at higher priority than other processes.
>

Heh. My new Linux box I ran for a week without speakers. :)

2003-07-31 15:38:13

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Saturday 26 July 2003 17:05, Ed Sweetman wrote:
> My comment was towards the fact that although playing audio is a
> realtime priority, decoding audio is not, and is not coded that way in
> many programs, in fact, many programs will sleep() in order to decode
> only when the output buffer is getting low.

Decoding is realtime too, though the bounds are more relaxed than for the DMA
refill process. In fact, it's the decoding task that causes skipping,
because the DMA refill normally runs in interrupt context (so should mixing
and equalizing, but that's another story) where essentially everything is
realtime, modulo handwaving.

To convince yourself of this, note that when DMA refill fails to meet its
deadline you will hear repeats, not skipping, because the DMA hardware on the
sound card has been set up to automatically restart the DMA each time the
buffer expires. Try running the kernel under kgdb and breaking to the
monitor while sound is playing.

So you need to reevaluate your thinking re the realtime nature of audio
decoders.

To be sure, for perfect audio reproduction, any file IO involved has to be
realtime as well, as does the block layer. We're not really in position to
take care of all that detail at this point, mainly because no Linux
filesystem has realtime IO support. (I believe Irix XFS has realtime IO and
that part didn't get ported because of missing infrastructure in Linux.) But
the block layer isn't really that far away from being able to make realtime
guarantees. Mainly, that work translates into plugging in a different IO
scheduler.

Beyond that, there's priority inversion to worry about, which is a hard
problem from a theoretical point of view. However, once we get to the point
where priority inversion is the worst thing about Linux audio, we will be
lightyears ahead of where we now stand.

> Technically, you shouldn't be
> able to get aplay to skip at all as long as no other processes are at an
> equal or higher nice value.

This got fuzzier with the interactivity hacks, which effectively allow the
nice values to vary within some informally defined range.

Regards,

Daniel

2003-07-31 23:03:44

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

What i consider decoding is everything from the file to the audio going
to the audio device. This means moving the audio from your decoder into
the buffer you use to play sound. The process may be set as realtime,
but whether or not it's able to accomplish that is up to the code
itself, usually you have more audio experienced programmers working on
the output library/device. Just because the decoder may be able to get
cpu and resources when it asks for it within a bounded range doesn't
mean it's able to accomplish what it's supposed to within that range all
the time. Otherwise explain why substituting different decoders results
in different performances regarding audio skipping, that includes
different decoders for different codecs and even the same ones. The
audio programs themselves are responsible for the poor performances they
get with current kernels much more than the kernel is, that point was my
original subject.

yea, it's just as distinct per cycle as outputting audio is, but the
players are not at a realtime quality, which they should be before
people start using them to grade the kernel's "realtimeness".

Daniel Phillips wrote:
> On Saturday 26 July 2003 17:05, Ed Sweetman wrote:
>
>>My comment was towards the fact that although playing audio is a
>>realtime priority, decoding audio is not, and is not coded that way in
>>many programs, in fact, many programs will sleep() in order to decode
>>only when the output buffer is getting low.
>
>
> Decoding is realtime too, though the bounds are more relaxed than for the DMA
> refill process. In fact, it's the decoding task that causes skipping,
> because the DMA refill normally runs in interrupt context (so should mixing
> and equalizing, but that's another story) where essentially everything is
> realtime, modulo handwaving.
>
> To convince yourself of this, note that when DMA refill fails to meet its
> deadline you will hear repeats, not skipping, because the DMA hardware on the
> sound card has been set up to automatically restart the DMA each time the
> buffer expires. Try running the kernel under kgdb and breaking to the
> monitor while sound is playing.
>
> So you need to reevaluate your thinking re the realtime nature of audio
> decoders.
>
> To be sure, for perfect audio reproduction, any file IO involved has to be
> realtime as well, as does the block layer. We're not really in position to
> take care of all that detail at this point, mainly because no Linux
> filesystem has realtime IO support. (I believe Irix XFS has realtime IO and
> that part didn't get ported because of missing infrastructure in Linux.) But
> the block layer isn't really that far away from being able to make realtime
> guarantees. Mainly, that work translates into plugging in a different IO
> scheduler.
>
> Beyond that, there's priority inversion to worry about, which is a hard
> problem from a theoretical point of view. However, once we get to the point
> where priority inversion is the worst thing about Linux audio, we will be
> lightyears ahead of where we now stand.
>
>
>>Technically, you shouldn't be
>>able to get aplay to skip at all as long as no other processes are at an
>>equal or higher nice value.
>
>
> This got fuzzier with the interactivity hacks, which effectively allow the
> nice values to vary within some informally defined range.
>
> Regards,
>
> Daniel
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-08-07 04:52:36

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Sunday 27 July 2003 11:46, Daniel Phillips wrote:

> The definition of a realtime scheduler is that the worst case latency is
> bounded. The current crop of interactive tweaks do not do that. So we
> need a scheduler with a bounded worst case. Davide Libenzi's recent patch
> that implements a new SCHED_SOFTRR scheduler policy, usable by non-root
> users, provides such a bound. Please don't lose sight of the fact that
> this is the correct solution to the problem, and that interactive tweaking,
> while it may produce good results for some or even most users in some or
> even most situations, will never magically transform Linux into an
> operating system that an audiophile could love.

Thinking out loud for a bit, please tell me if I'm wrong about SCHED_SOFTRR...

Whatever the policy is, there's only so many ticks to go around and there is
an overload for which it will fail. No resource allocation scheme can
prevent starvation if there simply isn't enough of the resource to go around.

So, how does SCHED_SOFTRR fail? Theoretically there is a minimum timeslice
you can hand out, yes? And an upper bound on scheduling latency. So
logically, there is some maximum number "N" of SCHED_SOFTRR tasks running at
once where you wind up round-robining with minimal timeslices and the system
is saturated. At N+1, you fall over. (And in reality, there are interrupts
and kernel threads and other things going on that get kind of cramped
somewhere below N.)

In theory, the real benefit of SCHED_SOFTRR is that an attempt to switch to it
can fail with -EMYBRAINISMELTING up front, so you know when it won't work at
the start, rather than having it glitch halfway through the run. At which
point half the fun becomes policy decisions about how to allocate the finite
number of SCHED_SOFTRR slots between however many users are trying to use the
system, which gets into Alan Cox's accounting work...

Sorry if this is old hat; I'm still a week and change behind on the list, but
catching up... :)

Rob

2003-08-07 09:27:30

by Helge Hafting

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Rob Landley wrote:
[...]
>
> Thinking out loud for a bit, please tell me if I'm wrong about SCHED_SOFTRR...
>
> Whatever the policy is, there's only so many ticks to go around and there is
> an overload for which it will fail. No resource allocation scheme can
> prevent starvation if there simply isn't enough of the resource to go around.
>
> So, how does SCHED_SOFTRR fail? Theoretically there is a minimum timeslice
> you can hand out, yes? And an upper bound on scheduling latency. So
> logically, there is some maximum number "N" of SCHED_SOFTRR tasks running at
> once where you wind up round-robining with minimal timeslices and the system
> is saturated. At N+1, you fall over. (And in reality, there are interrupts
> and kernel threads and other things going on that get kind of cramped
> somewhere below N.)

I don't know how this particular scheduler fail, but the problem
exists for any real-time system. Nobody can run "N+1" guaranteed
low-latency
tasks where N is the max, that is obvious.

A generic os scheduler can run almost any amount of tasks, with latencies
proportional to the amount of tasks.

A good RT scheduler won't even _try_ to run "N+1" RT tasks. The
last task will either fail to start, or fail the attempt to
increase its priority into RT. You may then kill (or un-prioritize)
some other RT task and try again.

Helge Hafting

2003-08-07 15:45:31

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Wednesday 06 August 2003 22:28, Rob Landley wrote:
> So, how does SCHED_SOFTRR fail? Theoretically there is a minimum timeslice
> you can hand out, yes? And an upper bound on scheduling latency. So
> logically, there is some maximum number "N" of SCHED_SOFTRR tasks running
> at once where you wind up round-robining with minimal timeslices and the
> system is saturated. At N+1, you fall over. (And in reality, there are
> interrupts and kernel threads and other things going on that get kind of
> cramped somewhere below N.)

The upper bound for softrr realtime scheduling isn't based on number of tasks,
it's a global slice of cpu time: so long as the sum of running times of all
softrr tasks in the system lies below limit, softrr tasks will be scheduled
as SCHED_RR, otherwise they will be SCHED_NORMAL.

> In theory, the real benefit of SCHED_SOFTRR is that an attempt to switch to
> it can fail with -EMYBRAINISMELTING up front, so you know when it won't
> work at the start, rather than having it glitch halfway through the run.

Not as implemented. Anyway, from the user's point of view, that would be an
unpleasant way for a sound player to fail. What we want is something more
like a little red light that comes on (in the form of error statistics, say)
any time a softrr process gets demoted. Granted, there may be situations
where what you want is the right behavior, but it's (as you say) a separate
issue of resource allocation.

Regards,

Daniel

2003-08-07 20:45:23

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Wednesday 06 August 2003 22:28, Rob Landley wrote:
>> So, how does SCHED_SOFTRR fail? Theoretically there is a minimum timeslice
>> you can hand out, yes? And an upper bound on scheduling latency. So
>> logically, there is some maximum number "N" of SCHED_SOFTRR tasks running
>> at once where you wind up round-robining with minimal timeslices and the
>> system is saturated. At N+1, you fall over. (And in reality, there are
>> interrupts and kernel threads and other things going on that get kind of
>> cramped somewhere below N.)

On Thu, Aug 07, 2003 at 04:42:55PM +0100, Daniel Phillips wrote:
> The upper bound for softrr realtime scheduling isn't based on number
> of tasks, it's a global slice of cpu time: so long as the sum of
> running times of all softrr tasks in the system lies below limit,
> softrr tasks will be scheduled as SCHED_RR, otherwise they will be
> SCHED_NORMAL.

You're thinking of Little's law, which is describes the mean number of
waiters on a queue as the mean service time divided by the number of
servers times the mean inter-arrival time.

-- wli

2003-08-07 20:49:05

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Thursday 07 August 2003 11:42, Daniel Phillips wrote:
> On Wednesday 06 August 2003 22:28, Rob Landley wrote:
> > So, how does SCHED_SOFTRR fail? Theoretically there is a minimum
> > timeslice you can hand out, yes? And an upper bound on scheduling
> > latency. So logically, there is some maximum number "N" of SCHED_SOFTRR
> > tasks running at once where you wind up round-robining with minimal
> > timeslices and the system is saturated. At N+1, you fall over. (And in
> > reality, there are interrupts and kernel threads and other things going
> > on that get kind of cramped somewhere below N.)
>
> The upper bound for softrr realtime scheduling isn't based on number of
> tasks, it's a global slice of cpu time: so long as the sum of running times
> of all softrr tasks in the system lies below limit, softrr tasks will be
> scheduled as SCHED_RR, otherwise they will be SCHED_NORMAL.

I thought one of the advantages here was that a userspace program could give
hints about whether the scheduler should optimize it for latency or for
throughput, without having to be root.

XFree86 and Konqueror Xterm and Kmail could all say "latency in me is end-user
visible, so I care more about latency than throughput". And stuff like the
nightly cron job that exists just to screw up my desktop because I AM awake
at 4 am a noticeable percentage of the time... Anyway, it cares about
throughput and not at all about latency. Same with just about any invocation
of gcc, so they'd never set the flags.

If Bash really wanted to get fancy, it could set the flag depending on whether
the process on the other end of its input PTY had the flag or not, but let's
worry about that later... :)

> > In theory, the real benefit of SCHED_SOFTRR is that an attempt to switch
> > to it can fail with -EMYBRAINISMELTING up front, so you know when it
> > won't work at the start, rather than having it glitch halfway through the
> > run.
>
> Not as implemented. Anyway, from the user's point of view, that would be
> an unpleasant way for a sound player to fail. What we want is something
> more like a little red light that comes on (in the form of error
> statistics, say) any time a softrr process gets demoted. Granted, there
> may be situations where what you want is the right behavior, but it's (as
> you say) a separate issue of resource allocation.

Uh-huh.

So with SCHED_SOFTRR, if the system gets heavily loaded enough later on then
the SOFTRR tasks can get demoted and start skipping. So we're back to having
a system where cron had better not start up while you're mixing sound because
it might put you over the edge.

I fail to see how this is an improvement on Con's "carpet bomb the problem
with heuristics out the wazoo" approach? (I like heuristcs. They're like
Duct Tape. I like Duct Tape.)

> Regards,
>
> Daniel

Rob

2003-08-07 21:40:47

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Rob Landley wrote:
> On Thursday 07 August 2003 11:42, Daniel Phillips wrote:
>
>>On Wednesday 06 August 2003 22:28, Rob Landley wrote:
>>
>>>So, how does SCHED_SOFTRR fail? Theoretically there is a minimum
>>>timeslice you can hand out, yes? And an upper bound on scheduling
>>>latency. So logically, there is some maximum number "N" of SCHED_SOFTRR
>>>tasks running at once where you wind up round-robining with minimal
>>>timeslices and the system is saturated. At N+1, you fall over. (And in
>>>reality, there are interrupts and kernel threads and other things going
>>>on that get kind of cramped somewhere below N.)
>>
>>The upper bound for softrr realtime scheduling isn't based on number of
>>tasks, it's a global slice of cpu time: so long as the sum of running times
>>of all softrr tasks in the system lies below limit, softrr tasks will be
>>scheduled as SCHED_RR, otherwise they will be SCHED_NORMAL.
>
>
> I thought one of the advantages here was that a userspace program could give
> hints about whether the scheduler should optimize it for latency or for
> throughput, without having to be root.
>
> XFree86 and Konqueror Xterm and Kmail could all say "latency in me is end-user
> visible, so I care more about latency than throughput". And stuff like the
> nightly cron job that exists just to screw up my desktop because I AM awake
> at 4 am a noticeable percentage of the time... Anyway, it cares about
> throughput and not at all about latency. Same with just about any invocation
> of gcc, so they'd never set the flags.

cron is user setable. Just set it to work at a time you aren't there.

> If Bash really wanted to get fancy, it could set the flag depending on whether
> the process on the other end of its input PTY had the flag or not, but let's
> worry about that later... :
>
>
>>>In theory, the real benefit of SCHED_SOFTRR is that an attempt to switch
>>>to it can fail with -EMYBRAINISMELTING up front, so you know when it
>>>won't work at the start, rather than having it glitch halfway through the
>>>run.
>>
>>Not as implemented. Anyway, from the user's point of view, that would be
>>an unpleasant way for a sound player to fail. What we want is something
>>more like a little red light that comes on (in the form of error
>>statistics, say) any time a softrr process gets demoted. Granted, there
>>may be situations where what you want is the right behavior, but it's (as
>>you say) a separate issue of resource allocation.
>
>
> Uh-huh.
>
> So with SCHED_SOFTRR, if the system gets heavily loaded enough later on then
> the SOFTRR tasks can get demoted and start skipping. So we're back to having
> a system where cron had better not start up while you're mixing sound because
> it might put you over the edge.

Again, cron is not something inevitable that you cant control. If you're
mixing sound, dont run cron at times where it can interfere with your
work. Cron is a throughput intensive process. Complaining about
processes like cron is like complaining about how your audio is skipping
while running hdparm -t on the drive or dbench.

> I fail to see how this is an improvement on Con's "carpet bomb the problem
> with heuristics out the wazoo" approach? (I like heuristcs. They're like
> Duct Tape. I like Duct Tape.

>
>>Regards,
>>
>>Daniel
>
>
> Rob
>

the problem is you want a process that works like it was run on a single
tasking OS on an operating system that is built from the ground up to be
a multi-user multi-tasking OS and you want both to work perfectly at
peak performance and you want it to know when you want which to work at
peak performance automatically.

Duct tape cant do that, because just about nothing can. You're gonna
have to make some effort as a user to do the job because short of
artificial intelligence, the schedular is never going to be good enough
for everyone to always be happy with heuristics or not.

Tune and optimize the schedular to handle problems with latency within
like-processes and throughput within like-processes and allow priority
levels to take care of how they work together. There is always room to
optimize the code without changing what it eventually does too, in that
way the schedular can be improved without exchanging it for something
else whenever a problem occurs or allowing it to be directed by a
specific group of loud users and set of userspace programs.

I'd just like to see less complication because less is faster and faster
means less overhead in kernel time. If i have to do some of the work
that a bloated artificially intelligent schedular will do then i'm more
than happy to because that system is going to be able to scale much
better than something with complicated scheduling as the number of
processes increases.

2003-08-07 22:16:38

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Thu, Aug 07, 2003 at 05:40:34PM -0400, Ed Sweetman wrote:
> I'd just like to see less complication because less is faster and faster
> means less overhead in kernel time. If i have to do some of the work
> that a bloated artificially intelligent schedular will do then i'm more
> than happy to because that system is going to be able to scale much
> better than something with complicated scheduling as the number of
> processes increases.

... which only works so long as you've got root.

-- wli

2003-08-07 23:59:00

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Thursday 07 August 2003 17:40, Ed Sweetman wrote:

> > XFree86 and Konqueror Xterm and Kmail could all say "latency in me is
> > end-user visible, so I care more about latency than throughput". And
> > stuff like the nightly cron job that exists just to screw up my desktop
> > because I AM awake at 4 am a noticeable percentage of the time...
> > Anyway, it cares about throughput and not at all about latency. Same
> > with just about any invocation of gcc, so they'd never set the flags.
>
> cron is user setable. Just set it to work at a time you aren't there.

I would using it as an example. It doesn't do anything I want to have done on
a regular basis. (If I want to find stuff, I use fine. Generally if I don't
know where it is, it's moved in the past hour or two anyway.) So the first
thing I do customizing my own system is rip cron out by the roots and burn it
ceremonially.

> > So with SCHED_SOFTRR, if the system gets heavily loaded enough later on
> > then the SOFTRR tasks can get demoted and start skipping. So we're back
> > to having a system where cron had better not start up while you're mixing
> > sound because it might put you over the edge.
>
> Again, cron is not something inevitable that you cant control. If you're
> mixing sound, dont run cron at times where it can interfere with your
> work. Cron is a throughput intensive process. Complaining about
> processes like cron is like complaining about how your audio is skipping
> while running hdparm -t on the drive or dbench.

I am using cron as an example of an unrelated asynchronous background load.

Are you suggesting that the scheduler is fundamentally incapable of even
addressing any asynchronous background load in the case of latency-critical
tasks, and that the only way Linux can be made to deal with this sort of
thing is via a hand-configured embedded system that might as well run the
latency critical task in place of "init" so the scheduler never actually has
anything non-trivial to do?

If not, then cron may make a good example, even if I don't personally use it.

> > I fail to see how this is an improvement on Con's "carpet bomb the
> > problem with heuristics out the wazoo" approach? (I like heuristcs.
> > They're like Duct Tape. I like Duct Tape.
> >
> >>Regards,
> >>
> >>Daniel
> >
> > Rob
>
> the problem is you want a process that works like it was run on a single
> tasking OS on an operating system that is built from the ground up to be
> a multi-user multi-tasking OS and you want both to work perfectly at
> peak performance and you want it to know when you want which to work at
> peak performance automatically.

And world peace, sure.

I suggested that applications could potentially provide an "I am interested in
latency" hint, which could kill their throughput (make their timeslices
really small) if necessary in the name of giving them good latency with
approximately the same amount of scheduler resources. And that the attempt
to supply the hint could fail if the system has too many processes interested
in latency, so they know up front that it ain't gonna work rather than having
it fail halfway through, and so that attempts to spawn new tasks don't
interfere with the existing PVR thread that's halfway through recording a 4
hour live teleconference or some such.

If SCHED_SOFTRR doesn't do that, then at best it's just one more heuristic.

> Duct tape cant do that, because just about nothing can. You're gonna
> have to make some effort as a user to do the job because short of
> artificial intelligence, the schedular is never going to be good enough
> for everyone to always be happy with heuristics or not.

Hence the "I care about latency at the expense of throughput" scheduler hint.
(It's entirely possible that Con is coming up with his own heuristics to
handle this without such a hint, but what he's basically doing is trying to
figure out which tasks care about latency and which tasks care about
throughput, and in many cases the author of the tasks knows. There's
probably never a time when X or XMMS does NOT care about latency over
throughput, for example.)

The traditional behavior (2.4 and earlier) was to optimize for throughput at
the expense of latency, at least until the task became a CPU hog, and I'm not
suggesting changing that default behavior.

We're starting to get this kind of role-based hint thing now, by the way.
There was some kind of SCHED_BATCH tweak a while back when people wanted
jumbo timeslices to keep the cache hot for processor-intensive background
stuff. That got merged into the priority stuff if I recall, but low latency
is not quite the same thing as high priority, because high priority implies
you want a bigger percentage of the CPU time and the truth may be the
opposite.

> Tune and optimize the schedular to handle problems with latency within
> like-processes and throughput within like-processes and allow priority
> levels to take care of how they work together. There is always room to
> optimize the code without changing what it eventually does too, in that
> way the schedular can be improved without exchanging it for something
> else whenever a problem occurs or allowing it to be directed by a
> specific group of loud users and set of userspace programs.

Actually, in MY specific case of having 6 desktops full of open windows (and
now Konqueror's ability to open a zillion websites in tabbed windows), I
usually drive my system deeply enough into swap that what the scheduler is
doing is a sideshow at best. You're mouse is GOING to skip if code mouse
movement is blocked on has been sent to the swap file. :)

> I'd just like to see less complication because less is faster and faster
> means less overhead in kernel time. If i have to do some of the work
> that a bloated artificially intelligent schedular will do then i'm more
> than happy to because that system is going to be able to scale much
> better than something with complicated scheduling as the number of
> processes increases.

The user supplying a hint so the scheduler doesn't have to figure out whether
a particular process cares more about latency or throughput is hopefully a
simplifying suggestion. Otherwise, to provide good low-latency behavior for
interactive tasks, you have to figure out what's an interactive task, as
Con's patches seem to be doing a fairly good (if not necessarily perfect) job
of.

I also think it's a bit late in the game in 2.5 to be adding that sort of
thing. I thought it might already be part of SCHED_SOFTRR, hence I was
asking. Since it isn't, I'll wait until 2.7...

Rob

2003-08-08 06:05:52

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Thursday 07 August 2003 21:51, Rob Landley wrote:
> Uh-huh.
>
> So with SCHED_SOFTRR, if the system gets heavily loaded enough later on
> then the SOFTRR tasks can get demoted and start skipping.

No. A SOFTRR task only becomes SCHED_NORMAL if the total load of *realtime*
tasks exceeds a threshold.

> I fail to see how this is an improvement on Con's "carpet bomb the problem
> with heuristics out the wazoo" approach? ... (I like heuristcs. They're
> like Duct Tape. I like Duct Tape.)

Danger, danger! Man with duct tape loose in kernel! Seal off the bulwarks
and flood the lower compartments!

Regards,

Daniel

2003-08-08 21:10:07

by Bill Huey

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Wed, Jul 30, 2003 at 09:46:41AM -0500, Daniel Phillips wrote:
> On Tuesday 29 July 2003 08:56, Timothy Miller wrote:
> > ...since we're dealing with real-time and audio issues, is there any
> > way we can do this: When the interrupt arrives from the sound card so
> > that the driver needs to set up DMA for the next block or whatever it
> > does, move any processes which talk to an audio device to the head of
> > the process queue?
>
> That would be a good thing. To clarify, there are typically two buffers
> involved:
>
> - A short DMA buffer
> - A longer buffer into which the audio process generates samples
>
> There are several cycles through the short buffer for each cycle through the
> long buffer. On one of these cycles, the contents of the long buffer will
> drop below some threshold and the refill process should be scheduled,
> according to your suggestion. Developing a sane API for that seems a little
> challenging. Somebody should just hack this and demonstrate the benefit.
>
> In the meantime, the SCHED_SOFTRR proposal provides a way of closely
> approximating the above behaviour without being intrusive or
> application-specific.

You might also like to think about driving the scheduler with an
interrupt from the DMA device, if it's regular, or VBL (vertical
retrace) for video/graphics applications. IRIX's REACT/pro RT system
does stuff like this.

http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=bks&srch=&fname=/SGI_Developer/REACT_PG/sgi_html/pr02.html

It's in their frame scheduler section.

bill

2003-08-13 03:39:56

by George Anzinger

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Ed Sweetman wrote:
> >>
>
> the problem is you want a process that works like it was run on a single
> tasking OS on an operating system that is built from the ground up to be
> a multi-user multi-tasking OS and you want both to work perfectly at
> peak performance and you want it to know when you want which to work at
> peak performance automatically.

Well said :)

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-08-13 19:23:54

by George Anzinger

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

2003-08-14 00:22:30

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Saturday 09 August 2003 16:52, George Anzinger wrote:
> Ed Sweetman wrote:
> > the problem is you want a process that works like it was run on a single
> > tasking OS on an operating system that is built from the ground up to be
> > a multi-user multi-tasking OS

Considering the multi-tasking OS has 1000 times the CPU power, memory, and
disk space as the single-tasking OS did when it debuted, yet still loses to
it in some areas, isn't it at least worth looking at?

> > and you want both to work perfectly at peak performance

We're pondering various heuristics with which to to improve the situation and
you say we're persuing perfection. From heuristics.

Do you say these sort of things to the virtual memory people? (Since you
can't do it perfectly, why bother to swap at all? The perfect being the
enemy of the good, and all that.)

> > and you want it to know when you want which to work at
> > peak performance automatically.

I know for a fact that automatic determination of interactivity is possible.
In OS/2 you could speed up a compile by moving the mouse pointer over its
window repeatedly to give it extra clock ticks. (So far we've managed to
avoid anything quite so disgusting in Linux, but there exist OSes where it
was done. Having the keyboard and mouse and display be local devices is
actually the common case. It took X about ten years to finally start
optimizing for the common case on the output side with MIT shared memory
extensions and such...)

The scheduler actually has a lot of information to work with. Ingo's patches
strive to give it more information, and and Con's patches make much better
use of that information. This is a good thing.

> Well said :)

Actually, I didn't really consider that list of straw man arguments to be
worth commenting on the first time around. (I thought he was being
sarcastic...)

Rob

2003-08-14 08:01:22

by George Anzinger

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

Rob Landley wrote:
> On Saturday 09 August 2003 16:52, George Anzinger wrote:
>
>>Ed Sweetman wrote:
>>
>>>the problem is you want a process that works like it was run on a single
>>>tasking OS on an operating system that is built from the ground up to be
>>> a multi-user multi-tasking OS
>
>
> Considering the multi-tasking OS has 1000 times the CPU power, memory, and
> disk space as the single-tasking OS did when it debuted, yet still loses to
> it in some areas, isn't it at least worth looking at?
>
>
>>>and you want both to work perfectly at peak performance
>
>
> We're pondering various heuristics with which to to improve the situation and
> you say we're persuing perfection. From heuristics.
>
> Do you say these sort of things to the virtual memory people? (Since you
> can't do it perfectly, why bother to swap at all? The perfect being the
> enemy of the good, and all that.)
>
>
>>>and you want it to know when you want which to work at
>>>peak performance automatically.
>
>
> I know for a fact that automatic determination of interactivity is possible.
> In OS/2 you could speed up a compile by moving the mouse pointer over its
> window repeatedly to give it extra clock ticks. (So far we've managed to
> avoid anything quite so disgusting in Linux, but there exist OSes where it
> was done. Having the keyboard and mouse and display be local devices is
> actually the common case. It took X about ten years to finally start
> optimizing for the common case on the output side with MIT shared memory
> extensions and such...)
>
> The scheduler actually has a lot of information to work with. Ingo's patches
> strive to give it more information, and and Con's patches make much better
> use of that information. This is a good thing.
>
>
>>Well said :)
>
>
> Actually, I didn't really consider that list of straw man arguments to be
> worth commenting on the first time around. (I thought he was being
> sarcastic...)

Well, I think he was too, but I am trying to say (as I think you are
too) that it is not far from being a realistic goal.

As to timing, I just changed ISPs and was off line for a few days...
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml

2003-08-16 09:10:53

[permalink] [raw]

Subject: Re: Ingo Molnar and Con Kolivas 2.6 scheduler patches

On Thursday 14 August 2003 04:01, George Anzinger wrote:

> >>Well said :)
> >
> > Actually, I didn't really consider that list of straw man arguments to be
> > worth commenting on the first time around. (I thought he was being
> > sarcastic...)
>
> Well, I think he was too, but I am trying to say (as I think you are
> too) that it is not far from being a realistic goal.

2.5 already seems to be scheduling better for me, although I'm still mostly
running 2.4 on my new laptop until I figure out how to properly configure all
the new hardware. (APM suspends, and then never comes back until you yank
the #*%(&# battery. Great. Trying it with the real mode bios calls next
reboot...)

> As to timing, I just changed ISPs and was off line for a few days...

I got caught in a downpour with my laptop in my backpack, and didn't realise
the water resistant coating on my backpack had worn away until I turned the
thing on and the display shorted out. After two days of drying out, the
display was still screwed up, so I just bought a used thinkpad (iseries 1300,
quite a nice little machine) and swapped the hard drive and some ram from my
old toshiba. (Kudzu actually did something useful for once. :)

I think I've figured out why X is giving me an 800x600 window on a 1024x768
display screen (with a big black border). Why XFree86 freezes the box solid
for ten seconds at a time while KDE is probing the hardware devices, that I
don't know. (Google suggests the USB is funky. I'll see if 2.5 fixes it,
all I know of 2.5 on this box is that it booted to text mode and then shut
down again ok...)

I'm likely to be a bit distracted for a while yet, not counting catching up,
being out of town for a weekend, and of course the fall semester starting in
two weeks. Luckily, they make caffeine for just such occasions...

Rob

2003-08-16 14:31:10

by Jamie Lokier

[permalink] [raw]

Subject: APM and 2.5.75 not resuming properly

Rob Landley wrote:
> (APM suspends, and then never comes back until you yank the #*%(&#
> battery. Great. Trying it with the real mode bios calls next
> reboot...)

Similar here. Using 2.5.75. APM with no local APIC (kernel is unable
to enable it anyway).

It suspends. On resume, the screen is blank and the keyboard doesn't
respond (no Caps Lock or SysRq). Occasionally when it resumes the
keyboard does respond, but the screen stays blank. At least it is
possible to do SysRq-S SysRq-B in this state. Sometimes, if I'm
lucky, I can make it reboot by holding down the power key for 5 seconds.

2.4 APM works great. ACPI doesn't do anything useful except give me
more control over the screen brightness.

--- Jamie

2003-08-16 15:03:35

by Stephen Rothwell

[permalink] [raw]

Subject: Re: APM and 2.5.75 not resuming properly

On Sat, 16 Aug 2003 15:29:33 +0100 Jamie Lokier <[email protected]> wrote:
>
> Rob Landley wrote:
> > (APM suspends, and then never comes back until you yank the #*%(&#
> > battery. Great. Trying it with the real mode bios calls next
> > reboot...)
>
> Similar here. Using 2.5.75. APM with no local APIC (kernel is unable
> to enable it anyway).
>
> It suspends. On resume, the screen is blank and the keyboard doesn't
> respond (no Caps Lock or SysRq). Occasionally when it resumes the
> keyboard does respond, but the screen stays blank. At least it is
> possible to do SysRq-S SysRq-B in this state. Sometimes, if I'm
> lucky, I can make it reboot by holding down the power key for 5 seconds.

I may have missed somthing, but let me ask anyway: What laptop? Have you
tried switching to a text console before suspending? Have you tried
Options "NoPM" "True"
in the ServerFlags section of your XF86Config?

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/

2003-08-16 16:13:45

by Jamie Lokier

[permalink] [raw]

Subject: Re: APM and 2.5.75 not resuming properly

Stephen Rothwell wrote:
> On Sat, 16 Aug 2003 15:29:33 +0100 Jamie Lokier <[email protected]> wrote:
> > Rob Landley wrote:
> > > (APM suspends, and then never comes back until you yank the #*%(&#
> > > battery. Great. Trying it with the real mode bios calls next
> > > reboot...)
> >
> > Similar here. Using 2.5.75. APM with no local APIC (kernel is unable
> > to enable it anyway).
> >
> > It suspends. On resume, the screen is blank and the keyboard doesn't
> > respond (no Caps Lock or SysRq). Occasionally when it resumes the
> > keyboard does respond, but the screen stays blank. At least it is
> > possible to do SysRq-S SysRq-B in this state. Sometimes, if I'm
> > lucky, I can make it reboot by holding down the power key for 5 seconds.
>
> I may have missed somthing, but let me ask anyway: What laptop? Have you
> tried switching to a text console before suspending? Have you tried
> Options "NoPM" "True"
> in the ServerFlags section of your XF86Config?

Toshiba Satellite 4070CDT. APM has worked without any problems, in
2.4 and earlier kernels, both Red Hat kernels and vanilla ones.

I'm not using X :)

I am using vesafb, as my text console. Same with 2.4.

I've just noticed a notable change: the Toshiba SMM driver. That is
now configured in, whereas before it was a module and I never loaded
it. Although I don't use it, when it initialises it does some funky
SMM stuff - might that be breaking resume?

-- Jamie

2003-08-16 20:44:53