my latest scheduler patchset can be found at:
redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
this version takes a shot at more scheduling fairness - i'd be interested
how it works out for others.
Changes since -G3:
- fix the timeslice granularity inconsistency found by Con
- further increase timeslice granularity
- decrease sleep average interval to 1 second
- fix starvation detection, increase fairness
Reports, testing feedback and comments are welcome,
Ingo
On Sun, 27 Jul 2003 23:40, Ingo Molnar wrote:
> - further increase timeslice granularity
For a while now I've been running a 1000Hz 2.4 O(1) kernel tree that uses
timeslice granularity set to MIN_TIMESLICE which has stark smoothness
improvements in X. I've avoided promoting this idea because of the
theoretical drop in throughput this might cause. I've not been able to see
any detriment in my basic testing of this small granularity, so I was curious
to see what you throught was a reasonable lower limit?
Con
On Sun, 2003-07-27 at 15:40, Ingo Molnar wrote:
> my latest scheduler patchset can be found at:
>
> redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
>
> this version takes a shot at more scheduling fairness - i'd be interested
> how it works out for others.
This -G6 patch is fantastic, even without nicing the X server. I didn't
even need to tweak any kernel scheduler knob to adjust for maximum
smoothness on my desktop. Response times are impressive, even under
heavy load. Great!
At 09:18 PM 7/27/2003 +0200, Felipe Alfaro Solana wrote:
>On Sun, 2003-07-27 at 15:40, Ingo Molnar wrote:
> > my latest scheduler patchset can be found at:
> >
> > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
> >
> > this version takes a shot at more scheduling fairness - i'd be interested
> > how it works out for others.
>
>This -G6 patch is fantastic, even without nicing the X server. I didn't
>even need to tweak any kernel scheduler knob to adjust for maximum
>smoothness on my desktop. Response times are impressive, even under
>heavy load. Great!
Can you try the following please?
This one I just noticed:
1. start top.
2. start dd if=/dev/zero | dd of=/dev/null
3. wiggle a window very briefly.
Here, X becomes extremely jerky, and I think this is due to two
things. One, X uses it's sleep_avg very quickly, and expires. Two, the
piped dd now is highly interactive due to the ns resolution clock (uhoh).
This one, I've seen before, but it just became easily repeatable:
1. start top.
2. start make -j2 bzImage. minimize window.
3. start xmms, and enable it's gl visualization.
4. grab top window and wiggle it until you see/feel X expire. (don't blink)
What you should notice:
1. your desktop response is now horrible.
2. X is sitting at prio 25, and is not getting better as minutes
pass.
3. the gl thread (oink) is suddenly at interactive priority, and
stays that way.
5. minimize gl window
What you should notice:
1. your desktop recovers, because..
2. X now recovers it's interactive priority, and the gl thread
becomes non-interactive.
6. scratch head ;-)
My conclusions:
1. evil voodoo. (then a while later..;)
2. while X is running, but is in the expired array, the wakeups it is
supposed to be sending to the gl thread are delayed by the amount of cpu
used by the two cc1's (also at prio 25) it is round-robining with. Ergo,
the gl thread receives that quantity of additional boost. X is not
sleeping, he's trying to get the cpu, ergo he cannot receive sleep
boost. He can't get enough cpu to do his job and go to sleep long enough
to recharge his sleep_avg. He stays low priority. Kobiashi-maru: X can't
keep the cpu long enough to catch up. Either it expires, or it wakes the
gl thread, is preempted, and _then_ expires, waits for cc1's, wakes gl
thread (+big boost), gets preempted again... repeat forever, or until you
reduce X's backlog, by minimizing, covering or killing the gl thread.
Conclusion accuracy/inaccuracy aside, I'd like to see if anyone else can
reproduce that second scenario.
-Mike
On Mon, 28 Jul 2003, Mike Galbraith wrote:
> At 09:18 PM 7/27/2003 +0200, Felipe Alfaro Solana wrote:
> >On Sun, 2003-07-27 at 15:40, Ingo Molnar wrote:
> > > my latest scheduler patchset can be found at:
> > >
> > > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
> > >
> > > this version takes a shot at more scheduling fairness - i'd be interested
> > > how it works out for others.
> >
> >This -G6 patch is fantastic, even without nicing the X server. I didn't
> >even need to tweak any kernel scheduler knob to adjust for maximum
> >smoothness on my desktop. Response times are impressive, even under
> >heavy load. Great!
>
> Can you try the following please?
>
> This one I just noticed:
> 1. start top.
> 2. start dd if=/dev/zero | dd of=/dev/null
> 3. wiggle a window very briefly.
> Here, X becomes extremely jerky, and I think this is due to two
> things. One, X uses it's sleep_avg very quickly, and expires. Two, the
> piped dd now is highly interactive due to the ns resolution clock (uhoh).
What kind of LAME test is this? If "X becomes extremely jerky" ?
Sheesh, somebody come up with a build class solution.
CONFIG_SERVER
CONFIG_WORKSTATION
CONGIG_IAMAGEEKWHOPLAYSGAMES
CONFIG_GENERIC_LAMER
Determining quality of the scheduler based on how a mouse responds is ...
Sorry but this is just laughable, emperical subjective determination
based on a random hardware combinations for QA/QC for a test?
Don't bother replying cause last thing I want to know is why.
-a
On Mon, 28 Jul 2003 16:04, Mike Galbraith wrote:
> to recharge his sleep_avg. He stays low priority. Kobiashi-maru: X can't
> Conclusion accuracy/inaccuracy aside, I'd like to see if anyone else can
> reproduce that second scenario.
Yes I can reproduce it, but we need the Kirk approach and cheat. Some
workaround for tasks that have fallen onto the expired array but shouldn't be
there needs to be created. But first we need to think of one before we can
create one...
Con
At 11:45 PM 7/27/2003 -0700, Andre Hedrick wrote:
>On Mon, 28 Jul 2003, Mike Galbraith wrote:
>
> > At 09:18 PM 7/27/2003 +0200, Felipe Alfaro Solana wrote:
> > >On Sun, 2003-07-27 at 15:40, Ingo Molnar wrote:
> > > > my latest scheduler patchset can be found at:
> > > >
> > > > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
> > > >
> > > > this version takes a shot at more scheduling fairness - i'd be
> interested
> > > > how it works out for others.
> > >
> > >This -G6 patch is fantastic, even without nicing the X server. I didn't
> > >even need to tweak any kernel scheduler knob to adjust for maximum
> > >smoothness on my desktop. Response times are impressive, even under
> > >heavy load. Great!
> >
> > Can you try the following please?
> >
> > This one I just noticed:
> > 1. start top.
> > 2. start dd if=/dev/zero | dd of=/dev/null
> > 3. wiggle a window very briefly.
> > Here, X becomes extremely jerky, and I think this is due to two
> > things. One, X uses it's sleep_avg very quickly, and expires. Two, the
> > piped dd now is highly interactive due to the ns resolution clock (uhoh).
>
>What kind of LAME test is this? If "X becomes extremely jerky" ?
Huh? The point is that piped cpu hogs just became high priority.
>Sheesh, somebody come up with a build class solution.
>
>CONFIG_SERVER
>CONFIG_WORKSTATION
>CONGIG_IAMAGEEKWHOPLAYSGAMES
>CONFIG_GENERIC_LAMER
>
>Determining quality of the scheduler based on how a mouse responds is ...
>
>Sorry but this is just laughable, emperical subjective determination
>based on a random hardware combinations for QA/QC for a test?
>
>Don't bother replying cause last thing I want to know is why.
Oh, I see, you just felt like doing some mindless flaming.
-Mike
On Mon, 28 Jul 2003, Con Kolivas wrote:
> On Sun, 27 Jul 2003 23:40, Ingo Molnar wrote:
> > - further increase timeslice granularity
>
> For a while now I've been running a 1000Hz 2.4 O(1) kernel tree that
> uses timeslice granularity set to MIN_TIMESLICE which has stark
> smoothness improvements in X. I've avoided promoting this idea because
> of the theoretical drop in throughput this might cause. I've not been
> able to see any detriment in my basic testing of this small granularity,
> so I was curious to see what you throught was a reasonable lower limit?
it's a hard question. The 25 msecs in -G6 is probably too low.
Ingo
At 05:05 PM 7/28/2003 +1000, Con Kolivas wrote:
>On Mon, 28 Jul 2003 16:04, Mike Galbraith wrote:
> > to recharge his sleep_avg. He stays low priority. Kobiashi-maru: X can't
>
> > Conclusion accuracy/inaccuracy aside, I'd like to see if anyone else can
> > reproduce that second scenario.
>
>Yes I can reproduce it, but we need the Kirk approach and cheat. Some
>workaround for tasks that have fallen onto the expired array but shouldn't be
>there needs to be created. But first we need to think of one before we can
>create one...
Oh good, it's not my poor little box. My experimental tree already has a
"Kirk" ;-)
-Mike
On Mon, 28 Jul 2003, Mike Galbraith wrote:
> >Yes I can reproduce it, but we need the Kirk approach and cheat. Some
> >workaround for tasks that have fallen onto the expired array but shouldn't be
> >there needs to be created. But first we need to think of one before we can
> >create one...
>
> Oh good, it's not my poor little box. My experimental tree already has
> a "Kirk" ;-)
could you give -G7 a try:
redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G7
Mr. Kirk was busy fixing the IDE code (a subsystem he loves to contribute
to) but i managed to get some code from Mr. Spock: it introduces
ON_RUNQUEUE_WEIGHT, set to 30% currently. Wakeups that come from IRQ
contexts get a 100% sleep average - most hw interrupts are of interactive
nature.
this method should result in process-context wakeups giving a limited but
load-proportional boost - which boost is enough to prevent such tasks from
getting max CPU hogs, but not enough to make them permanently interactive.
Ingo
At 09:44 AM 7/28/2003 +0200, Ingo Molnar wrote:
>On Mon, 28 Jul 2003, Mike Galbraith wrote:
>
> > >Yes I can reproduce it, but we need the Kirk approach and cheat. Some
> > >workaround for tasks that have fallen onto the expired array but
> shouldn't be
> > >there needs to be created. But first we need to think of one before we can
> > >create one...
> >
> > Oh good, it's not my poor little box. My experimental tree already has
> > a "Kirk" ;-)
>
>could you give -G7 a try:
>
> redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G7
The dd case is improved. The dd if=/dev/zero is now prio 25, but it's
of=/dev/null partner remains at 16. No change with the xmms gl thread.
-Mike
Quoting Mike Galbraith <[email protected]>:
> At 09:44 AM 7/28/2003 +0200, Ingo Molnar wrote:
>
> >On Mon, 28 Jul 2003, Mike Galbraith wrote:
> >
> > > >Yes I can reproduce it, but we need the Kirk approach and cheat. Some
> > > >workaround for tasks that have fallen onto the expired array but
> > shouldn't be
> > > >there needs to be created. But first we need to think of one before we
> can
> > > >create one...
> > >
> > > Oh good, it's not my poor little box. My experimental tree already has
> > > a "Kirk" ;-)
> >
> >could you give -G7 a try:
> >
> > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G7
>
> The dd case is improved. The dd if=/dev/zero is now prio 25, but it's
> of=/dev/null partner remains at 16. No change with the xmms gl thread.
Well O10 is not prone to the dd/of problem (obviously since it doesn't use
nanosecond timing [yet?]) but I can exhibit your second weird one if I try hard
enough.
Con
At 06:42 PM 7/28/2003 +1000, Con Kolivas wrote:
>Quoting Mike Galbraith <[email protected]>:
>
> > At 09:44 AM 7/28/2003 +0200, Ingo Molnar wrote:
> >
> > >On Mon, 28 Jul 2003, Mike Galbraith wrote:
> > >
> > > > >Yes I can reproduce it, but we need the Kirk approach and cheat. Some
> > > > >workaround for tasks that have fallen onto the expired array but
> > > shouldn't be
> > > > >there needs to be created. But first we need to think of one before we
> > can
> > > > >create one...
> > > >
> > > > Oh good, it's not my poor little box. My experimental tree already has
> > > > a "Kirk" ;-)
> > >
> > >could you give -G7 a try:
> > >
> > > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G7
> >
> > The dd case is improved. The dd if=/dev/zero is now prio 25, but it's
> > of=/dev/null partner remains at 16. No change with the xmms gl thread.
>
>Well O10 is not prone to the dd/of problem (obviously since it doesn't use
>nanosecond timing [yet?]) but I can exhibit your second weird one if I try
>hard
>enough.
Try setting the gl thread to SCHED_RR. That causes X to lose priority here
too.
-Mike
Quoting Ingo Molnar <[email protected]>:
>
> On Mon, 28 Jul 2003, Con Kolivas wrote:
>
> > On Sun, 27 Jul 2003 23:40, Ingo Molnar wrote:
> > > - further increase timeslice granularity
> >
> > For a while now I've been running a 1000Hz 2.4 O(1) kernel tree that
> > uses timeslice granularity set to MIN_TIMESLICE which has stark
> > smoothness improvements in X. I've avoided promoting this idea because
> > of the theoretical drop in throughput this might cause. I've not been
> > able to see any detriment in my basic testing of this small granularity,
> > so I was curious to see what you throught was a reasonable lower limit?
>
> it's a hard question. The 25 msecs in -G6 is probably too low.
Just another thought on that is to make sure they don't get requeued to start
with just 2 ticks left - which would happen to all nice 0 tasks running their
full timeslice. Here is what I'm doing in O10:
+ } else if (!((task_timeslice(p) - p->time_slice) %
+ TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE) &&
+ (p->array == rq->active)) {
Con
On Sunday, 27 July 2003, at 15:40:42 +0200,
Ingo Molnar wrote:
> my latest scheduler patchset can be found at:
>
> redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
>
> this version takes a shot at more scheduling fairness - i'd be interested
> how it works out for others.
>
There a couple of now-famous interactivity tests wher -G6 succeed and
2.6.0-test2 fails.
First, with -G6 I can't make XMMS MP3s skip simply by scrolling a web
page loaded in Mozilla 1.4, no matter how hard I try.
Second, moving a window like mad (show window contents while moving set
to ON) won't freeze X, and with 2.6.0-test2 this could be done in
several seconds of moving the window.
Still, OpenOffice is dog slow when anything else is getting CPU cycles,
but Andrew Morton pointed out on another thread this seems to be a
problem with OpenOffice , not with the scheduler.
Regards,
--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.0-test1-bk3)
No, it was an attempt to get you to explain in detail for people to
understand why jitter responses in X have anything with scheduling. Now
the a pipe ipc that makings loading go south is interesting.
> >Don't bother replying cause last thing I want to know is why.
Means, don't tell me about "X becomes extremely jerky", disclose that is
below creating the observed effects.
-a
On Mon, 28 Jul 2003, Mike Galbraith wrote:
> At 11:45 PM 7/27/2003 -0700, Andre Hedrick wrote:
>
> >On Mon, 28 Jul 2003, Mike Galbraith wrote:
> >
> > > At 09:18 PM 7/27/2003 +0200, Felipe Alfaro Solana wrote:
> > > >On Sun, 2003-07-27 at 15:40, Ingo Molnar wrote:
> > > > > my latest scheduler patchset can be found at:
> > > > >
> > > > > redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G6
> > > > >
> > > > > this version takes a shot at more scheduling fairness - i'd be
> > interested
> > > > > how it works out for others.
> > > >
> > > >This -G6 patch is fantastic, even without nicing the X server. I didn't
> > > >even need to tweak any kernel scheduler knob to adjust for maximum
> > > >smoothness on my desktop. Response times are impressive, even under
> > > >heavy load. Great!
> > >
> > > Can you try the following please?
> > >
> > > This one I just noticed:
> > > 1. start top.
> > > 2. start dd if=/dev/zero | dd of=/dev/null
> > > 3. wiggle a window very briefly.
> > > Here, X becomes extremely jerky, and I think this is due to two
> > > things. One, X uses it's sleep_avg very quickly, and expires. Two, the
> > > piped dd now is highly interactive due to the ns resolution clock (uhoh).
> >
> >What kind of LAME test is this? If "X becomes extremely jerky" ?
>
> Huh? The point is that piped cpu hogs just became high priority.
>
> >Sheesh, somebody come up with a build class solution.
> >
> >CONFIG_SERVER
> >CONFIG_WORKSTATION
> >CONGIG_IAMAGEEKWHOPLAYSGAMES
> >CONFIG_GENERIC_LAMER
> >
> >Determining quality of the scheduler based on how a mouse responds is ...
> >
> >Sorry but this is just laughable, emperical subjective determination
> >based on a random hardware combinations for QA/QC for a test?
> >
> >Don't bother replying cause last thing I want to know is why.
>
> Oh, I see, you just felt like doing some mindless flaming.
>
> -Mike
>
On Mon, 28 Jul 2003, Ingo Molnar wrote:
>
> On Mon, 28 Jul 2003, Con Kolivas wrote:
>
> > On Sun, 27 Jul 2003 23:40, Ingo Molnar wrote:
> > > - further increase timeslice granularity
> >
> > For a while now I've been running a 1000Hz 2.4 O(1) kernel tree that
> > uses timeslice granularity set to MIN_TIMESLICE which has stark
> > smoothness improvements in X. I've avoided promoting this idea because
> > of the theoretical drop in throughput this might cause. I've not been
> > able to see any detriment in my basic testing of this small granularity,
> > so I was curious to see what you throught was a reasonable lower limit?
>
> it's a hard question. The 25 msecs in -G6 is probably too low.
It would seem to me that the lower limit for a given CPU is a function of
CPU speed and cache size. One reason for longer slices is to preserve the
cache, but the real time to get good use from the cache is not a constant,
and you just can't pick any one number which won't be too short on a slow
cpu or unproductively long on a fast CPU. Hyperthreading shrinks the
effective cache size as well, but certainly not by 2:1 or anything nice.
Perhaps this should be a tunable set by a bit of hardware discovery at
boot and diddled at your own risk. Sure one factor in why people can't
agree on HZ and all to get best results.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
On Tue, 29 Jul 2003 07:38, Bill Davidsen wrote:
> On Mon, 28 Jul 2003, Ingo Molnar wrote:
> > On Mon, 28 Jul 2003, Con Kolivas wrote:
> > > On Sun, 27 Jul 2003 23:40, Ingo Molnar wrote:
> > > > - further increase timeslice granularity
> > >
> > > For a while now I've been running a 1000Hz 2.4 O(1) kernel tree that
> > > uses timeslice granularity set to MIN_TIMESLICE which has stark
> > > smoothness improvements in X. I've avoided promoting this idea because
> > > of the theoretical drop in throughput this might cause. I've not been
> > > able to see any detriment in my basic testing of this small
> > > granularity, so I was curious to see what you throught was a reasonable
> > > lower limit?
> >
> > it's a hard question. The 25 msecs in -G6 is probably too low.
>
> It would seem to me that the lower limit for a given CPU is a function of
> CPU speed and cache size. One reason for longer slices is to preserve the
> cache, but the real time to get good use from the cache is not a constant,
> and you just can't pick any one number which won't be too short on a slow
> cpu or unproductively long on a fast CPU. Hyperthreading shrinks the
> effective cache size as well, but certainly not by 2:1 or anything nice.
>
> Perhaps this should be a tunable set by a bit of hardware discovery at
> boot and diddled at your own risk. Sure one factor in why people can't
> agree on HZ and all to get best results.
Agreed, and no doubt the smaller the timeslice the worse it is. I did a little
experimenting with my P4 2.53 here and found that basically no matter how
much longer the timeslice was there was continued benefit. However the
benefit was diminishing the higher you got. If you graphed it out it was a
nasty exponential curve up to 7ms and then there was a knee in the curve and
it was virtually linear from that point on with only tiny improvements. A p3
933 behaved surprisingly similarly. That's why on 2.4.21-ck3 it was running
with timeslice_granularity set to 10ms. However the round robin isn't as bad
as pure timeslice limiting because if they're still on the active array I am
led to believe there is less cache trashing.
There was no answer in that but just thought I'd add what I know so far.
Con
At 10:33 AM 7/28/2003 -0700, Andre Hedrick wrote:
>No, it was an attempt to get you to explain in detail for people to
>understand why jitter responses in X have anything with scheduling. Now
>the a pipe ipc that makings loading go south is interesting.
I was going to ignore this, but ok, brief explanation follows. Do me a
favor though, if you reply to this, please do so in straight forward English...
> > >Don't bother replying cause last thing I want to know is why.
>
>Means, don't tell me about "X becomes extremely jerky", disclose that is
>below creating the observed effects.
... you didn't have to goad me into explaining why I tested the way I did,
a simple question would have sufficed.
Now, why would I test Ingo's scheduler changes by wiggling a window while
some other load is running? Simple. I saw the MAX_SLEEP_AVG change, and
knew that this meant that a pure cpu burner, such as X is while you're
moving a window, will expire in less than half a second. That's not very
much time for someone to drag a window or whatever before their decidedly
important interactive task expires. Why is expiration time
important? Because of the highly variable amount of time it takes to
return the expired task to the active array...
If there is no other load present, there will be an instantaneous array
switch, and you won't even notice that X has expired. If there is a
non-interactive cpu burner with a full timeslice present in the active
array at expiration time, you will probably notice that you took a 102ms
latency hit. If there are several present, you will very definitely
notice. If, as in the first case I reported, and asked other testers to
verify, there happens to be a couple of cpu hogs present which the
scheduler is classifying as interactive because they are sleeping slightly
more than they are running, and feeding each other sleep_avg via wakeups,
your time to return to the active array becomes STARVATION_LIMIT unless all
active tasks happen to sleep at the same time, because these tasks will
never decay in priority, the active array will never become empty, so there
will never be an array switch until one is forced by the starvation
timeout. Not only will it take 1 second for X to return to the active
array, when it arrives, the cpu hogs will be above X's priority, and X will
only receive cpu if both hogs happen to sleep at the same time. X can
only recover it's interactive priority if it can get enough cpu to
accomplish the work it was awakened to do, and go back to sleep. The
turn-around time/recovery time is what I was testing, knowing full well
that there is a price to pay for increased fairness, and knowing from
experience that the cost, especially when achieved via increased array
switch frequency, can be quite high. What I reported was two repeatable
cases where X suffers from starvation due to the mechanics of the
scheduler, and the increased fairness changes.
"Jitter responses in X" has everything in the world to do with the
scheduler. "jitter responses in X" is a direct output of the
scheduler. The decidedly bad reaction to the "jitter test" was a direct
result of the changes I was testing, and therefore reportable.
Clear now?
-Mike
On Tue, 29 Jul 2003, Con Kolivas wrote:
> On Tue, 29 Jul 2003 07:38, Bill Davidsen wrote:
> > It would seem to me that the lower limit for a given CPU is a function of
> > CPU speed and cache size. One reason for longer slices is to preserve the
> > cache, but the real time to get good use from the cache is not a constant,
> > and you just can't pick any one number which won't be too short on a slow
> > cpu or unproductively long on a fast CPU. Hyperthreading shrinks the
> > effective cache size as well, but certainly not by 2:1 or anything nice.
> >
> > Perhaps this should be a tunable set by a bit of hardware discovery at
> > boot and diddled at your own risk. Sure one factor in why people can't
> > agree on HZ and all to get best results.
>
> Agreed, and no doubt the smaller the timeslice the worse it is. I did a little
> experimenting with my P4 2.53 here and found that basically no matter how
> much longer the timeslice was there was continued benefit. However the
> benefit was diminishing the higher you got. If you graphed it out it was a
> nasty exponential curve up to 7ms and then there was a knee in the curve and
> it was virtually linear from that point on with only tiny improvements. A p3
> 933 behaved surprisingly similarly. That's why on 2.4.21-ck3 it was running
> with timeslice_granularity set to 10ms. However the round robin isn't as bad
> as pure timeslice limiting because if they're still on the active array I am
> led to believe there is less cache trashing.
>
> There was no answer in that but just thought I'd add what I know so far.
I think your agreement that one size doesn't fit all at least indicates
that hardware performance does enter into the settings. I'm disappointed
that you see no nice sharp "best value," but real data is better than
theory.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
Ingo Molnar said:
>
> could you give -G7 a try:
>
> redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G7
>
For me it's worse than G2 but still good.
With vanila kernel patched with your patch (G7) on top of 2.6.0-test2
I can watch a movie while doing make -j 5 bzImage in the background
With Con Kolivas O10 (on top on 2.6.0-test2-mm1) I can't watch a
movie while doing make bzImage
Setup is the same: minimal window manager (fvwm),
a couple of processes sleeping and only make and mplayer running
>
> Ingo
Calin
--
# fortune
fortune: write error on /dev/null --- please empty the bit bucket
-----------------------------------------
This email was sent using SquirrelMail.
"Webmail for nuts!"
http://squirrelmail.org/
On Monday 28 July 2003 18:00, Con Kolivas wrote:
> Agreed, and no doubt the smaller the timeslice the worse it is. I did a
> little experimenting with my P4 2.53 here and found that basically no
> matter how much longer the timeslice was there was continued benefit.
> However the benefit was diminishing the higher you got. If you graphed it
> out it was a nasty exponential curve up to 7ms and then there was a knee in
> the curve and it was virtually linear from that point on with only tiny
> improvements. A p3 933 behaved surprisingly similarly. That's why on
> 2.4.21-ck3 it was running with timeslice_granularity set to 10ms. However
> the round robin isn't as bad as pure timeslice limiting because if they're
> still on the active array I am led to believe there is less cache trashing.
>
> There was no answer in that but just thought I'd add what I know so far.
>
> Con
Fun.
Have you read the excellent DRAM series on ars technica?
http://www.arstechnica.com/paedia/r/ram_guide/ram_guide.part1-2.html
http://www.arstechnica.com/paedia/r/ram_guide/ram_guide.part2-1.html
http://www.arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html
Sounds like you're thwacking into memory latency and bank switching and such.
(Yes, you can thrash dram. It's not as noticeable as with disk, but it's can
be done.)
The memory bus speed will affect this a little bit, but it's not going to do
to much for request turnaround time except make it proportionally even worse.
Same for DDR. :)
Rob