2003-08-08 15:44:16

by Con Kolivas

[permalink] [raw]
Subject: [PATCH]O14int

More duck tape interactivity tweaks

Changes

Put some bounds on the interactive_credit and specified a size beyond which a
task is considered highly interactive or not.

Moved the uninterruptible sleep limiting to within recalc_task_prio removing
one call of the sched_clock() and being more careful about the limits.

Wli pointed out an error in the nanosecond to jiffy conversion which may have
been causing too easy to migrate tasks on smp (? performance change).

Put greater limitation on the requeuing code; now it only requeues interactive
tasks thereby letting cpu hogs run their full timeslice unabated which should
improve cpu intensive task performance.

Made highly interactive tasks earn all their waiting time on the runqueue
during requeuing as sleep_avg.

Patch applies onto 2.6.0-test2-mm5

http://kernel.kolivas.org/2.5
has the patch and a full patch against 2.6.0-test2

Patch O13.1-O14int:

--- linux-2.6.0-test2-mm5-O13.1/kernel/sched.c 2003-08-08 22:47:08.000000000 +1000
+++ linux-2.6.0-test2-mm5/kernel/sched.c 2003-08-09 01:27:47.000000000 +1000
@@ -130,6 +130,15 @@
#define JUST_INTERACTIVE_SLEEP(p) \
(MAX_SLEEP_AVG - (DELTA(p) * AVG_TIMESLICE))

+#define HIGH_CREDIT(p) \
+ ((p)->interactive_credit > MAX_SLEEP_AVG)
+
+#define LOW_CREDIT(p) \
+ ((p)->interactive_credit < -MAX_SLEEP_AVG)
+
+#define VARYING_CREDIT(p) \
+ (!(HIGH_CREDIT(p) || LOW_CREDIT(p)))
+
#define TASK_PREEMPTS_CURR(p, rq) \
((p)->prio < (rq)->curr->prio || \
((p)->prio == (rq)->curr->prio && \
@@ -364,7 +373,7 @@ static void recalc_task_prio(task_t *p,
unsigned long long __sleep_time = now - p->timestamp;
unsigned long sleep_time;

- if (!p->sleep_avg)
+ if (!p->sleep_avg && VARYING_CREDIT(p))
p->interactive_credit--;

if (__sleep_time > NS_MAX_SLEEP_AVG)
@@ -373,7 +382,6 @@ static void recalc_task_prio(task_t *p,
sleep_time = (unsigned long)__sleep_time;

if (likely(sleep_time > 0)) {
-
/*
* User tasks that sleep a long time are categorised as
* idle and will get just interactive status to stay active &
@@ -387,20 +395,38 @@ static void recalc_task_prio(task_t *p,
else {
/*
* The lower the sleep avg a task has the more
- * rapidly it will rise with sleep time. Tasks
- * without interactive_credit are limited to
- * one timeslice worth of sleep avg bonus.
+ * rapidly it will rise with sleep time.
*/
sleep_time *= (MAX_BONUS + 1 -
(NS_TO_JIFFIES(p->sleep_avg) *
MAX_BONUS / MAX_SLEEP_AVG));

- if (p->interactive_credit < 0 &&
+ /*
+ * Tasks with low interactive_credit are limited to
+ * one timeslice worth of sleep avg bonus.
+ */
+ if (LOW_CREDIT(p) &&
sleep_time > JIFFIES_TO_NS(task_timeslice(p)))
sleep_time =
JIFFIES_TO_NS(task_timeslice(p));

/*
+ * Non high_credit tasks waking from uninterruptible
+ * sleep are limited in their sleep_avg rise
+ */
+ if (!HIGH_CREDIT(p) && p->activated == -1){
+ if (p->sleep_avg >=
+ JIFFIES_TO_NS(JUST_INTERACTIVE_SLEEP(p)))
+ sleep_time = 0;
+ else if (p->sleep_avg + sleep_time >=
+ JIFFIES_TO_NS(JUST_INTERACTIVE_SLEEP(p))){
+ p->sleep_avg =
+ JIFFIES_TO_NS(JUST_INTERACTIVE_SLEEP(p));
+ sleep_time = 0;
+ }
+ }
+
+ /*
* This code gives a bonus to interactive tasks.
*
* The boost works by updating the 'average sleep time'
@@ -418,7 +444,7 @@ static void recalc_task_prio(task_t *p,
*/
if (p->sleep_avg > NS_MAX_SLEEP_AVG){
p->sleep_avg = NS_MAX_SLEEP_AVG;
- p->interactive_credit++;
+ p->interactive_credit += VARYING_CREDIT(p);
}
}
}
@@ -588,11 +614,7 @@ repeat_lock_task:
* Tasks on involuntary sleep don't earn
* sleep_avg beyond just interactive state.
*/
- if (NS_TO_JIFFIES(p->sleep_avg) >=
- JUST_INTERACTIVE_SLEEP(p)){
- p->timestamp = sched_clock();
- p->activated = -1;
- }
+ p->activated = -1;
}
if (sync)
__activate_task(p, rq);
@@ -1156,9 +1178,9 @@ skip_queue:
* 3) are cache-hot on their current CPU.
*/

-#define CAN_MIGRATE_TASK(p,rq,this_cpu) \
- ((!idle || (((now - (p)->timestamp)>>10) > cache_decay_ticks)) &&\
- !task_running(rq, p) && \
+#define CAN_MIGRATE_TASK(p,rq,this_cpu) \
+ ((!idle || (NS_TO_JIFFIES(now - (p)->timestamp) > \
+ cache_decay_ticks)) && !task_running(rq, p) && \
cpu_isset(this_cpu, (p)->cpus_allowed))

curr = curr->prev;
@@ -1366,13 +1388,23 @@ void scheduler_tick(int user_ticks, int
* requeue this task to the end of the list on this priority
* level, which is in essence a round-robin of tasks with
* equal priority.
+ *
+ * This only applies to user tasks in the interactive
+ * delta range with at least MIN_TIMESLICE left.
*/
- if (p->mm && !((task_timeslice(p) - p->time_slice) %
- TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE) &&
+ if (p->mm && TASK_INTERACTIVE(p) && !((task_timeslice(p) -
+ p->time_slice) % TIMESLICE_GRANULARITY) &&
+ (p->time_slice > MIN_TIMESLICE) &&
(p->array == rq->active)) {

dequeue_task(p, rq->active);
set_tsk_need_resched(p);
+ /*
+ * Tasks with interactive credit get all their
+ * time waiting on the run queue credited as sleep
+ */
+ if (HIGH_CREDIT(p))
+ p->activated = 2;
enqueue_task(p, rq->active);
}
}
@@ -1426,7 +1458,7 @@ need_resched:
* as their sleep_avg decreases to slow them losing their
* priority bonus
*/
- if (prev->interactive_credit > 0)
+ if (HIGH_CREDIT(prev))
run_time /= (MAX_BONUS + 1 -
(NS_TO_JIFFIES(prev->sleep_avg) * MAX_BONUS /
MAX_SLEEP_AVG));
@@ -1484,12 +1516,12 @@ pick_next_task:
if (next->activated == 1)
delta = delta * (ON_RUNQUEUE_WEIGHT * 128 / 100) / 128;

- next->activated = 0;
array = next->array;
dequeue_task(next, array);
recalc_task_prio(next, next->timestamp + delta);
enqueue_task(next, array);
}
+ next->activated = 0;
switch_tasks:
prefetch(next);
clear_tsk_need_resched(prev);


2003-08-08 17:43:51

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH]O14int



Con Kolivas wrote:

> Made highly interactive tasks earn all their waiting time on the runqueue
> during requeuing as sleep_avg.


There are some mechanics of this that I am not familiar with, so please
excuse the naive question.

Someone had suggested that a task's sleep time should be determine
exclusively from the time it spends blocked by what it's waiting on, and
not based on any OTHER time it sleeps. That is, the time between the
I/O request being satisfied and the task actually getting the CPU
doesn't count.

Is your statement above a reflection of that suggestion?


2003-08-08 19:32:01

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Fri, 2003-08-08 at 17:49, Con Kolivas wrote:
> More duck tape interactivity tweaks
>
> Changes
>
> Put some bounds on the interactive_credit and specified a size beyond which a
> task is considered highly interactive or not.
>
> Moved the uninterruptible sleep limiting to within recalc_task_prio removing
> one call of the sched_clock() and being more careful about the limits.
>
> Wli pointed out an error in the nanosecond to jiffy conversion which may have
> been causing too easy to migrate tasks on smp (? performance change).
>
> Put greater limitation on the requeuing code; now it only requeues interactive
> tasks thereby letting cpu hogs run their full timeslice unabated which should
> improve cpu intensive task performance.
>
> Made highly interactive tasks earn all their waiting time on the runqueue
> during requeuing as sleep_avg.

I have been testing this jewel on top of 2.6.0-test2-mm5 for a couple of
hours. It behaves much like O13*int, well maybe a little bit better:
Renicing X to -20 is a total disaster (sluggish window movement, Juk
skipping like mad, etc), but with X at +0 the system feels pretty good.

Evolution still feels like a mamooth when moving windows over it. XMMS
doesn't skip, neither Juk does when X is at +0. This is good :-)
All in all, this one feels good!

PS: May I say O10int is still a little bit smoother than this one? ;-)

2003-08-08 20:06:50

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH]O14int


On 2003-08-08 15:49:25 Con Kolivas wrote:

> More duck tape interactivity tweaks

Do you have a premonition... Game-test goes down in flames. Volatile to
the extent where I can't catch head or tail. It can behave like in
A3-O12.2 or as an unpatched 2.6.0-test2. Trigger badness by switching to
a text console. Sometimes it recovers, sometimes not. Sometimes fast,
sometimes slowly (when it does recover).

I'll withdraw under my rock now. Won't come forth until everything
smells of roses. Getting stressed by being a bringer of bad news only.
Please speak up, all you other testers. Divide the burden. Even out the
scores.

Greetings,
Mats Johannesson

2003-08-09 00:31:01

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, 9 Aug 2003 06:08, Voluspa wrote:
> On 2003-08-08 15:49:25 Con Kolivas wrote:
> > More duck tape interactivity tweaks
>
> Do you have a premonition... Game-test goes down in flames. Volatile to
> the extent where I can't catch head or tail. It can behave like in
> A3-O12.2 or as an unpatched 2.6.0-test2. Trigger badness by switching to
> a text console.

Ah. There's the answer. You've totally changed the behaviour of the
application in question by moving to the text console. No longer is it the
sizable cpu hog that it is when it's in the foreground on X, so you've
totally changed it's behaviour and how it is treated.

> Sometimes it recovers, sometimes not. Sometimes fast,
> sometimes slowly (when it does recover).

Depends on whether the scheduler has decided firmly "you're interactive or
not".

Your question of course is can this be changed? Well of course everything
_can_ be... It may be simple tuning. In the meantime the answer is don't
switch to the text console. (Doc it hurts when I do this... Well don't do
that). Might be useful for you to see how long it has run when it recovers,
and how long when it no longer recovers.

> I'll withdraw under my rock now. Won't come forth until everything
> smells of roses. Getting stressed by being a bringer of bad news only.
> Please speak up, all you other testers. Divide the burden. Even out the
> scores.

Wine, women and song^H^H^H games and scheduling are not a good mix. It's not
your fault. Please do not hold back any reports.

Con

2003-08-09 00:39:43

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, 9 Aug 2003 03:57, Timothy Miller wrote:
> Con Kolivas wrote:
> > Made highly interactive tasks earn all their waiting time on the runqueue
> > during requeuing as sleep_avg.
>
> There are some mechanics of this that I am not familiar with, so please
> excuse the naive question.
>
> Someone had suggested that a task's sleep time should be determine
> exclusively from the time it spends blocked by what it's waiting on, and
> not based on any OTHER time it sleeps. That is, the time between the
> I/O request being satisfied and the task actually getting the CPU
> doesn't count.
>
> Is your statement above a reflection of that suggestion?

I have already been doing this in previous iterations to the extent that it
was capable of being done. There's only so much information feeding into the
scheduler at the moment. All the dependent subsections would have to have
more code to have specific feedback to the scheduler.

Con

2003-08-09 08:58:56

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> More duck tape interactivity tweaks

s/duck/duct

> Wli pointed out an error in the nanosecond to jiffy conversion which may
> have been causing too easy to migrate tasks on smp (? performance change).

Looks like I broke SMP build with this. Will fix soon; don't bother trying
this on SMP yet.

Con

2003-08-10 08:48:43

by Simon Kirby

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, Aug 09, 2003 at 10:36:17AM +1000, Con Kolivas wrote:

> On Sat, 9 Aug 2003 06:08, Voluspa wrote:
> > On 2003-08-08 15:49:25 Con Kolivas wrote:
> > > More duck tape interactivity tweaks
> >
> > Do you have a premonition... Game-test goes down in flames. Volatile to
> > the extent where I can't catch head or tail. It can behave like in
> > A3-O12.2 or as an unpatched 2.6.0-test2. Trigger badness by switching to
> > a text console.
>
> Ah. There's the answer. You've totally changed the behaviour of the
> application in question by moving to the text console. No longer is it the
> sizable cpu hog that it is when it's in the foreground on X, so you've
> totally changed it's behaviour and how it is treated.

I haven't been following this as closely as I would have liked to
(recent vacation and all), but I am definitely seeing issues with the
recent 2.5.x, 2.6.x-testx secheduler code and have been looking over
these threads.

I don't really understand why these changes were made at all to the
scheduler. As I understand it, the 2.2.x and older 2.4.x scheduler was
simple in that it allowed any process to wake up if it had available
ticks, and would switch to that process if any new event occurred and
woke it up. The rest was just limiting the ticks based on nice value
and remembering to switch when the ticks run out.

It seems that newer schedulers are now temporarily postponing the
waking up of other processes when the running process is running with
"preemptive" ticks, and that there's all sorts of hacks involved in
trying to hide the bad effects of this decision.

If this is indeed what is going on, what is the reasoning behind it?
I didn't really see any problems before with the simple scheduler, so
it seems to me like this may just be a hack to make poorly-written
applications seem to be a bit "faster" by starving other processes of
CPU when the poorly-written applications decide they want to do
something (such as rendering a page with a large table in Mozilla
-- grr). Is this really making a large enough difference to be worth
all of this trouble?

To me it would seem the best algorithm would be what we had before all
of this started. Isn't it best to switch to a task as soon as an event
(such as disk I/O finishing or a mouse move waking up X to read mouse
input) occurs for both latency and cache reasons (queued in LIFO
order)? DMA may make some this more complicated, I don't know.

I am seeing similar starvation problems that others are seeing in these
threads. At first it was whenever I clicked a link in Mozilla -- xmms
would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
More recently I found that loading a web page consisting of several
large animated gif images (a security camera web page) caused
absolutely horrible jerking of mouse and keyboard input in all other
windows, even when the browser window was minimized or hidden. What's
worse is the jerking tends to subside if I do a lot of typing or more
the mouse a lot, probably because I'm changing the scheduler's idea of
what "kind" of processes are running (which makes this stuff even
harder to debug).

Simon-

2003-08-10 09:01:15

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sun, 10 Aug 2003 18:48, Simon Kirby wrote:
> On Sat, Aug 09, 2003 at 10:36:17AM +1000, Con Kolivas wrote:
> > On Sat, 9 Aug 2003 06:08, Voluspa wrote:
> > > On 2003-08-08 15:49:25 Con Kolivas wrote:
> > > > More duck tape interactivity tweaks
> > >
> > > Do you have a premonition... Game-test goes down in flames. Volatile to
> > > the extent where I can't catch head or tail. It can behave like in
> > > A3-O12.2 or as an unpatched 2.6.0-test2. Trigger badness by switching
> > > to a text console.
> >
> > Ah. There's the answer. You've totally changed the behaviour of the
> > application in question by moving to the text console. No longer is it
> > the sizable cpu hog that it is when it's in the foreground on X, so
> > you've totally changed it's behaviour and how it is treated.
>
> I haven't been following this as closely as I would have liked to
> (recent vacation and all), but I am definitely seeing issues with the
> recent 2.5.x, 2.6.x-testx secheduler code and have been looking over
> these threads.
>
> I don't really understand why these changes were made at all to the
> scheduler. As I understand it, the 2.2.x and older 2.4.x scheduler was
> simple in that it allowed any process to wake up if it had available
> ticks, and would switch to that process if any new event occurred and
> woke it up. The rest was just limiting the ticks based on nice value
> and remembering to switch when the ticks run out.
>
> It seems that newer schedulers are now temporarily postponing the
> waking up of other processes when the running process is running with
> "preemptive" ticks, and that there's all sorts of hacks involved in
> trying to hide the bad effects of this decision.
>
> If this is indeed what is going on, what is the reasoning behind it?
> I didn't really see any problems before with the simple scheduler, so
> it seems to me like this may just be a hack to make poorly-written
> applications seem to be a bit "faster" by starving other processes of
> CPU when the poorly-written applications decide they want to do
> something (such as rendering a page with a large table in Mozilla
> -- grr). Is this really making a large enough difference to be worth
> all of this trouble?
>
> To me it would seem the best algorithm would be what we had before all
> of this started. Isn't it best to switch to a task as soon as an event
> (such as disk I/O finishing or a mouse move waking up X to read mouse
> input) occurs for both latency and cache reasons (queued in LIFO
> order)? DMA may make some this more complicated, I don't know.
>
> I am seeing similar starvation problems that others are seeing in these
> threads. At first it was whenever I clicked a link in Mozilla -- xmms
> would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
> More recently I found that loading a web page consisting of several
> large animated gif images (a security camera web page) caused
> absolutely horrible jerking of mouse and keyboard input in all other
> windows, even when the browser window was minimized or hidden. What's
> worse is the jerking tends to subside if I do a lot of typing or more
> the mouse a lot, probably because I'm changing the scheduler's idea of
> what "kind" of processes are running (which makes this stuff even
> harder to debug).

Is this with or without my changes? The old scheduler was not very scalable;
that's why we moved. The new one has other intrinsic issues that I (and
others) have been trying to address, but is much much more scalable. It was
not possible to make the old one more scalable, but it is possible to make
this one more interactive.

Con

2003-08-10 10:07:28

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> I haven't been following this as closely as I would have liked to
> (recent vacation and all), but I am definitely seeing issues with the
> recent 2.5.x, 2.6.x-testx secheduler code and have been looking over
> these threads.
> I don't really understand why these changes were made at all to the
> scheduler. As I understand it, the 2.2.x and older 2.4.x scheduler was
> simple in that it allowed any process to wake up if it had available
> ticks, and would switch to that process if any new event occurred and
> woke it up. The rest was just limiting the ticks based on nice value
> and remembering to switch when the ticks run out.

Most of this isn't of much concern; most of the 2.4.x semantics have
largely been carried over to 2.6.x with algorithmic improvements, apart
from the same-mm heuristic (which was of dubious value anyway). Even
epochs are still there in the form of the duelling arrays, which
renders the thing vaguely timeout-based like 2.4.x.


On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> It seems that newer schedulers are now temporarily postponing the
> waking up of other processes when the running process is running with
> "preemptive" ticks, and that there's all sorts of hacks involved in
> trying to hide the bad effects of this decision.

If this would deliberate it would be a "selfish" scheduling algorithm,
where the delay in preemptively capturing the cpu is a number of ticks
equal to whatever the value of beta/alpha was chosen to be, and some
raw scheduling algorithm is used otherwise unaltered for those tasks in
the service box. I see no evidence of such an organization (it'd be
really obvious, as a queue box and service box would need to exist),
hence this is probably just something in need of a performance tweak
if it's a real problem.


On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> If this is indeed what is going on, what is the reasoning behind it?
> I didn't really see any problems before with the simple scheduler, so
> it seems to me like this may just be a hack to make poorly-written
> applications seem to be a bit "faster" by starving other processes of
> CPU when the poorly-written applications decide they want to do
> something (such as rendering a page with a large table in Mozilla
> -- grr). Is this really making a large enough difference to be worth
> all of this trouble?

Yes. The SMP issues addressed by the algorithmic improvements in the
scheduler are performance issues so severe, they may safely be called
functional issues.


On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> To me it would seem the best algorithm would be what we had before all
> of this started. Isn't it best to switch to a task as soon as an event
> (such as disk I/O finishing or a mouse move waking up X to read mouse
> input) occurs for both latency and cache reasons (queued in LIFO
> order)? DMA may make some this more complicated, I don't know.

This sounds like either LCFS or FB. FB's not usable out of the box for
long-running tasks, as its context switch rates are excessive there.
LCFS has some rather undesirable properties that render it unsuitable
for general purpose operating systems. Something like multilevel
processor sharing would be a much better alternative, as long-running
tasks can be classified and scheduled according to a more appropriate
discipline with a lower context switch rate while maintaining the
(essentially infinitely) strong preference for short-running tasks.


On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> I am seeing similar starvation problems that others are seeing in these
> threads. At first it was whenever I clicked a link in Mozilla -- xmms
> would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
> More recently I found that loading a web page consisting of several
> large animated gif images (a security camera web page) caused
> absolutely horrible jerking of mouse and keyboard input in all other
> windows, even when the browser window was minimized or hidden. What's
> worse is the jerking tends to subside if I do a lot of typing or more
> the mouse a lot, probably because I'm changing the scheduler's idea of
> what "kind" of processes are running (which makes this stuff even
> harder to debug).

One problem with these kinds of reports is that they aren't coming with
enough information to determine if the scheduler truly is the cause of
the problem, and worse yet, assuming the scheduler did cause these
problems, this isn't enough actual information to address it. We're
going to need proper instrumentation at some point here.

Until then, when you deliver these reports, could you do the following:

(a) vmstat 1 | cat -n | tee -a vmstat.log

(b) run top under script

(c) regularly snapshot profiles with
n=1
while true
do
readprofile -n -m /boot/System.map-`uname -r` \
| sort -k 2,2 > prof.$n
n=`expr $n + 1`
sleep 1
done

while running interactivity tests?

(a) will give some moderately useful information about how much io is
going on and interrupt and context switch rates.

(b) will report dynamic priorities and other general conditions so the
scheduler's decisions can be examined.

(c) will determine if the issue is due to in-kernel algorithms consuming
excessive amounts of cpu and causing application-level latency
issues via cpu burn

Also, send in bootlogs (dmesg), so that general information about the
system can be communicated.


-- wli

2003-08-10 11:15:34

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH]O14int

At 01:48 AM 8/10/2003 -0700, Simon Kirby wrote:
>On Sat, Aug 09, 2003 at 10:36:17AM +1000, Con Kolivas wrote:
>
> > On Sat, 9 Aug 2003 06:08, Voluspa wrote:
> > > On 2003-08-08 15:49:25 Con Kolivas wrote:
> > > > More duck tape interactivity tweaks
> > >
> > > Do you have a premonition... Game-test goes down in flames. Volatile to
> > > the extent where I can't catch head or tail. It can behave like in
> > > A3-O12.2 or as an unpatched 2.6.0-test2. Trigger badness by switching to
> > > a text console.
> >
> > Ah. There's the answer. You've totally changed the behaviour of the
> > application in question by moving to the text console. No longer is it the
> > sizable cpu hog that it is when it's in the foreground on X, so you've
> > totally changed it's behaviour and how it is treated.
>
>I haven't been following this as closely as I would have liked to
>(recent vacation and all), but I am definitely seeing issues with the
>recent 2.5.x, 2.6.x-testx secheduler code and have been looking over
>these threads.
>
>I don't really understand why these changes were made at all to the
>scheduler. As I understand it, the 2.2.x and older 2.4.x scheduler was
>simple in that it allowed any process to wake up if it had available
>ticks, and would switch to that process if any new event occurred and
>woke it up. The rest was just limiting the ticks based on nice value
>and remembering to switch when the ticks run out.
>
>It seems that newer schedulers are now temporarily postponing the
>waking up of other processes when the running process is running with
>"preemptive" ticks, and that there's all sorts of hacks involved in
>trying to hide the bad effects of this decision.

I don't see this as a bad decision at all, it's just that there are some
annoying cases where the deliberate starvation which works nicely in my
favor for both interactivity and throughput in most cases can and does kick
my ass in others. This is nothing new. I have no memory of the scheduler
ever being perfect (0.96->today). This scheduler is very nice to me; it's
very simple, it's generally highly effective, and it's easily
tweakable. It just has some irritating rough edges.

>If this is indeed what is going on, what is the reasoning behind it?
>I didn't really see any problems before with the simple scheduler, so
>it seems to me like this may just be a hack to make poorly-written
>applications seem to be a bit "faster" by starving other processes of
>CPU when the poorly-written applications decide they want to do
>something (such as rendering a page with a large table in Mozilla
>-- grr). Is this really making a large enough difference to be worth
>all of this trouble?
>
>To me it would seem the best algorithm would be what we had before all
>of this started. Isn't it best to switch to a task as soon as an event
>(such as disk I/O finishing or a mouse move waking up X to read mouse
>input) occurs for both latency and cache reasons (queued in LIFO
>order)? DMA may make some this more complicated, I don't know.

Hmm. If a mouse event happened to be queued but not yet run when a slew of
disk events arrived, LIFO would immediately suck. LIFO may be good for the
cache, but it doesn't seem like it could be good for average
latency. Other than that, what you describe is generally what
happens. Tasks which are waiting for hardware a lot rapidly attain a very
high priority, and preempt whoever happened to service the interrupt
(waker) almost instantly. I'd have to look closer at the old scheduler to
be sure, but I don't think there's anything much different between old/new
handling.

>I am seeing similar starvation problems that others are seeing in these
>threads. At first it was whenever I clicked a link in Mozilla -- xmms
>would stop, sometimes for a second or so, on a Celeron 466 MHz machine.

Do you see this with test-X and Ingo's latest changes too? I can only
imagine one scenario off the top of my head where this could happen; if
xmms exhausted a slice while STARVATION_LIMIT is exceeded, it could land in
the expired array and remain unserviced for the period of time it takes for
all tasks remaining in the active array to exhaust their slices. Seems
like that should be pretty rare though.

-Mike

2003-08-11 05:50:27

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
> On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> > More duck tape interactivity tweaks
>
> s/duck/duct
>
> > Wli pointed out an error in the nanosecond to jiffy conversion which may
> > have been causing too easy to migrate tasks on smp (? performance change).
>
> Looks like I broke SMP build with this. Will fix soon; don't bother trying
> this on SMP yet.
>

Not to be nasty or such, but all these patches have taken
a very responsive HT box to one that have issues with multiple
make -j10's running and random jerkyness.

I am not so sure I for one want changes to the scheduler for
SMP (not UP interactivity ones anyhow).


Cheers,

--
Martin Schlemmer


2003-08-11 06:02:54

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
> On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
> > On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> > > More duck tape interactivity tweaks
> >
> > s/duck/duct
> >
> > > Wli pointed out an error in the nanosecond to jiffy conversion which
> > > may have been causing too easy to migrate tasks on smp (? performance
> > > change).
> >
> > Looks like I broke SMP build with this. Will fix soon; don't bother
> > trying this on SMP yet.
>
> Not to be nasty or such, but all these patches have taken
> a very responsive HT box to one that have issues with multiple
> make -j10's running and random jerkyness.

A UP HT box you mean? That shouldn't be capable of running multiple make -j10s
without some noticable effect. Apart from looking impressive, there is no
point in having 30 cpu heavy things running with only 1 and a bit processor
and the machine being smooth as silk; the cpu heavy things will just be
unfairly starved in the interest of appearance (I can do that easily enough).
Please give details if there is a specific issue you think I've broken or
else I wont know about it.

> I am not so sure I for one want changes to the scheduler for
> SMP (not UP interactivity ones anyhow).

They're not; the improvements should affect fairness on SMP as well and
although interactivity is what I'm addressing on the surface, fairness is the
real issue.

Con.

2003-08-11 08:40:37

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 2003-08-11 at 08:08, Con Kolivas wrote:
> On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
> > On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
> > > On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> > > > More duck tape interactivity tweaks
> > >
> > > s/duck/duct
> > >
> > > > Wli pointed out an error in the nanosecond to jiffy conversion which
> > > > may have been causing too easy to migrate tasks on smp (? performance
> > > > change).
> > >
> > > Looks like I broke SMP build with this. Will fix soon; don't bother
> > > trying this on SMP yet.
> >
> > Not to be nasty or such, but all these patches have taken
> > a very responsive HT box to one that have issues with multiple
> > make -j10's running and random jerkyness.
>
> A UP HT box you mean?

Given :D

> That shouldn't be capable of running multiple make -j10s
> without some noticable effect. Apart from looking impressive, there is no
> point in having 30 cpu heavy things running with only 1 and a bit processor
> and the machine being smooth as silk; the cpu heavy things will just be
> unfairly starved in the interest of appearance (I can do that easily enough).
> Please give details if there is a specific issue you think I've broken or
> else I wont know about it.
>

My opinion when the first 3.06 with HT came out was also sceptic.
They did not perform that well. Things did change though - with
the 8[67]5p chipsets and dual channel ddr400 it is a vast
improvement, even if only a P4 2.4C running HT.

To give a good example (right, not linux based :P), my brother
have pretty much the same system running XP. With the P4T533-c
(rambus) running a 2.4B, he could not do video encoding at highest
priority (or even high) and be able to do much else. With 875p
and 2.4C he do encoding at highest priority, get frame rates
of 28-35+ (with the dual pass, etc) which is higher than the old
2.4B, while playing C&C Generals, watching movie, etc.

My system runs two make -j24's (yes, just testing), while MP3's
play smooth and general moving between desktops and windows is
still smooth with the default scheduler.

After all - most of what I do is compile too many things while
trying to read email, browse, irc and listen to MP3's. I do not
mind the obvious skip or two if really pushing the box, and after
all, interactivity is 50% in the mind (ok, maybe not that much),
but I do notice a degrade in 'interactivity' with your patches
and HT enabled on this box.

Another question - should bogomips (or some other type of general
system performance measurement) not have a bigger role in how
the scheduler work ? Maybe I am on crack, but I assume a process
gets more 'slices' per second/minute on a 3GHz machine than on a
300MHz ? It may already, have not checked =)

> > I am not so sure I for one want changes to the scheduler for
> > SMP (not UP interactivity ones anyhow).
>
> They're not; the improvements should affect fairness on SMP as well and
> although interactivity is what I'm addressing on the surface, fairness is the
> real issue.
>

Yes, but 'fairness' and 'interactivity' is not the same thing (IMHO).

'fairness' in my books means give each process enough time when it
is needed so that everything gets whatever it do done without
being 'forgotten' for too long, or having disk/whatever hold up too
badly, or starving a smaller process to death.

'interactivity' on the other hand means (my books again) starve
everything that would not be noticed by the user so that he can
'wiggle around windows' (sorry, seen this a few times, and could
not help myself =) to his heats content and still not get XMMS
to skip (ok, so maybe this may be once again too far fetched :-).

Just me.

NB: any chance to get you patches against vanilla/bk as well,
as I in general like rolling my own kernels more than using
mm, jc, etc (no offence guys).


Thanks,

--
Martin Schlemmer


2003-08-11 08:49:16

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 11 Aug 2003, Martin Schlemmer wrote:

> NB: any chance to get you patches against vanilla/bk as well,
> as I in general like rolling my own kernels more than using
> mm, jc, etc (no offence guys).

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/2.6.0-test3-mm1/broken-out/

2003-08-11 09:02:03

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 11 Aug 2003 18:37, Zwane Mwaikambo wrote:
> On Mon, 11 Aug 2003, Martin Schlemmer wrote:
> > NB: any chance to get you patches against vanilla/bk as well,
> > as I in general like rolling my own kernels more than using
> > mm, jc, etc (no offence guys).
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test3/
>2.6.0-test3-mm1/broken-out/

http://kernel.kolivas.org/2.5
has full sets against vanilla

Con

2003-08-11 09:15:56

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH]O14int



Con Kolivas wrote:

>On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
>
>>On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
>>
>>>On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
>>>
>>>>More duck tape interactivity tweaks
>>>>
>>>s/duck/duct
>>>
>>>
>>>>Wli pointed out an error in the nanosecond to jiffy conversion which
>>>>may have been causing too easy to migrate tasks on smp (? performance
>>>>change).
>>>>
>>>Looks like I broke SMP build with this. Will fix soon; don't bother
>>>trying this on SMP yet.
>>>
>>Not to be nasty or such, but all these patches have taken
>>a very responsive HT box to one that have issues with multiple
>>make -j10's running and random jerkyness.
>>
>
>A UP HT box you mean? That shouldn't be capable of running multiple make -j10s
>without some noticable effect. Apart from looking impressive, there is no
>point in having 30 cpu heavy things running with only 1 and a bit processor
>and the machine being smooth as silk; the cpu heavy things will just be
>unfairly starved in the interest of appearance (I can do that easily enough).
>Please give details if there is a specific issue you think I've broken or
>else I wont know about it.
>

Yeah make -j10s won't be without impact, but I think for a lot of
interactive stuff they don't need a lot of CPU, just to get it
in a timely manner. And Martin did say it had been responsive.
Sounds like in this case your changes are causing the interactive
stuff to get less CPU or higher scheduling latency?


2003-08-11 09:38:25

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 11 Aug 2003 19:15, Nick Piggin wrote:
> Con Kolivas wrote:
> >On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
> >>On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
> >>>On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> >>>>More duck tape interactivity tweaks
> >>>
> >>>s/duck/duct
> >>>
> >>>>Wli pointed out an error in the nanosecond to jiffy conversion which
> >>>>may have been causing too easy to migrate tasks on smp (? performance
> >>>>change).
> >>>
> >>>Looks like I broke SMP build with this. Will fix soon; don't bother
> >>>trying this on SMP yet.
> >>
> >>Not to be nasty or such, but all these patches have taken
> >>a very responsive HT box to one that have issues with multiple
> >>make -j10's running and random jerkyness.
> >
> >A UP HT box you mean? That shouldn't be capable of running multiple make
> > -j10s without some noticable effect. Apart from looking impressive, there
> > is no point in having 30 cpu heavy things running with only 1 and a bit
> > processor and the machine being smooth as silk; the cpu heavy things will
> > just be unfairly starved in the interest of appearance (I can do that
> > easily enough). Please give details if there is a specific issue you
> > think I've broken or else I wont know about it.
>
> Yeah make -j10s won't be without impact, but I think for a lot of
> interactive stuff they don't need a lot of CPU, just to get it
> in a timely manner. And Martin did say it had been responsive.
> Sounds like in this case your changes are causing the interactive
> stuff to get less CPU or higher scheduling latency?

Sigh..,

No, it sounds to me like things are expiring faster than on default. He didn't
say make -j10, it was multiple -j10s. This is one where you simply cannot let
the scheduler keep starving the make -j10s indefinitely for X; on a server or
multiuser box X will simply cause unfair starvation. I'm trying to find a
workaround for this without rewriting whole sections of the scheduler code,
but I'm just not sure I should be trying to optimise for a desktop that runs
loads >16 per cpu. (I'll keep trying though, but if there is no workaround
that remains fair it wont happen)

Con

2003-08-11 09:45:17

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH]O14int



Con Kolivas wrote:

>On Mon, 11 Aug 2003 19:15, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
>>>
>>>>On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
>>>>
>>>>>On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
>>>>>
>>>>>>More duck tape interactivity tweaks
>>>>>>
>>>>>s/duck/duct
>>>>>
>>>>>
>>>>>>Wli pointed out an error in the nanosecond to jiffy conversion which
>>>>>>may have been causing too easy to migrate tasks on smp (? performance
>>>>>>change).
>>>>>>
>>>>>Looks like I broke SMP build with this. Will fix soon; don't bother
>>>>>trying this on SMP yet.
>>>>>
>>>>Not to be nasty or such, but all these patches have taken
>>>>a very responsive HT box to one that have issues with multiple
>>>>make -j10's running and random jerkyness.
>>>>
>>>A UP HT box you mean? That shouldn't be capable of running multiple make
>>>-j10s without some noticable effect. Apart from looking impressive, there
>>>is no point in having 30 cpu heavy things running with only 1 and a bit
>>>processor and the machine being smooth as silk; the cpu heavy things will
>>>just be unfairly starved in the interest of appearance (I can do that
>>>easily enough). Please give details if there is a specific issue you
>>>think I've broken or else I wont know about it.
>>>
>>Yeah make -j10s won't be without impact, but I think for a lot of
>>interactive stuff they don't need a lot of CPU, just to get it
>>in a timely manner. And Martin did say it had been responsive.
>>Sounds like in this case your changes are causing the interactive
>>stuff to get less CPU or higher scheduling latency?
>>
>
>Sigh..,
>
>No, it sounds to me like things are expiring faster than on default. He didn't
>say make -j10, it was multiple -j10s. This is one where you simply cannot let
>the scheduler keep starving the make -j10s indefinitely for X; on a server or
>multiuser box X will simply cause unfair starvation. I'm trying to find a
>workaround for this without rewriting whole sections of the scheduler code,
>but I'm just not sure I should be trying to optimise for a desktop that runs
>loads >16 per cpu. (I'll keep trying though, but if there is no workaround
>that remains fair it wont happen)
>
>

Yep, I did see the multiple j10s ;)
I wasn't aware that there was longer term starvation of gccs by X. I
thought the scheduler had always been quite good at evening up the
total CPU time used and a change you made had recently introduced a
latency or interactiveness problem.

But Martin didn't give a very detailed description of the problem,
and no I definitely don't think you should be aiming at fixing
his problem if it causes starvation or harms more common loads.


2003-08-11 14:09:25

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 2003-08-11 at 11:43, Con Kolivas wrote:
> On Mon, 11 Aug 2003 19:15, Nick Piggin wrote:
> > Con Kolivas wrote:
> > >On Mon, 11 Aug 2003 15:44, Martin Schlemmer wrote:
> > >>On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
> > >>>On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
> > >>>>More duck tape interactivity tweaks
> > >>>
> > >>>s/duck/duct
> > >>>
> > >>>>Wli pointed out an error in the nanosecond to jiffy conversion which
> > >>>>may have been causing too easy to migrate tasks on smp (? performance
> > >>>>change).
> > >>>
> > >>>Looks like I broke SMP build with this. Will fix soon; don't bother
> > >>>trying this on SMP yet.
> > >>
> > >>Not to be nasty or such, but all these patches have taken
> > >>a very responsive HT box to one that have issues with multiple
> > >>make -j10's running and random jerkyness.
> > >
> > >A UP HT box you mean? That shouldn't be capable of running multiple make
> > > -j10s without some noticable effect. Apart from looking impressive, there
> > > is no point in having 30 cpu heavy things running with only 1 and a bit
> > > processor and the machine being smooth as silk; the cpu heavy things will
> > > just be unfairly starved in the interest of appearance (I can do that
> > > easily enough). Please give details if there is a specific issue you
> > > think I've broken or else I wont know about it.
> >
> > Yeah make -j10s won't be without impact, but I think for a lot of
> > interactive stuff they don't need a lot of CPU, just to get it
> > in a timely manner. And Martin did say it had been responsive.
> > Sounds like in this case your changes are causing the interactive
> > stuff to get less CPU or higher scheduling latency?
>
> Sigh..,
>
> No, it sounds to me like things are expiring faster than on default. He didn't
> say make -j10, it was multiple -j10s. This is one where you simply cannot let
> the scheduler keep starving the make -j10s indefinitely for X; on a server or
> multiuser box X will simply cause unfair starvation. I'm trying to find a
> workaround for this without rewriting whole sections of the scheduler code,
> but I'm just not sure I should be trying to optimise for a desktop that runs
> loads >16 per cpu. (I'll keep trying though, but if there is no workaround
> that remains fair it wont happen)
>

Con, you are doing great work for UP desktop systems.
All I am saying is I do not think that there will be
an golden middle way. If I disable SMP, it works much
as expected for the short time I tested. I guess I am just
voicing what a few people have said - maybe there should
be a choice for what sheduler - UP or SMP.


Cheers,

--
Martin Schlemmer


2003-08-11 14:13:00

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 2003-08-11 at 11:44, Nick Piggin wrote:

> >
> >Sigh..,
> >
> >No, it sounds to me like things are expiring faster than on default. He didn't
> >say make -j10, it was multiple -j10s. This is one where you simply cannot let
> >the scheduler keep starving the make -j10s indefinitely for X; on a server or
> >multiuser box X will simply cause unfair starvation. I'm trying to find a
> >workaround for this without rewriting whole sections of the scheduler code,
> >but I'm just not sure I should be trying to optimise for a desktop that runs
> >loads >16 per cpu. (I'll keep trying though, but if there is no workaround
> >that remains fair it wont happen)
> >
> >
>
> Yep, I did see the multiple j10s ;)
> I wasn't aware that there was longer term starvation of gccs by X. I
> thought the scheduler had always been quite good at evening up the
> total CPU time used and a change you made had recently introduced a
> latency or interactiveness problem.
>

I did not say the 'make -j10s' starved. I am saying that mouse
is laggish, as well as window/desktop switching.

Also, I am not saying Con should fix it - I am asking if we really
want one scheduler that should try to do the right thing for SMP
*and* UP.


--
Martin Schlemmer


2003-08-11 14:30:07

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Tue, 12 Aug 2003 00:04, Martin Schlemmer wrote:
> I did not say the 'make -j10s' starved. I am saying that mouse
> is laggish, as well as window/desktop switching.

No, but I did. With the vanilla scheduler, a normal user can turn X into a cpu
hog and starve cc1s for 3 seconds, and the more X like applications there are
on the machine, the longer they can all sit around starving something else.
Feeling ok on one machine is not enough to say it's fine.

> Also, I am not saying Con should fix it - I am asking if we really
> want one scheduler that should try to do the right thing for SMP
> *and* UP.

No, the same issues that apply to fairness, interactivity, throughput and
latency are there regardless of SMP or UP. I've had good reports from SMP in
the past; your HT report is the first that it was bad, and I've said that
some fairness issues have been addressed which cause those.

The current scheduler (with or without some tweak or other) will be in 2.6 and
should work as much of the time, in as many settings as possible, well. Since
I'm trying to work on it I hope you can report exactly what your issue is and
I'll try and address it. Do you really compile jobs make -j10 each time while
using your machine? (rhetoric question of course since there is absolutely no
advantage to doing that without lots of cpus). If not, how does it perform
under your real world conditions?

Con

2003-08-11 15:25:26

by Martin Schlemmer

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Mon, 2003-08-11 at 16:33, Con Kolivas wrote:

> > Also, I am not saying Con should fix it - I am asking if we really
> > want one scheduler that should try to do the right thing for SMP
> > *and* UP.
>
> No, the same issues that apply to fairness, interactivity, throughput and
> latency are there regardless of SMP or UP. I've had good reports from SMP in
> the past; your HT report is the first that it was bad, and I've said that
> some fairness issues have been addressed which cause those.
>
> The current scheduler (with or without some tweak or other) will be in 2.6 and
> should work as much of the time, in as many settings as possible, well. Since
> I'm trying to work on it I hope you can report exactly what your issue is and
> I'll try and address it. Do you really compile jobs make -j10 each time while
> using your machine? (rhetoric question of course since there is absolutely no
> advantage to doing that without lots of cpus). If not, how does it perform
> under your real world conditions?
>

Normal run of things there is many times 1-3 'make -j6s' running.
Yes, sure, for on of them you prob should use -j4, but hey its
in the head, right =). No, it is not kernels, it is a variety
of stuff - goes with the distro i guess. Yes, I have tested
runs of 'make -j12' and 'make -j24' (sorry, should have been more
precise, but -j{6,12,24) was used as testing, with -j6 default)
running dual makes.

With vanilla the mouse pointer, XMMS, switching desktops or
windows is smooth. If I really hammer the system, it does
'slow down' the general navigation of X a little, but not
so that that mouse pointer is jerky, etc. With the O??int
patches things starts to 'stutter' under loads that is fairly
under those that vanilla handles fine. The mouse gets jerky,
switching desktops is notably lagging.

Note that I am not talking about starving XMMS/the make's.
I am just talking general navigation of X. Yes, even with
vanilla things do start a bit slower, but the mouse goes
where it should, and its not as if the vga struggles to
redraw the screen on desktop switch. I do not expect the
system to behave for 'interactive' processes and xmms/whatever
as if there is no load - the signs of load is just way more
than with vanilla.


Regards,

--
Martin Schlemmer


2003-08-11 16:29:13

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH]O14int

At 12:33 AM 8/12/2003 +1000, Con Kolivas wrote:

>I'm trying to work on it I hope you can report exactly what your issue is and
>I'll try and address it. Do you really compile jobs make -j10 each time while
>using your machine? (rhetoric question of course since there is absolutely no
>advantage to doing that without lots of cpus). If not, how does it perform
>under your real world conditions?

Um, slight ~objection.

What's the difference between one tester running a make -j10 and 10
students compiling their assignments in a multiuser box? I test throughput
with make -j30 on a 128Mb 500Mhz PIII, because I know for a very very many
times measured fact that the box can handle this (heavy but _not_ extreme)
load. It's not the only load in the world, but it's such a dead simple
load that the kernel dare not have difficulty with it IMHO.

-Mike

2003-08-11 17:55:27

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sat, 9 Aug 2003 01:49, Con Kolivas wrote:
>>> Wli pointed out an error in the nanosecond to jiffy conversion which may
>>> have been causing too easy to migrate tasks on smp (? performance change).

On Sat, 2003-08-09 at 11:04, Con Kolivas wrote:
>> Looks like I broke SMP build with this. Will fix soon; don't bother trying
>> this on SMP yet.

On Mon, Aug 11, 2003 at 07:44:52AM +0200, Martin Schlemmer wrote:
> Not to be nasty or such, but all these patches have taken
> a very responsive HT box to one that have issues with multiple
> make -j10's running and random jerkyness.
> I am not so sure I for one want changes to the scheduler for
> SMP (not UP interactivity ones anyhow).

Please try this again with the suggested correction to the load balancer.


-- wli

2003-08-11 18:18:40

by Roger Larsson

[permalink] [raw]
Subject: Re: [PATCH]O14int [SCHED_SOFTRR please]

On Sunday 10 August 2003 13.17, Mike Galbraith wrote:
> At 01:48 AM 8/10/2003 -0700, Simon Kirby wrote:
> >I am seeing similar starvation problems that others are seeing in these
> >threads. At first it was whenever I clicked a link in Mozilla -- xmms
> >would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
>
> Do you see this with test-X and Ingo's latest changes too? I can only
> imagine one scenario off the top of my head where this could happen; if
> xmms exhausted a slice while STARVATION_LIMIT is exceeded, it could land in
> the expired array and remain unserviced for the period of time it takes for
> all tasks remaining in the active array to exhaust their slices. Seems
> like that should be pretty rare though.
>

xmms is a RT process - it does not really have interactivity problems...
It will be extremely hard to fix this in a generic scheduler, instead
let xmms be the RT process it is with SCHED_SOFTRR (or whatever
it will be named).
Do this for arts, and other audio/video path applications.

Then start the race for interactivity tuning
(X, X applications, console, login, etc)

interactivity = two-way
http://www.m-w.com/cgi-bin/dictionary?va=interactive

Listening to music is not interactive.

Changing equalization on a media playback need to be interactive in
two ways.
1) The slider should move in the GUI.
2) The volume should change, but the big buffers needed in todays audio path
will delay the audible changes...
Note: audio path starvation is not one of them...

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2003-08-11 19:43:38

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH]O14int [SCHED_SOFTRR please]

At 08:19 PM 8/11/2003 +0200, Roger Larsson wrote:
>On Sunday 10 August 2003 13.17, Mike Galbraith wrote:
> > At 01:48 AM 8/10/2003 -0700, Simon Kirby wrote:
> > >I am seeing similar starvation problems that others are seeing in these
> > >threads. At first it was whenever I clicked a link in Mozilla -- xmms
> > >would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
> >
> > Do you see this with test-X and Ingo's latest changes too? I can only
> > imagine one scenario off the top of my head where this could happen; if
> > xmms exhausted a slice while STARVATION_LIMIT is exceeded, it could land in
> > the expired array and remain unserviced for the period of time it takes for
> > all tasks remaining in the active array to exhaust their slices. Seems
> > like that should be pretty rare though.
> >
>
>xmms is a RT process - it does not really have interactivity problems...
>It will be extremely hard to fix this in a generic scheduler, instead
>let xmms be the RT process it is with SCHED_SOFTRR (or whatever
>it will be named).
>Do this for arts, and other audio/video path applications.

(For the scenario described, it doesn't matter what scheduler policy is used)

>Then start the race for interactivity tuning
> (X, X applications, console, login, etc)
>
>interactivity = two-way
> http://www.m-w.com/cgi-bin/dictionary?va=interactive
>
>Listening to music is not interactive.

?!? <tilt> What makes you say that? What in the world am I doing when I
fire up xmms?
(can't be the two way thing... that's happening until I stop listening)

-Mike

-Mike

2003-08-11 21:47:58

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int [SCHED_SOFTRR please]

On Tue, 12 Aug 2003 04:19, Roger Larsson wrote:
> xmms is a RT process - it does not really have interactivity problems...
> It will be extremely hard to fix this in a generic scheduler, instead
> let xmms be the RT process it is with SCHED_SOFTRR (or whatever
> it will be named).

Have you actually _tried_ the tweaked generic scheduler before this big claim?

Con

2003-08-11 23:40:16

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH]O14int



Martin Schlemmer wrote:

>
> I did not say the 'make -j10s' starved. I am saying that mouse
> is laggish, as well as window/desktop switching.
>
> Also, I am not saying Con should fix it - I am asking if we really
> want one scheduler that should try to do the right thing for SMP
> *and* UP.


Putting aside the load balancer, isn't the SMP case little more than
multiple UP schedulers running in parallel?

I think that was supposed to be one of the great things about the O(1)
scheduler: It unlocked the CPUs from each other so there would be far
fewer spinlocks.


2003-08-12 00:24:13

by Roger Larsson

[permalink] [raw]
Subject: What is interactivity? Re: [PATCH]O14int [SCHED_SOFTRR please]

On Monday 11 August 2003 21.46, Mike Galbraith wrote:
> At 08:19 PM 8/11/2003 +0200, Roger Larsson wrote:
> >On Sunday 10 August 2003 13.17, Mike Galbraith wrote:
> > > At 01:48 AM 8/10/2003 -0700, Simon Kirby wrote:
> > > >I am seeing similar starvation problems that others are seeing in
> > > > these threads. At first it was whenever I clicked a link in Mozilla
> > > > -- xmms would stop, sometimes for a second or so, on a Celeron 466
> > > > MHz machine.
> > >
> > > Do you see this with test-X and Ingo's latest changes too? I can only
> > > imagine one scenario off the top of my head where this could happen; if
> > > xmms exhausted a slice while STARVATION_LIMIT is exceeded, it could
> > > land in the expired array and remain unserviced for the period of time
> > > it takes for all tasks remaining in the active array to exhaust their
> > > slices. Seems like that should be pretty rare though.
> >
> >xmms is a RT process - it does not really have interactivity problems...
> >It will be extremely hard to fix this in a generic scheduler, instead
> >let xmms be the RT process it is with SCHED_SOFTRR (or whatever
> >it will be named).
> >Do this for arts, and other audio/video path applications.
>
> (For the scenario described, it doesn't matter what scheduler policy is
> used)

It matters if the SOFTRR processes are well behaved, they will get their share
as long as _they_ do not overuse CPU.

Suppose you have xmms running SOFTRR. Whatever you do that is not SOFTRR
(or higher SCHED_FIFO, SCHED_RR) can't touch is scheduler wice.
It will remain SOFTRR and will not run out of its timeslice unless it uses too
much CPU - its timeslice is refilled immediately whenever it gets empty (it
is put last on the SOFTRR run queue - not in the expired array...)
But if it SOFTRR processes has used too much CPU there are no guarantees.

>
> >Then start the race for interactivity tuning
> > (X, X applications, console, login, etc)
> >
> >interactivity = two-way
> > http://www.m-w.com/cgi-bin/dictionary?va=interactive
> >
> >Listening to music is not interactive.
>
> ?!? <tilt> What makes you say that? What in the world am I doing when I
> fire up xmms?
> --- snip ---

You expect sound to start soon - that is the interactive behaviour.

Suppose xmms starts after four seconds and then won't miss a beat.
Compare with if it starts after ten seconds and then won't miss a beat.
If you relate each frame to the start action then you will see that _every_
frame in the first case is one second late, and in the second case ten
seconds late. (Best possible interactivity would be an immediate start - don't
you agree?)

xmms is interactive if you see the audioboard as the second part.
But I think that if we could concentrate on human users the problem will
become easier. If I leave home while compiling KDE and playing audio with xmms
- is xmms still interactive? (this will be hard to fix but it is not
impossible, someone (on a MAC I think) have done a application that logged in
when you arrived with your bluetooth device and logged off when you left)


"make all" - interactive? It depends on my expectations, my expectations
depends on how big the _total task_ is.
* If it is run from a shell script - like the kde-build I have in the
background right now. No way!
* If it is my kdeveloper test project ("Hello world" for remote debugging).
Yes it is! I waiting for it and expect it to be ready NOW.

make bzImage - total rebuild, Not interactive - I expect to be able to get a
cup of coffe while waiting.
make bzImage - one .c file changed, interactive

I think that the work done this far is great. It is great that the scheduler
almost can handle xmms under all kinds of loads - but enough is enough.

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2003-08-12 05:36:46

by Mike Galbraith

[permalink] [raw]
Subject: Re: What is interactivity? Re: [PATCH]O14int [SCHED_SOFTRR please]

At 02:26 AM 8/12/2003 +0200, Roger Larsson wrote:
>On Monday 11 August 2003 21.46, Mike Galbraith wrote:
> > At 08:19 PM 8/11/2003 +0200, Roger Larsson wrote:
> > >On Sunday 10 August 2003 13.17, Mike Galbraith wrote:
> > > > At 01:48 AM 8/10/2003 -0700, Simon Kirby wrote:
> > > > >I am seeing similar starvation problems that others are seeing in
> > > > > these threads. At first it was whenever I clicked a link in Mozilla
> > > > > -- xmms would stop, sometimes for a second or so, on a Celeron 466
> > > > > MHz machine.
> > > >
> > > > Do you see this with test-X and Ingo's latest changes too? I can only
> > > > imagine one scenario off the top of my head where this could happen; if
> > > > xmms exhausted a slice while STARVATION_LIMIT is exceeded, it could
> > > > land in the expired array and remain unserviced for the period of time
> > > > it takes for all tasks remaining in the active array to exhaust their
> > > > slices. Seems like that should be pretty rare though.
> > >
> > >xmms is a RT process - it does not really have interactivity problems...
> > >It will be extremely hard to fix this in a generic scheduler, instead
> > >let xmms be the RT process it is with SCHED_SOFTRR (or whatever
> > >it will be named).
> > >Do this for arts, and other audio/video path applications.
> >
> > (For the scenario described, it doesn't matter what scheduler policy is
> > used)
>
>It matters if the SOFTRR processes are well behaved, they will get their share
>as long as _they_ do not overuse CPU.
>
>Suppose you have xmms running SOFTRR. Whatever you do that is not SOFTRR
>(or higher SCHED_FIFO, SCHED_RR) can't touch is scheduler wice.
>It will remain SOFTRR and will not run out of its timeslice unless it uses
>too
>much CPU - its timeslice is refilled immediately whenever it gets empty (it
>is put last on the SOFTRR run queue - not in the expired array...)

Yup, brainfart on my part. Realtime tasks are immune.

>But if it SOFTRR processes has used too much CPU there are no guarantees.
>
> >
> > >Then start the race for interactivity tuning
> > > (X, X applications, console, login, etc)
> > >
> > >interactivity = two-way
> > > http://www.m-w.com/cgi-bin/dictionary?va=interactive
> > >
> > >Listening to music is not interactive.
> >
> > ?!? <tilt> What makes you say that? What in the world am I doing when I
> > fire up xmms?
> > --- snip ---
>
>You expect sound to start soon - that is the interactive behaviour.
>
>Suppose xmms starts after four seconds and then won't miss a beat.
>Compare with if it starts after ten seconds and then won't miss a beat.
>If you relate each frame to the start action then you will see that _every_
>frame in the first case is one second late, and in the second case ten
>seconds late. (Best possible interactivity would be an immediate start -
>don't
>you agree?)
>
>xmms is interactive if you see the audioboard as the second part.
>But I think that if we could concentrate on human users the problem will
>become easier. If I leave home while compiling KDE and playing audio with
>xmms
>- is xmms still interactive? (this will be hard to fix but it is not
>impossible, someone (on a MAC I think) have done a application that logged in
>when you arrived with your bluetooth device and logged off when you left)

If I leave the room, or even become distracted enough, xmms ceases to be
interactive.

>"make all" - interactive? It depends on my expectations, my expectations
>depends on how big the _total task_ is.

If you're watching it, I'd call it interactive. I see no difference
between watching a movie and watching compiler output scroll by.

>* If it is run from a shell script - like the kde-build I have in the
> background right now. No way!

Agreed. If you're not watching the output scroll by, it's not interactive.

>* If it is my kdeveloper test project ("Hello world" for remote debugging).
> Yes it is! I waiting for it and expect it to be ready NOW.
>
>make bzImage - total rebuild, Not interactive - I expect to be able to get a
>cup of coffe while waiting.
>make bzImage - one .c file changed, interactive

Well, interactivity can certainly be viewed like one of those tricky
philosophy questions (bears farting in the woods, trees falling over etc;),
but I consider any task which is connected to a human via any of our senses
to be interactive. Perhaps it's not a 100% accurate use of the term, but
for lack of a better term...

>I think that the work done this far is great. It is great that the scheduler
>almost can handle xmms under all kinds of loads - but enough is enough.

I don't care if xmms skips or my mouse pointer stalls while I'm testing at
the heavy end of the load scale, you flat can't have low latency and max
throughput at the same time. If xmms skips and the mouse becomes sticks at
less than "heavy" though, something is wrong (defining heavy is one of
those tricky judgement calls). It's the mozilla loading a webpage type of
reports that I worry about.

-Mike

2003-08-12 17:56:53

by Simon Kirby

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sun, Aug 10, 2003 at 07:06:34PM +1000, Con Kolivas wrote:

> Is this with or without my changes? The old scheduler was not very scalable;
> that's why we moved. The new one has other intrinsic issues that I (and
> others) have been trying to address, but is much much more scalable. It was
> not possible to make the old one more scalable, but it is possible to make
> this one more interactive.

Without your changes. Are you changing the design or just tuning certain
cases? I was talking more about the theory behind the scheduling
decisions and not about particular cases.

The O(1) scheduler changes definitely help scalability and I don't have
any problem with that change (unless it introduced the behavior I'm
talking about).

Simon-

2003-08-12 18:36:17

by Simon Kirby

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Sun, Aug 10, 2003 at 03:08:36AM -0700, William Lee Irwin III wrote:

> Most of this isn't of much concern; most of the 2.4.x semantics have
> largely been carried over to 2.6.x with algorithmic improvements, apart
> from the same-mm heuristic (which was of dubious value anyway). Even
> epochs are still there in the form of the duelling arrays, which
> renders the thing vaguely timeout-based like 2.4.x.

Hmm. I admit I haven't read the code enough to understand really what is
going on -- I'm just guessing how it is working (and how it did work)
based on experiences I've had with it over the years.

> On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> > It seems that newer schedulers are now temporarily postponing the
> > waking up of other processes when the running process is running with
> > "preemptive" ticks, and that there's all sorts of hacks involved in
> > trying to hide the bad effects of this decision.
>
> If this would deliberate it would be a "selfish" scheduling algorithm,
> where the delay in preemptively capturing the cpu is a number of ticks
> equal to whatever the value of beta/alpha was chosen to be, and some
> raw scheduling algorithm is used otherwise unaltered for those tasks in
> the service box. I see no evidence of such an organization (it'd be
> really obvious, as a queue box and service box would need to exist),
> hence this is probably just something in need of a performance tweak
> if it's a real problem.

Perhaps I should read the code to see what is actually going on (though
it is now fairly complex), but it definitely feels like this is
happening. Why else would my keystrokes to an otherwise-idle rxvt be
delayed while my browser is rendering a page? I suppose there may be
interactions with X. This never used to happen, however.

The simple question: Does the scheduler ever intend to delay a context
switch to a process (which has been idle long enough to rebuild its
maximum timeslice) when a wake up event occurs? If so, what is the
reasoning for this?

> > If this is indeed what is going on, what is the reasoning behind it?
> > I didn't really see any problems before with the simple scheduler, so
> > it seems to me like this may just be a hack to make poorly-written
> > applications seem to be a bit "faster" by starving other processes of
> > CPU when the poorly-written applications decide they want to do
> > something (such as rendering a page with a large table in Mozilla
> > -- grr). Is this really making a large enough difference to be worth
> > all of this trouble?
>
> Yes. The SMP issues addressed by the algorithmic improvements in the
> scheduler are performance issues so severe, they may safely be called
> functional issues.

Obviously the scheduler O(1) changes and other scalability improvements
are worthwhile, but I don't think (unless I'm missing something) they
explain the problem I'm seeing.

> On Sun, Aug 10, 2003 at 01:48:27AM -0700, Simon Kirby wrote:
> > To me it would seem the best algorithm would be what we had before all
> > of this started. Isn't it best to switch to a task as soon as an event
> > (such as disk I/O finishing or a mouse move waking up X to read mouse
> > input) occurs for both latency and cache reasons (queued in LIFO
> > order)? DMA may make some this more complicated, I don't know.
>
> This sounds like either LCFS or FB. FB's not usable out of the box for
> long-running tasks, as its context switch rates are excessive there.
> LCFS has some rather undesirable properties that render it unsuitable
> for general purpose operating systems. Something like multilevel
> processor sharing would be a much better alternative, as long-running
> tasks can be classified and scheduled according to a more appropriate
> discipline with a lower context switch rate while maintaining the
> (essentially infinitely) strong preference for short-running tasks.

What makes the context switches excessive? As far as I can see, the
only thing that can initiate a context switch are a process sleeping or
finishing, a timer tick and the scheduler deciding to switch, or a device
causing a wake up event. I was also wondering: Isn't it best to always
switch to the process which has just had an event for cache coherency?

> > I am seeing similar starvation problems that others are seeing in these
> > threads. At first it was whenever I clicked a link in Mozilla -- xmms
> > would stop, sometimes for a second or so, on a Celeron 466 MHz machine.
> > More recently I found that loading a web page consisting of several
> > large animated gif images (a security camera web page) caused
> > absolutely horrible jerking of mouse and keyboard input in all other
> > windows, even when the browser window was minimized or hidden. What's
> > worse is the jerking tends to subside if I do a lot of typing or more
> > the mouse a lot, probably because I'm changing the scheduler's idea of
> > what "kind" of processes are running (which makes this stuff even
> > harder to debug).
>
> One problem with these kinds of reports is that they aren't coming with
> enough information to determine if the scheduler truly is the cause of
> the problem, and worse yet, assuming the scheduler did cause these
> problems, this isn't enough actual information to address it. We're
> going to need proper instrumentation at some point here.

I can do this, but I'm not seeing inefficiency, I'm seeing large decision
problems. If the context switches were up in the hundreds of thousands
or higher, I would understand, but they're in the low hundreds. Isn't
top far too slow to figure out what is actually going on? Also, kernel
time is less than 10 percent, so I don't think kernel profiles will help.

Maybe I'm dreaming, but shouldn't the scheduler be simple enough so that
it can be considered "obviously correct"? ...Or close to that? :)

Simon-

2003-08-12 21:17:50

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Wed, 13 Aug 2003 03:56, Simon Kirby wrote:
> On Sun, Aug 10, 2003 at 07:06:34PM +1000, Con Kolivas wrote:
> > Is this with or without my changes? The old scheduler was not very
> > scalable; that's why we moved. The new one has other intrinsic issues
> > that I (and others) have been trying to address, but is much much more
> > scalable. It was not possible to make the old one more scalable, but it
> > is possible to make this one more interactive.
>
> Without your changes. Are you changing the design or just tuning certain
> cases? I was talking more about the theory behind the scheduling
> decisions and not about particular cases.

I'm just changing the algorithm that gives priority boost or penalty, and
creating code to further feedback into that algorithm.

> The O(1) scheduler changes definitely help scalability and I don't have
> any problem with that change (unless it introduced the behavior I'm
> talking about).

2003-08-13 06:42:41

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH]O14int

[Resent because the lkml spam filter thought this was spam originally]

Thanks for detailed description.

On Tue, 12 Aug 2003 01:19, Martin Schlemmer wrote:
> Normal run of things there is many times 1-3 'make -j6s' running.
> Yes, sure, for on of them you prob should use -j4, but hey its
> in the head, right =). No, it is not kernels, it is a variety

Actually in benchmarking I've found no increase in speed with more than one
job per cpu but it's up to you of course.

> of stuff - goes with the distro i guess. Yes, I have tested
> runs of 'make -j12' and 'make -j24' (sorry, should have been more
> precise, but -j{6,12,24) was used as testing, with -j6 default)
> running dual makes.
>
> With vanilla the mouse pointer, XMMS, switching desktops or
> windows is smooth. If I really hammer the system, it does
> 'slow down' the general navigation of X a little, but not
> so that that mouse pointer is jerky, etc. With the O??int
> patches things starts to 'stutter' under loads that is fairly
> under those that vanilla handles fine. The mouse gets jerky,
> switching desktops is notably lagging.

Yes what you're describing is the expiring X issue. At this moment in time
with my patches with very high loads, it is easy for interactive tasks to
expire if used for sustained periods. This means that they will be smoother
than vanilla for a burst, then have a nasty blip when falling on the expired
array. This doesn't happen at lower loads, and is representative of the hard
to optimise for case - the interactive task that behaves occasionally like a
cpu hog (ie X). Lots of suggestions for ways around this have been offered,
but none address the fact that these will cause starvation of other loads.
Note that I say the vanilla scheduler does cause starvation already in the
wrong circumstances if you offer that as a solution to go back to it. Suffice
to say I'm still working on it as my highest priority.

Note that tasks that are never cpu hogs (eg xmms) will never stutter or falter
under these circumstances; which is why I stopped mentioning audio ages ago:
audio works without problems under normal and extreme circumstances with my
patches unless you renice a cpu hog to better priority than your audio app.
Note that despite all this, since people are so excited by the idea of
soft RR scheduling, I actually wrote a patch that will work with my tweaks a
while ago. I've not optimised or improved it at all because that is lower
priority to me than getting the general interactivity correct. For those
interested, it's in my experimental directory in kernel.kolivas.org/2.5

> Note that I am not talking about starving XMMS/the make's.
> I am just talking general navigation of X. Yes, even with
> vanilla things do start a bit slower, but the mouse goes
> where it should, and its not as if the vga struggles to
> redraw the screen on desktop switch. I do not expect the
> system to behave for 'interactive' processes and xmms/whatever
> as if there is no load - the signs of load is just way more
> than with vanilla.

As discussed above. Thanks for your report.

Con

2003-08-13 06:35:30

by Rob Landley

[permalink] [raw]
Subject: Re: What is interactivity? Re: [PATCH]O14int [SCHED_SOFTRR please]

On Tuesday 12 August 2003 01:40, Mike Galbraith wrote:

> Well, interactivity can certainly be viewed like one of those tricky
> philosophy questions (bears farting in the woods, trees falling over etc;),
> but I consider any task which is connected to a human via any of our senses
> to be interactive. Perhaps it's not a 100% accurate use of the term, but
> for lack of a better term...

"Interactivity" is being used as a proxy for at least two different
conditions: smooth spooling and snappy response to (possibly repeated)
asynchronous wakeups.

The smooth spooler problem is where you're trying to input or output stuff at
a constant rate, somewhere below your theoretical maximum capacity. Sound
output is like this. Whether you're listening or not, the tree in the forest
still falls. A skip is a skip, the output could be being recorded to tape or
who knows what. Correctness here is emprical; if it skips something went
wrong.

Sound is just one example, and a relatively easy one since the CPU
requirements are so low on modern machines. Personal Video Recorders ala
Tivo are a more demanding application (often coming perilously close to your
memory or disk bandwidth capacity), and skips or dropouts are saved for
posterity there. A human doesn't even have to be in the room, that task is
still "interactive".

Repeated asynchronous wakeups come from typing on the keyboard and wiggling
the mouse. If your mouse is dragging a window, the asynchronous wakeups
could provoke a lot of CPU activity.

The difference between these two is that they are different types of waits.
Smooth spooling involves waiting for a known period of time, and being woken
up by a timer. Asynchronous wakeups come out of the blue, the application
has know way of knowing the mouse is about to move or the keyboard is about
to press until it happens.

(Some things combine these behaviors. First person shooters (30 frames per
second, plus responding to the joystick NOW), but that kind of thing could
also collapse into the smooth spooler case if the frame rate's high enough
and polling for input is cheap...)

True CPU hogs do block, but they only block when they're requesting more work.
Any read or write to a block device is a "request more work" type of block,
for example. If the block device gets faster, the app runs faster.

With a CPU hog, there is no system so powerful that this thing won't try to
speed to completion as fast as it can. With an "interactive" task, the speed
of the system is not the limiting factor (or at least shouldn't be).

Now there's a lot of fuzzy bits where you can't tell what kind of block you're
doing. Blocking on the network, blocking on pipes, etc. Could be anything.
But I think it's pretty safe to say that a timer is always an interactive
wait, and a block device never is. (And considering that the I/O scheduler
and the CPU scheduler may have to work together in the future to make things
like the anticipatory schedulerwork properly, it shouldn't be TOO much of a
stretch to distinguish between waiting on a block device and waiting on
something else...)

> >I think that the work done this far is great. It is great that the
> > scheduler almost can handle xmms under all kinds of loads - but enough is
> > enough.
>
> I don't care if xmms skips or my mouse pointer stalls while I'm testing at
> the heavy end of the load scale,

I do. I believe you're in the minority here.

> you flat can't have low latency and max
> throughput at the same time.

If you're talking about keeping your cache hot, I agree. But a lot of times,
minimizing latency DOES help throughput. (Anticipatory scheduler, case in
point. :)

What you're saying is that you want your CPU hog loads to complete as quickly
as possible at the expense of smooth mouse movement. This is what "nice" is
for, isn't it? (If you've got a dedicated, throughput-optimized server
running X in the first place, you have more fundamental problems.)

And your uber-optimized configuration is still going to lose out to an
unoptimized configuration running on hardware that's three months newer... :)

The linux-kernel gurus focused their optimizations almost exclusively on
throughput for almost the first full decade of kernel development.
Interactive latency started explicitly showing up as a concern in 2.4, and
has only really become a priority in 2.5. There are a few tradeoffs, but
some of them are a bit overdue if you ask me.

If you can document a throughput degredation and give a repeatable benchmark,
I'm sure Con and Ingo will be thrilled to address it. A lot of contest is
about throughput, you know. They're trying very hard to avoid regressions...

> If xmms skips and the mouse becomes sticks at
> less than "heavy" though, something is wrong (defining heavy is one of
> those tricky judgement calls).

You know, I used to beat OS/2 to DEATH, and the mouse never went funky on me.
(Of course the mouse was updated directly from an interrupt routine in kernel
memory that never swapped out. But still... :)

> It's the mozilla loading a webpage type of reports that I worry about.

It could be worse. It could be OpenOffice. :)

> -Mike

Rob


2003-08-14 06:18:49

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Tue, 12 Aug 2003 01:19, Martin Schlemmer wrote:
>> Normal run of things there is many times 1-3 'make -j6s' running.
>> Yes, sure, for on of them you prob should use -j4, but hey its
>> in the head, right =). No, it is not kernels, it is a variety

On Wed, Aug 13, 2003 at 04:48:18PM +1000, Con Kolivas wrote:
> Actually in benchmarking I've found no increase in speed with more than one
> job per cpu but it's up to you of course.

I found some strange SMP artifacts that seemed to show a dromedary-like
throughput curve with respect to tasks, with one peak at 4 tasks/cpu and
another peak at 16 tasks/cpu on a 16x box (for kernel compiles).

But I don't consider that evidence of anything to do something about.


-- wli

2003-08-15 23:40:18

by Paul Dickson

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Wed, 13 Aug 2003 23:19:53 -0700, William Lee Irwin III wrote:

> I found some strange SMP artifacts that seemed to show a dromedary-like
> throughput curve with respect to tasks, with one peak at 4 tasks/cpu and
> another peak at 16 tasks/cpu on a 16x box (for kernel compiles).

"Dromedary-like" is a bell-shaped curve. Perhaps you meant "bactrian-like".

Sorry. I couldn't resist posting this. :-)

-Paul

2003-08-17 02:19:43

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH]O14int

On Wed, 13 Aug 2003 23:19:53 -0700, William Lee Irwin III wrote:
>> I found some strange SMP artifacts that seemed to show a dromedary-like
>> throughput curve with respect to tasks, with one peak at 4 tasks/cpu and
>> another peak at 16 tasks/cpu on a 16x box (for kernel compiles).

On Fri, Aug 15, 2003 at 04:40:10PM -0700, Paul Dickson wrote:
> "Dromedary-like" is a bell-shaped curve. Perhaps you meant "bactrian-like".
> Sorry. I couldn't resist posting this. :-)

Doh. Yes.


-- wli