2003-08-04 16:02:17

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] O13int for interactivity

Changes:

Reverted the child penalty to 95 as new changes help this from hurting

Changed the logic behind loss of interactive credits to those that burn off
all their sleep_avg

Now all tasks get proportionately more sleep as their relative bonus drops
off. This has the effect of detecting a change from a cpu burner to an
interactive task more rapidly as in O10.

The _major_ change in this patch is that tasks on uninterruptible sleep do not
earn any sleep avg during that sleep; it is not voluntary sleep so they should
not get it. This has the effect of stopping cpu hogs from gaining dynamic
priority during periods of heavy I/O. Very good for the jerks you may
see in X or audio skips when you start a whole swag of disk intensive cpu hogs
(eg make -j large number). I've simply dropped all their sleep_avg, but
weighting it may be more appropriate. This has the side effect that pure
disk tasks (eg cp) have relatively low priority which is why weighting may
be better. We shall see.

Please test this one extensively. It should _not_ affect I/O throughput per
se, but I'd like to see some of the I/O benchmarks on this. I do not want to
have detrimental effects elsewhere.

patch-O12.3-O13int applies on top of 2.6.0-test2-mm4 that has been
patched with O12.3int and is available on my site, and a full patch
against 2.6.0-test2 called patch-test2-O13int is here:

http://kernel.kolivas.org/2.5

patch-O12.3-O13int:

--- linux-2.6.0-test2-mm4-O12.3/kernel/sched.c 2003-08-05 01:30:27.000000000 +1000
+++ linux-2.6.0-test2-mm4-O13/kernel/sched.c 2003-08-05 01:36:20.000000000 +1000
@@ -78,7 +78,7 @@
#define MAX_TIMESLICE (200 * HZ / 1000)
#define TIMESLICE_GRANULARITY (HZ/40 ?: 1)
#define ON_RUNQUEUE_WEIGHT 30
-#define CHILD_PENALTY 90
+#define CHILD_PENALTY 95
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
#define PRIO_BONUS_RATIO 25
@@ -365,6 +365,9 @@ static void recalc_task_prio(task_t *p,
unsigned long long __sleep_time = now - p->timestamp;
unsigned long sleep_time;

+ if (!p->sleep_avg)
+ p->interactive_credit--;
+
if (__sleep_time > NS_MAX_SLEEP_AVG)
sleep_time = NS_MAX_SLEEP_AVG;
else
@@ -384,17 +387,19 @@ static void recalc_task_prio(task_t *p,
JIFFIES_TO_NS(JUST_INTERACTIVE_SLEEP(p));
else {
/*
- * Tasks with interactive credits get boosted more
- * rapidly if their bonus has dropped off. Other
- * tasks are limited to one timeslice worth of
- * sleep avg.
+ * The lower the sleep avg a task has the more
+ * rapidly it will rise with sleep time. Tasks
+ * without interactive_credit are limited to
+ * one timeslice worth of sleep avg bonus.
*/
- if (p->interactive_credit > 0)
- sleep_time *= (MAX_BONUS + 1 -
+ sleep_time *= (MAX_BONUS + 1 -
(NS_TO_JIFFIES(p->sleep_avg) *
MAX_BONUS / MAX_SLEEP_AVG));
- else if (sleep_time > JIFFIES_TO_NS(task_timeslice(p)))
- sleep_time = JIFFIES_TO_NS(task_timeslice(p));
+
+ if (p->interactive_credit < 0 &&
+ sleep_time > JIFFIES_TO_NS(task_timeslice(p)))
+ sleep_time =
+ JIFFIES_TO_NS(task_timeslice(p));

/*
* This code gives a bonus to interactive tasks.
@@ -435,20 +440,26 @@ static inline void activate_task(task_t
recalc_task_prio(p, now);

/*
- * Tasks which were woken up by interrupts (ie. hw events)
- * are most likely of interactive nature. So we give them
- * the credit of extending their sleep time to the period
- * of time they spend on the runqueue, waiting for execution
- * on a CPU, first time around:
+ * This checks to make sure it's not an uninterruptible task
+ * that is now waking up.
*/
- if (in_interrupt())
- p->activated = 2;
- else
- /*
- * Normal first-time wakeups get a credit too for on-runqueue time,
- * but it will be weighted down:
- */
- p->activated = 1;
+ if (!p->activated){
+ /*
+ * Tasks which were woken up by interrupts (ie. hw events)
+ * are most likely of interactive nature. So we give them
+ * the credit of extending their sleep time to the period
+ * of time they spend on the runqueue, waiting for execution
+ * on a CPU, first time around:
+ */
+ if (in_interrupt())
+ p->activated = 2;
+ else
+ /*
+ * Normal first-time wakeups get a credit too for on-runqueue
+ * time, but it will be weighted down:
+ */
+ p->activated = 1;
+ }

p->timestamp = now;

@@ -572,8 +583,15 @@ repeat_lock_task:
task_rq_unlock(rq, &flags);
goto repeat_lock_task;
}
- if (old_state == TASK_UNINTERRUPTIBLE)
+ if (old_state == TASK_UNINTERRUPTIBLE){
+ /*
+ * Tasks on involuntary sleep don't earn
+ * sleep_avg
+ */
rq->nr_uninterruptible--;
+ p->timestamp = sched_clock();
+ p->activated = -1;
+ }
if (sync)
__activate_task(p, rq);
else {
@@ -1326,7 +1344,6 @@ void scheduler_tick(int user_ticks, int
p->prio = effective_prio(p);
p->time_slice = task_timeslice(p);
p->first_time_slice = 0;
- p->interactive_credit--;

if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
@@ -1459,7 +1476,7 @@ pick_next_task:
queue = array->queue + idx;
next = list_entry(queue->next, task_t, run_list);

- if (next->activated && next->interactive_credit > 0) {
+ if (next->activated > 0) {
unsigned long long delta = now - next->timestamp;

if (next->activated == 1)


2003-08-04 17:13:16

by Antonio Vargas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, Aug 05, 2003 at 02:07:18AM +1000, Con Kolivas wrote:
> Changes:
>
> Reverted the child penalty to 95 as new changes help this from hurting
>
> Changed the logic behind loss of interactive credits to those that burn off
> all their sleep_avg
>
> Now all tasks get proportionately more sleep as their relative bonus drops
> off. This has the effect of detecting a change from a cpu burner to an
> interactive task more rapidly as in O10.
>
> The _major_ change in this patch is that tasks on uninterruptible sleep do not
> earn any sleep avg during that sleep; it is not voluntary sleep so they should
> not get it. This has the effect of stopping cpu hogs from gaining dynamic
> priority during periods of heavy I/O. Very good for the jerks you may
> see in X or audio skips when you start a whole swag of disk intensive cpu hogs
> (eg make -j large number). I've simply dropped all their sleep_avg, but
> weighting it may be more appropriate. This has the side effect that pure
> disk tasks (eg cp) have relatively low priority which is why weighting may
> be better. We shall see.
>
> Please test this one extensively. It should _not_ affect I/O throughput per
> se, but I'd like to see some of the I/O benchmarks on this. I do not want to
> have detrimental effects elsewhere.
>
> patch-O12.3-O13int applies on top of 2.6.0-test2-mm4 that has been
> patched with O12.3int and is available on my site, and a full patch
> against 2.6.0-test2 called patch-test2-O13int is here:
>
> http://kernel.kolivas.org/2.5
>
> patch-O12.3-O13int:
>
> --- linux-2.6.0-test2-mm4-O12.3/kernel/sched.c 2003-08-05 01:30:27.000000000 +1000
> +++ linux-2.6.0-test2-mm4-O13/kernel/sched.c 2003-08-05 01:36:20.000000000 +1000
> @@ -78,7 +78,7 @@
> #define MAX_TIMESLICE (200 * HZ / 1000)
> #define TIMESLICE_GRANULARITY (HZ/40 ?: 1)
> #define ON_RUNQUEUE_WEIGHT 30
> -#define CHILD_PENALTY 90
> +#define CHILD_PENALTY 95
> #define PARENT_PENALTY 100
> #define EXIT_WEIGHT 3
> #define PRIO_BONUS_RATIO 25
> @@ -365,6 +365,9 @@ static void recalc_task_prio(task_t *p,
> unsigned long long __sleep_time = now - p->timestamp;
> unsigned long sleep_time;
>
> + if (!p->sleep_avg)
> + p->interactive_credit--;
> +
> if (__sleep_time > NS_MAX_SLEEP_AVG)
> sleep_time = NS_MAX_SLEEP_AVG;
> else
> @@ -384,17 +387,19 @@ static void recalc_task_prio(task_t *p,
> JIFFIES_TO_NS(JUST_INTERACTIVE_SLEEP(p));
> else {
> /*
> - * Tasks with interactive credits get boosted more
> - * rapidly if their bonus has dropped off. Other
> - * tasks are limited to one timeslice worth of
> - * sleep avg.
> + * The lower the sleep avg a task has the more
> + * rapidly it will rise with sleep time. Tasks
> + * without interactive_credit are limited to
> + * one timeslice worth of sleep avg bonus.
> */
> - if (p->interactive_credit > 0)
> - sleep_time *= (MAX_BONUS + 1 -
> + sleep_time *= (MAX_BONUS + 1 -
> (NS_TO_JIFFIES(p->sleep_avg) *
> MAX_BONUS / MAX_SLEEP_AVG));
> - else if (sleep_time > JIFFIES_TO_NS(task_timeslice(p)))
> - sleep_time = JIFFIES_TO_NS(task_timeslice(p));
> +
> + if (p->interactive_credit < 0 &&
> + sleep_time > JIFFIES_TO_NS(task_timeslice(p)))
> + sleep_time =
> + JIFFIES_TO_NS(task_timeslice(p));
>
> /*
> * This code gives a bonus to interactive tasks.
> @@ -435,20 +440,26 @@ static inline void activate_task(task_t
> recalc_task_prio(p, now);
>
> /*
> - * Tasks which were woken up by interrupts (ie. hw events)
> - * are most likely of interactive nature. So we give them
> - * the credit of extending their sleep time to the period
> - * of time they spend on the runqueue, waiting for execution
> - * on a CPU, first time around:
> + * This checks to make sure it's not an uninterruptible task
> + * that is now waking up.
> */
> - if (in_interrupt())
> - p->activated = 2;
> - else
> - /*
> - * Normal first-time wakeups get a credit too for on-runqueue time,
> - * but it will be weighted down:
> - */
> - p->activated = 1;
> + if (!p->activated){

[1]

> + /*
> + * Tasks which were woken up by interrupts (ie. hw events)
> + * are most likely of interactive nature. So we give them
> + * the credit of extending their sleep time to the period
> + * of time they spend on the runqueue, waiting for execution
> + * on a CPU, first time around:
> + */
> + if (in_interrupt())
> + p->activated = 2;
> + else
> + /*
> + * Normal first-time wakeups get a credit too for on-runqueue
> + * time, but it will be weighted down:
> + */
> + p->activated = 1;

[3]

> + }
>
> p->timestamp = now;
>
> @@ -572,8 +583,15 @@ repeat_lock_task:
> task_rq_unlock(rq, &flags);
> goto repeat_lock_task;
> }
> - if (old_state == TASK_UNINTERRUPTIBLE)
> + if (old_state == TASK_UNINTERRUPTIBLE){
> + /*
> + * Tasks on involuntary sleep don't earn
> + * sleep_avg
> + */
> rq->nr_uninterruptible--;
> + p->timestamp = sched_clock();
> + p->activated = -1;

[2]

> + }
> if (sync)
> __activate_task(p, rq);
> else {
> @@ -1326,7 +1344,6 @@ void scheduler_tick(int user_ticks, int
> p->prio = effective_prio(p);
> p->time_slice = task_timeslice(p);
> p->first_time_slice = 0;
> - p->interactive_credit--;
>
> if (!rq->expired_timestamp)
> rq->expired_timestamp = jiffies;
> @@ -1459,7 +1476,7 @@ pick_next_task:
> queue = array->queue + idx;
> next = list_entry(queue->next, task_t, run_list);
>
> - if (next->activated && next->interactive_credit > 0) {
> + if (next->activated > 0) {
> unsigned long long delta = now - next->timestamp;
>
> if (next->activated == 1)
>

Con, I will probably be wrong, but in [1] you are testing
"activated != 0" and [2] is setting "activated = -1", which
_is_ != 0 and thus would enter the "if->else" branch and
do "activated = 1" in [3].

Perhaps you meant to set "activated = 0" in [2]???

Note I've not read the rest of the scheduler code,
so perhaps the "activated = 0" is in another place...
just in case, I prefer asking.

Greets for your hard work, Antonio.

--

1. Dado un programa, siempre tiene al menos un fallo.
2. Dadas varias lineas de codigo, siempre se pueden acortar a menos lineas.
3. Por induccion, todos los programas se pueden
reducir a una linea que no funciona.

2003-08-04 18:26:05

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Mon, 2003-08-04 at 18:07, Con Kolivas wrote:
> Changes:
>
> Reverted the child penalty to 95 as new changes help this from hurting
>
> Changed the logic behind loss of interactive credits to those that burn off
> all their sleep_avg
>
> Now all tasks get proportionately more sleep as their relative bonus drops
> off. This has the effect of detecting a change from a cpu burner to an
> interactive task more rapidly as in O10.

Oh, yeah! This is damn good! I've had only a little bit time to try it,
but I think it rocks. Good work :-)

2003-08-04 19:10:48

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity


On 2003-08-04 18:24:15 Felipe Alfaro Solana wrote:

> Oh, yeah! This is damn good! I've had only a little bit time to try
> it, but I think it rocks. Good work :-)

Can only agree. Wine/wineserver now has the same PRI as in pure A3 on my
game-test, which can be felt and seen. Playability = 8.75 due to there
being slightly more bumps and audio glitches than in A3.

But that (A3) scheduler had other issues. Came home from work and began
typing in an xterm that had been sitting on the screen for more than 8
hours. It woke up... slowly. Took about 3 or 4 seconds before it
accepted my keystrokes without delays. It was not swapped out, but more
like it slowly accelerated from standstill.

Will report faults if I find them. Otherwise will stay silent. No news
is good news, you know.

Mvh
Mats Johannesson

2003-08-04 20:08:06

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 02:07 AM 8/5/2003 +1000, Con Kolivas wrote:

>The _major_ change in this patch is that tasks on uninterruptible sleep do
>not
>earn any sleep avg during that sleep; it is not voluntary sleep so they
>should
>not get it. This has the effect of stopping cpu hogs from gaining dynamic
>priority during periods of heavy I/O. Very good for the jerks you may
>see in X or audio skips when you start a whole swag of disk intensive cpu
>hogs
>(eg make -j large number). I've simply dropped all their sleep_avg, but
>weighting it may be more appropriate. This has the side effect that pure
>disk tasks (eg cp) have relatively low priority which is why weighting may
>be better. We shall see.

IMHO, absolute cut off is a very bad idea (btdt, and it _sucked rocks_).

The last thing in the world you want to do is to remove differentiation
between tasks... try to classify them and make them all the same within
their class. For grins, take away all remaining differentiation, and run a
hefty parallel make.

-Mike

2003-08-04 21:27:29

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 05:15, Antonio Vargas wrote:
> On Tue, Aug 05, 2003 at 02:07:18AM +1000, Con Kolivas wrote:
> > + if (!p->activated){

This is the relevent branch.

> Con, I will probably be wrong, but in [1] you are testing
> "activated != 0" and [2] is setting "activated = -1", which
> _is_ != 0 and thus would enter the "if->else" branch and
> do "activated = 1" in [3].
>
> Perhaps you meant to set "activated = 0" in [2]???

It looks for activated == 0 to decide if it should enter the branch and set it
to either 1 or 2.

> Note I've not read the rest of the scheduler code,
> so perhaps the "activated = 0" is in another place...
> just in case, I prefer asking.

Never hurts.

Con

2003-08-04 22:06:07

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 06:11, Mike Galbraith wrote:

> IMHO, absolute cut off is a very bad idea (btdt, and it _sucked rocks_).
>
> The last thing in the world you want to do is to remove differentiation
> between tasks... try to classify them and make them all the same within
> their class. For grins, take away all remaining differentiation, and run a
> hefty parallel make.

I didn't fully understand you but I get your drift. I have a better solution
in the works anyway but I wanted this tested.

Con

2003-08-05 02:11:26

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>Changes:
>
>Reverted the child penalty to 95 as new changes help this from hurting
>
>Changed the logic behind loss of interactive credits to those that burn off
>all their sleep_avg
>
>Now all tasks get proportionately more sleep as their relative bonus drops
>off. This has the effect of detecting a change from a cpu burner to an
>interactive task more rapidly as in O10.
>
>The _major_ change in this patch is that tasks on uninterruptible sleep do not
>earn any sleep avg during that sleep; it is not voluntary sleep so they should
>not get it. This has the effect of stopping cpu hogs from gaining dynamic
>priority during periods of heavy I/O. Very good for the jerks you may
>see in X or audio skips when you start a whole swag of disk intensive cpu hogs
>(eg make -j large number). I've simply dropped all their sleep_avg, but
>weighting it may be more appropriate. This has the side effect that pure
>disk tasks (eg cp) have relatively low priority which is why weighting may
>be better. We shall see.
>

I don't think this is a good idea. Uninterruptible does not mean its
not a voluntary sleep. Its more to do with how a syscall is implemented.
I don't think it should be treated any differently to any other type of
sleep.

Any task which calls schedule in kernel context is sleeping volintarily
- if implicity due to having called a blocking syscall.

>
>Please test this one extensively. It should _not_ affect I/O throughput per
>se, but I'd like to see some of the I/O benchmarks on this. I do not want to
>have detrimental effects elsewhere.
>

Well the reason it can affect IO thoughput is for example when there is
an IO bound process and a CPU hog on the same processor: the longer the
IO process has to wait (after being woken) before being run, the more
chance the disk will fall idle for a longer period. And of course the
CPU uncontended case is somewhat uninteresting when it comes to a CPU
scheduler.



2003-08-05 02:15:00

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 12:11, Nick Piggin wrote:
> Con Kolivas wrote:
> >Changes:
> >
> >Reverted the child penalty to 95 as new changes help this from hurting
> >
> >Changed the logic behind loss of interactive credits to those that burn
> > off all their sleep_avg
> >
> >Now all tasks get proportionately more sleep as their relative bonus drops
> >off. This has the effect of detecting a change from a cpu burner to an
> >interactive task more rapidly as in O10.
> >
> >The _major_ change in this patch is that tasks on uninterruptible sleep do
> > not earn any sleep avg during that sleep; it is not voluntary sleep so
> > they should not get it. This has the effect of stopping cpu hogs from
> > gaining dynamic priority during periods of heavy I/O. Very good for the
> > jerks you may see in X or audio skips when you start a whole swag of disk
> > intensive cpu hogs (eg make -j large number). I've simply dropped all
> > their sleep_avg, but weighting it may be more appropriate. This has the
> > side effect that pure disk tasks (eg cp) have relatively low priority
> > which is why weighting may be better. We shall see.
>
> I don't think this is a good idea. Uninterruptible does not mean its
> not a voluntary sleep. Its more to do with how a syscall is implemented.
> I don't think it should be treated any differently to any other type of
> sleep.
>
> Any task which calls schedule in kernel context is sleeping volintarily
> - if implicity due to having called a blocking syscall.
>
> >Please test this one extensively. It should _not_ affect I/O throughput
> > per se, but I'd like to see some of the I/O benchmarks on this. I do not
> > want to have detrimental effects elsewhere.
>
> Well the reason it can affect IO thoughput is for example when there is
> an IO bound process and a CPU hog on the same processor: the longer the
> IO process has to wait (after being woken) before being run, the more
> chance the disk will fall idle for a longer period. And of course the
> CPU uncontended case is somewhat uninteresting when it comes to a CPU
> scheduler.

I've already posted a better solution in O13.1

Con

2003-08-05 02:21:28

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 12:11, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>Changes:
>>>
>>>Reverted the child penalty to 95 as new changes help this from hurting
>>>
>>>Changed the logic behind loss of interactive credits to those that burn
>>>off all their sleep_avg
>>>
>>>Now all tasks get proportionately more sleep as their relative bonus drops
>>>off. This has the effect of detecting a change from a cpu burner to an
>>>interactive task more rapidly as in O10.
>>>
>>>The _major_ change in this patch is that tasks on uninterruptible sleep do
>>>not earn any sleep avg during that sleep; it is not voluntary sleep so
>>>they should not get it. This has the effect of stopping cpu hogs from
>>>gaining dynamic priority during periods of heavy I/O. Very good for the
>>>jerks you may see in X or audio skips when you start a whole swag of disk
>>>intensive cpu hogs (eg make -j large number). I've simply dropped all
>>>their sleep_avg, but weighting it may be more appropriate. This has the
>>>side effect that pure disk tasks (eg cp) have relatively low priority
>>>which is why weighting may be better. We shall see.
>>>
>>I don't think this is a good idea. Uninterruptible does not mean its
>>not a voluntary sleep. Its more to do with how a syscall is implemented.
>>I don't think it should be treated any differently to any other type of
>>sleep.
>>
>>Any task which calls schedule in kernel context is sleeping volintarily
>>- if implicity due to having called a blocking syscall.
>>
>>
>>>Please test this one extensively. It should _not_ affect I/O throughput
>>>per se, but I'd like to see some of the I/O benchmarks on this. I do not
>>>want to have detrimental effects elsewhere.
>>>
>>Well the reason it can affect IO thoughput is for example when there is
>>an IO bound process and a CPU hog on the same processor: the longer the
>>IO process has to wait (after being woken) before being run, the more
>>chance the disk will fall idle for a longer period. And of course the
>>CPU uncontended case is somewhat uninteresting when it comes to a CPU
>>scheduler.
>>
>
>I've already posted a better solution in O13.1
>
>

No, this still special-cases the uninterruptible sleep. Why is this
needed? What is being worked around? There is probably a way to
attack the cause of the problem.



2003-08-05 03:01:39

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
> >I've already posted a better solution in O13.1
>
> No, this still special-cases the uninterruptible sleep. Why is this
> needed? What is being worked around? There is probably a way to
> attack the cause of the problem.

Sure I'm open to any and all ideas. Cpu hogs occasionally do significant I/O.
Up until that time they have been only losing sleep_avg as they have spent no
time sleeping; and this is what gives them a lower dynamic priority. During
uninterruptible sleep all of a sudden they are seen as sleeping even though
they are cpu hogs waiting on I/O. Witness the old standard, a kernel compile.
The very first time you launch a make -j something, the higher the something,
the longer all the jobs wait on I/O, the better the dynamic priority they
get, which they shouldn't.

No, this is not just a "fix the scheduler so you don't feel -j kernel
compiles" as it happens with any cpu hog starving other tasks, and the longer
the cpu hogs wait on I/O the worse it is. This change causes a _massive_
improvement for that test case which usually brings the machine to a
standstill the size of which is dependent on the number of cpu hogs and the
size of their I/O wait. I don't think the latest incarnation should be a
problem. In my limited testing I've not found any difference in throughput
but I don't have a major testbed at my disposal, nor time to use one if it
was offered which is why I requested more testing.

Thoughts?

Con

2003-08-05 03:15:25

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
> No, this still special-cases the uninterruptible sleep. Why is this
> needed? What is being worked around? There is probably a way to
> attack the cause of the problem.

Footnote: I was thinking of using this to also _elevate_ the dynamic priority
of tasks waking from interruptible sleep as well which may help throughput.

Con

2003-08-05 03:18:23

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
>
>>>I've already posted a better solution in O13.1
>>>
>>No, this still special-cases the uninterruptible sleep. Why is this
>>needed? What is being worked around? There is probably a way to
>>attack the cause of the problem.
>>
>
>Sure I'm open to any and all ideas. Cpu hogs occasionally do significant I/O.
>Up until that time they have been only losing sleep_avg as they have spent no
>time sleeping; and this is what gives them a lower dynamic priority. During
>uninterruptible sleep all of a sudden they are seen as sleeping even though
>they are cpu hogs waiting on I/O. Witness the old standard, a kernel compile.
>The very first time you launch a make -j something, the higher the something,
>the longer all the jobs wait on I/O, the better the dynamic priority they
>get, which they shouldn't.
>

Well I don't think the scheduler should really care about a process waiting
1 second vs a process waiting 10 seconds. The point of the dynamic priority
here is that 1 you want the process to wake up soon to respond to the IO,
and 2 you want to give it a bit of an advantage vs a non sleeping CPU hog,
right? I think a very rapidly decaying benefit vs sleep time is in order.

>
>No, this is not just a "fix the scheduler so you don't feel -j kernel
>compiles" as it happens with any cpu hog starving other tasks, and the longer
>the cpu hogs wait on I/O the worse it is. This change causes a _massive_
>improvement for that test case which usually brings the machine to a
>standstill the size of which is dependent on the number of cpu hogs and the
>size of their I/O wait. I don't think the latest incarnation should be a
>problem. In my limited testing I've not found any difference in throughput
>but I don't have a major testbed at my disposal, nor time to use one if it
>was offered which is why I requested more testing.
>
>Thoughts?
>

Well if it really is the right thing to do, it should be done with _any_
type of sleep, not just uninterruptible. But you may have just answered
your question there: "the longer cpu hogs wait on I/O the worse it is".
Change the dynamic priority boost so this is no longer the case.

I understand this is essentially what you have done, but you did it in a
way that does not allow a task to become "interactive". Try changing the
formula used to derive the priority boost?


2003-08-05 03:31:42

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
>
>>No, this still special-cases the uninterruptible sleep. Why is this
>>needed? What is being worked around? There is probably a way to
>>attack the cause of the problem.
>>
>
>Footnote: I was thinking of using this to also _elevate_ the dynamic priority
>of tasks waking from interruptible sleep as well which may help throughput.
>

Con, an uninterruptible sleep is one which is not be woken by a signal,
an interruptible sleep is one which is. There is no other connotation.
What happens when read/write syscalls are changed to be interruptible?
I'm not saying this will happen... but come to think of it, NFS probably
has interruptible read/write.

In short: make the same policy for an interruptible and an uninterruptible
sleep.




2003-08-05 05:04:14

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Quoting Nick Piggin <[email protected]>:

>
>
> Con Kolivas wrote:
>
> >On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
> >
> >>No, this still special-cases the uninterruptible sleep. Why is this
> >>needed? What is being worked around? There is probably a way to
> >>attack the cause of the problem.
> >>
> >
> >Footnote: I was thinking of using this to also _elevate_ the dynamic
> priority
> >of tasks waking from interruptible sleep as well which may help throughput.
> >
>
> Con, an uninterruptible sleep is one which is not be woken by a signal,
> an interruptible sleep is one which is. There is no other connotation.
> What happens when read/write syscalls are changed to be interruptible?
> I'm not saying this will happen... but come to think of it, NFS probably
> has interruptible read/write.
>
> In short: make the same policy for an interruptible and an uninterruptible
> sleep.

That's the policy that has always existed...

Interesting that I have only seen the desired effect and haven't noticed any
side effect from this change so far. I'll keep experimenting as much as
possible (as if I wasn't going to) and see what the testers find as well.

Con

2003-08-05 05:12:59

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Con Kolivas wrote:

>Quoting Nick Piggin <[email protected]>:
>
>
>>
>>Con Kolivas wrote:
>>
>>
>>>On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
>>>
>>>
>>>>No, this still special-cases the uninterruptible sleep. Why is this
>>>>needed? What is being worked around? There is probably a way to
>>>>attack the cause of the problem.
>>>>
>>>>
>>>Footnote: I was thinking of using this to also _elevate_ the dynamic
>>>
>>priority
>>
>>>of tasks waking from interruptible sleep as well which may help throughput.
>>>
>>>
>>Con, an uninterruptible sleep is one which is not be woken by a signal,
>>an interruptible sleep is one which is. There is no other connotation.
>>What happens when read/write syscalls are changed to be interruptible?
>>I'm not saying this will happen... but come to think of it, NFS probably
>>has interruptible read/write.
>>
>>In short: make the same policy for an interruptible and an uninterruptible
>>sleep.
>>
>
>That's the policy that has always existed...
>
>Interesting that I have only seen the desired effect and haven't noticed any
>side effect from this change so far. I'll keep experimenting as much as
>possible (as if I wasn't going to) and see what the testers find as well.
>

Oh, I'm not saying that your change is outright wrong, on the contrary I'd
say you have a better feel for what is needed than I do, but if you are
finding
that the uninterruptible sleep case needs some tweaking then the same tweak
should be applied to all sleep cases. If there really is a difference,
then its
just a fluke that the sleep paths in question use the type of sleep you are
testing for, and nothing more profound than that.


2003-08-05 05:16:21

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Quoting Nick Piggin <[email protected]>:

> Con Kolivas wrote:
>
> >Quoting Nick Piggin <[email protected]>:
> >
> >
> >>
> >>Con Kolivas wrote:
> >>
> >>
> >>>On Tue, 5 Aug 2003 12:21, Nick Piggin wrote:
> >>>
> >>>
> >>>>No, this still special-cases the uninterruptible sleep. Why is this
> >>>>needed? What is being worked around? There is probably a way to
> >>>>attack the cause of the problem.
> >>>>
> >>>>
> >>>Footnote: I was thinking of using this to also _elevate_ the dynamic
> >>>
> >>priority
> >>
> >>>of tasks waking from interruptible sleep as well which may help
> throughput.
> >>>
> >>>
> >>Con, an uninterruptible sleep is one which is not be woken by a signal,
> >>an interruptible sleep is one which is. There is no other connotation.
> >>What happens when read/write syscalls are changed to be interruptible?
> >>I'm not saying this will happen... but come to think of it, NFS probably
> >>has interruptible read/write.
> >>
> >>In short: make the same policy for an interruptible and an uninterruptible
> >>sleep.
> >>
> >
> >That's the policy that has always existed...
> >
> >Interesting that I have only seen the desired effect and haven't noticed any
>
> >side effect from this change so far. I'll keep experimenting as much as
> >possible (as if I wasn't going to) and see what the testers find as well.
> >
>
> Oh, I'm not saying that your change is outright wrong, on the contrary I'd
> say you have a better feel for what is needed than I do, but if you are
> finding
> that the uninterruptible sleep case needs some tweaking then the same tweak
> should be applied to all sleep cases. If there really is a difference,
> then its
> just a fluke that the sleep paths in question use the type of sleep you are
> testing for, and nothing more profound than that.

Ah I see. It was from my observations of the behaviour of tasks in D that
found it was the period spent in D that was leading to unfairness. The same
tweak can't be applied to the rest of the sleeps because that inactivates
everything. So it is a fluke that the thing I'm trying to penalise is what
tasks in uninterruptible sleep do, but it is by backward observation of D
tasks, not random chance.

Con

2003-08-05 06:02:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Con Kolivas <[email protected]> wrote:
>
> > In short: make the same policy for an interruptible and an uninterruptible
> > sleep.
>
> That's the policy that has always existed...
>
> Interesting that I have only seen the desired effect and haven't noticed any
> side effect from this change so far. I'll keep experimenting as much as
> possible (as if I wasn't going to) and see what the testers find as well.

We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so they
can submit more IO and go back to sleep. Remember that we are artificially
leaving the disk head idle in the expectation that the task will submit
more I/O. It's pretty sad if the CPU scheduler leaves the anticipated task
in the doldrums for five milliseconds.

Very early on in AS development I was playing with adding "extra boost" to
the anticipated-upon task, but it did appear that the stock scheduler was
sufficiently doing the right thing anyway.


2003-08-05 06:11:09

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Con Kolivas wrote:

>Quoting Nick Piggin <[email protected]>:
>

snip

>
>>Oh, I'm not saying that your change is outright wrong, on the contrary I'd
>>say you have a better feel for what is needed than I do, but if you are
>>finding
>>that the uninterruptible sleep case needs some tweaking then the same tweak
>>should be applied to all sleep cases. If there really is a difference,
>>then its
>>just a fluke that the sleep paths in question use the type of sleep you are
>>testing for, and nothing more profound than that.
>>
>
>Ah I see. It was from my observations of the behaviour of tasks in D that
>found it was the period spent in D that was leading to unfairness. The same
>tweak can't be applied to the rest of the sleeps because that inactivates
>everything. So it is a fluke that the thing I'm trying to penalise is what
>tasks in uninterruptible sleep do, but it is by backward observation of D
>tasks, not random chance.
>

Yes yes, but we come to the same conclusion no matter why you have decided
to make the change ;) namely that you're only papering over a flaw in the
scheduler!

What happens in the same sort of workload that is using interruptible
sleeps?
Say the same make -j NFS mounted interrruptible (I think?).

I didn't really understand your answer a few emails ago... please just
reiterate: if the problem is that processes sleeping too long on IO get
too high a priority, then give all processes the same boost after they
have slept for half a second?

Also, why is this a problem exactly? Is there a difference between a
process that would be a CPU hog but for its limited disk bandwidth, and
a process that isn't a CPU hog? Disk IO aside, they are exactly the same
thing to the CPU scheduler, aren't they?

_wants_ to be a CPU hog, but can't due to disk

2003-08-05 07:06:06

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 08:11 AM 8/5/2003 +1000, Con Kolivas wrote:
>On Tue, 5 Aug 2003 06:11, Mike Galbraith wrote:
>
> > IMHO, absolute cut off is a very bad idea (btdt, and it _sucked rocks_).
> >
> > The last thing in the world you want to do is to remove differentiation
> > between tasks... try to classify them and make them all the same within
> > their class. For grins, take away all remaining differentiation, and run a
> > hefty parallel make.
>
>I didn't fully understand you but I get your drift. I have a better solution
>in the works anyway but I wanted this tested.

WRT disk wait: The pure disk load turns into a problem here only while
it's writing heavily, which means to me that either write throttling needs
a bit of work, or async write boost needs a bit of throttling (same
thing). To me, async operations using sync wakeups makes more sense than
trying to discriminate based upon task state. Writes are generally async,
and reads are generally sync. If a task is in D state waiting for a write
request to become available, it doesn't seem to me to be a good idea to
boost his priority, that allows the guy who is overloading your I/O system
and beating up your pagecache to preempt other pagecache users as soon as a
request becomes available. OTOH, a reader is almost always (except for
readahead?) sync, and _needs_ to be able to preempt to use the data it
waited on. Take for example swapin; remove the sleep credit for swapin,
and watch the thrash-fest begin... the light to moderate swap load that
your box formerly handled easily suddenly becomes a horrible mess.

-Mike

[aside: "D state is involuntary sleep" is kinda wrong-minded I think. The
only voluntary sleep is a yield. All others could just as well be
considered involuntary (also wrong). A good example of what I mean is the
problem with X and xmms's gl thread inverting their priorities once X
expires in the presence of a non-interactive load. X isn't running at prio
25 because it is inherently a cpu hog, and the gl thread isn't at prio 16
because it's voluntarily sleeping. The situation is exactly the
opposite. The gl thread is a huge cpu hog that is "involuntarily"
sleeping, waiting on X, who is "involuntarily" permanently running because
it can't get enough cpu to catch up with it's workload so it can do what it
normally does, sleep. Also as an aside, if you want to have lots of fun
with D state, play with semaphores. Test results of using async/sync
wakeups there can be highly counter-intuitive.]

2003-08-05 07:21:09

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
> We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so they
> can submit more IO and go back to sleep. Remember that we are artificially
> leaving the disk head idle in the expectation that the task will submit
> more I/O. It's pretty sad if the CPU scheduler leaves the anticipated task
> in the doldrums for five milliseconds.

Indeed that has been on my mind. This change doesn't affect how long it takes
to wake up. It simply prevents tasks from getting full interactive status
during the period they are doing unint. sleep.

> Very early on in AS development I was playing with adding "extra boost" to
> the anticipated-upon task, but it did appear that the stock scheduler was
> sufficiently doing the right thing anyway.

Con

2003-08-05 07:49:51

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 11:03 PM 8/4/2003 -0700, Andrew Morton wrote:
>Con Kolivas <[email protected]> wrote:
> >
> > > In short: make the same policy for an interruptible and an
> uninterruptible
> > > sleep.
> >
> > That's the policy that has always existed...
> >
> > Interesting that I have only seen the desired effect and haven't
> noticed any
> > side effect from this change so far. I'll keep experimenting as much as
> > possible (as if I wasn't going to) and see what the testers find as well.
>
>We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so they
>can submit more IO and go back to sleep. Remember that we are artificially
>leaving the disk head idle in the expectation that the task will submit
>more I/O. It's pretty sad if the CPU scheduler leaves the anticipated task
>in the doldrums for five milliseconds.

It's actually (potentially) _much_ more than that isn't it? Wakeups don't
consider the last time a task has run... the awakened task is always placed
at the back of the pack regardless of whether the tasks in front of it have
been receiving heavy doses of cpu and the awakened task has not.

-Mike

2003-08-05 08:15:55

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 18:12, Oliver Neukum wrote:
> Am Dienstag, 5. August 2003 09:26 schrieb Con Kolivas:
> > On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
> > > We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so
> > > they can submit more IO and go back to sleep. Remember that we are
> > > artificially leaving the disk head idle in the expectation that the
> > > task will submit more I/O. It's pretty sad if the CPU scheduler leaves
> > > the anticipated task in the doldrums for five milliseconds.
> >
> > Indeed that has been on my mind. This change doesn't affect how long it
> > takes to wake up. It simply prevents tasks from getting full interactive
> > status during the period they are doing unint. sleep.
>
> If you take that to its logical conclusion, such tasks should be woken
> immediately. Likewise, the io scheduler should be notified when you know
> that the task won't do io or will do other io, like waiting on character
> devices, go paging out or terminate.

Every experiment I've tried at putting tasks at the start of the queue instead
of the end has resulted in some form of starvation so should not be possible
for any user task and I've abandoned it.

Con

2003-08-05 08:12:20

by Oliver Neukum

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Am Dienstag, 5. August 2003 09:26 schrieb Con Kolivas:
> On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
> > We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so they
> > can submit more IO and go back to sleep. Remember that we are artificially
> > leaving the disk head idle in the expectation that the task will submit
> > more I/O. It's pretty sad if the CPU scheduler leaves the anticipated task
> > in the doldrums for five milliseconds.
>
> Indeed that has been on my mind. This change doesn't affect how long it takes
> to wake up. It simply prevents tasks from getting full interactive status
> during the period they are doing unint. sleep.

If you take that to its logical conclusion, such tasks should be woken
immediately. Likewise, the io scheduler should be notified when you know
that the task won't do io or will do other io, like waiting on character
devices, go paging out or terminate.

Regards
Oliver

2003-08-05 08:24:03

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 06:20 PM 8/5/2003 +1000, Con Kolivas wrote:
>On Tue, 5 Aug 2003 18:12, Oliver Neukum wrote:
> > Am Dienstag, 5. August 2003 09:26 schrieb Con Kolivas:
> > > On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
> > > > We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so
> > > > they can submit more IO and go back to sleep. Remember that we are
> > > > artificially leaving the disk head idle in the expectation that the
> > > > task will submit more I/O. It's pretty sad if the CPU scheduler leaves
> > > > the anticipated task in the doldrums for five milliseconds.
> > >
> > > Indeed that has been on my mind. This change doesn't affect how long it
> > > takes to wake up. It simply prevents tasks from getting full interactive
> > > status during the period they are doing unint. sleep.
> >
> > If you take that to its logical conclusion, such tasks should be woken
> > immediately. Likewise, the io scheduler should be notified when you know
> > that the task won't do io or will do other io, like waiting on character
> > devices, go paging out or terminate.
>
>Every experiment I've tried at putting tasks at the start of the queue
>instead
>of the end has resulted in some form of starvation so should not be possible
>for any user task and I've abandoned it.

(ditto:)

2003-08-05 08:39:01

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 18:27, Mike Galbraith wrote:
> At 06:20 PM 8/5/2003 +1000, Con Kolivas wrote:
> >Every experiment I've tried at putting tasks at the start of the queue
> >instead
> >of the end has resulted in some form of starvation so should not be
> > possible for any user task and I've abandoned it.
>
> (ditto:)

Superuser access real time tasks may be worth reconsidering though...

Con

2003-08-05 09:04:52

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 06:43 PM 8/5/2003 +1000, Con Kolivas wrote:
>On Tue, 5 Aug 2003 18:27, Mike Galbraith wrote:
> > At 06:20 PM 8/5/2003 +1000, Con Kolivas wrote:
> > >Every experiment I've tried at putting tasks at the start of the queue
> > >instead
> > >of the end has resulted in some form of starvation so should not be
> > > possible for any user task and I've abandoned it.
> >
> > (ditto:)
>
>Superuser access real time tasks may be worth reconsidering though...

If they were guaranteed ultra-light, maybe, but userland is just not
trustworthy.

Better imho would be something like Davide's SOFT_RR with an additional
automatic priority adjust per cpu usage or something (cpu usage being a
[very] little bit of a latency hint, and a great 'hurt me' hint). Best
would be an API that allowed userland applications to describe their
latency requirements explicitly, with the scheduler watching users of this
API like a hawk, ever ready to sanction abusers. Anything I think about in
this area gets uncomfortably close to hard rt though, and all of the wisdom
I've heard on LKLM over the years wrt separation of problem spaces comes
flooding back.

-Mike

2003-08-05 09:14:32

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 19:09, Mike Galbraith wrote:
> At 06:43 PM 8/5/2003 +1000, Con Kolivas wrote:
> >On Tue, 5 Aug 2003 18:27, Mike Galbraith wrote:
> > > At 06:20 PM 8/5/2003 +1000, Con Kolivas wrote:
> > > >Every experiment I've tried at putting tasks at the start of the queue
> > > >instead
> > > >of the end has resulted in some form of starvation so should not be
> > > > possible for any user task and I've abandoned it.
> > >
> > > (ditto:)
> >
> >Superuser access real time tasks may be worth reconsidering though...
>
> If they were guaranteed ultra-light, maybe, but userland is just not
> trustworthy.

Agreed

> Better imho would be something like Davide's SOFT_RR with an additional
> automatic priority adjust per cpu usage or something (cpu usage being a
> [very] little bit of a latency hint, and a great 'hurt me' hint). Best
> would be an API that allowed userland applications to describe their
> latency requirements explicitly, with the scheduler watching users of this
> API like a hawk, ever ready to sanction abusers. Anything I think about in
> this area gets uncomfortably close to hard rt though, and all of the wisdom
> I've heard on LKLM over the years wrt separation of problem spaces comes
> flooding back.

I'll pass. There's enough on my plate already. Soft_rr in some form is a
decent idea but best tackled separately.

Con

2003-08-05 10:05:49

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Oliver Neukum wrote:

>Am Dienstag, 5. August 2003 09:26 schrieb Con Kolivas:
>
>>On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
>>
>>>We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so they
>>>can submit more IO and go back to sleep. Remember that we are artificially
>>>leaving the disk head idle in the expectation that the task will submit
>>>more I/O. It's pretty sad if the CPU scheduler leaves the anticipated task
>>>in the doldrums for five milliseconds.
>>>
>>Indeed that has been on my mind. This change doesn't affect how long it takes
>>to wake up. It simply prevents tasks from getting full interactive status
>>during the period they are doing unint. sleep.
>>
>
>If you take that to its logical conclusion, such tasks should be woken
>immediately. Likewise, the io scheduler should be notified when you know
>that the task won't do io or will do other io, like waiting on character
>devices, go paging out or terminate.
>

I don't think that is the logical conclusion because you are balancing
against other things.

As for the io scheduler, no, there is a lot that can be done (including
waiting on character devs) before it is no longer worth keeping the disk
waiting. AS really doesn't care in the slightest what a process does
between submitting IOs*, what is important is simply its IO pattern.

* except exit which is an easy case of course.


2003-08-05 10:16:55

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 15:28, Nick Piggin wrote:
> Con Kolivas wrote:
> >Quoting Nick Piggin <[email protected]>:
> Yes yes, but we come to the same conclusion no matter why you have decided
> to make the change ;) namely that you're only papering over a flaw in the
> scheduler!

This would take a redesign in the interactivity estimator. I worked on one for
a while but decided it best to stick to one infrastructure and tune it as
much as possible; especially in this stage of 2.6 blah blah...

> What happens in the same sort of workload that is using interruptible
> sleeps?
> Say the same make -j NFS mounted interrruptible (I think?).

Dunno. Can't say. I've only ever seen NFS D but I don't have enough test
material...

> I didn't really understand your answer a few emails ago... please just
> reiterate: if the problem is that processes sleeping too long on IO get
> too high a priority, then give all processes the same boost after they
> have slept for half a second?
>
> Also, why is this a problem exactly? Is there a difference between a
> process that would be a CPU hog but for its limited disk bandwidth, and
> a process that isn't a CPU hog? Disk IO aside, they are exactly the same
> thing to the CPU scheduler, aren't they?
>
> _wants_ to be a CPU hog, but can't due to disk

You're on the right track; I'll try and explain differently.

A truly interactive task has periods of sleeping irrespective of disk
activity. It is the time spent sleeping that the estimator uses to decide
"this task is interactive, improve it's dynamic priority by 5". A true cpu
hog (eg cc1) never sleeps intentionally and the estimator sees this as "I'm a
hog; drop my priority by 5". Now if the cpu hog sleeps while waiting on disk
i/o the estimator suddenly decides to elevate it's priority. If it gets to
maximum boost and then stops doing I/O and goes back to being a hog it now
starts starving other processes till it's dynamic priority drops enough
again. As I said it's a design quirk (bug?) and _limiting_ how high the
priority goes if the sleep is due to I/O would be ideal but I don't have a
simple way to tell that apart from knowing that the sleep was
UNINTERRUPTIBLE. This is not as bad as it sounds as for the most part it
still is counted as sleep except that it can't ever get maximum priority
boost to be a sustained starver.

However, since you're a disk I/O kind of guy you may have a better solution to
this problem and give me some data I can feedback into the estimator ;-)

Con

2003-08-05 10:40:30

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 20:32, Nick Piggin wrote:
> What you are doing is restricting some range so it can adapt more quickly
> right? So you still have the problem in the cases where you are not
> restricting this range.

Avoiding it becoming interactive in the first place is the answer. Anything
more rapid and X dies dead as soon as you start moving a window for example,
and new apps are seen as cpu hogs during startup and will take _forever_ to
start under load. It's a tricky juggling act and I keep throwing more balls
at it.

Con

2003-08-05 10:33:16

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 15:28, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>Quoting Nick Piggin <[email protected]>:
>>>
>>Yes yes, but we come to the same conclusion no matter why you have decided
>>to make the change ;) namely that you're only papering over a flaw in the
>>scheduler!
>>
>
>This would take a redesign in the interactivity estimator. I worked on one for
>a while but decided it best to stick to one infrastructure and tune it as
>much as possible; especially in this stage of 2.6 blah blah...
>
>
>>What happens in the same sort of workload that is using interruptible
>>sleeps?
>>Say the same make -j NFS mounted interrruptible (I think?).
>>
>
>Dunno. Can't say. I've only ever seen NFS D but I don't have enough test
>material...
>
>
>>I didn't really understand your answer a few emails ago... please just
>>reiterate: if the problem is that processes sleeping too long on IO get
>>too high a priority, then give all processes the same boost after they
>>have slept for half a second?
>>
>>Also, why is this a problem exactly? Is there a difference between a
>>process that would be a CPU hog but for its limited disk bandwidth, and
>>a process that isn't a CPU hog? Disk IO aside, they are exactly the same
>>thing to the CPU scheduler, aren't they?
>>
>>_wants_ to be a CPU hog, but can't due to disk
>>
>
>You're on the right track; I'll try and explain differently.
>
>A truly interactive task has periods of sleeping irrespective of disk
>activity. It is the time spent sleeping that the estimator uses to decide
>"this task is interactive, improve it's dynamic priority by 5". A true cpu
>hog (eg cc1) never sleeps intentionally and the estimator sees this as "I'm a
>hog; drop my priority by 5". Now if the cpu hog sleeps while waiting on disk
>i/o the estimator suddenly decides to elevate it's priority. If it gets to
>maximum boost and then stops doing I/O and goes back to being a hog it now
>starts starving other processes till it's dynamic priority drops enough
>again. As I said it's a design quirk (bug?) and _limiting_ how high the
>priority goes if the sleep is due to I/O would be ideal but I don't have a
>simple way to tell that apart from knowing that the sleep was
>UNINTERRUPTIBLE. This is not as bad as it sounds as for the most part it
>still is counted as sleep except that it can't ever get maximum priority
>boost to be a sustained starver.
>

But by employing the kernel's services in the shape of a blocking
syscall, all sleeps are intentional. I think what you see is interactive
apps sleep in select which is interruptible. Anyway, I'll grant you that
a true cpu hog never sleeps, but then you don't have to worry about what
happens if it were to submit IO ;)

If cc1 is doing a lot of waiting on IO, I fail to see how it should be
called a CPU hog. OK I'll stop being difficult! I understand the problem
is that its behaviour suddenly changes from IO bound to CPU hog, right?
Then it seems like the scheduler's problem is that it doesn't adapt
quickly enough to this change.

What you are doing is restricting some range so it can adapt more quickly
right? So you still have the problem in the cases where you are not
restricting this range.


2003-08-05 10:48:52

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 20:32, Nick Piggin wrote:
>
>>What you are doing is restricting some range so it can adapt more quickly
>>right? So you still have the problem in the cases where you are not
>>restricting this range.
>>
>
>Avoiding it becoming interactive in the first place is the answer. Anything
>more rapid and X dies dead as soon as you start moving a window for example,
>and new apps are seen as cpu hogs during startup and will take _forever_ to
>start under load. It's a tricky juggling act and I keep throwing more balls
>at it.
>

Well, what if you give less boost for sleeping?


2003-08-05 10:54:51

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 2003-08-05 at 12:45, Con Kolivas wrote:
> On Tue, 5 Aug 2003 20:32, Nick Piggin wrote:
> > What you are doing is restricting some range so it can adapt more quickly
> > right? So you still have the problem in the cases where you are not
> > restricting this range.
>
> Avoiding it becoming interactive in the first place is the answer. Anything
> more rapid and X dies dead as soon as you start moving a window for example,
> and new apps are seen as cpu hogs during startup and will take _forever_ to
> start under load. It's a tricky juggling act and I keep throwing more balls
> at it.

generally that's a sign that the approach might not be the best one.

Lets face it: we're trying to estimate behavior here. Result: There
ALWAYS will be mistakes in that estimator. The more complex the
estimator the fewer such cases you will have, but the more mis-estimated
such cases will be.
The only way to really deal with estimators is to *ALSO* make the price
you pay on mis-estimation acceptable. For the scheduler that most likely
means that you can't punish as hard as we do now, nor give bonuses as
much as we do now.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-08-05 10:51:34

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 20:48, Nick Piggin wrote:
> Con Kolivas wrote:
> >On Tue, 5 Aug 2003 20:32, Nick Piggin wrote:
> >>What you are doing is restricting some range so it can adapt more quickly
> >>right? So you still have the problem in the cases where you are not
> >>restricting this range.
> >
> >Avoiding it becoming interactive in the first place is the answer.
> > Anything more rapid and X dies dead as soon as you start moving a window
> > for example, and new apps are seen as cpu hogs during startup and will
> > take _forever_ to start under load. It's a tricky juggling act and I keep
> > throwing more balls at it.
>
> Well, what if you give less boost for sleeping?

Then it takes longer to become interactive. Take 2.6.0-test2 vanilla - audio
apps can take up to a minute to be seen as fully interactive; whether this is
a problem for your hardware or not is another matter but clearly they are
interactive using <1% cpu time on the whole.

Con

2003-08-05 11:05:52

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 20:54, Arjan van de Ven wrote:
> generally that's a sign that the approach might not be the best one.
>
> Lets face it: we're trying to estimate behavior here. Result: There
> ALWAYS will be mistakes in that estimator. The more complex the
> estimator the fewer such cases you will have, but the more mis-estimated
> such cases will be.
> The only way to really deal with estimators is to *ALSO* make the price
> you pay on mis-estimation acceptable. For the scheduler that most likely
> means that you can't punish as hard as we do now, nor give bonuses as
> much as we do now.

It is acceptable. This thread is getting carried away. Just because we
continued talking doesn't mean there is suddenly a big problem. There is no
sudden drop in performance or handling. It's a tiny tweak which helps and
there is no evidence of harm, only a theoretical concern on Nick's part which
ended up being a discussion about the merits of sleep_avg as a method of
determining interactivity. Yes there probably is a better way of doing it
(and I have embarked on one that I stopped doing), but a redesign from
scratch now is not what Ingo wants, and I see the logic in his reasoning.

Con

2003-08-05 11:07:09

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 21:03, Nick Piggin wrote:
> Con Kolivas wrote:
> >Then it takes longer to become interactive. Take 2.6.0-test2 vanilla -
> > audio apps can take up to a minute to be seen as fully interactive;
> > whether this is a problem for your hardware or not is another matter but
> > clearly they are interactive using <1% cpu time on the whole.
>
> I think this is a big problem, a minute is much too long. I guess its
> taking this long to build up because X needs a great deal of inertia
> so that it can stay in a highly interactive state right?
>
> If so then it seems the interactivity estimator does not have enough
> information to work properly for X. In which case maybe X needs to be
> reniced, or backboosted, or have _something_ done to help out.

Well we're in agreement there. That's what all this work I've done is about.
You'll see I've not been just tweaking numbers.

Con

2003-08-05 11:03:59

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 20:48, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>On Tue, 5 Aug 2003 20:32, Nick Piggin wrote:
>>>
>>>>What you are doing is restricting some range so it can adapt more quickly
>>>>right? So you still have the problem in the cases where you are not
>>>>restricting this range.
>>>>
>>>Avoiding it becoming interactive in the first place is the answer.
>>>Anything more rapid and X dies dead as soon as you start moving a window
>>>for example, and new apps are seen as cpu hogs during startup and will
>>>take _forever_ to start under load. It's a tricky juggling act and I keep
>>>throwing more balls at it.
>>>
>>Well, what if you give less boost for sleeping?
>>
>
>Then it takes longer to become interactive. Take 2.6.0-test2 vanilla - audio
>apps can take up to a minute to be seen as fully interactive; whether this is
>a problem for your hardware or not is another matter but clearly they are
>interactive using <1% cpu time on the whole.
>

I think this is a big problem, a minute is much too long. I guess its
taking this long to build up because X needs a great deal of inertia
so that it can stay in a highly interactive state right?

If so then it seems the interactivity estimator does not have enough
information to work properly for X. In which case maybe X needs to be
reniced, or backboosted, or have _something_ done to help out.


2003-08-05 11:23:58

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:

>On Tue, 5 Aug 2003 21:03, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>Then it takes longer to become interactive. Take 2.6.0-test2 vanilla -
>>>audio apps can take up to a minute to be seen as fully interactive;
>>>whether this is a problem for your hardware or not is another matter but
>>>clearly they are interactive using <1% cpu time on the whole.
>>>
>>I think this is a big problem, a minute is much too long. I guess its
>>taking this long to build up because X needs a great deal of inertia
>>so that it can stay in a highly interactive state right?
>>
>>If so then it seems the interactivity estimator does not have enough
>>information to work properly for X. In which case maybe X needs to be
>>reniced, or backboosted, or have _something_ done to help out.
>>
>
>Well we're in agreement there. That's what all this work I've done is about.
>You'll see I've not been just tweaking numbers.
>

I know you haven't been just tweaking numbers ;) But in the case of the
patch that provides different behaviour depending on whether a sleep is
interruptible or not really smelt of papering over symptoms. Now it might
be that nothing better can be done without move invasive changes, but I
just thought I'd voice my concerns.

Oh, and remember that your desktop load is devoid of make -j big compiles,
so that is not a requisite for good interactivity.


2003-08-05 11:29:47

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 21:23, Nick Piggin wrote:
> I know you haven't been just tweaking numbers ;) But in the case of the
> patch that provides different behaviour depending on whether a sleep is
> interruptible or not really smelt of papering over symptoms. Now it might
> be that nothing better can be done without move invasive changes, but I
> just thought I'd voice my concerns.

Indeed and the more discussion on the topic the better we can nut it out.
Especially on lkml where having the last word is important ;-D

> Oh, and remember that your desktop load is devoid of make -j big compiles,
> so that is not a requisite for good interactivity.

Thank goodness ;-). It's an easy way to reproduce a problem on a grander
scale.

Con

2003-08-06 10:33:04

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity


Mon, 4 Aug 2003 21:12:47 +0200 I wrote:

> Wine/wineserver now has the same PRI as in pure A3 on my game-test,

I have to make an addendum based on more thorough observations.
Determining which one of wine or wineserver to treat as interactive
seems to be a hard nut for O13int - maybe impossible and irrelevant as
well. Anyway, here's what is happening:

Switching from the game to a text console where "top" is running,
counting to 15 in my head (didn't have a watch on my arm), wine dropps
its PRI from 25 to 16. Wineserver, which has had a PRI of 16, gains a
few points to 18, then shortly after gets elevated to 25 and stays
there. Returning to the game everything is clunky and sound choppy. It
takes a fair amount of work (panning, character movement, menu
selections etc) before wine gets its 25 PRI back. Just waiting doesn't
cut it.

A3 can also be fooled. Not by a mere switch to the text console, but by
deactivating an option which affects the whole graphic handling:

"Software standard BLT [on/off]. Enable this option if graphic anomalies
appear in the game"

After disabling it, but only the first time - on/off thereafter has no
trigger effect - A3 gives wineserver a PRI of 25. It does however
recuperate quickly, within something like 5 seconds. Just waiting is
enough. O13int is also affected by this trigger, that's how I first
experienced the PRI reversing.

Disclaimer: I'm not a gamer, and have no interest in the scheduler
being tuned for this particular scenario. It just happens to be that the
game-test is where I really can observe the differences in scheduler
behaviour.

Mvh
Mats Johannesson

2003-08-06 21:20:39

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Nick Piggin wrote:

> If cc1 is doing a lot of waiting on IO, I fail to see how it should be
> called a CPU hog. OK I'll stop being difficult! I understand the problem
> is that its behaviour suddenly changes from IO bound to CPU hog, right?
> Then it seems like the scheduler's problem is that it doesn't adapt
> quickly enough to this change.
>
> What you are doing is restricting some range so it can adapt more quickly
> right? So you still have the problem in the cases where you are not
> restricting this range.


For this, I reiterate my suggestion to intentionally over-shoot the
mark. If you do it right, a process will run an inappropriate length of
time only every other time slice until the oscillation dies down.


Let me give you an example. Let's say you have a process which is being
interactive, and then suddenly becomes a CPU hog.

In the case as it is (assumptions here), what happens is that the
priority is reduced by some amount until it reaches a level appropriate
for the new behavior.

I get the impression that lower numbers mean higher priority, so here goes:

- The process starts out with a priority of 10 (this may mean something
that I don't know about... just follow along).
- It becomes a CPU hog sufficient to make it NEED to be at a priority of 30.
- Over some number of time slices, the priority is changed something
like this: 10, 20, 25, 27, 28, 29, 30.


Here's my alternative suggestion -- if 10 is pure interactive and 30 is
CPU hog, and you see some change in behavior, before, you would go half
way. Now, instead, go one-and-a-half way.
- Over some number of time slices, the priority is changed like this:
10, 40, 25, 32, 27, 31, 28, 30



Let's say that you only get one time slice which is CPU hog, but others
are not, for the first case, you'd get something like this:
10, 20, 15, 12, 11, 10

For the second case, you'd get this:
10, 40, -5, 17, 7, 11, 10


Something like that. So instead of getting tricked and having to
return, it over shoots but makes up for it the next time the process is run.


This is a very incomplete thought and may be pure garbage, so please
forgive me if I'm being an idiot. :)

2003-08-06 21:28:50

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

> For this, I reiterate my suggestion to intentionally over-shoot the
> mark. If you do it right, a process will run an inappropriate length of
> time only every other time slice until the oscillation dies down.

Your thoughts are fine, and to some degree I do what you're proscribing, but I
take into account the behaviour of real processes in the real world and their
effect on scheduling fairness.

Con

2003-08-07 00:15:10

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:
>>For this, I reiterate my suggestion to intentionally over-shoot the
>>mark. If you do it right, a process will run an inappropriate length of
>>time only every other time slice until the oscillation dies down.
>
>
> Your thoughts are fine, and to some degree I do what you're proscribing, but I
> take into account the behaviour of real processes in the real world and their
> effect on scheduling fairness.

And I know you know a lot more about how real processes behave than I
do. I'm not saying (or thinking) anything negative about you. I'm just
trying to throw random thoughts into the mix just in case some small
part of what I say is useful inspiration for someone else such as yourself.

It is probably the case that the idea I suggest is BS and makes no real
difference or makes it worse anyhow. :)

2003-08-07 00:32:51

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Con Kolivas wrote:
> Quoting Timothy Miller <[email protected]>:
>
>
>>
>>Con Kolivas wrote:
>>
>>>>For this, I reiterate my suggestion to intentionally over-shoot the
>>>>mark. If you do it right, a process will run an inappropriate length of
>>>>time only every other time slice until the oscillation dies down.
>>>
>>>
>>>Your thoughts are fine, and to some degree I do what you're proscribing,
>>
>>but I
>>
>>>take into account the behaviour of real processes in the real world and
>>
>>their
>>
>>>effect on scheduling fairness.
>>
>>And I know you know a lot more about how real processes behave than I
>>do. I'm not saying (or thinking) anything negative about you. I'm just
>>trying to throw random thoughts into the mix just in case some small
>>part of what I say is useful inspiration for someone else such as yourself.
>>
>>It is probably the case that the idea I suggest is BS and makes no real
>>difference or makes it worse anyhow. :)
>
>
> Nowhere do I recall saying your ideas were BS nor did I say you should stop
> throwing ideas at me. All thoughts are appreciated and considered. I'm pretty
> sure I said I do what you're suggesting anyway, bound by the limits of when
> those changes induce unfairness.

Oh, no, you have always been most gratious and kind! _I_ was saying
that my idea (or at least certain aspects of it) is probably BS. But
some aspects of it may be useful (upon reflection and with
modification), and it would seem that you have already thought of and
implemented those things, so I am very pleased. :)


2003-08-07 00:27:35

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Quoting Timothy Miller <[email protected]>:

>
>
> Con Kolivas wrote:
> >>For this, I reiterate my suggestion to intentionally over-shoot the
> >>mark. If you do it right, a process will run an inappropriate length of
> >>time only every other time slice until the oscillation dies down.
> >
> >
> > Your thoughts are fine, and to some degree I do what you're proscribing,
> but I
> > take into account the behaviour of real processes in the real world and
> their
> > effect on scheduling fairness.
>
> And I know you know a lot more about how real processes behave than I
> do. I'm not saying (or thinking) anything negative about you. I'm just
> trying to throw random thoughts into the mix just in case some small
> part of what I say is useful inspiration for someone else such as yourself.
>
> It is probably the case that the idea I suggest is BS and makes no real
> difference or makes it worse anyhow. :)

Nowhere do I recall saying your ideas were BS nor did I say you should stop
throwing ideas at me. All thoughts are appreciated and considered. I'm pretty
sure I said I do what you're suggesting anyway, bound by the limits of when
those changes induce unfairness.

Con

2003-08-11 13:44:45

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 05 August 2003 06:32, Nick Piggin wrote:

> But by employing the kernel's services in the shape of a blocking
> syscall, all sleeps are intentional.

Wrong. Some sleeps indicate "I have run out of stuff to do right now, I'm
going to wait for a timer or another process or something to wake me up with
new work".

Some sleeps indicate "ideally this would run on an enormous ramdisk attached
to gigabit ethernet, but hard drives and internet connections are just too
slow so my true CPU-hogness is hidden by the fact I'm running on a PC instead
of a mainframe."

There is are "I have nothing to do right now, and I'm okay with that" sleeps,
and there are "I have requested more work, and it should hurry up and get
here" sleeps.

Rob

2003-08-11 13:44:46

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 05 August 2003 03:26, Con Kolivas wrote:
> On Tue, 5 Aug 2003 16:03, Andrew Morton wrote:
> > We do prefer that TASK_UNINTERRUPTIBLE processes are woken promptly so
> > they can submit more IO and go back to sleep. Remember that we are
> > artificially leaving the disk head idle in the expectation that the task
> > will submit more I/O. It's pretty sad if the CPU scheduler leaves the
> > anticipated task in the doldrums for five milliseconds.
>
> Indeed that has been on my mind. This change doesn't affect how long it
> takes to wake up. It simply prevents tasks from getting full interactive
> status during the period they are doing unint. sleep.
>
> > Very early on in AS development I was playing with adding "extra boost"
> > to the anticipated-upon task, but it did appear that the stock scheduler
> > was sufficiently doing the right thing anyway.
>
> Con

It seems that there's a special case, where a task that was blocked on a read
(either from a file or from swap) wants to be scheduled immediately, but with
a really short timeslice. I.E. give it the ability to submit another read
and block on it immediately, but if a single jiffy goes by and it hasn't done
it, it should go away.

This has nothing to do with the normal priority levels or being considered
interactive or not. As I said, a special case. IO_UNBLOCKED_FLAG or some
such. Maybe unnecessary...

(Once again, the percentage of CPU time to devote to a task and the immediacy
of scheduling that task are in opposition. The "priority" abstraction is a
bit too simple at times...)

Rob

2003-08-11 15:46:54

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>> But by employing the kernel's services in the shape of a blocking
>> syscall, all sleeps are intentional.

On Mon, Aug 11, 2003 at 02:48:09AM -0400, Rob Landley wrote:
> Wrong. Some sleeps indicate "I have run out of stuff to do right now, I'm
> going to wait for a timer or another process or something to wake me up with
> new work".
> Some sleeps indicate "ideally this would run on an enormous ramdisk attached
> to gigabit ethernet, but hard drives and internet connections are just too
> slow so my true CPU-hogness is hidden by the fact I'm running on a PC instead
> of a mainframe."
> There is are "I have nothing to do right now, and I'm okay with that" sleeps,
> and there are "I have requested more work, and it should hurry up and get
> here" sleeps.

Perhaps more apps should use aio.


-- wli

2003-08-11 15:59:15

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Mon, Aug 11, 2003 at 02:57:25AM -0400, Rob Landley wrote:
> It seems that there's a special case, where a task that was blocked on a read
> (either from a file or from swap) wants to be scheduled immediately, but with
> a really short timeslice. I.E. give it the ability to submit another read
> and block on it immediately, but if a single jiffy goes by and it hasn't done
> it, it should go away.
> This has nothing to do with the normal priority levels or being considered
> interactive or not. As I said, a special case. IO_UNBLOCKED_FLAG or some
> such. Maybe unnecessary...
> (Once again, the percentage of CPU time to devote to a task and the immediacy
> of scheduling that task are in opposition. The "priority" abstraction is a
> bit too simple at times...)

This is bandwidth vs. latency. Priority isn't directly correlated to
either. There are patches floating around for more explicit cpu
bandwidth control (which IMHO would be ideal for the xmms problem).
Differentiated service with respect to latency is a bit of a different
story, and appears to have more complex semantics (!= complex code!)
than bandwidth.


-- wli

2003-08-12 02:51:55

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Rob Landley wrote:

>On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>
>
>>But by employing the kernel's services in the shape of a blocking
>>syscall, all sleeps are intentional.
>>
>
>Wrong. Some sleeps indicate "I have run out of stuff to do right now, I'm
>going to wait for a timer or another process or something to wake me up with
>new work".
>
>
>
>Some sleeps indicate "ideally this would run on an enormous ramdisk attached
>to gigabit ethernet, but hard drives and internet connections are just too
>slow so my true CPU-hogness is hidden by the fact I'm running on a PC instead
>of a mainframe."
>

I don't quite understand what you are getting at, but if you don't want to
sleep you should be able to use a non blocking syscall. But in some cases
I think there are times when you may not be able to use a non blocking call.

And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
matter how it would behave on another system.




2003-08-12 06:12:33

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 12:51 PM 8/12/2003 +1000, Nick Piggin wrote:


>Rob Landley wrote:
>
>>On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>>
>>
>>>But by employing the kernel's services in the shape of a blocking
>>>syscall, all sleeps are intentional.
>>
>>Wrong. Some sleeps indicate "I have run out of stuff to do right now,
>>I'm going to wait for a timer or another process or something to wake me
>>up with new work".
>>
>>
>>
>>Some sleeps indicate "ideally this would run on an enormous ramdisk
>>attached to gigabit ethernet, but hard drives and internet connections
>>are just too slow so my true CPU-hogness is hidden by the fact I'm
>>running on a PC instead of a mainframe."
>
>I don't quite understand what you are getting at, but if you don't want to
>sleep you should be able to use a non blocking syscall. But in some cases
>I think there are times when you may not be able to use a non blocking call.
>And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
>matter how it would behave on another system.

Ah, but there is something there. Take the X and xmms's gl thread thingy I
posted a while back. (X runs long enough to expire in the presence of a
couple of low priority cpu hogs. gl thread, which is a mondo cpu hog, and
normally runs and runs and runs at cpu hog priority, suddenly acquires
extreme interactive priority, and X, which is normally sleepy suddenly
becomes permanently runnable at cpu hog priority) The gl thread starts
sleeping because X isn't getting enough cpu to be able to get it's work
done and go to sleep. The gl thread isn't voluntarily sleeping, and X
isn't voluntarily running. The behavior change is forced upon both.

-Mike

2003-08-12 07:07:40

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Mike Galbraith wrote:

> At 12:51 PM 8/12/2003 +1000, Nick Piggin wrote:
>
>
>> Rob Landley wrote:
>>
>>> On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>>>
>>>
>>>> But by employing the kernel's services in the shape of a blocking
>>>> syscall, all sleeps are intentional.
>>>
>>>
>>> Wrong. Some sleeps indicate "I have run out of stuff to do right
>>> now, I'm going to wait for a timer or another process or something
>>> to wake me up with new work".
>>>
>>>
>>>
>>> Some sleeps indicate "ideally this would run on an enormous ramdisk
>>> attached to gigabit ethernet, but hard drives and internet
>>> connections are just too slow so my true CPU-hogness is hidden by
>>> the fact I'm running on a PC instead of a mainframe."
>>
>>
>> I don't quite understand what you are getting at, but if you don't
>> want to
>> sleep you should be able to use a non blocking syscall. But in some
>> cases
>> I think there are times when you may not be able to use a non
>> blocking call.
>> And if a process is a CPU hog, its a CPU hog. If its not its not.
>> Doesn't
>> matter how it would behave on another system.
>
>
> Ah, but there is something there. Take the X and xmms's gl thread
> thingy I posted a while back. (X runs long enough to expire in the
> presence of a couple of low priority cpu hogs. gl thread, which is a
> mondo cpu hog, and normally runs and runs and runs at cpu hog
> priority, suddenly acquires extreme interactive priority, and X, which
> is normally sleepy suddenly becomes permanently runnable at cpu hog
> priority) The gl thread starts sleeping because X isn't getting
> enough cpu to be able to get it's work done and go to sleep. The gl
> thread isn't voluntarily sleeping, and X isn't voluntarily running.
> The behavior change is forced upon both.


It does... It is I tell ya!

Look, the gl thread is probably _very_ explicitly asking to sleep. No I
don't know how X works, but I have an idea that select is generally used
as an event notification, right?

Now the gl thread is essentially saying "wait until X finishes the work
I've given it, or I get some other event": ie. "put me to sleep until
this fd becomes readable".

OK maybe your scenario is a big problem. Its not due to any imagined
semantics in the way things are sleeping. Its due to the scheduler.


2003-08-12 07:19:10

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Nick Piggin wrote:

>
>
> Mike Galbraith wrote:
>
>> At 12:51 PM 8/12/2003 +1000, Nick Piggin wrote:
>>
>>
>>> Rob Landley wrote:
>>>
>>>> On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>>>>
>>>>
>>>>> But by employing the kernel's services in the shape of a blocking
>>>>> syscall, all sleeps are intentional.
>>>>
>>>>
>>>>
>>>> Wrong. Some sleeps indicate "I have run out of stuff to do right
>>>> now, I'm going to wait for a timer or another process or something
>>>> to wake me up with new work".
>>>>
>>>>
>>>>
>>>> Some sleeps indicate "ideally this would run on an enormous ramdisk
>>>> attached to gigabit ethernet, but hard drives and internet
>>>> connections are just too slow so my true CPU-hogness is hidden by
>>>> the fact I'm running on a PC instead of a mainframe."
>>>
>>>
>>>
>>> I don't quite understand what you are getting at, but if you don't
>>> want to
>>> sleep you should be able to use a non blocking syscall. But in some
>>> cases
>>> I think there are times when you may not be able to use a non
>>> blocking call.
>>> And if a process is a CPU hog, its a CPU hog. If its not its not.
>>> Doesn't
>>> matter how it would behave on another system.
>>
>>
>>
>> Ah, but there is something there. Take the X and xmms's gl thread
>> thingy I posted a while back. (X runs long enough to expire in the
>> presence of a couple of low priority cpu hogs. gl thread, which is a
>> mondo cpu hog, and normally runs and runs and runs at cpu hog
>> priority, suddenly acquires extreme interactive priority, and X,
>> which is normally sleepy suddenly becomes permanently runnable at cpu
>> hog priority) The gl thread starts sleeping because X isn't getting
>> enough cpu to be able to get it's work done and go to sleep. The gl
>> thread isn't voluntarily sleeping, and X isn't voluntarily running.
>> The behavior change is forced upon both.
>
>
>
> It does... It is I tell ya!
>
> Look, the gl thread is probably _very_ explicitly asking to sleep. No I
> don't know how X works, but I have an idea that select is generally used
> as an event notification, right?
>
> Now the gl thread is essentially saying "wait until X finishes the work
> I've given it, or I get some other event": ie. "put me to sleep until
> this fd becomes readable".
>
> OK maybe your scenario is a big problem. Its not due to any imagined
> semantics in the way things are sleeping. Its due to the scheduler.


And no, X isn't intentionally sleeping. Its being preempted which is
obviously not intentional.

2003-08-12 09:18:29

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 05:07 PM 8/12/2003 +1000, Nick Piggin wrote:


>Mike Galbraith wrote:
>
>>At 12:51 PM 8/12/2003 +1000, Nick Piggin wrote:
>>
>>
>>>Rob Landley wrote:
>>>
>>>>On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>>>>
>>>>
>>>>>But by employing the kernel's services in the shape of a blocking
>>>>>syscall, all sleeps are intentional.
>>>>
>>>>
>>>>Wrong. Some sleeps indicate "I have run out of stuff to do right now,
>>>>I'm going to wait for a timer or another process or something to wake
>>>>me up with new work".
>>>>
>>>>
>>>>
>>>>Some sleeps indicate "ideally this would run on an enormous ramdisk
>>>>attached to gigabit ethernet, but hard drives and internet connections
>>>>are just too slow so my true CPU-hogness is hidden by the fact I'm
>>>>running on a PC instead of a mainframe."
>>>
>>>
>>>I don't quite understand what you are getting at, but if you don't want to
>>>sleep you should be able to use a non blocking syscall. But in some cases
>>>I think there are times when you may not be able to use a non blocking call.
>>>And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
>>>matter how it would behave on another system.
>>
>>
>>Ah, but there is something there. Take the X and xmms's gl thread thingy
>>I posted a while back. (X runs long enough to expire in the presence of
>>a couple of low priority cpu hogs. gl thread, which is a mondo cpu hog,
>>and normally runs and runs and runs at cpu hog priority, suddenly
>>acquires extreme interactive priority, and X, which is normally sleepy
>>suddenly becomes permanently runnable at cpu hog priority) The gl thread
>>starts sleeping because X isn't getting enough cpu to be able to get it's
>>work done and go to sleep. The gl thread isn't voluntarily sleeping, and
>>X isn't voluntarily running.
>>The behavior change is forced upon both.
>
>
>It does... It is I tell ya!
>
>Look, the gl thread is probably _very_ explicitly asking to sleep. No I
>don't know how X works, but I have an idea that select is generally used
>as an event notification, right?

Oh, sure, it blocks because it asks for it... but not because it _wants_ to
:) It wants to create work for X fast enough to make a nice stutter free
bit of eye-candy.

>Now the gl thread is essentially saying "wait until X finishes the work
>I've given it, or I get some other event": ie. "put me to sleep until
>this fd becomes readable".

Yes. Voluntary or involuntary is just a matter of point of view.

>OK maybe your scenario is a big problem. Its not due to any imagined
>semantics in the way things are sleeping. Its due to the scheduler.

It's due to the scheduler to a point... only in that it doesn't recognize
the problem and correct it (that might be pretty hard to do). If my
hardware were fast enough that X could get the work done in the allotted
time, the problem wouldn't arise in the first place. I bet it's fairly
hard to reproduce on a really fast box. It happens easily on my box
because the combination of X and the gl thread need most of what my
hardware has to offer.

-Mike

2003-08-12 09:37:25

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Mike Galbraith wrote:

> At 05:07 PM 8/12/2003 +1000, Nick Piggin wrote:
>
>
>> Mike Galbraith wrote:
>>

snip

>>>
>>> Ah, but there is something there. Take the X and xmms's gl thread
>>> thingy I posted a while back. (X runs long enough to expire in the
>>> presence of a couple of low priority cpu hogs. gl thread, which is
>>> a mondo cpu hog, and normally runs and runs and runs at cpu hog
>>> priority, suddenly acquires extreme interactive priority, and X,
>>> which is normally sleepy suddenly becomes permanently runnable at
>>> cpu hog priority) The gl thread starts sleeping because X isn't
>>> getting enough cpu to be able to get it's work done and go to
>>> sleep. The gl thread isn't voluntarily sleeping, and X isn't
>>> voluntarily running.
>>> The behavior change is forced upon both.
>>
>>
>>
>> It does... It is I tell ya!
>>
>> Look, the gl thread is probably _very_ explicitly asking to sleep. No I
>> don't know how X works, but I have an idea that select is generally used
>> as an event notification, right?
>
>
> Oh, sure, it blocks because it asks for it... but not because it
> _wants_ to :) It wants to create work for X fast enough to make a
> nice stutter free bit of eye-candy.


Well if it doesn't want to, it could just give select a timeout of 0 though.

>
>> Now the gl thread is essentially saying "wait until X finishes the work
>> I've given it, or I get some other event": ie. "put me to sleep until
>> this fd becomes readable".
>
>
> Yes. Voluntary or involuntary is just a matter of point of view.


Well I would think a NULL, or non-zero timeout would mean its a voluntary
sleep. If the thread has nothing to do until there is an event on the fd,
then it really does want to sleep.

Anyway, this whole thread arose because Con was making the scheduler do
different things for interruptible and uninterruptible sleeps which I
didn't think was a very good idea. Con thought uninterruptible implied
involuntary sleep (though there might have been some confusion).

I don't think they should be treated any differently, but hey I'm not
making any code or having any problems! Just trying to stir the pot a
bit!

>
>> OK maybe your scenario is a big problem. Its not due to any imagined
>> semantics in the way things are sleeping. Its due to the scheduler.
>
>
> It's due to the scheduler to a point... only in that it doesn't
> recognize the problem and correct it (that might be pretty hard to
> do). If my hardware were fast enough that X could get the work done
> in the allotted time, the problem wouldn't arise in the first place.
> I bet it's fairly hard to reproduce on a really fast box. It happens
> easily on my box because the combination of X and the gl thread need
> most of what my hardware has to offer.
>

I think backboost was very nice. I'd say Con could probably get a lot
further if that was in but its not going to happen now.


2003-08-12 09:38:11

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 05:18 PM 8/12/2003 +1000, Nick Piggin wrote:

>And no, X isn't intentionally sleeping. Its being preempted which is
>obviously not intentional.

Right. Every time X wakes the gl thread, he'll lose the cpu. Once the gl
thread passes X in priority, X is pretty much doomed. (hmm... sane [hard]
backboost will probably prevent that)

-Mike

2003-08-12 09:44:23

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 07:37 PM 8/12/2003 +1000, Nick Piggin wrote:

>I think backboost was very nice. I'd say Con could probably get a lot
>further if that was in but its not going to happen now.

Agreed. Backboost is _lovely_... except for the fangs and claws :)

-Mike

2003-08-12 10:27:20

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Monday 11 August 2003 22:51, Nick Piggin wrote:
> Rob Landley wrote:
> >On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
> >>But by employing the kernel's services in the shape of a blocking
> >>syscall, all sleeps are intentional.
> >
> >Wrong. Some sleeps indicate "I have run out of stuff to do right now, I'm
> >going to wait for a timer or another process or something to wake me up
> > with new work".
> >
> >
> >
> >Some sleeps indicate "ideally this would run on an enormous ramdisk
> > attached to gigabit ethernet, but hard drives and internet connections
> > are just too slow so my true CPU-hogness is hidden by the fact I'm
> > running on a PC instead of a mainframe."
>
> I don't quite understand what you are getting at, but if you don't want to
> sleep you should be able to use a non blocking syscall.

So you can then block on poll instead, you mean?

> But in some cases
> I think there are times when you may not be able to use a non blocking
> call.
>
> And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
> matter how it would behave on another system.

Audio playback, video playback, animated gifs in your web browser, and even
first person shooters have built in rate limiting. (Okay, games can go to an
insanely high framerate, but usually they achieve "good enough" and are happy
with that unless you're doing a benchmark with them.)

There is a certain rate of work they do, and if they can manage that they stop
working. On a system with twice as much CPU power and disks twice as fast,
kmail shouldn't use significantly more CPU keeping up with my typing. These
are "interactive" tasks.

Bug gzip, tar, gcc, and most cron jobs, are a different type of task. They
have nobuilt-in rate limiting. On a system with twice as much CPU and disks
twice as fast, they finish twice as quickly. They never voluntarily go idle
until they exit; when they're idle it just means they hit a bottleneck. The
system can never be "fast enough" that these quiesce themselves for a while
because they've run out of work just now.

These are hogs, often both of CPU time and I/O bandwidth. Being blocked on
I/O does not stop them from being hogs, it just means they're juggling their
hoggishness.

An mpeg player has times when it's neither blocked on CPU or on I/O, it's
waiting until it's time to display the next frame.

Now some of this could be viewed as a spooler problem, where there's a slow
output device (the screen, the sound card, etc) and if you wanted to you
could precompute stuff into a big memory wasting buffer and then instead of
skipping because you're not getting scheduled fast enough you're skipping
because your precomputed buffer got swapped to disk. But the difference here
is that xmms or xine could do their output generation much faster than they
are, if they wanted to. The output device could be sped up. Your animated
gif can cycle too fast to see, you can fast-forward through your movie, you
can play an mpeg so it sounds like chip and dale on helium... But they
don't, they intentionally rate limit the output, and what they want in return
is low latency when the rate limiting is up.

When you're rate limiting the output, you want to accurately control the rate.
You don't want it to be too fast (timers are great at this), and you don't
want it to be too slow (you get skips or miss frames).

That's what Con's detecting. It's a heuristic. But it's a good heuristic. A
process that plays nice and yields the CPU regularly gets a priority boost.
(That's always BEEN a heuristic.)

The current scheduler code has moved a bit beyond this, but this is the bit I
was talking about when I disagreed with you earlier.

Rob

2003-08-12 11:09:09

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Rob Landley wrote:

>On Monday 11 August 2003 22:51, Nick Piggin wrote:
>
>>Rob Landley wrote:
>>
>>>On Tuesday 05 August 2003 06:32, Nick Piggin wrote:
>>>
>>>>But by employing the kernel's services in the shape of a blocking
>>>>syscall, all sleeps are intentional.
>>>>
>>>Wrong. Some sleeps indicate "I have run out of stuff to do right now, I'm
>>>going to wait for a timer or another process or something to wake me up
>>>with new work".
>>>
>>>
>>>
>>>Some sleeps indicate "ideally this would run on an enormous ramdisk
>>>attached to gigabit ethernet, but hard drives and internet connections
>>>are just too slow so my true CPU-hogness is hidden by the fact I'm
>>>running on a PC instead of a mainframe."
>>>
>>I don't quite understand what you are getting at, but if you don't want to
>>sleep you should be able to use a non blocking syscall.
>>
>
>So you can then block on poll instead, you mean?
>

Well if thats what you intend, yes. Or set poll to be non-blocking.

>
>>But in some cases
>>I think there are times when you may not be able to use a non blocking
>>call.
>>
>>And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
>>matter how it would behave on another system.
>>
>
>Audio playback, video playback, animated gifs in your web browser, and even
>first person shooters have built in rate limiting. (Okay, games can go to an
>insanely high framerate, but usually they achieve "good enough" and are happy
>with that unless you're doing a benchmark with them.)
>
>There is a certain rate of work they do, and if they can manage that they stop
>working. On a system with twice as much CPU power and disks twice as fast,
>kmail shouldn't use significantly more CPU keeping up with my typing. These
>are "interactive" tasks.
>
>Bug gzip, tar, gcc, and most cron jobs, are a different type of task. They
>have nobuilt-in rate limiting. On a system with twice as much CPU and disks
>twice as fast, they finish twice as quickly. They never voluntarily go idle
>until they exit; when they're idle it just means they hit a bottleneck. The
>system can never be "fast enough" that these quiesce themselves for a while
>because they've run out of work just now.
>
>These are hogs, often both of CPU time and I/O bandwidth. Being blocked on
>I/O does not stop them from being hogs, it just means they're juggling their
>hoggishness.
>

This is the CPU scheduler though. A program could be a disk/network
hog and use a few % cpu. Its obviously not a cpu hog, and should get
the cpu again soon after it is woken. Sooner than non running cpu hogs,
anyway.

>
>An mpeg player has times when it's neither blocked on CPU or on I/O, it's
>waiting until it's time to display the next frame.
>
>Now some of this could be viewed as a spooler problem, where there's a slow
>output device (the screen, the sound card, etc) and if you wanted to you
>could precompute stuff into a big memory wasting buffer and then instead of
>skipping because you're not getting scheduled fast enough you're skipping
>because your precomputed buffer got swapped to disk. But the difference here
>is that xmms or xine could do their output generation much faster than they
>are, if they wanted to. The output device could be sped up. Your animated
>gif can cycle too fast to see, you can fast-forward through your movie, you
>can play an mpeg so it sounds like chip and dale on helium... But they
>don't, they intentionally rate limit the output, and what they want in return
>is low latency when the rate limiting is up.
>
>When you're rate limiting the output, you want to accurately control the rate.
>You don't want it to be too fast (timers are great at this), and you don't
>want it to be too slow (you get skips or miss frames).
>
>That's what Con's detecting. It's a heuristic. But it's a good heuristic. A
>process that plays nice and yields the CPU regularly gets a priority boost.
>(That's always BEEN a heuristic.)
>
>The current scheduler code has moved a bit beyond this, but this is the bit I
>was talking about when I disagreed with you earlier.
>

Yeah, I know Con is trying to detect this. Its just that detecting
it using TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE may not be the best
way. Suddenly your kernel compile on an NFS mount becomes interactive
for example. Then again, the way things are, Con might not have any
other option.

Mostly I agree with what you've said above.


2003-08-12 11:33:02

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 12 August 2003 07:08, Nick Piggin wrote:

> >>I don't quite understand what you are getting at, but if you don't want
> >> to sleep you should be able to use a non blocking syscall.
> >
> >So you can then block on poll instead, you mean?
>
> Well if thats what you intend, yes. Or set poll to be non-blocking.

So you're still blocking for an unknown amount of time waiting for your
outstanding requests to get serviced, now you're just hiding it to
intentionally give the scheduler less information to work with.

> >These are hogs, often both of CPU time and I/O bandwidth. Being blocked
> > on I/O does not stop them from being hogs, it just means they're juggling
> > their hoggishness.
>
> This is the CPU scheduler though. A program could be a disk/network
> hog and use a few % cpu. Its obviously not a cpu hog, and should get
> the cpu again soon after it is woken. Sooner than non running cpu hogs,
> anyway.

A program that waits for a known amount of time (I.E. on a timer) cares about
when it gets woken up. A program that blocks on an event that's going to
take an unknown amount of time can't be too upset if its wakeup is after an
unknown amount of time.

Beyond that there's blocking for input from the user (latency matters) and
blocking for input from something else (latency doesn't matter), but we can't
tell that directly and have to fake our way around it with heuristics.

> >That's what Con's detecting. It's a heuristic. But it's a good
> > heuristic. A process that plays nice and yields the CPU regularly gets a
> > priority boost. (That's always BEEN a heuristic.)
> >
> >The current scheduler code has moved a bit beyond this, but this is the
> > bit I was talking about when I disagreed with you earlier.
>
> Yeah, I know Con is trying to detect this. Its just that detecting
> it using TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE may not be the best
> way.

Okay, if this isn't the "best way", then what is? You have yet to suggest an
alternative, and this heuristic is obviously better than nothing.

> Suddenly your kernel compile on an NFS mount becomes interactive
> for example.

Translation: Suppose the heuristics fail. If it can't fail, it's not a
heuristic, is it? Failure of heuristics must be survivable. The kernel
compile IS a rampant CPU hog, and if it's mis-identified as interactive for
some reason it'll get demoted again after using up too many time slices. In
the mean time, your PVR (think home-browed Tivo clone) skips recording your
buffy rerun. This is something to be minimized, but the scheduler isn't
psychic. If it happens to once out of every million hours of use, you're
going to see more hard drive failures due and dying power supplies than
problems caused by this. (This is not sufficient for running a nuclear power
plant or automated factory, but those guys need hard realtime anyway, which
this isn't pretending to be.)

> Then again, the way things are, Con might not have any
> other option.

You're welcome to suggest a better alternative, but criticizing the current
approach without suggesting any alternative at all may not be that helpful.

> Mostly I agree with what you've said above.

Cool.

Rob

2003-08-12 11:58:16

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Rob Landley wrote:

>On Tuesday 12 August 2003 07:08, Nick Piggin wrote:
>
>
>>>>I don't quite understand what you are getting at, but if you don't want
>>>>to sleep you should be able to use a non blocking syscall.
>>>>
>>>So you can then block on poll instead, you mean?
>>>
>>Well if thats what you intend, yes. Or set poll to be non-blocking.
>>
>
>So you're still blocking for an unknown amount of time waiting for your
>outstanding requests to get serviced, now you're just hiding it to
>intentionally give the scheduler less information to work with.
>

Where are you blocking?

>
>>>These are hogs, often both of CPU time and I/O bandwidth. Being blocked
>>>on I/O does not stop them from being hogs, it just means they're juggling
>>>their hoggishness.
>>>
>>This is the CPU scheduler though. A program could be a disk/network
>>hog and use a few % cpu. Its obviously not a cpu hog, and should get
>>the cpu again soon after it is woken. Sooner than non running cpu hogs,
>>anyway.
>>
>
>A program that waits for a known amount of time (I.E. on a timer) cares about
>when it gets woken up. A program that blocks on an event that's going to
>take an unknown amount of time can't be too upset if its wakeup is after an
>unknown amount of time.
>
>Beyond that there's blocking for input from the user (latency matters) and
>blocking for input from something else (latency doesn't matter), but we can't
>tell that directly and have to fake our way around it with heuristics.
>
>
>>>That's what Con's detecting. It's a heuristic. But it's a good
>>>heuristic. A process that plays nice and yields the CPU regularly gets a
>>>priority boost. (That's always BEEN a heuristic.)
>>>
>>>The current scheduler code has moved a bit beyond this, but this is the
>>>bit I was talking about when I disagreed with you earlier.
>>>
>>Yeah, I know Con is trying to detect this. Its just that detecting
>>it using TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE may not be the best
>>way.
>>
>
>Okay, if this isn't the "best way", then what is? You have yet to suggest an
>alternative, and this heuristic is obviously better than nothing.
>

Well I'm not sure what the best way is, but I'm pretty sure its not
this ;)

I have been hearing of people complaining the scheduler is worse than
2.4 so its not entirely obvious to me. But yeah lots of it is trial and
error, so I'm not saying Con is wasting his time.

>
>>Suddenly your kernel compile on an NFS mount becomes interactive
>>for example.
>>
>
>Translation: Suppose the heuristics fail. If it can't fail, it's not a
>heuristic, is it? Failure of heuristics must be survivable. The kernel
>compile IS a rampant CPU hog, and if it's mis-identified as interactive for
>some reason it'll get demoted again after using up too many time slices. In
>the mean time, your PVR (think home-browed Tivo clone) skips recording your
>buffy rerun. This is something to be minimized, but the scheduler isn't
>psychic. If it happens to once out of every million hours of use, you're
>going to see more hard drive failures due and dying power supplies than
>problems caused by this. (This is not sufficient for running a nuclear power
>plant or automated factory, but those guys need hard realtime anyway, which
>this isn't pretending to be.)
>

Of course. I the problem is people think that the failure
cases are currently too common and or types of failure are
unacceptable.

>
>>Then again, the way things are, Con might not have any
>>other option.
>>
>
>You're welcome to suggest a better alternative, but criticizing the current
>approach without suggesting any alternative at all may not be that helpful.
>

I have been trying half hartedly over the past week or two.


2003-08-12 15:21:52

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Nick Piggin wrote:

>>
>
> I don't quite understand what you are getting at, but if you don't
> want to
> sleep you should be able to use a non blocking syscall. But in some cases
> I think there are times when you may not be able to use a non blocking
> call.
>
> And if a process is a CPU hog, its a CPU hog. If its not its not. Doesn't
> matter how it would behave on another system.
>
>

The idea is that this kind of process WANTS to be a CPU hog. If it were
not for the fact that the I/O is not immediately available, it would
never want to sleep. The only thing it ever blocks on is the read, and
this is involuntary. It doesn't use a non blocking call because it
can't continue without the data.

The questions is: Does this matter for the issue of interactivity?


2003-08-12 21:11:53

by Mike Fedyk

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, Aug 12, 2003 at 11:42:16AM +0200, Mike Galbraith wrote:
> At 05:18 PM 8/12/2003 +1000, Nick Piggin wrote:
>
> >And no, X isn't intentionally sleeping. Its being preempted which is
> >obviously not intentional.
>
> Right. Every time X wakes the gl thread, he'll lose the cpu. Once the gl
> thread passes X in priority, X is pretty much doomed. (hmm... sane [hard]
> backboost will probably prevent that)

Isn't 2.4 doing exactly that for pipes and such?

2003-08-13 02:08:15

by jw schultz

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, Aug 12, 2003 at 09:58:04PM +1000, Nick Piggin wrote:
> I have been hearing of people complaining the scheduler is worse than
> 2.4 so its not entirely obvious to me. But yeah lots of it is trial and
> error, so I'm not saying Con is wasting his time.

I've been watching Con and Ingo's efforts with the process
scheduler and i haven't seen people complaining that the
process scheduler is worse. They have complained that
interactive processes seem to have more latency. Con has
rightly questioned whether that might be because the process
scheduler has less control over CPU time allocation than in
2.4. Remember that the process scheduler only manages the
CPU time not spent in I/O and other overhead.

If there is something in BIO chewing cycles it will wreak
havoc with latency no matter what you do about process
scheduling. The work on BIO to improve bandwidth and reduce
latency was Herculean but the growing performance gap
between CPU and I/O is a formidable challenge.


--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt

2003-08-13 03:07:27

by Gene Heskett

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 12 August 2003 22:08, jw schultz wrote:
>On Tue, Aug 12, 2003 at 09:58:04PM +1000, Nick Piggin wrote:
>> I have been hearing of people complaining the scheduler is worse
>> than 2.4 so its not entirely obvious to me. But yeah lots of it is
>> trial and error, so I'm not saying Con is wasting his time.
>
>I've been watching Con and Ingo's efforts with the process
>scheduler and i haven't seen people complaining that the
>process scheduler is worse. They have complained that
>interactive processes seem to have more latency. Con has
>rightly questioned whether that might be because the process
>scheduler has less control over CPU time allocation than in
>2.4. Remember that the process scheduler only manages the
>CPU time not spent in I/O and other overhead.
>
>If there is something in BIO chewing cycles it will wreak
>havoc with latency no matter what you do about process
>scheduling. The work on BIO to improve bandwidth and reduce
>latency was Herculean but the growing performance gap
>between CPU and I/O is a formidable challenge.

In thinking about this from the aspect of what I do here, this makes
quite a bit of sense. In running 2.6.0-test3, with anticipatory
scheduler, it appears the i/o intensive tasks are being pushed back
in favor of interactivity, perhaps a bit too aggressively. An amanda
estimate phase, which turns tar loose on the drives, had to be
advanced to a -10 niceness for the whole tree of processes amanda
spawns before it began to impact the setiathome use as shown by the
nice display in gkrellm. Normally there is a period for maybe 20
minutes before the tape drive fires up where the machine is virtually
unusable due to gzip hogging things, like the cpu, during which time
seti could just as easily be swapped out. It remained at around 60%!

It did not hog/lag near as badly as usual, and the amanda run was over
an hour longer than it would have been in 2.4.22-rc2.

It is my opinion that all this should have been at setiathomes
expense, which is also rather cpu intensive, but it didn't seem to be
without lots of forceing. This is what the original concept of
niceness was all about. Or at least that was my impression. From
what it feels like here, it seems the i/o stuff is whats being
choked, and choked pretty badly when using the anticipatory
scheduler.

I've read rumors that a boottime option can switch it to somethng
else, so what do I do to switch it from the anticipatory scheduler to
whatever the alternate is?, so that I can get a feel for the other
methods and results.

--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.

2003-08-13 03:25:05

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



Gene Heskett wrote:

>On Tuesday 12 August 2003 22:08, jw schultz wrote:
>
>>On Tue, Aug 12, 2003 at 09:58:04PM +1000, Nick Piggin wrote:
>>
>>>I have been hearing of people complaining the scheduler is worse
>>>than 2.4 so its not entirely obvious to me. But yeah lots of it is
>>>trial and error, so I'm not saying Con is wasting his time.
>>>
>>I've been watching Con and Ingo's efforts with the process
>>scheduler and i haven't seen people complaining that the
>>process scheduler is worse. They have complained that
>>interactive processes seem to have more latency. Con has
>>rightly questioned whether that might be because the process
>>scheduler has less control over CPU time allocation than in
>>2.4. Remember that the process scheduler only manages the
>>CPU time not spent in I/O and other overhead.
>>
>>If there is something in BIO chewing cycles it will wreak
>>havoc with latency no matter what you do about process
>>scheduling. The work on BIO to improve bandwidth and reduce
>>latency was Herculean but the growing performance gap
>>between CPU and I/O is a formidable challenge.
>>
>
>In thinking about this from the aspect of what I do here, this makes
>quite a bit of sense. In running 2.6.0-test3, with anticipatory
>scheduler, it appears the i/o intensive tasks are being pushed back
>in favor of interactivity, perhaps a bit too aggressively. An amanda
>estimate phase, which turns tar loose on the drives, had to be
>advanced to a -10 niceness for the whole tree of processes amanda
>spawns before it began to impact the setiathome use as shown by the
>nice display in gkrellm. Normally there is a period for maybe 20
>minutes before the tape drive fires up where the machine is virtually
>unusable due to gzip hogging things, like the cpu, during which time
>seti could just as easily be swapped out. It remained at around 60%!
>
>It did not hog/lag near as badly as usual, and the amanda run was over
>an hour longer than it would have been in 2.4.22-rc2.
>
>It is my opinion that all this should have been at setiathomes
>expense, which is also rather cpu intensive, but it didn't seem to be
>without lots of forceing. This is what the original concept of
>niceness was all about. Or at least that was my impression. From
>what it feels like here, it seems the i/o stuff is whats being
>choked, and choked pretty badly when using the anticipatory
>scheduler.
>
>I've read rumors that a boottime option can switch it to somethng
>else, so what do I do to switch it from the anticipatory scheduler to
>whatever the alternate is?, so that I can get a feel for the other
>methods and results.
>

Boot with "elevator=deadline" to use the more conventional elevator.

It would be good if you could get some numbers 2.4 vs 2.6, with and
without seti running. Sounds like a long cycle though so you probably
can't be bothered!


2003-08-13 05:24:39

by Gene Heskett

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tuesday 12 August 2003 23:24, Nick Piggin wrote:
>Gene Heskett wrote:
>>On Tuesday 12 August 2003 22:08, jw schultz wrote:
>>>On Tue, Aug 12, 2003 at 09:58:04PM +1000, Nick Piggin wrote:
>>>>I have been hearing of people complaining the scheduler is worse
>>>>than 2.4 so its not entirely obvious to me. But yeah lots of it
>>>> is trial and error, so I'm not saying Con is wasting his time.
>>>
>>>I've been watching Con and Ingo's efforts with the process
>>>scheduler and i haven't seen people complaining that the
>>>process scheduler is worse. They have complained that
>>>interactive processes seem to have more latency. Con has
>>>rightly questioned whether that might be because the process
>>>scheduler has less control over CPU time allocation than in
>>>2.4. Remember that the process scheduler only manages the
>>>CPU time not spent in I/O and other overhead.
>>>
>>>If there is something in BIO chewing cycles it will wreak
>>>havoc with latency no matter what you do about process
>>>scheduling. The work on BIO to improve bandwidth and reduce
>>>latency was Herculean but the growing performance gap
>>>between CPU and I/O is a formidable challenge.
>>
>>In thinking about this from the aspect of what I do here, this
>> makes quite a bit of sense. In running 2.6.0-test3, with
>> anticipatory scheduler, it appears the i/o intensive tasks are
>> being pushed back in favor of interactivity, perhaps a bit too
>> aggressively. An amanda estimate phase, which turns tar loose on
>> the drives, had to be advanced to a -10 niceness for the whole
>> tree of processes amanda spawns before it began to impact the
>> setiathome use as shown by the nice display in gkrellm. Normally
>> there is a period for maybe 20 minutes before the tape drive fires
>> up where the machine is virtually unusable due to gzip hogging
>> things, like the cpu, during which time seti could just as easily
>> be swapped out. It remained at around 60%!
>>
>>It did not hog/lag near as badly as usual, and the amanda run was
>> over an hour longer than it would have been in 2.4.22-rc2.
>>
>>It is my opinion that all this should have been at setiathomes
>>expense, which is also rather cpu intensive, but it didn't seem to
>> be without lots of forceing. This is what the original concept of
>> niceness was all about. Or at least that was my impression. From
>> what it feels like here, it seems the i/o stuff is whats being
>> choked, and choked pretty badly when using the anticipatory
>> scheduler.
>>
>>I've read rumors that a boottime option can switch it to somethng
>>else, so what do I do to switch it from the anticipatory scheduler
>> to whatever the alternate is?, so that I can get a feel for the
>> other methods and results.
>
>Boot with "elevator=deadline" to use the more conventional elevator.
>
>It would be good if you could get some numbers 2.4 vs 2.6, with and
>without seti running. Sounds like a long cycle though so you
> probably can't be bothered!

Not this time of the night at least since amanda is doing her nightly
thing ATM. But thats not an impossible task, just time consuming.
Right now, 40 minutes into the backup, seti is only getting 20% of
the cpu, and I'd expect the run to be finished by about 3:50. Kernel
is 2.4.22-rc2 ATM.

Tomorrow nite, I'll be running the 2.6.0-test3 kernel, and will kill
seti just before amanda starts and see how long it takes.

Then thursday I'll be running the deadline scheduler just for
comparison, again without seti, and I'll relate the results.

Unrelated question: I've applied the 2.6 patches someone pointed me
at to the nvidia-linux-4496-pkg2 after figuring out how to get it to
unpack and leave itself behind, so x can be run on 2.6 now. But its
a 100% total crash to exit x by any method when using it that way.

Has the patch been updated in the last couple of weeks to prevent that
now? It takes nearly half an hour to e2fsck a hundred gigs worth of
drives, and its going to bite me if I don't let the system settle
before I crash it to reboot, finishing the reboot with the hardware
reset button.

Better yet, a fresh pointer to that site.

--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.

2003-08-13 05:43:48

by Andrew McGregor

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity



--On Wednesday, August 13, 2003 01:24:31 AM -0400 Gene Heskett
<[email protected]> wrote:


> Unrelated question: I've applied the 2.6 patches someone pointed me
> at to the nvidia-linux-4496-pkg2 after figuring out how to get it to
> unpack and leave itself behind, so x can be run on 2.6 now. But its
> a 100% total crash to exit x by any method when using it that way.
>
> Has the patch been updated in the last couple of weeks to prevent that
> now? It takes nearly half an hour to e2fsck a hundred gigs worth of
> drives, and its going to bite me if I don't let the system settle
> before I crash it to reboot, finishing the reboot with the hardware
> reset button.
>
> Better yet, a fresh pointer to that site.
>

http://www.minion.de/

Works fine for me, as of 2.6.0-test1 (which is when I downloaded the
patch). I don't get the crash on either of my systems (GeForce2Go P3
laptop and GeForce4 Athlon desktop).

Andrew


Attachments:
(No filename) (941.00 B)
(No filename) (189.00 B)
Download all attachments

2003-08-13 06:51:14

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

At 02:11 PM 8/12/2003 -0700, Mike Fedyk wrote:
>On Tue, Aug 12, 2003 at 11:42:16AM +0200, Mike Galbraith wrote:
> > At 05:18 PM 8/12/2003 +1000, Nick Piggin wrote:
> >
> > >And no, X isn't intentionally sleeping. Its being preempted which is
> > >obviously not intentional.
> >
> > Right. Every time X wakes the gl thread, he'll lose the cpu. Once the gl
> > thread passes X in priority, X is pretty much doomed. (hmm... sane [hard]
> > backboost will probably prevent that)
>
>Isn't 2.4 doing exactly that for pipes and such?

At a glance, preemption appears to be primarily a matter of timeslice in 2.4.

-Mike

2003-08-13 12:34:49

by Gene Heskett

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Wednesday 13 August 2003 01:43, Andrew McGregor wrote:
>--On Wednesday, August 13, 2003 01:24:31 AM -0400 Gene Heskett
>
><[email protected]> wrote:
>> Unrelated question: I've applied the 2.6 patches someone pointed
>> me at to the nvidia-linux-4496-pkg2 after figuring out how to get
>> it to unpack and leave itself behind, so x can be run on 2.6 now.
>> But its a 100% total crash to exit x by any method when using it
>> that way.
>>
>> Has the patch been updated in the last couple of weeks to prevent
>> that now? It takes nearly half an hour to e2fsck a hundred gigs
>> worth of drives, and its going to bite me if I don't let the
>> system settle before I crash it to reboot, finishing the reboot
>> with the hardware reset button.
>>
>> Better yet, a fresh pointer to that site.
>
>http://www.minion.de/
>
>Works fine for me, as of 2.6.0-test1 (which is when I downloaded the
>patch). I don't get the crash on either of my systems (GeForce2Go
> P3 laptop and GeForce4 Athlon desktop).
>
>Andrew

I see some notes about patching X, which I haven't done. That might
be it. I also doublechecked that I'm running the correct makefile,
and get this:

[root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2]# ls -lR * |grep Makefile
-rw-r--r-- 1 root root 3623 Jul 16 22:56 Makefile
-rw-r--r-- 1 root root 7629 Aug 5 22:24 Makefile
-rw-r--r-- 1 root root 7629 Aug 5 21:46 Makefile.kbuild
-rw-r--r-- 1 root root 4865 Aug 5 21:46 Makefile.nvidia
[root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2]# cd ../NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4/
[root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]# ls -lR * |grep Makefile
-rw-r--r-- 1 root root 3623 Jul 16 22:56 Makefile
-rw-r--r-- 1 root root 5665 Jul 16 22:56 Makefile
[root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]#

My video card, from an lspci:
01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX DDR] (rev b2)

And the XFree86 version is:
3.2.1-21

Interesting to note that the 'nv' driver that comes with X
does not do this. But it also has no openGL and such.
We are instructed to remove agp support from the kernel, and
use that which is in the nvidia kit, and I just checked the
.config, and its off, so thats theoreticly correct. A grep
for FB stuff returns this:

CONFIG_FB=y
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
CONFIG_FB_VESA=y
# CONFIG_FB_HGA is not set
CONFIG_FB_RIVA=y
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_VIRTUAL is not set

I'd assume the 'RIVA' fb is the correct one, its working in
2.4, although I can induce a crash there by switching from X
to a virtual console, and then attempting to switch back to X.
That will generally bring the machine down. It is perfectly ok
to do that, repeatedly, when running the nv driver from X.

--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.

2003-08-13 13:48:47

by Pascal Schmidt

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Wed, 13 Aug 2003 07:30:11 +0200, you wrote in linux.kernel:

> Unrelated question: I've applied the 2.6 patches someone pointed me
> at to the nvidia-linux-4496-pkg2 after figuring out how to get it to
> unpack and leave itself behind, so x can be run on 2.6 now.

Do you need 3d for your testing? If not, XFree86's own nv driver seems
to work very well indeed.

--
Ciao,
Pascal

2003-08-13 14:50:24

by Gene Heskett

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Wednesday 13 August 2003 09:48, Pascal Schmidt wrote:
>On Wed, 13 Aug 2003 07:30:11 +0200, you wrote in linux.kernel:
>> Unrelated question: I've applied the 2.6 patches someone pointed
>> me at to the nvidia-linux-4496-pkg2 after figuring out how to get
>> it to unpack and leave itself behind, so x can be run on 2.6 now.
>
>Do you need 3d for your testing? If not, XFree86's own nv driver
> seems to work very well indeed.

Not really since kmail is the one app thats never quit here, that and
xawtv :), so I'm going back to the nv drivers, turning on that stuff
in the kernel it needs instead, for 2.4.22-rc2 first. Then I'll see
if I can make it work to test3-mm2, which I just downloaded.

--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.

2003-08-14 05:04:22

by Andrew McGregor

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Ah. I see you have framebuffer console on, whereas I have plain VGA
console only. Try turning framebuffer off; two drivers for the same
hardware may well fight over it. My X isn't patched, it just has their
driver modules and the libraries installed.

Andrew

--On Wednesday, August 13, 2003 08:33:43 AM -0400 Gene Heskett
<[email protected]> wrote:

> On Wednesday 13 August 2003 01:43, Andrew McGregor wrote:
>> --On Wednesday, August 13, 2003 01:24:31 AM -0400 Gene Heskett
>>
>> <[email protected]> wrote:
>>> Unrelated question: I've applied the 2.6 patches someone pointed
>>> me at to the nvidia-linux-4496-pkg2 after figuring out how to get
>>> it to unpack and leave itself behind, so x can be run on 2.6 now.
>>> But its a 100% total crash to exit x by any method when using it
>>> that way.
>>>
>>> Has the patch been updated in the last couple of weeks to prevent
>>> that now? It takes nearly half an hour to e2fsck a hundred gigs
>>> worth of drives, and its going to bite me if I don't let the
>>> system settle before I crash it to reboot, finishing the reboot
>>> with the hardware reset button.
>>>
>>> Better yet, a fresh pointer to that site.
>>
>> http://www.minion.de/
>>
>> Works fine for me, as of 2.6.0-test1 (which is when I downloaded the
>> patch). I don't get the crash on either of my systems (GeForce2Go
>> P3 laptop and GeForce4 Athlon desktop).
>>
>> Andrew
>
> I see some notes about patching X, which I haven't done. That might
> be it. I also doublechecked that I'm running the correct makefile,
> and get this:
>
> [root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2]# ls -lR * |grep Makefile
> -rw-r--r-- 1 root root 3623 Jul 16 22:56 Makefile
> -rw-r--r-- 1 root root 7629 Aug 5 22:24 Makefile
> -rw-r--r-- 1 root root 7629 Aug 5 21:46 Makefile.kbuild
> -rw-r--r-- 1 root root 4865 Aug 5 21:46 Makefile.nvidia
> [root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2]# cd
> ../NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4/ [root@coyote
> NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]# ls -lR * |grep Makefile -rw-r--r--
> 1 root root 3623 Jul 16 22:56 Makefile
> -rw-r--r-- 1 root root 5665 Jul 16 22:56 Makefile
> [root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]#
>
> My video card, from an lspci:
> 01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX
> DDR] (rev b2)
>
> And the XFree86 version is:
> 3.2.1-21
>
> Interesting to note that the 'nv' driver that comes with X
> does not do this. But it also has no openGL and such.
> We are instructed to remove agp support from the kernel, and
> use that which is in the nvidia kit, and I just checked the
> .config, and its off, so thats theoreticly correct. A grep
> for FB stuff returns this:
>
> CONFIG_FB=y
># CONFIG_FB_CIRRUS is not set
># CONFIG_FB_PM2 is not set
># CONFIG_FB_CYBER2000 is not set
># CONFIG_FB_IMSTT is not set
># CONFIG_FB_VGA16 is not set
> CONFIG_FB_VESA=y
># CONFIG_FB_HGA is not set
> CONFIG_FB_RIVA=y
># CONFIG_FB_MATROX is not set
># CONFIG_FB_RADEON is not set
># CONFIG_FB_ATY128 is not set
># CONFIG_FB_ATY is not set
># CONFIG_FB_SIS is not set
># CONFIG_FB_NEOMAGIC is not set
># CONFIG_FB_3DFX is not set
># CONFIG_FB_VOODOO1 is not set
># CONFIG_FB_TRIDENT is not set
># CONFIG_FB_PM3 is not set
># CONFIG_FB_VIRTUAL is not set
>
> I'd assume the 'RIVA' fb is the correct one, its working in
> 2.4, although I can induce a crash there by switching from X
> to a virtual console, and then attempting to switch back to X.
> That will generally bring the machine down. It is perfectly ok
> to do that, repeatedly, when running the nv driver from X.
>
> --
> Cheers, Gene
> AMD K6-III@500mhz 320M
> Athlon1600XP@1400mhz 512M
> 99.27% setiathome rank, not too shabby for a WV hillbilly
> Yahoo.com attornies please note, additions to this message
> by Gene Heskett are:
> Copyright 2003 by Maurice Eugene Heskett, all rights reserved.
>
>




Attachments:
(No filename) (3.88 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-14 10:49:13

by Gene Heskett

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Thursday 14 August 2003 01:03, Andrew McGregor wrote:
>Ah. I see you have framebuffer console on, whereas I have plain VGA
>console only. Try turning framebuffer off; two drivers for the same
>hardware may well fight over it. My X isn't patched, it just has
> their driver modules and the libraries installed.
>
>Andrew

Currently I booted to test3-mm2 with the bugoff patch, and using the x
nv drivers and everything in the video dept is cool. I see by dmesg
that the vesafb isn't being used so I'll take that out in addition
the next time I switch to the nvidia drivers.

I might add that test3-mm2 appeared to handle this mornings amanda run
more like it would have run under 2.4, a pretty nice improvement over
the bare test3, which appeared to shove amanda to the back of the
queue most of the time.

>From dmesg, snippets:

Kernel command line: ro root=/dev/hda3 hdc=ide-scsi noapic vga=791
ide_setup: hdc=ide-scsi
Found and enabled local APIC!
current: c03c59c0
current->thread_info: c0454000
[...]
Initializing RT netlink socket
spurious 8259A interrupt: IRQ7.
PCI: PCI BIOS revision 2.10 entry at 0xfb4e0, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Using IRQ router default [1106/3099] at 0000:00:00.0
rivafb: nVidia device/chipset 10DE0111
rivafb: Detected CRTC controller 0 being used
rivafb: RIVA MTRR set to ON
rivafb: PCI nVidia NV10 framebuffer ver 0.9.5b (nVidiaGeForce2-M, 32MB
@ 0xE0000000)
Console: switching to colour frame buffer device 80x30
[...]
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:11.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8233 (rev 00) IDE UDMA100 controller on pci0000:00:11.1
ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:DMA
hda: Maxtor 54610H6, ATA DISK drive
hda: IRQ probe failed (0xffffffba)
hdb: IRQ probe failed (0xffffffba)
hdb: IRQ probe failed (0xffffffba)

those last 3 lines are new to 2.6.

Comments?

>--On Wednesday, August 13, 2003 08:33:43 AM -0400 Gene Heskett
>
><[email protected]> wrote:
>> On Wednesday 13 August 2003 01:43, Andrew McGregor wrote:
>>> --On Wednesday, August 13, 2003 01:24:31 AM -0400 Gene Heskett
>>>
>>> <[email protected]> wrote:
>>>> Unrelated question: I've applied the 2.6 patches someone
>>>> pointed me at to the nvidia-linux-4496-pkg2 after figuring out
>>>> how to get it to unpack and leave itself behind, so x can be run
>>>> on 2.6 now. But its a 100% total crash to exit x by any method
>>>> when using it that way.
>>>>
>>>> Has the patch been updated in the last couple of weeks to
>>>> prevent that now? It takes nearly half an hour to e2fsck a
>>>> hundred gigs worth of drives, and its going to bite me if I
>>>> don't let the system settle before I crash it to reboot,
>>>> finishing the reboot with the hardware reset button.
>>>>
>>>> Better yet, a fresh pointer to that site.
>>>
>>> http://www.minion.de/
>>>
>>> Works fine for me, as of 2.6.0-test1 (which is when I downloaded
>>> the patch). I don't get the crash on either of my systems
>>> (GeForce2Go P3 laptop and GeForce4 Athlon desktop).
>>>
>>> Andrew
>>
>> I see some notes about patching X, which I haven't done. That
>> might be it. I also doublechecked that I'm running the correct
>> makefile, and get this:
>>
>> [root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2]# ls -lR * |grep
>> Makefile -rw-r--r-- 1 root root 3623 Jul 16 22:56
>> Makefile -rw-r--r-- 1 root root 7629 Aug 5 22:24
>> Makefile -rw-r--r-- 1 root root 7629 Aug 5 21:46
>> Makefile.kbuild -rw-r--r-- 1 root root 4865 Aug 5
>> 21:46 Makefile.nvidia [root@coyote
>> NVIDIA-Linux-x86-1.0-4496-pkg2]# cd
>> ../NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4/ [root@coyote
>> NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]# ls -lR * |grep Makefile
>> -rw-r--r-- 1 root root 3623 Jul 16 22:56 Makefile
>> -rw-r--r-- 1 root root 5665 Jul 16 22:56 Makefile
>> [root@coyote NVIDIA-Linux-x86-1.0-4496-pkg2-4-2.4]#
>>
>> My video card, from an lspci:
>> 01:00.0 VGA compatible controller: nVidia Corporation NV11
>> [GeForce2 MX DDR] (rev b2)
>>
>> And the XFree86 version is:
>> 3.2.1-21
>>
>> Interesting to note that the 'nv' driver that comes with X
>> does not do this. But it also has no openGL and such.
>> We are instructed to remove agp support from the kernel, and
>> use that which is in the nvidia kit, and I just checked the
>> .config, and its off, so thats theoreticly correct. A grep
>> for FB stuff returns this:
>>
>> CONFIG_FB=y
>># CONFIG_FB_CIRRUS is not set
>># CONFIG_FB_PM2 is not set
>># CONFIG_FB_CYBER2000 is not set
>># CONFIG_FB_IMSTT is not set
>># CONFIG_FB_VGA16 is not set
>> CONFIG_FB_VESA=y
>># CONFIG_FB_HGA is not set
>> CONFIG_FB_RIVA=y
>># CONFIG_FB_MATROX is not set
>># CONFIG_FB_RADEON is not set
>># CONFIG_FB_ATY128 is not set
>># CONFIG_FB_ATY is not set
>># CONFIG_FB_SIS is not set
>># CONFIG_FB_NEOMAGIC is not set
>># CONFIG_FB_3DFX is not set
>># CONFIG_FB_VOODOO1 is not set
>># CONFIG_FB_TRIDENT is not set
>># CONFIG_FB_PM3 is not set
>># CONFIG_FB_VIRTUAL is not set
>>
>> I'd assume the 'RIVA' fb is the correct one, its working in
>> 2.4, although I can induce a crash there by switching from X
>> to a virtual console, and then attempting to switch back to X.
>> That will generally bring the machine down. It is perfectly ok
>> to do that, repeatedly, when running the nv driver from X.
>>
>> --
>> Cheers, Gene
>> AMD K6-III@500mhz 320M
>> Athlon1600XP@1400mhz 512M
>> 99.27% setiathome rank, not too shabby for a WV hillbilly
>> Yahoo.com attornies please note, additions to this message
>> by Gene Heskett are:
>> Copyright 2003 by Maurice Eugene Heskett, all rights reserved.

--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.