2003-07-27 14:53:08

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] O10int for interactivity

Here is a fairly rapid evolution of the O*int patches for interactivity thanks
to Ingo's involvement.

Changes:
I've put in some defines to clarify where the numbers for MAX_SLEEP_AVG come
from now, rather than the number being magic. In the process it increases MSA
every so slightly so that an average task that runs a full timeslice (102ms)
will drop exactly one priority in that time.

I've incorporated Ingo's fix for scheduling latency in a form that works for
my patch, along with the other minor tweaks.

The parent and child sleep avg on forking is set to just on the priority bonus
value with each fork thus keeping their bonus the same but making them very
easy to tip to a lower priority.

A tiny addition to ensure any task that runs gets charged one tick of
sleep_avg.

This patch is against 2.6.0-test1-mm2 patched up to O9int. An updated
O9int with layout corrections was posted on my website. A full O10int patch
against 2.6.0-test1 is available on my website.

Con

http://kernel.kolivas.org/2.5

patch-O10int-0307280030 :

--- linux-2.6.0-test1-mm2/kernel/sched.c 2003-07-27 14:03:16.000000000 +1000
+++ linux-2.6.0-test1ck2/kernel/sched.c 2003-07-28 00:31:39.000000000 +1000
@@ -58,6 +58,8 @@
#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
+#define AVG_TIMESLICE (MIN_TIMESLICE + ((MAX_TIMESLICE - MIN_TIMESLICE) *\
+ (MAX_PRIO-1-NICE_TO_PRIO(0))/(MAX_USER_PRIO - 1)))

/*
* These are the 'tuning knobs' of the scheduler:
@@ -68,16 +70,16 @@
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
#define MAX_TIMESLICE (200 * HZ / 1000)
-#define TIMESLICE_GRANULARITY (HZ / 20 ?: 1)
+#define TIMESLICE_GRANULARITY (HZ/40 ?: 1)
#define CHILD_PENALTY 90
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
#define PRIO_BONUS_RATIO 25
+#define MAX_BONUS (MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)
#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (HZ)
-#define STARVATION_LIMIT (HZ)
+#define MAX_SLEEP_AVG (AVG_TIMESLICE * MAX_BONUS)
+#define STARVATION_LIMIT (MAX_SLEEP_AVG)
#define NODE_THRESHOLD 125
-#define MAX_BONUS (MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)

/*
* If a task is 'interactive' then we reinsert it in the active
@@ -117,6 +119,11 @@
#define TASK_INTERACTIVE(p) \
((p)->prio <= (p)->static_prio - DELTA(p))

+#define TASK_PREEMPTS_CURR(p, rq) \
+ ((p)->prio < (rq)->curr->prio || \
+ ((p)->prio == (rq)->curr->prio && \
+ (p)->time_slice > (rq)->curr->time_slice * 2))
+
/*
* BASE_TIMESLICE scales user-nice values [ -20 ... 19 ]
* to time slice values.
@@ -341,6 +348,42 @@ static inline void __activate_task(task_
nr_running_inc(rq);
}

+static void recalc_task_prio(task_t *p)
+{
+ long sleep_time = jiffies - p->last_run - 1;
+
+ if (sleep_time > 0) {
+ p->activated = 0;
+
+ /*
+ * User tasks that sleep a long time are categorised as
+ * idle and will get just under interactive status to
+ * prevent them suddenly becoming cpu hogs and starving
+ * other processes.
+ */
+ if (p->mm && sleep_time > HZ)
+ p->sleep_avg = MAX_SLEEP_AVG *
+ (MAX_BONUS - 1) / MAX_BONUS - 1;
+ else {
+
+ /*
+ * Processes that sleep get pushed to one higher
+ * priority each time they sleep greater than
+ * one tick. -ck
+ */
+ p->sleep_avg = (p->sleep_avg * MAX_BONUS /
+ MAX_SLEEP_AVG + 1) *
+ MAX_SLEEP_AVG / MAX_BONUS;
+
+ if (p->sleep_avg > MAX_SLEEP_AVG)
+ p->sleep_avg = MAX_SLEEP_AVG;
+ }
+ }
+ p->prio = effective_prio(p);
+
+}
+
+
/*
* activate_task - move a task to the runqueue and do priority recalculation
*
@@ -350,37 +393,11 @@ static inline void __activate_task(task_
static inline void activate_task(task_t *p, runqueue_t *rq)
{
if (likely(p->last_run)){
- long sleep_time = jiffies - p->last_run - 1;
-
- if (sleep_time > 0) {
- /*
- * User tasks that sleep a long time are categorised as
- * idle and will get just under interactive status to
- * prevent them suddenly becoming cpu hogs and starving
- * other processes.
- */
- if (p->mm && sleep_time > HZ)
- p->sleep_avg = MAX_SLEEP_AVG *
- (MAX_BONUS - 1) / MAX_BONUS - 1;
- else {
-
- /*
- * Processes that sleep get pushed to one higher
- * priority each time they sleep greater than
- * one tick. -ck
- */
- p->sleep_avg = (p->sleep_avg * MAX_BONUS /
- MAX_SLEEP_AVG + 1) *
- MAX_SLEEP_AVG / MAX_BONUS;
-
- if (p->sleep_avg > MAX_SLEEP_AVG)
- p->sleep_avg = MAX_SLEEP_AVG;
- }
- }
+ p->activated = 1;
+ recalc_task_prio(p);
} else
p->last_run = jiffies;

- p->prio = effective_prio(p);
__activate_task(p, rq);
}

@@ -507,7 +524,7 @@ repeat_lock_task:
__activate_task(p, rq);
else {
activate_task(p, rq);
- if (p->prio < rq->curr->prio)
+ if (TASK_PREEMPTS_CURR(p, rq))
resched_task(rq->curr);
}
success = 1;
@@ -556,8 +573,11 @@ void wake_up_forked_process(task_t * p)
* and children as well, to keep max-interactive tasks
* from forking tasks that are max-interactive.
*/
- current->sleep_avg = current->sleep_avg * PARENT_PENALTY / 100;
- p->sleep_avg = p->sleep_avg * CHILD_PENALTY / 100;
+ current->sleep_avg = current->sleep_avg * MAX_BONUS / MAX_SLEEP_AVG *
+ PARENT_PENALTY / 100 * MAX_SLEEP_AVG /
+ MAX_BONUS;
+ p->sleep_avg = p->sleep_avg * MAX_BONUS / MAX_SLEEP_AVG *
+ CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS;
p->prio = effective_prio(p);
p->last_run = 0;
set_task_cpu(p, smp_processor_id());
@@ -1254,7 +1274,8 @@ void scheduler_tick(int user_ticks, int
} else
enqueue_task(p, rq->active);
} else if (!((task_timeslice(p) - p->time_slice) %
- TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE)) {
+ TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE) &&
+ (p->array == rq->active)) {
/*
* Running user tasks get requeued with their remaining
* timeslice after TIMESLICE_GRANULARITY provided they have at
@@ -1302,6 +1323,13 @@ need_resched:

release_kernel_lock(prev);
prev->last_run = jiffies;
+ /*
+ * If a task has run less than one tick make sure it is still
+ * charged one sleep_avg for running.
+ */
+ if (unlikely((task_timeslice(prev) == prev->time_slice) &&
+ prev->sleep_avg))
+ prev->sleep_avg--;
spin_lock_irq(&rq->lock);

/*
@@ -1349,6 +1377,13 @@ pick_next_task:
queue = array->queue + idx;
next = list_entry(queue->next, task_t, run_list);

+ if (next->activated) {
+ next->activated = 0;
+ array = next->array;
+ dequeue_task(next, array);
+ recalc_task_prio(next);
+ enqueue_task(next, array);
+ }
switch_tasks:
prefetch(next);
clear_tsk_need_resched(prev);
--- linux-2.6.0-test1-mm2/include/linux/sched.h 2003-07-24 10:31:41.000000000 +1000
+++ linux-2.6.0-test1ck2/include/linux/sched.h 2003-07-27 20:09:04.000000000 +1000
@@ -342,6 +342,7 @@ struct task_struct {

unsigned long sleep_avg;
unsigned long last_run;
+ int activated;

unsigned long policy;
cpumask_t cpus_allowed;


2003-07-27 16:22:17

by Wade

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

Con Kolivas wrote:
> Here is a fairly rapid evolution of the O*int patches for interactivity thanks
> to Ingo's involvement.
>
> Changes:
> I've put in some defines to clarify where the numbers for MAX_SLEEP_AVG come
> from now, rather than the number being magic. In the process it increases MSA
> every so slightly so that an average task that runs a full timeslice (102ms)
> will drop exactly one priority in that time.
>
> I've incorporated Ingo's fix for scheduling latency in a form that works for
> my patch, along with the other minor tweaks.
>
> The parent and child sleep avg on forking is set to just on the priority bonus
> value with each fork thus keeping their bonus the same but making them very
> easy to tip to a lower priority.
>
> A tiny addition to ensure any task that runs gets charged one tick of
> sleep_avg.
>
> This patch is against 2.6.0-test1-mm2 patched up to O9int. An updated
> O9int with layout corrections was posted on my website. A full O10int patch
> against 2.6.0-test1 is available on my website.
>
> Con
>
> http://kernel.kolivas.org/2.5
>
[snip]

Anyone who has been holding back on trying these patches out should try
this one, it's _very_ smooth for me here. Thank you Con & Ingo!



2003-07-28 00:02:21

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Sun, 2003-07-27 at 17:12, Con Kolivas wrote:
> Here is a fairly rapid evolution of the O*int patches for interactivity thanks
> to Ingo's involvement.

Only one word: IMPRESSIVE

It's smooth, pretty smooth. However, as I'm a little bit picky, for
maximum smoothness, I've reniced X to -20. Good work!

2003-07-28 01:51:57

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity


# Resend, since vger seems to have dumped the first attempt:

On 2003-07-27 16:26:23 Wade wrote:

> Con Kolivas wrote:
>> Here is a fairly rapid evolution of the O*int patches for
>> interactivity thanks to Ingo's involvement.
[...]
> Anyone who has been holding back on trying these patches out should
> try this one, it's _very_ smooth for me here. Thank you Con & Ingo!

Incredible. I haven't tried the patches before since I experienced only
slight jerkiness - probably due to running a very light environment. No
desktop, only Enlightenment as wm. This on a PII 400. And no audio skips
in xmms (oss emulation of alsa).

Compiling 2.6.0-test1 with O10int first gave a definite smoothness when
browsing heavy pages with Opera 6.02. But the real surprise came when I
tried Baldurs Gate I under winex3. That oldie but goodie has been
impossible to play with all 2.4-s and 2.5-s (and plain -test1) due to
extreme jerks in sound, mouse and screen panning. Now... almost as
smooth as I remember it from a WinDos machine! With music and all the
sound effects! I'm almost speechless. What a job well done.

Mvh
Mats Johannesson

2003-07-28 01:57:51

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity


On 2003-07-27 16:26:23 Wade wrote:

> Con Kolivas wrote:
>> Here is a fairly rapid evolution of the O*int patches for
>> interactivity thanks to Ingo's involvement.
[...]
> Anyone who has been holding back on trying these patches out should
> try this one, it's _very_ smooth for me here. Thank you Con & Ingo!

Incredible. I haven't tried the patches before since I experienced only
slight jerkiness - probably due to running a very light environment. No
desktop, only Enlightenment as wm. This on a PII 400. And no audio skips
in xmms (oss emulation of alsa).

Compiling 2.6.0-test1 with O10int first gave a definite smoothness when
browsing heavy pages with Opera 6.02. But the real surprise came when I
tried Baldurs Gate I under winex3. That oldie but goodie has been
impossible to play with all 2.4-s and 2.5-s (and plain -test1) due to
extreme jerks in sound, mouse and screen panning. Now... almost as
smooth as I remember it from a WinDos machine! With music and all the
sound effects! I'm almost speechless. What a job well done.

Mvh
Mats Johannesson

2003-07-28 07:36:05

by Wiktor Wodecki

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Mon, Jul 28, 2003 at 01:12:16AM +1000, Con Kolivas wrote:
> Here is a fairly rapid evolution of the O*int patches for interactivity thanks
> to Ingo's involvement.
>
> Changes:
> I've put in some defines to clarify where the numbers for MAX_SLEEP_AVG come
> from now, rather than the number being magic. In the process it increases MSA
> every so slightly so that an average task that runs a full timeslice (102ms)
> will drop exactly one priority in that time.
>
> I've incorporated Ingo's fix for scheduling latency in a form that works for
> my patch, along with the other minor tweaks.
>
> The parent and child sleep avg on forking is set to just on the priority bonus
> value with each fork thus keeping their bonus the same but making them very
> easy to tip to a lower priority.
>
> A tiny addition to ensure any task that runs gets charged one tick of
> sleep_avg.
>
> This patch is against 2.6.0-test1-mm2 patched up to O9int. An updated
> O9int with layout corrections was posted on my website. A full O10int patch
> against 2.6.0-test1 is available on my website.

okay, applied O10 on top of 2.6.0-test2. The same problem I wrote you
yesterday about O9, when starting OpenOffice and bzip2'ing in the
background OO becomes nearly unusable - I can type a sentence and watch
the characters appear. I don't know if this was always the case since I
haven't used OO before that much (need it for the university now)

--
Regards,

Wiktor Wodecki


Attachments:
(No filename) (1.43 kB)
(No filename) (189.00 B)
Download all attachments

2003-07-28 07:45:05

by Wiktor Wodecki

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity


On Mon, Jul 28, 2003 at 12:55:43AM -0700, Andrew Morton wrote:
> Wiktor Wodecki <[email protected]> wrote:
> >
> > The same problem I wrote you
> > yesterday about O9, when starting OpenOffice and bzip2'ing in the
> > background OO becomes nearly unusable
>
> There's a known problem with OpenOffice and its use of sched_yield().
> sched_yield() got changed in 2.6 and it makes OO unusable when there is
> other stuff happening.
>
> Apparently it has been fixed in recent OpenOffice versions. If you cannot
> reproduce this problem in any other application I'd be saying it is "not a
> bug".

No, I have tried others. I'll write 'OO-Update' on my ToDo as #434355
then.

--
Regards,

Wiktor Wodecki


Attachments:
(No filename) (703.00 B)
(No filename) (189.00 B)
Download all attachments

2003-07-28 07:40:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

Wiktor Wodecki <[email protected]> wrote:
>
> The same problem I wrote you
> yesterday about O9, when starting OpenOffice and bzip2'ing in the
> background OO becomes nearly unusable

There's a known problem with OpenOffice and its use of sched_yield().
sched_yield() got changed in 2.6 and it makes OO unusable when there is
other stuff happening.

Apparently it has been fixed in recent OpenOffice versions. If you cannot
reproduce this problem in any other application I'd be saying it is "not a
bug".

2003-07-28 16:57:39

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Monday, 28 July 2003, at 00:55:43 -0700,
Andrew Morton wrote:

> There's a known problem with OpenOffice and its use of sched_yield().
> sched_yield() got changed in 2.6 and it makes OO unusable when there is
> other stuff happening.
>
> Apparently it has been fixed in recent OpenOffice versions. If you cannot
> reproduce this problem in any other application I'd be saying it is "not a
> bug".
>
This must be the reason behind a simple "Save..." of a little OpenOffice
Writer document taking 3 seconds with no activity on the box, and two
minutes when "make bzImage", or "yes" or anyhting CPU intensive.

On the other hand, OO saves files (which is a CPU-bound process) only
marginally slower under heavy hard disk read activity than on an idle
system. Tested with 2.6.0-test2 and with 2.6.0-test2 with latest
scheduler patch from Ingo (2.6.0-test1-G6).

Regards,

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.0-test1-bk3)

2003-07-28 17:53:05

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Mon, 28 Jul 2003 01:12:16 +1000, Con Kolivas said:
> Here is a fairly rapid evolution of the O*int patches for interactivity thanks
> to Ingo's involvement.

I'm running the -O10 variant that's in Andrew's -test2-mm1 patch, and I'm
totally unable to force the CPU scheduler to misbehave.

I am, however, able to get 'xmms' to skip. The reason is that the CPU is being
scheduled quite adequately, but I/O is *NOT*.

The reason is that xmms's .ogg decoder is reading a 128K chunk every 10 seconds
or so, and doesn't do the next read till it's *really* close to running out of
data. Unfortunately, under high I/O load (which isn't all THAT high, it's a
HITACHI_DK23DA-40 in a Dell Laptop) it's possible for that 128K read to get
stuck behind other stuff that's doing heavy I/O (for instance, starting Mozilla
or OpenOffice, or sometime a 'find' command).

I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
out the incantations to use the deadline scheduler, I'll report back....


Attachments:
(No filename) (226.00 B)

2003-07-28 18:25:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

[email protected] wrote:
>
> I am, however, able to get 'xmms' to skip. The reason is that the CPU is being
> scheduled quite adequately, but I/O is *NOT*.
>
> ...
> I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
> out the incantations to use the deadline scheduler, I'll report back....

Try decreasing the expiry times in /sys/block/hda/queue/iosched:

read_batch_expire
read_expire
write_batch_expire
write_expire

2003-07-28 21:31:30

by Wiktor Wodecki

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Mon, Jul 28, 2003 at 11:40:41AM -0700, Andrew Morton wrote:
> [email protected] wrote:
> >
> > I am, however, able to get 'xmms' to skip. The reason is that the CPU is being
> > scheduled quite adequately, but I/O is *NOT*.
> >
> > ...
> > I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
> > out the incantations to use the deadline scheduler, I'll report back....
>
> Try decreasing the expiry times in /sys/block/hda/queue/iosched:
>
> read_batch_expire
> read_expire
> write_batch_expire
> write_expire

I noticed that when bringing a huge application out of swap (mozilla,
openoffice, also tested the gimp with 50 images open) that dividing
everything by 2 in those 4 files I get a decent process fork. Without
this tuning the fork (xterm) waits till the application is back up.

--
Regards,

Wiktor Wodecki


Attachments:
(No filename) (865.00 B)
(No filename) (189.00 B)
Download all attachments

2003-07-28 21:47:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

Wiktor Wodecki <[email protected]> wrote:
>
> On Mon, Jul 28, 2003 at 11:40:41AM -0700, Andrew Morton wrote:
> > [email protected] wrote:
> > >
> > > I am, however, able to get 'xmms' to skip. The reason is that the CPU is being
> > > scheduled quite adequately, but I/O is *NOT*.
> > >
> > > ...
> > > I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
> > > out the incantations to use the deadline scheduler, I'll report back....
> >
> > Try decreasing the expiry times in /sys/block/hda/queue/iosched:
> >
> > read_batch_expire
> > read_expire
> > write_batch_expire
> > write_expire
>
> I noticed that when bringing a huge application out of swap (mozilla,
> openoffice, also tested the gimp with 50 images open) that dividing
> everything by 2 in those 4 files I get a decent process fork. Without
> this tuning the fork (xterm) waits till the application is back up.

Interesting. What we have there is pretty much a straight tradeoff between
latency and throughput. It could be that the defaults are not centered in
the right spot.

It will need some careful characterisation. Maybe we can persuade Nick to
generate the mystical Documentation/as-iosched.txt?

2003-07-29 14:12:42

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



[email protected] wrote:

> I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
> out the incantations to use the deadline scheduler, I'll report back....
>

It would be unfortunate if AS and the interactivity scheduler were to
conflict. Is there a way we can have them talk to each other and have
AS boost some I/O requests for tasks which are marked as interactive?

It would sacrifice some throughput for the sake of interactivity, which
is what the interactivity patches do anyhow. This is a reasonable
compromise.

2003-07-29 14:30:49

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Wed, 30 Jul 2003 00:21, Timothy Miller wrote:
> [email protected] wrote:
> > I'm guessing that the anticipatory scheduler is the culprit here. Soon
> > as I figure out the incantations to use the deadline scheduler, I'll
> > report back....
>
> It would be unfortunate if AS and the interactivity scheduler were to
> conflict. Is there a way we can have them talk to each other and have
> AS boost some I/O requests for tasks which are marked as interactive?
>
> It would sacrifice some throughput for the sake of interactivity, which
> is what the interactivity patches do anyhow. This is a reasonable
> compromise.

That's not as silly as it sounds. In fact it should be dead easy to
increase/decrease the amount of anticipatory time based on the bonus from
looking at the code. I dunno how the higher filesystem gods feel about this
though.

Con

2003-07-29 15:25:09

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



Con Kolivas wrote:
> On Wed, 30 Jul 2003 00:21, Timothy Miller wrote:
>
>>[email protected] wrote:
>>
>>>I'm guessing that the anticipatory scheduler is the culprit here. Soon
>>>as I figure out the incantations to use the deadline scheduler, I'll
>>>report back....
>>
>>It would be unfortunate if AS and the interactivity scheduler were to
>>conflict. Is there a way we can have them talk to each other and have
>>AS boost some I/O requests for tasks which are marked as interactive?
>>
>>It would sacrifice some throughput for the sake of interactivity, which
>>is what the interactivity patches do anyhow. This is a reasonable
>>compromise.
>
>
> That's not as silly as it sounds. In fact it should be dead easy to
> increase/decrease the amount of anticipatory time based on the bonus from
> looking at the code. I dunno how the higher filesystem gods feel about this
> though.


On the one hand, it's nice to keep systems independent so that you can
make them separately optional, but on the other hand, if they can talk
to each other, it makes for an all-around better-performing system,
because things don't stomp on each other.

They will need to pay attention to each other's kernel config options so
as to keep or leave out whatever code communicates between them. How
hard is that to do?



2003-07-29 15:29:04

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Tue, 29 Jul 2003 10:21:35 EDT, Timothy Miller said:

> > I'm guessing that the anticipatory scheduler is the culprit here. Soon as
I figure
> > out the incantations to use the deadline scheduler, I'll report back....

> It would be unfortunate if AS and the interactivity scheduler were to
> conflict. Is there a way we can have them talk to each other and have
> AS boost some I/O requests for tasks which are marked as interactive?

Well.,.. it turns out I was half right, sort of. My remaining glitches *were*
I/O related rather than the CPU scheduler. However, they weren't directly
related to the /sys/block/hda/queue/iosched/* values.

Turns out that at least on this laptop, 256M is just a bit tight on memory under
some conditions (well... OK... having X and xmms running, and then doing a
'tar xjvf linux-2.6.0-test1.tar.bz2' and launching OpenOffice 1.1rc1 all at once
is probably a stress test and a half ;).

Watching /proc/vmstat, it became obvious that audio skips were happening *only*
when 'pswpout' was going up - which means somebody's waiting on a page *IN*
that won't happen till another page goes *out* to swap first.....

Time for more pondering.. ;)


Attachments:
(No filename) (226.00 B)

2003-07-29 15:33:35

by Timothy Miller

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



[email protected] wrote:

>
> Well.,.. it turns out I was half right, sort of. My remaining glitches *were*
> I/O related rather than the CPU scheduler. However, they weren't directly
> related to the /sys/block/hda/queue/iosched/* values.
>
> Turns out that at least on this laptop, 256M is just a bit tight on memory under
> some conditions (well... OK... having X and xmms running, and then doing a
> 'tar xjvf linux-2.6.0-test1.tar.bz2' and launching OpenOffice 1.1rc1 all at once
> is probably a stress test and a half ;).
>
> Watching /proc/vmstat, it became obvious that audio skips were happening *only*
> when 'pswpout' was going up - which means somebody's waiting on a page *IN*
> that won't happen till another page goes *out* to swap first.....
>
> Time for more pondering.. ;)

Heh... can we prioritize swapping based on interactivity information? :)

2003-07-29 15:45:54

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Tue, 29 Jul 2003 11:44:11 EDT, Timothy Miller said:

> Heh... can we prioritize swapping based on interactivity information? :)

That concept has been seen before in other operating systems, and I'm
pretty sure that although Con and Ingo are doing stellar work, they will
soon have to start feeding back I/O and paging patterns into the calculations...

In the meantime, I probably need to see what tweaking the various paging
controls does... /proc/sys/vm/swappiness looks like a good place to start. ;)


Attachments:
(No filename) (226.00 B)

2003-07-30 01:16:01

by Diego Calleja

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

El Wed, 30 Jul 2003 00:35:01 +1000 Con Kolivas <[email protected]> escribi?:

>
> That's not as silly as it sounds. In fact it should be dead easy to
> increase/decrease the amount of anticipatory time based on the bonus from
> looking at the code. I dunno how the higher filesystem gods feel about this
> though.

I've done a small patch (one line) which tries to implement that.
At as-iosched.c:as_add_request() there's:

/*
* set expire time (only used for reads) and add to fifo list
*/
arq->expires = jiffies + ad->fifo_expire[data_dir];

ad->fifo_expire[data_dir] should be /sys/block/hda/queue/iosched/read_expire
(i've not checked it and i may be wrong) so instead of adding the static read_expire
we increase/decrease it a bit based on current->static_prio
NOTE: I don't even know if static_prio is what i'm searching, just sounds like it is.

diff -puN drivers/block/as-iosched.c~dyndeadline drivers/block/as-iosched.c
--- unsta.moo/drivers/block/as-iosched.c~dyndeadline 2003-07-30 02:49:34.000000000 +0200
+++ unsta.moo-diego/drivers/block/as-iosched.c 2003-07-30 02:51:06.000000000 +0200
@@ -1300,7 +1300,8 @@ static void as_add_request(struct as_dat
/*
* set expire time (only used for reads) and add to fifo list
*/
- arq->expires = jiffies + ad->fifo_expire[data_dir];
+ arq->expires = jiffies + ad->fifo_expire[data_dir] +
+ ((ad->fifo_expire[data_dir] * current->static_prio * 5)/100);
list_add_tail(&arq->fifo, &ad->fifo_list[data_dir]);
arq->state = AS_RQ_QUEUED;
as_update_arq(ad, arq); /* keep state machine up to date */

_


The patch should do the following:
read_expire=50 (the default value)

If current->static_prio is -10; the deadline given to the request
is 0; if it's 20 (well, there're only +19 priority i think, but
you get it) the deadline is read_expire*2, and the rest in the
same range.

It isn't a very nice patch; first because deadline 0 is wrong i suppose.
This should be doing read_expire +/- read_expire and probably i'd be
better to set a read_expire +/- 20% read_expire or so. 20 could
be a value exported to sysfs...

Patch effects haven't been tested (I've to awake in 5 hours), but at
least compiles and runs on a 2x box, so it can't be that bad. I hope
it helps.


Diego Calleja

2003-07-30 01:31:43

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Wed, 30 Jul 2003 11:16, Diego Calleja Garc?a wrote:
> El Wed, 30 Jul 2003 00:35:01 +1000 Con Kolivas <[email protected]>
escribi?:
> > That's not as silly as it sounds. In fact it should be dead easy to
> > increase/decrease the amount of anticipatory time based on the bonus from
> > looking at the code. I dunno how the higher filesystem gods feel about
> > this though.
>
> I've done a small patch (one line) which tries to implement that.
> At as-iosched.c:as_add_request() there's:

The logic is in the difference between the dynamic and the static priority to
determine if a task is interactive.
current->static_prio - current->prio
will give you a number of -5 to +5, with +5 being a good bonus and vice versa.
however you need to ensure that the value you are fiddling with in the i/o
scheduler is actually due to the current process[1]

On top of that, the p->prio itself will give you a number of 0-140 depending
with higher being a lower priority task; numbers 100-140 are for user tasks
and <100 for real time tasks.

These all change if you fiddle with the magic in bonus ratios and max rt prio
etc.

Con

[1] This is why I didn't bother posting my attempts ;)

2003-07-30 19:30:15

by d.c

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

[changing email address; several hosts block mail from *@teleline/terra.es;
which is good if they're fighting against spam]
El Wed, 30 Jul 2003 11:36:06 +1000 Con Kolivas <[email protected]> escribi?:

> The logic is in the difference between the dynamic and the static priority to
> determine if a task is interactive.
> current->static_prio - current->prio
> will give you a number of -5 to +5, with +5 being a good bonus and vice versa.
> however you need to ensure that the value you are fiddling with in the i/o
> scheduler is actually due to the current process[1]

I think current really is the process submitting the request; at least in the
same function we've this:

if (rq_data_dir(arq->request) == READ
|| current->flags&PF_SYNCWRITE)

Which would be wrong if current isn't the process submitting the request.


Diego Calleja

2003-07-30 19:46:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

[email protected] wrote:
>
> [changing email address; several hosts block mail from *@teleline/terra.es;
> which is good if they're fighting against spam]
> El Wed, 30 Jul 2003 11:36:06 +1000 Con Kolivas <[email protected]> escribi?:
>
> > The logic is in the difference between the dynamic and the static priority to
> > determine if a task is interactive.
> > current->static_prio - current->prio
> > will give you a number of -5 to +5, with +5 being a good bonus and vice versa.
> > however you need to ensure that the value you are fiddling with in the i/o
> > scheduler is actually due to the current process[1]
>
> I think current really is the process submitting the request; at least in the
> same function we've this:
>
> if (rq_data_dir(arq->request) == READ
> || current->flags&PF_SYNCWRITE)
>
> Which would be wrong if current isn't the process submitting the request.

`current' is correct for reads and synchronous writes. It is usually wrong
for normal pagecache writeback.

If we're going to do this sort of thing, the IO priority and any associated
state should be placeed into struct io_context, which is the structure with
which the IO scheduler tracks per-process stuff.

The io_context is constructed just once across teh lifetime of a process,
so we'd need to update it occasionally to pick up dynamic priority shifts,
changes in niceness, etc. Probably do that inside a read.

I have a vague feeling that co-opting the scheduling priority information
for use as IO priority will end up being a mistake. It may be best to
treat these things separately from the outset.

2003-07-31 06:36:48

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



Andrew Morton wrote:

>Wiktor Wodecki <[email protected]> wrote:
>
>>On Mon, Jul 28, 2003 at 11:40:41AM -0700, Andrew Morton wrote:
>>
>>>[email protected] wrote:
>>>
>>>>I am, however, able to get 'xmms' to skip. The reason is that the CPU is being
>>>> scheduled quite adequately, but I/O is *NOT*.
>>>>
>>>>...
>>>> I'm guessing that the anticipatory scheduler is the culprit here. Soon as I figure
>>>> out the incantations to use the deadline scheduler, I'll report back....
>>>>
>>>Try decreasing the expiry times in /sys/block/hda/queue/iosched:
>>>
>>>read_batch_expire
>>>read_expire
>>>write_batch_expire
>>>write_expire
>>>
>>I noticed that when bringing a huge application out of swap (mozilla,
>>openoffice, also tested the gimp with 50 images open) that dividing
>>everything by 2 in those 4 files I get a decent process fork. Without
>>this tuning the fork (xterm) waits till the application is back up.
>>
>>
>
>Interesting. What we have there is pretty much a straight tradeoff between
>latency and throughput. It could be that the defaults are not centered in
>the right spot.
>

Well it should help a bad case application by about 2x by doing this.
It will very roughly change efficiency of a streaming IO vs other IO
from 80% to 60% which is going too far for a default IMO. A better
idea would be to do the exec prefaulting you had in your tree...

Oh, and the process scheduler can definitely be a contributing factor.
Even if it looks like your process is getting enough cpu, if your
process doesn't get woken in less than 5ms after its read completes,
then AS will give up waiting for it.

>
>It will need some careful characterisation. Maybe we can persuade Nick to
>generate the mystical Documentation/as-iosched.txt?
>

I did send one to you but not in patch form so I guess you were a
bit lazy with it! I guess I'll be doing this autotuning thing soon
so it is going to change.

2003-07-31 07:38:53

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

On Thu, 31 Jul 2003 16:36, Nick Piggin wrote:
> Oh, and the process scheduler can definitely be a contributing factor.
> Even if it looks like your process is getting enough cpu, if your
> process doesn't get woken in less than 5ms after its read completes,
> then AS will give up waiting for it.

This part interests me. It would seem that either
1. The AS scheduler should not bother waiting at all if the process is not
going to wake up in that time
2. The process should be woken in that time to ensure the AS scheduler is not
wasting it's time waiting.
or a combination of 1 and 2 depending on some heuristic deciding on how
important it is for 2 instead of 1.

No, I'm not planning on trying to implement either of these <insert usual
complaint about time and knowledge here>, but I thought I should at least
contribute my thoughts.

Con

2003-07-31 07:59:33

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



Con Kolivas wrote:

>On Thu, 31 Jul 2003 16:36, Nick Piggin wrote:
>
>>Oh, and the process scheduler can definitely be a contributing factor.
>>Even if it looks like your process is getting enough cpu, if your
>>process doesn't get woken in less than 5ms after its read completes,
>>then AS will give up waiting for it.
>>
>
>This part interests me. It would seem that either
>1. The AS scheduler should not bother waiting at all if the process is not
>going to wake up in that time
>

It doesn't. Lacking a crystal ball, it relies on heuristics
to achive this. It generally works.

>
>2. The process should be woken in that time to ensure the AS scheduler is not
>wasting it's time waiting.
>or a combination of 1 and 2 depending on some heuristic deciding on how
>important it is for 2 instead of 1.
>

Well yes, for any IO scheduler its important that a process being
woken for IO is run ASAP. It is realy up to the process scheduler
to hash out the policy here, just keep in mind that this is an
important metric.

If the scheduler / CPU can't keep up, then AS's heuristic should
kick in quickly.

>
>No, I'm not planning on trying to implement either of these <insert usual
>complaint about time and knowledge here>, but I thought I should at least
>contribute my thoughts.
>
>

No need! Just keep in mind that newly waking processes are important.
Not just for disk but any sort of IO: network, soundcard buffers, etc.


2003-07-31 15:00:09

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

Con Kolivas wrote:
> On Thu, 31 Jul 2003 16:36, Nick Piggin wrote:
> > Oh, and the process scheduler can definitely be a contributing factor.
> > Even if it looks like your process is getting enough cpu, if your
> > process doesn't get woken in less than 5ms after its read completes,
> > then AS will give up waiting for it.
>
> This part interests me. It would seem that either
> 1. The AS scheduler should not bother waiting at all if the process is not
> going to wake up in that time

How about something as simple as: if process sleeps, and AS scheduler
is waiting since last request from that process, AS scheduler stops
waiting immediately?

In other words, a hook in the process scheduler when a process goes to
sleep, to tell the AS scheduler to stop waiting.

Although this would not always be optimal, for many cases the point of
AS is that the process is continuing to run, not sleeping, and will
issue another request shortly.

-- Jamie

2003-07-31 15:42:57

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity

Oliver Neukum wrote:
> > Although this would not always be optimal, for many cases the point of
> > AS is that the process is continuing to run, not sleeping, and will
> > issue another request shortly.
>
> How do you tell which task dirtied the page?

No idea :)

It may be easier to tell which task is _waiting_ on the page when an
I/O completes, as that is the task you are hoping will issue another
I/O to a similar place on the disk soon.

> Wouldn't giving a bonus to tasks doing file io achieve the same purpose?
> Also, isn't quickly waking up tasks more important?

I am not sure, these as just off the cuff ideas :)

That's a policy decision. Waking up such tasks _may_ be important, on
the other hand if their dynamic priority is so low that they are
sleeping because of that, it means they have used more than their fair
share of CPU recently already, then they should be woken but not run immediately.

If you can figure out in advance that they wouldn't be run immediately
(e.g. due to a dynamic priority test from the task scheduler), that
would tell AS not to bother waiting.

-- Jamie

2003-07-31 15:26:47

by Oliver Neukum

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity


> > This part interests me. It would seem that either
> > 1. The AS scheduler should not bother waiting at all if the process is not
> > going to wake up in that time
>
> How about something as simple as: if process sleeps, and AS scheduler
> is waiting since last request from that process, AS scheduler stops
> waiting immediately?
>
> In other words, a hook in the process scheduler when a process goes to
> sleep, to tell the AS scheduler to stop waiting.
>
> Although this would not always be optimal, for many cases the point of
> AS is that the process is continuing to run, not sleeping, and will
> issue another request shortly.

How do you tell which task dirtied the page?
Wouldn't giving a bonus to tasks doing file io achieve the same purpose?
Also, isn't quickly waking up tasks more important?

Regards
Oliver

2003-07-31 22:59:19

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O10int for interactivity



Oliver Neukum wrote:

>>>This part interests me. It would seem that either
>>>1. The AS scheduler should not bother waiting at all if the process is not
>>>going to wake up in that time
>>>
>>How about something as simple as: if process sleeps, and AS scheduler
>>is waiting since last request from that process, AS scheduler stops
>>waiting immediately?
>>

No its fine if the process were to sleep on something. Its the
amount of time between IOs that is important (and is measured).
Makes no difference if the process is computing something or
waiting for something really.

>>
>>In other words, a hook in the process scheduler when a process goes to
>>sleep, to tell the AS scheduler to stop waiting.
>>
>>Although this would not always be optimal, for many cases the point of
>>AS is that the process is continuing to run, not sleeping, and will
>>issue another request shortly.
>>
>
>How do you tell which task dirtied the page?
>Wouldn't giving a bonus to tasks doing file io achieve the same purpose?
>Also, isn't quickly waking up tasks more important?
>

With AS, it doesn't matter what task created the IO, its what
task will have to wait on it. In the case of async writes, we
don't care about them anyway because the pagecache means they
get done a long way behind the instruction pointer of the
process anyway, so they'll be nicely layed out anyway.


2003-08-04 18:52:01

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

Bad news, I guess...

I'm experiencing XMMS skips with 2.6.0-test2-mm4 + O13int patch. They
are easily reproducible when browsing through the menus of
KDE/Konqueror.

My KDE session is configured with the Keramik style, using XRender
transparencies and drop-down shadows for the menus. When browsing the
"Bookmarks" Konqueror drop-down menu, XMMS pauses audio playback very
briedly. The skip starts at the moment at which I click the "Bookmarks"
menu and lasts until the menu is displayed completely on the screen. My
Konqueror "Bookmarks" menu is really big, occupying almost the entire
screen height (over 700 pixels).

The XMMS skips can also be reproduced while navigating through web pages
that require a lot of CPU horsepower, like for example,
http://www.3dwallpapers.com. When browsing through the nice wallpapers
at the site, Konqueror hogs the CPU and XMMS starts skipping.

Both scenarios can be reproduced with either XMMS or MPlayer, so I guess
is not an isolated problem with an specific player. Also, the XMMS skips
are not reproducible with previous releases of your scheduler patches.

Hope this helps!

2003-08-04 18:58:56

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Mon, 2003-08-04 at 20:51, Felipe Alfaro Solana wrote:
> Bad news, I guess...
>
> I'm experiencing XMMS skips with 2.6.0-test2-mm4 + O13int patch. They
> are easily reproducible when browsing through the menus of
> KDE/Konqueror.
>
> My KDE session is configured with the Keramik style, using XRender
> transparencies and drop-down shadows for the menus. When browsing the
> "Bookmarks" Konqueror drop-down menu, XMMS pauses audio playback very
> briedly. The skip starts at the moment at which I click the "Bookmarks"
> menu and lasts until the menu is displayed completely on the screen. My
> Konqueror "Bookmarks" menu is really big, occupying almost the entire
> screen height (over 700 pixels).
>
> The XMMS skips can also be reproduced while navigating through web pages
> that require a lot of CPU horsepower, like for example,
> http://www.3dwallpapers.com. When browsing through the nice wallpapers
> at the site, Konqueror hogs the CPU and XMMS starts skipping.
>
> Both scenarios can be reproduced with either XMMS or MPlayer, so I guess
> is not an isolated problem with an specific player. Also, the XMMS skips
> are not reproducible with previous releases of your scheduler patches.
>
> Hope this helps!

OK, I had the X server reniced at -20... Renicing the X server at +0
makes the XMMS skips disappear. At least, with X at +0 I've been able to
reproduce them anymore.


2003-08-04 21:41:19

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Tue, 5 Aug 2003 04:58, Felipe Alfaro Solana wrote:
> OK, I had the X server reniced at -20... Renicing the X server at +0
> makes the XMMS skips disappear. At least, with X at +0 I've been able to
> reproduce them anymore.

As always, thanks Felipe.

This is good news. X should be able to make xmms skip if it's -20, and X
should still be smooth at 0.

Con

2003-08-04 22:16:57

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O13int for interactivity

On Mon, 2003-08-04 at 23:46, Con Kolivas wrote:
> On Tue, 5 Aug 2003 04:58, Felipe Alfaro Solana wrote:
> > OK, I had the X server reniced at -20... Renicing the X server at +0
> > makes the XMMS skips disappear. At least, with X at +0 I've been able to
> > reproduce them anymore.
>
> As always, thanks Felipe.

It's great to be helpful.

> This is good news. X should be able to make xmms skip if it's -20, and X
> should still be smooth at 0.

X is pretty smooth, but at certain times, it feels somewhat "jumpy".
When X is under load (not a single cpu hogger, like my standard devil
while loop), the mouse cursor is jumpy and X gets CPU at bursts.
Renicing X to -20 seemed to help in those situations, but caused XMMS to
skip, so I'm not pretty sure what's better. Meanwhile, I've reniced X
back to +0 to try to reduce those skips.

At nice +0, I can reproduce X "jumpiness" by opening several instances
of Konqueror, loading some web pages from http://www.linuxtoday.com, for
example and then arrange them all "tiled" (more or less tiled is
perfect, too) on the screen. Then, I drag a window over them as fast as
I can, then as slow as I can. This causes a lot of repainting, increases
CPU usage and, instead of concentrating all the CPU usage on a single
process (like the devil while loop), the load is distributed among all
the Konqueror processes. That makes X to not feel smooth, but jumpy.

Evolution is another kind of application that requires kinda lot of CPU
power when requested to do repainting. Many times, forcing Evolution to
repaint itself, by moving a window over it, generates a lot of "uncover"
events. Sometimes, Evolution feels pretty smooth, and other times, the
window I'm dragging over Evolution starts moving not so smoothly.

But anyway, I still feels this is getting on the right track. I think
it's a matter of time and a little tuning, but this will rock in the
end.