2003-07-30 00:34:30

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] O11int for interactivity

Update to the interactivity patches. Not a massive improvement but
more smoothing of the corners.

Changes:
Obviously interactive tasks are now flagged as such by the interactive
credit. This allows them to be treated differently.

Tasks with credit are now the only ones that get rapid elevation of their
sleep avg with any sleep. The rest get their sleep time added, with a
limit of their timeslice as the maximum bonus - this has the effect of not
allowing non-interactive tasks to elevate priority rapidly, and the
limitation on bonus indirectly affects how nice affects their rise.

Tasks that accumulate >MAX_SLEEP_AVG start earning interactive
credits.

Removed the detection of first time activation code - new changes make it
unecessary.

The main function of interactive credits is to make it harder for these tasks
to fall onto the expired array. These tasks can use up their entire
sleep_avg before being expired, and even when they do they are put at the
head of the expired array. This prevents them from starving non
interactive processes that would otherwise happen if their priority remained
elevated or they got put back on the active array indefinitely (I tried all
sorts of unfair combinations). The effect of all this is that interactive tasks
take a lot longer to expire during global heavy load when they are also
cpu hungry - ie X takes longer to stutter under heavy load and stutters for
less.

The requeuing was modified to exclude kernel threads (just in case...)

_All_ testing and comments are desired and appreciated.

Con

A full patch against 2.6.0-test2 is available on my website.

patch-O11int-0307300018 against 2.6.0-test2-mm1 is available
here:

http://kernel.kolivas.org/2.5

and here:

--- linux-2.6.0-test2-mm1/include/linux/sched.h 2003-07-28 20:48:22.000000000 +1000
+++ linux-2.6.0-test2mm1O11/include/linux/sched.h 2003-07-28 21:55:10.000000000 +1000
@@ -342,6 +342,7 @@ struct task_struct {

unsigned long sleep_avg;
unsigned long last_run;
+ unsigned long interactive_credit;
int activated;

unsigned long policy;
--- linux-2.6.0-test2-mm1/kernel/sched.c 2003-07-28 20:48:22.000000000 +1000
+++ linux-2.6.0-test2mm1O11/kernel/sched.c 2003-07-30 00:17:55.000000000 +1000
@@ -119,6 +119,9 @@
#define TASK_INTERACTIVE(p) \
((p)->prio <= (p)->static_prio - DELTA(p))

+#define JUST_INTERACTIVE_SLEEP(p) \
+ (MAX_SLEEP_AVG - (DELTA(p) * AVG_TIMESLICE))
+
#define TASK_PREEMPTS_CURR(p, rq) \
((p)->prio < (rq)->curr->prio || \
((p)->prio == (rq)->curr->prio && \
@@ -307,6 +310,14 @@ static inline void enqueue_task(struct t
p->array = array;
}

+static inline void enqueue_head_task(struct task_struct *p, prio_array_t *array)
+{
+ list_add(&p->run_list, array->queue + p->prio);
+ __set_bit(p->prio, array->bitmap);
+ array->nr_active++;
+ p->array = array;
+}
+
/*
* effective_prio - return the priority that is based on the static
* priority but is modified by bonuses/penalties.
@@ -357,33 +368,45 @@ static void recalc_task_prio(task_t *p)

/*
* User tasks that sleep a long time are categorised as
- * idle and will get just under interactive status to
+ * idle and will get just interactive status to stay active &
* prevent them suddenly becoming cpu hogs and starving
* other processes.
*/
if (p->mm && sleep_time > HZ)
- p->sleep_avg = MAX_SLEEP_AVG *
- (MAX_BONUS - 1) / MAX_BONUS - 1;
+ p->sleep_avg = JUST_INTERACTIVE_SLEEP(p);
else {
-
/*
- * Processes that sleep get pushed to one higher
+ * Processes with credit get pushed to one higher
* priority each time they sleep greater than
* one tick. -ck
*/
- p->sleep_avg = (p->sleep_avg * MAX_BONUS /
+ if (p->interactive_credit)
+ p->sleep_avg = (p->sleep_avg * MAX_BONUS /
MAX_SLEEP_AVG + 1) *
MAX_SLEEP_AVG / MAX_BONUS;
+ else {
+ /*
+ * The rest earn sleep_avg according to their sleep
+ * time up to a maximum of their timeslice size.
+ */
+ if (sleep_time > task_timeslice(p))
+ sleep_time = task_timeslice(p);
+ p->sleep_avg += sleep_time;
+ }

- if (p->sleep_avg > MAX_SLEEP_AVG)
+ /*
+ * Fully interactive tasks gain interactive credits
+ * to cash in when needed.
+ */
+ if (p->sleep_avg > MAX_SLEEP_AVG){
p->sleep_avg = MAX_SLEEP_AVG;
+ p->interactive_credit++;
+ }
}
}
p->prio = effective_prio(p);
-
}

-
/*
* activate_task - move a task to the runqueue and do priority recalculation
*
@@ -392,11 +415,8 @@ static void recalc_task_prio(task_t *p)
*/
static inline void activate_task(task_t *p, runqueue_t *rq)
{
- if (likely(p->last_run)){
- p->activated = 1;
- recalc_task_prio(p);
- } else
- p->last_run = jiffies;
+ p->activated = 1;
+ recalc_task_prio(p);

__activate_task(p, rq);
}
@@ -579,7 +599,8 @@ void wake_up_forked_process(task_t * p)
p->sleep_avg = p->sleep_avg * MAX_BONUS / MAX_SLEEP_AVG *
CHILD_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS;
p->prio = effective_prio(p);
- p->last_run = 0;
+ p->last_run = jiffies;
+ p->interactive_credit = 0;
set_task_cpu(p, smp_processor_id());

if (unlikely(!current->array))
@@ -1268,14 +1289,33 @@ void scheduler_tick(int user_ticks, int
p->prio = effective_prio(p);
p->time_slice = task_timeslice(p);
p->first_time_slice = 0;
+ /*
+ * This drop in interactive_credit is really just a
+ * sanity check to make sure tasks that only slept once
+ * for long enough dont act like interactive tasks
+ */
+ if (p->interactive_credit)
+ p->interactive_credit--;

if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
- enqueue_task(p, rq->expired);
+ /*
+ * Long term interactive tasks need to completely
+ * run out of sleep_avg to be expired, and when they
+ * do they are put at the start of the expired array
+ */
+ if (unlikely(p->interactive_credit)){
+ if (p->sleep_avg){
+ enqueue_task(p, rq->active);
+ goto out_unlock;
+ }
+ enqueue_head_task(p, rq->expired);
+ } else
+ enqueue_task(p, rq->expired);
} else
enqueue_task(p, rq->active);
- } else if (!((task_timeslice(p) - p->time_slice) %
+ } else if (p->mm && !((task_timeslice(p) - p->time_slice) %
TIMESLICE_GRANULARITY) && (p->time_slice > MIN_TIMESLICE) &&
(p->array == rq->active)) {
/*


2003-07-30 00:50:58

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 30 Jul 2003 10:38, Con Kolivas wrote:
> Update to the interactivity patches. Not a massive improvement but
> more smoothing of the corners.

Woops my bad. Seems putting things even at the start of the expired array can
induce a corner case. Will post an O11.1 in a few mins to back out that part.

Con

2003-07-30 01:04:28

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 30 Jul 2003 10:55, Con Kolivas wrote:
> On Wed, 30 Jul 2003 10:38, Con Kolivas wrote:
> > Update to the interactivity patches. Not a massive improvement but
> > more smoothing of the corners.
>
> Woops my bad. Seems putting things even at the start of the expired array
> can induce a corner case. Will post an O11.1 in a few mins to back out that
> part.

Here is O11.1int which backs out that part. This was only of minor help
anyway so backing it out still makes the other O11 changes worthwhile.

A full O11.1 patch against 2.6.0-test2 is available on my website.

--- linux-2.6.0-test2-mm1/kernel/sched.c 2003-07-30 10:54:54.000000000 +1000
+++ linux-2.6.0-test2mm1O11/kernel/sched.c 2003-07-30 10:46:43.000000000 +1000
@@ -310,14 +310,6 @@ static inline void enqueue_task(struct t
p->array = array;
}

-static inline void enqueue_head_task(struct task_struct *p, prio_array_t *array)
-{
- list_add(&p->run_list, array->queue + p->prio);
- __set_bit(p->prio, array->bitmap);
- array->nr_active++;
- p->array = array;
-}
-
/*
* effective_prio - return the priority that is based on the static
* priority but is modified by bonuses/penalties.
@@ -1305,13 +1297,10 @@ void scheduler_tick(int user_ticks, int
* run out of sleep_avg to be expired, and when they
* do they are put at the start of the expired array
*/
- if (unlikely(p->interactive_credit)){
- if (p->sleep_avg){
- enqueue_task(p, rq->active);
- goto out_unlock;
- }
- enqueue_head_task(p, rq->expired);
- } else
+ if (unlikely(p->interactive_credit && p->sleep_avg)){
+ enqueue_task(p, rq->active);
+ goto out_unlock;
+ }
enqueue_task(p, rq->expired);
} else
enqueue_task(p, rq->active);

2003-07-30 08:29:57

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 2003-07-30 at 02:38, Con Kolivas wrote:
> Update to the interactivity patches. Not a massive improvement but
> more smoothing of the corners.

I'm running 2.6.0-test2-mm1 + O11int.patch + O11.1int.patch and I must
say this is getting damn good! In the past, I've had to tweak scheduler
knobs to tune the engine to my taste, but since O10, this is a thing of
the past. It's working as smooth as silk...

Good work!

2003-07-30 08:45:46

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wednesday 30 July 2003 10:29, Felipe Alfaro Solana wrote:

Hi Felipe,

> I'm running 2.6.0-test2-mm1 + O11int.patch + O11.1int.patch and I must
> say this is getting damn good! In the past, I've had to tweak scheduler
> knobs to tune the engine to my taste, but since O10, this is a thing of
> the past. It's working as smooth as silk...
> Good work!
I really really wonder why I don't experience this behaviour. For me, the best
scheduler patch in the past was the one from you. I had a test last night
with 011.1 and I rebooted into 2.4 back after some hours of testing because
it is unusable for me under load, and it is no heavy load, it's just for
example a simple "make -j2 bzImage modules".

What makes me even more wondering is that 2.6.0-test1-wli tree does not suck
at all for interactivity where no scheduler changes were made.

Maybe we need both: VM fixups (we need them anyway!) and O(1) fixups so that
also my machine will be happy ;)

ciao, Marc

2003-07-30 08:42:55

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 2003-07-30 at 02:38, Con Kolivas wrote:
> Update to the interactivity patches. Not a massive improvement but
> more smoothing of the corners.

Wops! Wait a minute! O11.1 is great, but I've had a few XMMS skips that
I didn't have with O10. They're really difficult to reproduce, but I've
seen them when moving a window slowly enough to make the underlying
windows accumulate a lot of redrawing events. Also, although O11.1 feels
smooth, it's not as smooth as O10 is for me.

All in all, O11.1 is damn good, but O10 is the greatest scheduler I've
seen to date.

2003-07-30 09:38:29

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

Quoting Marc-Christian Petersen <[email protected]>:

> On Wednesday 30 July 2003 10:29, Felipe Alfaro Solana wrote:
>
> Hi Felipe,
>
> > I'm running 2.6.0-test2-mm1 + O11int.patch + O11.1int.patch and I must
> > say this is getting damn good! In the past, I've had to tweak scheduler
> > knobs to tune the engine to my taste, but since O10, this is a thing of
> > the past. It's working as smooth as silk...
> > Good work!
> I really really wonder why I don't experience this behaviour. For me, the
> best
> scheduler patch in the past was the one from you. I had a test last night
> with 011.1 and I rebooted into 2.4 back after some hours of testing because
> it is unusable for me under load, and it is no heavy load, it's just for
> example a simple "make -j2 bzImage modules".
>
> What makes me even more wondering is that 2.6.0-test1-wli tree does not suck
>
> at all for interactivity where no scheduler changes were made.
>
> Maybe we need both: VM fixups (we need them anyway!) and O(1) fixups so that
> also my machine will be happy ;)

The obvious question still needs to be asked here. How does vanilla compare to
vanilla +O11.1?

Con

2003-07-30 09:35:51

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

Quoting Felipe Alfaro Solana <[email protected]>:

> On Wed, 2003-07-30 at 02:38, Con Kolivas wrote:
> > Update to the interactivity patches. Not a massive improvement but
> > more smoothing of the corners.
>
> Wops! Wait a minute! O11.1 is great, but I've had a few XMMS skips that
> I didn't have with O10. They're really difficult to reproduce, but I've
> seen them when moving a window slowly enough to make the underlying
> windows accumulate a lot of redrawing events. Also, although O11.1 feels
> smooth, it's not as smooth as O10 is for me.

Hmm maybe a little too nice to X at the expense of other stuff. Will address
next time round.

Con

2003-07-30 09:50:31

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wednesday 30 July 2003 11:38, Con Kolivas wrote:

Hi Con,

> > Maybe we need both: VM fixups (we need them anyway!) and O(1) fixups so
> > that also my machine will be happy ;)
> The obvious question still needs to be asked here. How does vanilla compare
> to vanilla +O11.1?
Sorry, haven't had the time this night to test that also. I was a bit sad ;)
I'll report vanilla+O11.1int at friday. I don't have time earlier.

ciao, Marc


2003-07-30 10:30:00

by Voluspa

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity


On 2003-07-30 8:41:46 Felipe Alfaro Solana wrote:

> Wops! Wait a minute! O11.1 is great, but I've had a few XMMS skips
> that I didn't have with O10. They're really difficult to reproduce,

Can't reproduce your skips here with my light environment and O11.1 (on
a PII 400, 128 meg mem, no desktop, Enlightenment as wm). Even as I
write this my machine is under the most extreme load that I have -
natural, not artificial:

Playing a directory of mp3s with xmms.

Backing up the system from hda to a partition on hdc (the mp3s are on
that drive but another partition). This involves"rm -rf" of the old
backup, "dd if=/dev/zero of=cleanupfile" as a poor man's wipe, load is
11 there while producing a 15 gig file. "cp -a" all relevant directories
of my system, load is 3 to 7. "cp -a" the backup to a copy on the same
partition. Takes 50 minutes to complete everything.

During this operation I write a "letter" in OpenOffice 1.1rc2 and browse
the net with Opera 6.02. Apart from normal delays while swapped out
things get swapped in (top says 55 megs is swapped), everything is fully
operational. And no music skips ;-)

As to difference between O10 and O11.1 in feel... No comment. I'm too
old to catch such small variations.

Mvh
Mats Johannesson

2003-07-30 10:47:08

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 30 Jul 2003 20:31, Voluspa wrote:
> On 2003-07-30 8:41:46 Felipe Alfaro Solana wrote:
> > Wops! Wait a minute! O11.1 is great, but I've had a few XMMS skips
> > that I didn't have with O10. They're really difficult to reproduce,
>
> Can't reproduce your skips here with my light environment and O11.1 (on
> a PII 400, 128 meg mem, no desktop, Enlightenment as wm). Even as I
> write this my machine is under the most extreme load that I have -
> natural, not artificial:

Good test. Thanks.

> As to difference between O10 and O11.1 in feel... No comment. I'm too
> old to catch such small variations.

That's good, most of the difference was supposed to be in extremely unusual
circumstances. Felipe's issue is something I was concerned might happen (not
specifically an audio issue per se but audio is a sensitive way to pick it
up) which is why all testing is important.

Con

2003-07-30 11:20:32

by Eugene Teo

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

[~] $ uname -a
Linux amaryllis 2.6.0-test2-mm1-kj1+O11.1int #1 Wed Jul 30 18:57:37 SGT
2003 i686 GNU/Linux

Applied mm1, kj1 (kernel-janitor), and your O11.1int patch.

when X is niced with -10, there are a 1-2 sec pause whenever I switched
from X to console.

when X is niced with 0, there are no skips, no pause, no whatsoever.
even when I maximise aterm, there are no interruptions on my xmms when
i hide, and unhide aterm. nothing unusual when i compile kernels.

this is perhaps the best patch i ever applied from you.

Cheers,
Eugene

<quote sender="Con Kolivas">
> On Wed, 30 Jul 2003 20:31, Voluspa wrote:
> > On 2003-07-30 8:41:46 Felipe Alfaro Solana wrote:
> > > Wops! Wait a minute! O11.1 is great, but I've had a few XMMS skips
> > > that I didn't have with O10. They're really difficult to reproduce,
> >
> > Can't reproduce your skips here with my light environment and O11.1 (on
> > a PII 400, 128 meg mem, no desktop, Enlightenment as wm). Even as I
> > write this my machine is under the most extreme load that I have -
> > natural, not artificial:
>
> Good test. Thanks.
>
> > As to difference between O10 and O11.1 in feel... No comment. I'm too
> > old to catch such small variations.
>
> That's good, most of the difference was supposed to be in extremely unusual
> circumstances. Felipe's issue is something I was concerned might happen (not
> specifically an audio issue per se but audio is a sensitive way to pick it
> up) which is why all testing is important.
>
> Con
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Attachments:
(No filename) (1.72 kB)
(No filename) (189.00 B)
Download all attachments

2003-07-30 12:56:49

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, 2003-07-30 at 11:38, Con Kolivas wrote:

> The obvious question still needs to be asked here. How does vanilla compare to
> vanilla +O11.1?

Vanilla has serious interactivity problems for me. Vanilla + O11.1 is
the second best scheduler I've ever used (the best is clearly O10). So
this, indeed, are good news, at least for me :-)

2003-07-30 13:31:57

by Wade

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

Felipe Alfaro Solana wrote:
> On Wed, 2003-07-30 at 11:38, Con Kolivas wrote:
>
>
>>The obvious question still needs to be asked here. How does vanilla compare to
>>vanilla +O11.1?
>
>
> Vanilla has serious interactivity problems for me. Vanilla + O11.1 is
> the second best scheduler I've ever used (the best is clearly O10). So
> this, indeed, are good news, at least for me :-)
>

O10 is better(the best I've used so far) for me too.

2003-07-30 15:40:09

by Apurva Mehta

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

* Con Kolivas <[email protected]> [30-07-2003 13:36]:
> Update to the interactivity patches. Not a massive improvement but
> more smoothing of the corners.
>
> [snip]
>
> _All_ testing and comments are desired and appreciated.

Well, I just put my system under severe natural load using
2.6.0-test2-mm1 and I must say that my system remained as responsive
as one could expect. I have a PentiumIII 500Mhz with 192 Mb RAM.

Here's what load was like:
1) Compiling 2.6.0-test2-mm1-O11.1int :)
2) updating spamassassin bayesian filters (This mainly causes heavy
disk i/o, the CPU usage is not so high)
3) procmail filtering ~40 emails simultaneously.
4) playing ogg's in xmms
5) switching between viewing pdf's and web-browsing

Window switching remained pretty snappy and there was one skip in
the music . The pdf's scrolled at a decent speed too.

All in all I would say that I really cannot expect anything more in
terms of responsiveness from my hardware. IMO, there is a limit to the
magic a good scheduler can do on limited hardware resources.

The scheduler has certainly come a long way since 2.6.0-test1
( plus all subsequent patches and versions). Great work.

- Apurva

2003-07-31 13:50:17

by Szonyi Calin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

Con Kolivas said:
> On Wed, 30 Jul 2003 10:55, Con Kolivas wrote:
>> On Wed, 30 Jul 2003 10:38, Con Kolivas wrote:
>> > Update to the interactivity patches. Not a massive improvement but
>> more smoothing of the corners.
>>
> Here is O11.1int which backs out that part. This was only of minor help
> anyway so backing it out still makes the other O11 changes worthwhile.
>
> A full O11.1 patch against 2.6.0-test2 is available on my website.
>

A little bit better than O10 but mplayer still skips frames while
doind a make bzImage in the background

--
# fortune
fortune: write error on /dev/null --- please empty the bit bucket


-----------------------------------------
This email was sent using SquirrelMail.
"Webmail for nuts!"
http://squirrelmail.org/


2003-07-31 13:51:45

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Thu, 31 Jul 2003 23:55, Szonyi Calin wrote:
> Con Kolivas said:
> > On Wed, 30 Jul 2003 10:55, Con Kolivas wrote:
> >> On Wed, 30 Jul 2003 10:38, Con Kolivas wrote:
> >> > Update to the interactivity patches. Not a massive improvement but
> >>
> >> more smoothing of the corners.
> >
> > Here is O11.1int which backs out that part. This was only of minor help
> > anyway so backing it out still makes the other O11 changes worthwhile.
> >
> > A full O11.1 patch against 2.6.0-test2 is available on my website.
>
> A little bit better than O10 but mplayer still skips frames while
> doind a make bzImage in the background

Can you tell me what top shows mplayer scores in the PRI column during all
this?

Con

2003-07-31 15:26:48

by MånsRullgård

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

"Szonyi Calin" <[email protected]> writes:

>>> > Update to the interactivity patches. Not a massive improvement but
>>> more smoothing of the corners.
>>>
>> Here is O11.1int which backs out that part. This was only of minor help
>> anyway so backing it out still makes the other O11 changes worthwhile.
>>
>> A full O11.1 patch against 2.6.0-test2 is available on my website.
>>
>
> A little bit better than O10 but mplayer still skips frames while
> doind a make bzImage in the background

If you used a sane player this wouldn't happen. Every few seconds
there will be a frame that takes a little longer than average to
decode, and unless the player can all the CPU time it wants, there
will be skips. The solution is to buffer a few decoded frames
somewhere, preferably in video memory. That will give you the extra
time to decode the difficult frames, and then catch up with some easy
ones.

If the scheduler can be tweaked so even mplayer does the right thing,
that's good. It could solve other problems more difficult to address
in the application.

--
M?ns Rullg?rd
[email protected]

2003-07-31 16:58:21

by Moritz Muehlenhoff

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

In stuga.ml.linux.kernel, you wrote:
> _All_ testing and comments are desired and appreciated.

Playing "Baldurs Gate II" in winex in plain 2.6.0-test2 and
all tested 2.5 kernels before lead to heavy audio stutters
and 100% CPU consumption, while with 2.4 it consumes about
70%.

O11.1 on top of 2.6.0-test2 fixes that problem, it does now
behave just like 2.4. Thanks a lot.

Cheers,
Moritz

2003-07-31 21:38:42

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Thu, Jul 31, 2003 at 04:55:54PM +0300, Szonyi Calin wrote:
> A little bit better than O10 but mplayer still skips frames while
> doind a make bzImage in the background

Could you do the following during an mp3 skipping test please:

vmstat 1 | tee -a vmstat.log

n=1; while true; do /usr/sbin/readprofile -n -m /boot/System.map-`uname -r` | sort -k 2,2 > profile.log.$n ; n=$(($n + 1)) ; sleep 1 ; done

If you could stop both the vmstat and the readprofile loop shortly
after the skip (not _too_ shortly, at least 1 second after it) I'd be
much obliged.


--- wli

2003-07-31 21:56:27

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Thu, Jul 31, 2003 at 02:43:14PM -0700, William Lee Irwin III wrote:
> Could you make sure that you're not using 1A? (vanilla 1 and 1B are
> both fine for these purposes).
> Also, can I get a before/after of the following during an mp3 skip test?
> vmstat 1 | tee -a vmstat.log
> n=1; while true; do /usr/sbin/readprofile -n -m /boot/System.map-`uname -r` | sort -k 2,2 > profile.log.$n; n=$(( $n + 1 )); sleep 1; done
>
> If you could stop the logging shortly after the skip in the kernel that
> does skip (but not _too_ shortly after, give it at least 1 second) I
> would be much obliged. The "before" picture is most important. An
> "after" picture might also be helpful, but isn't strictly necessary.

Please don't forget to boot with profile=1


-- wli

2003-07-31 21:44:07

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Wed, Jul 30, 2003 at 10:43:22AM +0200, Marc-Christian Petersen wrote:
> What makes me even more wondering is that 2.6.0-test1-wli tree does not suck
> at all for interactivity where no scheduler changes were made.

Could you make sure that you're not using 1A? (vanilla 1 and 1B are
both fine for these purposes).

Also, can I get a before/after of the following during an mp3 skip test?

vmstat 1 | tee -a vmstat.log

n=1; while true; do /usr/sbin/readprofile -n -m /boot/System.map-`uname -r` | sort -k 2,2 > profile.log.$n; n=$(( $n + 1 )); sleep 1; done

If you could stop the logging shortly after the skip in the kernel that
does skip (but not _too_ shortly after, give it at least 1 second) I
would be much obliged. The "before" picture is most important. An
"after" picture might also be helpful, but isn't strictly necessary.

Thanks.


-- wli

2003-07-31 21:54:12

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Thu, 2003-07-31 at 14:38, William Lee Irwin III wrote:

> If you could stop both the vmstat and the readprofile loop shortly
> after the skip (not _too_ shortly, at least 1 second after it) I'd be
> much obliged.

Just an FYI, Szonyi, you will need to boot the kernel with profile=n
where n is some number like 5, in order to use readprofile.

Robert Love


2003-08-01 10:47:09

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Thursday 31 July 2003 23:43, William Lee Irwin III wrote:

Hi William,

> On Wed, Jul 30, 2003 at 10:43:22AM +0200, Marc-Christian Petersen wrote:
> > What makes me even more wondering is that 2.6.0-test1-wli tree does not
> > suck at all for interactivity where no scheduler changes were made.
> Could you make sure that you're not using 1A? (vanilla 1 and 1B are
> both fine for these purposes).
> Also, can I get a before/after of the following during an mp3 skip test?
> vmstat 1 | tee -a vmstat.log
> n=1; while true; do /usr/sbin/readprofile -n -m /boot/System.map-`uname -r`
> | sort -k 2,2 > profile.log.$n; n=$(( $n + 1 )); sleep 1; done
> If you could stop the logging shortly after the skip in the kernel that
> does skip (but not _too_ shortly after, give it at least 1 second) I
> would be much obliged. The "before" picture is most important. An
> "after" picture might also be helpful, but isn't strictly necessary.
Sure, I'll test it this weekend (mostly tomorrow). Stay tuned.

Thanks for your interest in fixing this.

ciao, Marc

2003-08-02 22:32:38

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Friday 01 August 2003 12:44, Marc-Christian Petersen wrote:

Hi William,

> > Could you make sure that you're not using 1A? (vanilla 1 and 1B are
> > both fine for these purposes).
> > Also, can I get a before/after of the following during an mp3 skip test?
> > vmstat 1 | tee -a vmstat.log
> > n=1; while true; do /usr/sbin/readprofile -n -m /boot/System.map-`uname
> > -r`
> > | sort -k 2,2 > profile.log.$n; n=$(( $n + 1 )); sleep 1; done
> > If you could stop the logging shortly after the skip in the kernel that
> > does skip (but not _too_ shortly after, give it at least 1 second) I
> > would be much obliged. The "before" picture is most important. An
> > "after" picture might also be helpful, but isn't strictly necessary.
> Sure, I'll test it this weekend (mostly tomorrow). Stay tuned.
> Thanks for your interest in fixing this.

Sorry, I was busy till now. Ok, now I have time. Well, but reading your mail
again, you are interested in vmstat 1 if an mp3 skip occurs. I have to say,
that I never ever got an mp3 skip with your tree (-wli1 I am using).

and now? :)

ciao, Marc


2003-08-02 22:54:08

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Sat, Aug 02, 2003 at 11:27:45PM +0200, Marc-Christian Petersen wrote:
> Sorry, I was busy till now. Ok, now I have time. Well, but reading your mail
> again, you are interested in vmstat 1 if an mp3 skip occurs. I have to say,
> that I never ever got an mp3 skip with your tree (-wli1 I am using).
> and now? :)
> ciao, Marc

I actually wanted it from a kernel that behaved pathologically so I can
tell what issue it is.


-- wli

2003-08-02 23:20:03

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Sunday 03 August 2003 00:55, William Lee Irwin III wrote:

Hi William,

> > mail again, you are interested in vmstat 1 if an mp3 skip occurs. I have
> > to say, that I never ever got an mp3 skip with your tree (-wli1 I am
> > using). and now? :)
> I actually wanted it from a kernel that behaved pathologically so I can
> tell what issue it is.

aah, ic. OK. Compiling 2.6.0-test2 mainline now.

ciao, Marc

2003-08-04 19:07:31

by Marc-Christian Petersen

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Sunday 03 August 2003 01:19, Marc-Christian Petersen wrote:

Hi William,

> > I actually wanted it from a kernel that behaved pathologically so I can
> > tell what issue it is.
> aah, ic. OK. Compiling 2.6.0-test2 mainline now.
sorry, I compiled it but during the compilation I had to go. Had some private
problems :-( ... It will take some time (within this week) to get all these
numbers.

Sorry for the delay.

ciao, Marc

2003-08-04 19:52:29

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Sunday 03 August 2003 01:19, Marc-Christian Petersen wrote:
>> aah, ic. OK. Compiling 2.6.0-test2 mainline now.

On Mon, Aug 04, 2003 at 09:06:51PM +0200, Marc-Christian Petersen wrote:
> sorry, I compiled it but during the compilation I had to go. Had some private
> problems :-( ... It will take some time (within this week) to get all these
> numbers.
> Sorry for the delay.
> ciao, Marc

Testing thus far seems to contradict my original ideas. I'm going to
need help from io scheduler -type people to instrument the things I'm
looking at properly.


-- wli

2003-08-05 00:58:02

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

William Lee Irwin III wrote:

>On Sunday 03 August 2003 01:19, Marc-Christian Petersen wrote:
>
>>>aah, ic. OK. Compiling 2.6.0-test2 mainline now.
>>>
>
>On Mon, Aug 04, 2003 at 09:06:51PM +0200, Marc-Christian Petersen wrote:
>
>>sorry, I compiled it but during the compilation I had to go. Had some private
>>problems :-( ... It will take some time (within this week) to get all these
>>numbers.
>>Sorry for the delay.
>>ciao, Marc
>>
>
>Testing thus far seems to contradict my original ideas. I'm going to
>need help from io scheduler -type people to instrument the things I'm
>looking at properly.
>
>

I'm an IO scheduler type person! What help do you need? I haven't been
following the thread.



2003-08-05 02:40:08

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Tue, Aug 05, 2003 at 10:56:16AM +1000, Nick Piggin wrote:
> I'm an IO scheduler type person! What help do you need? I haven't been
> following the thread.

I'm not sure it was in the thread. Basically, the testers appear to
associate skips with changes in writeout and/or readin behavior (either
large amounts of writeout or low amounts of readin), though the effect
of behavior similar to that surrounding a skip doesn't appear to
guarantee a skip.

e.g. on line 448 (this is a vmstat log passed through cat -n) there was
a skip:

438 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
439 r b swpd free buff cache si so bi bo in cs us sy id wa
440 1 6 0 3452 160248 274696 0 0 34960 7936 1450 1776 7 23 0 70
441 4 3 0 4416 155824 277712 0 0 9744 27736 1348 1430 30 25 0 45
442 0 7 0 4172 148604 285688 0 0 9040 16476 1395 1296 25 33 0 43
443 2 7 0 3456 145228 289160 0 0 6160 32728 1592 1317 29 26 0 46
444 0 6 0 6200 142256 289732 0 0 10512 20596 1323 794 25 26 0 48
445 0 6 0 3420 139124 295700 0 0 6680 20160 1283 853 14 22 0 64
446 11 6 0 3740 133528 301032 0 0 8084 20216 1308 941 21 34 0 45
447 0 7 0 2480 131044 304496 0 0 1424 24480 1290 1093 26 32 0 41
448 0 9 0 2372 128616 307008 0 0 908 27260 1263 258 6 18 0 76
449 1 6 0 5316 120960 311976 0 0 2064 19252 1243 698 9 24 0 67
450 3 5 0 2080 100960 334900 0 0 13964 16076 1685 1340 8 43 0 50
451 2 6 0 1932 74936 360496 0 0 22176 12292 1495 1359 8 92 0 0
452 2 5 0 5004 67216 365220 0 0 12944 12316 1280 1079 5 42 0 53
453 0 8 0 2264 56148 379780 0 0 1808 29180 1293 629 9 43 0 49
454 0 6 0 2476 64688 371320 0 0 11536 15608 1262 551 7 40 0 53
455 0 9 0 3620 52776 382020 0 0 2468 23836 1283 463 8 34 0 58
456 5 8 0 3012 52904 382600 0 0 1796 30540 1348 400 5 26 0 69
457 1 5 0 4440 45332 388380 0 0 1620 27980 1263 323 16 28 0 56

The load IIRC was some kind of io to an IDE disk while xmms played.

About all I can tell is that when there is a skip, bi is low, but
the converse does not hold. This appears to be independent of io
scheduler (I had them try deadline too), and I'm very unsure what to
make of it. I originally suspected thundering herds from waitqueue
hashing but things appear to contradict that given the low cs rates.

I'm collecting instrumentation patches to see what's going on. The
first order of business is probably getting the testers to run with
sleepometer to see if and where they're blocking, but given the io bits
that are observable some elevator instrumentation might help too (and
whatever it takes to figure out if a driver is spinning wildly too!).


Thanks.

-- wli

2003-08-05 03:14:28

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Tue, Aug 05, 2003 at 01:07:44PM +1000, Nick Piggin wrote:
> Let me know if you come up with anything significant ;)

Well, I was vaguely hoping a useful way to instrument the io stuff
would already be out there.


-- wli

2003-08-05 03:09:09

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity



William Lee Irwin III wrote:

>On Tue, Aug 05, 2003 at 10:56:16AM +1000, Nick Piggin wrote:
>
>>I'm an IO scheduler type person! What help do you need? I haven't been
>>following the thread.
>>
>
>I'm not sure it was in the thread. Basically, the testers appear to
>associate skips with changes in writeout and/or readin behavior (either
>large amounts of writeout or low amounts of readin), though the effect
>of behavior similar to that surrounding a skip doesn't appear to
>guarantee a skip.
>

Right.

snip vmstat

>
>The load IIRC was some kind of io to an IDE disk while xmms played.
>
>About all I can tell is that when there is a skip, bi is low, but
>the converse does not hold. This appears to be independent of io
>scheduler (I had them try deadline too), and I'm very unsure what to
>make of it. I originally suspected thundering herds from waitqueue
>hashing but things appear to contradict that given the low cs rates.
>

So yeah it could easily be that for example the cpu scheduler is
causing the skip and the low IO rates.

>
>I'm collecting instrumentation patches to see what's going on. The
>first order of business is probably getting the testers to run with
>sleepometer to see if and where they're blocking, but given the io bits
>that are observable some elevator instrumentation might help too (and
>whatever it takes to figure out if a driver is spinning wildly too!).
>

Let me know if you come up with anything significant ;)

2003-08-05 03:30:19

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

William Lee Irwin III wrote:
>> Well, I was vaguely hoping a useful way to instrument the io stuff
>> would already be out there.

On Tue, Aug 05, 2003 at 01:23:08PM +1000, Nick Piggin wrote:
> Not really.
> For a process doing blocking reads you could measure the time
> from when a process submits a read to when it gets the result.
> I suppose you also need some minimum rate too but I really can't
> see that being the problem here.

I'm at least aware of patches for 2.4.x that log io scheduling
decisions in the driver, which is basically what I was hoping for.

On a higher level, are you thinking there's some indication the
io schedulers themselves aren't involved? Or that something higher-
level should be instrumented? If so, what?


-- wli

2003-08-05 03:24:14

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity



William Lee Irwin III wrote:

>On Tue, Aug 05, 2003 at 01:07:44PM +1000, Nick Piggin wrote:
>
>>Let me know if you come up with anything significant ;)
>>
>
>Well, I was vaguely hoping a useful way to instrument the io stuff
>would already be out there.
>
>

Not really.
For a process doing blocking reads you could measure the time
from when a process submits a read to when it gets the result.
I suppose you also need some minimum rate too but I really can't
see that being the problem here.


2003-08-05 03:39:44

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity



William Lee Irwin III wrote:

>William Lee Irwin III wrote:
>
>>>Well, I was vaguely hoping a useful way to instrument the io stuff
>>>would already be out there.
>>>
>
>On Tue, Aug 05, 2003 at 01:23:08PM +1000, Nick Piggin wrote:
>
>>Not really.
>>For a process doing blocking reads you could measure the time
>>from when a process submits a read to when it gets the result.
>>I suppose you also need some minimum rate too but I really can't
>>see that being the problem here.
>>
>
>I'm at least aware of patches for 2.4.x that log io scheduling
>decisions in the driver, which is basically what I was hoping for.
>
>On a higher level, are you thinking there's some indication the
>io schedulers themselves aren't involved? Or that something higher-
>level should be instrumented? If so, what?
>

Yes thats what I think. Reading an mp3 shouldn't take a lot of
disk power, and seeing as its sustaining 20MB/s of writes, and
the default AS biases reads quite heavily over writes then it
would be very surprising.

Of course some minimum read _latency_ would be required: this
could actually be done easily with strace come to think of it.

Maybe some xmms mapped memory is being swapped out? But that
would be more of a VM problem.

Get the test run when the mp3 is in ram: if it can't be
reproduced then it would be worth looking into further. I guess
the process scheduler though.


2003-08-05 04:54:43

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Tue, 05 Aug 2003 13:38:34 +1000, Nick Piggin said:

> Of course some minimum read _latency_ would be required: this
> could actually be done easily with strace come to think of it.
>
> Maybe some xmms mapped memory is being swapped out? But that
> would be more of a VM problem.

I was seeing some CPU-related pauses, but once Con's work got to O7 or so,
those disappeared. I'm *quite* convinced that the remaining glitches
are VM related, mostly because every glitch seems to be associated with
an increase in the 'pswpout' field in /proc/vmstat (yes, I tested with stuff like
"for (;;) do cat /proc/vmstat; sleep 1 done;".

The *odd* part is that the pgpgin, pgpgout, and pswpin numbers do *NOT*
seem to be correlated. High I/O loads from read/write don't seem to cause
a problem - untarring the Linux distro won't do it, running badblocks won't do it.

But if somebody has to swap out, all hell breaks loose...

Hmm.. looking at mm/page_io.c, it seems swap_writepage() calls get_swap_bio
with GFP_NOIO, while readdpage() uses GFP_KERNEL. I wonder if that GFP_NOIO is
causing ugliness - that's really __GFP_WAIT, and the comments in bio_alloc() are
pretty clear that it can block. And remember we're not getting into this code unless
we're already under memory pressure....

(And if somebody tells me how to instrument a -test2-mm4 kernel so I can tell if
I'm on crack or not, I'll happily do so....)


Attachments:
(No filename) (226.00 B)

2003-08-05 05:01:44

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Tue, 05 Aug 2003 13:38:34 +1000, Nick Piggin said:
>> Of course some minimum read _latency_ would be required: this
>> could actually be done easily with strace come to think of it.
>> Maybe some xmms mapped memory is being swapped out? But that
>> would be more of a VM problem.

On Tue, Aug 05, 2003 at 12:54:10AM -0400, [email protected] wrote:
> I was seeing some CPU-related pauses, but once Con's work got to O7 or so,
> those disappeared. I'm *quite* convinced that the remaining glitches
> are VM related, mostly because every glitch seems to be associated with
> an increase in the 'pswpout' field in /proc/vmstat (yes, I tested
> with stuff like "for (;;) do cat /proc/vmstat; sleep 1 done;".

Could I get logs of the stuff?


On Tue, Aug 05, 2003 at 12:54:10AM -0400, [email protected] wrote:
> The *odd* part is that the pgpgin, pgpgout, and pswpin numbers do
> *NOT* seem to be correlated. High I/O loads from read/write don't
> seem to cause a problem - untarring the Linux distro won't do it,
> running badblocks won't do it. But if somebody has to swap out, all
> hell breaks loose...

Is the swapfile/partition on the same disk as the music? Is the disk IDE?


On Tue, Aug 05, 2003 at 12:54:10AM -0400, [email protected] wrote:
> Hmm.. looking at mm/page_io.c, it seems swap_writepage() calls
> get_swap_bio with GFP_NOIO, while readdpage() uses GFP_KERNEL. I
> wonder if that GFP_NOIO is causing ugliness - that's really
> __GFP_WAIT, and the comments in bio_alloc() are pretty clear that it
> can block. And remember we're not getting into this code unless
> we're already under memory pressure....
> (And if somebody tells me how to instrument a -test2-mm4 kernel so I
> can tell if I'm on crack or not, I'll happily do so....)

Well, sleepometer is around, but probably needs merging.


-- wli

2003-08-05 05:54:22

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

[email protected] wrote:
>
> The *odd* part is that the pgpgin, pgpgout, and pswpin numbers do *NOT*
> seem to be correlated. High I/O loads from read/write don't seem to cause
> a problem - untarring the Linux distro won't do it, running badblocks won't do it.
>
> But if somebody has to swap out, all hell breaks loose...

swapout tends to happen via page reclaim, whereas normal writeback does
not.

What's the difference? When swapout is happening you can expect increased
latency in the page allocator.

My guess is that xmms is getting throttled in try_to_free_pages().

There is a very good argument for giving !SCHED_OTHER tasks "special
treatment" in the VM. ie:

a) exempt them from balance_dirty_pages() throttling treatment altogether

b) let them dip further into the page reserves in __alloc_pages.

iirc, -aa kernels do some of this. As does the Digeo kernel. Just haven't
got around to it in 2.6. It's pretty simple.

If xmms isn't running SCHED_FIFO/SCHED_RR, well, you lose.

The instrumentation to add is page allocation latency.


Another possibility is that xmms is getting stuck in a read. The
anticipatory scheduler is currently rather tuned for throughput. Judging
by the vmstat trace which was posted, we have a classic
read-stream-vs-write-stream going on. We trade off latency versus
throughput; perhaps wrongly. You can decrease latency (at the expense of
throughput) by decreasing the settings in /sys/block/hda/queue/iosched.

To a point, it is a nice linear tradeoff, and someone should put the time
in to tweak and characterise it.

2003-08-05 07:12:05

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

Andrew Morton wrote:

>[email protected] wrote:
>
>>The *odd* part is that the pgpgin, pgpgout, and pswpin numbers do *NOT*
>> seem to be correlated. High I/O loads from read/write don't seem to cause
>> a problem - untarring the Linux distro won't do it, running badblocks won't do it.
>>
>> But if somebody has to swap out, all hell breaks loose...
>>
>
>swapout tends to happen via page reclaim, whereas normal writeback does
>not.
>
>What's the difference? When swapout is happening you can expect increased
>latency in the page allocator.
>
>My guess is that xmms is getting throttled in try_to_free_pages().
>
>There is a very good argument for giving !SCHED_OTHER tasks "special
>treatment" in the VM. ie:
>
>a) exempt them from balance_dirty_pages() throttling treatment altogether
>
>b) let them dip further into the page reserves in __alloc_pages.
>
>iirc, -aa kernels do some of this. As does the Digeo kernel. Just haven't
>got around to it in 2.6. It's pretty simple.
>
>If xmms isn't running SCHED_FIFO/SCHED_RR, well, you lose.
>
>The instrumentation to add is page allocation latency.
>
>
>Another possibility is that xmms is getting stuck in a read. The
>anticipatory scheduler is currently rather tuned for throughput. Judging
>by the vmstat trace which was posted, we have a classic
>read-stream-vs-write-stream going on. We trade off latency versus
>throughput; perhaps wrongly. You can decrease latency (at the expense of
>throughput) by decreasing the settings in /sys/block/hda/queue/iosched.
>
>To a point, it is a nice linear tradeoff, and someone should put the time
>in to tweak and characterise it.
>
>

With 1 "big" writer if xmms submits a read, it should be serviced within
50ms
if there is no TCQ, possibly more, but likely to be less. The reading
process
should then be able to read for up to 200ms before writes are serviced
again.


2003-08-05 17:05:56

by Martin Josefsson

[permalink] [raw]
Subject: Re: [PATCH] O11int for interactivity

On Tue, 2003-08-05 at 07:55, Andrew Morton wrote:

> Another possibility is that xmms is getting stuck in a read. The
> anticipatory scheduler is currently rather tuned for throughput. Judging
> by the vmstat trace which was posted, we have a classic
> read-stream-vs-write-stream going on. We trade off latency versus
> throughput; perhaps wrongly. You can decrease latency (at the expense of
> throughput) by decreasing the settings in /sys/block/hda/queue/iosched.
>
> To a point, it is a nice linear tradeoff, and someone should put the time
> in to tweak and characterise it.

I believe it was my trace wli posted.
No swapping was going on, swappiness set to 30

X was quite jerky and uninteractive during this and sometimes it froze
for up to 5 seconds (the sound usually stopped during the freezing).

Since there wasn't any swapping going on and quite a lot of cpu left we
either have quite some latency when reading back parts of X that
previously got discarded or massive stalls in kernelspace somewhere.

One thing I noticed was that when evolution started checking for new
mail in a lot of folders I get a lot of seeks and the throughput
naturally decreased but X got really responsive again. This points away
from X beeing discarded and read back in from disk since that would take
some time with all those seeks as well.

The machine this was tested on is a pIII 700 with 704MB ram and IDE
disks (everything was against the same disk)

--
/Martin