2006-03-14 09:49:50

by Mike Galbraith

[permalink] [raw]
Subject: [2.6.16-rc6 patch] remove sleep_avg multiplier

Greetings,

The patchlet below removes the sleep_avg multiplier. This multiplier
was necessary back when we had 10 seconds of dynamic range in sleep_avg,
but now that we only have one second, it causes that one second to be
compressed down to 100ms in some cases. This is particularly noticeable
when compiling a kernel in a slow NFS mount, and I believe it to be a
very likely candidate for other recently reported network related
interactivity problems.

In testing, I can detect no negative impact of this removal. IMHO, this
constitutes a bug-fix, and as such is suitable for 2.6.16.

-Mike

Signed-off-by: Mike Galbraith <[email protected]>

--- linux-2.6.16rc6/kernel/sched.c.org 2006-03-14 10:30:35.000000000 +0100
+++ linux-2.6.16rc6/kernel/sched.c 2006-03-14 10:31:13.000000000 +0100
@@ -707,12 +707,6 @@
DEF_TIMESLICE);
} else {
/*
- * The lower the sleep avg a task has the more
- * rapidly it will rise with sleep time.
- */
- sleep_time *= (MAX_BONUS - CURRENT_BONUS(p)) ? : 1;
-
- /*
* Tasks waking from uninterruptible sleep are
* limited in their sleep_avg rise as they
* are likely to be waiting on I/O



2006-03-14 09:59:12

by Ingo Molnar

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier


* Mike Galbraith <[email protected]> wrote:

> Greetings,
>
> The patchlet below removes the sleep_avg multiplier. This multiplier
> was necessary back when we had 10 seconds of dynamic range in sleep_avg,
> but now that we only have one second, it causes that one second to be
> compressed down to 100ms in some cases. This is particularly noticeable
> when compiling a kernel in a slow NFS mount, and I believe it to be a
> very likely candidate for other recently reported network related
> interactivity problems.
>
> In testing, I can detect no negative impact of this removal. IMHO, this
> constitutes a bug-fix, and as such is suitable for 2.6.16.
>
> -Mike
>
> Signed-off-by: Mike Galbraith <[email protected]>

looks good to me. The biggest complaint against the current scheduler is
over-eager interactivity boosting - this patch moderates that in a
smooth way.

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-03-14 10:06:56

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tuesday 14 March 2006 20:56, Ingo Molnar wrote:
> * Mike Galbraith <[email protected]> wrote:
> > Greetings,
> >
> > The patchlet below removes the sleep_avg multiplier. This multiplier
> > was necessary back when we had 10 seconds of dynamic range in sleep_avg,
> > but now that we only have one second, it causes that one second to be
> > compressed down to 100ms in some cases. This is particularly noticeable
> > when compiling a kernel in a slow NFS mount, and I believe it to be a
> > very likely candidate for other recently reported network related
> > interactivity problems.
> >
> > In testing, I can detect no negative impact of this removal. IMHO, this
> > constitutes a bug-fix, and as such is suitable for 2.6.16.

> looks good to me. The biggest complaint against the current scheduler is
> over-eager interactivity boosting - this patch moderates that in a
> smooth way.

I actually think Mike is right about the change, but has anyone else tested
this patch to also confirm "it has no negative impact" warranting it's rapid
inclusion in 2.6.16?

Cheers,
Con

2006-03-14 10:11:05

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tuesday 14 March 2006 21:05, Con Kolivas wrote:
> On Tuesday 14 March 2006 20:56, Ingo Molnar wrote:
> > * Mike Galbraith <[email protected]> wrote:
> > > Greetings,
> > >
> > > The patchlet below removes the sleep_avg multiplier. This multiplier
> > > was necessary back when we had 10 seconds of dynamic range in
> > > sleep_avg, but now that we only have one second, it causes that one
> > > second to be compressed down to 100ms in some cases. This is
> > > particularly noticeable when compiling a kernel in a slow NFS mount,
> > > and I believe it to be a very likely candidate for other recently
> > > reported network related interactivity problems.
> > >
> > > In testing, I can detect no negative impact of this removal. IMHO,
> > > this constitutes a bug-fix, and as such is suitable for 2.6.16.
> >
> > looks good to me. The biggest complaint against the current scheduler is
> > over-eager interactivity boosting - this patch moderates that in a
> > smooth way.
>
> I actually think Mike is right about the change, but has anyone else tested
> this patch to also confirm "it has no negative impact" warranting it's
> rapid inclusion in 2.6.16?

/me smacks himself for misusing "it's"

How about an interbench run before and after Mike?

Cheers,
Con

2006-03-14 11:54:52

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tue, 2006-03-14 at 21:10 +1100, Con Kolivas wrote:
> On Tuesday 14 March 2006 21:05, Con Kolivas wrote:
> > On Tuesday 14 March 2006 20:56, Ingo Molnar wrote:
> > > * Mike Galbraith <[email protected]> wrote:
> > > > Greetings,
> > > >
> > > > The patchlet below removes the sleep_avg multiplier. This multiplier
> > > > was necessary back when we had 10 seconds of dynamic range in
> > > > sleep_avg, but now that we only have one second, it causes that one
> > > > second to be compressed down to 100ms in some cases. This is
> > > > particularly noticeable when compiling a kernel in a slow NFS mount,
> > > > and I believe it to be a very likely candidate for other recently
> > > > reported network related interactivity problems.
> > > >
> > > > In testing, I can detect no negative impact of this removal. IMHO,
> > > > this constitutes a bug-fix, and as such is suitable for 2.6.16.
> > >
> > > looks good to me. The biggest complaint against the current scheduler is
> > > over-eager interactivity boosting - this patch moderates that in a
> > > smooth way.
> >
> > I actually think Mike is right about the change, but has anyone else tested
> > this patch to also confirm "it has no negative impact" warranting it's
> > rapid inclusion in 2.6.16?
>
> /me smacks himself for misusing "it's"
>
> How about an interbench run before and after Mike?

Nothing against interbench, but how about something more concrete...
like a very modest parallel kernel compile in a slow NFS mount. No need
to interpret results, it pokes you dead in the eye.

With my full change set, you _will_ see differences with interbench.
Interbench will say you're better off without my changes in fact. Run
any of the known scheduler exploits without my changes, and then with,
and you'll likely consider revising interbench a little methinks ;-)

-Mike

2006-03-14 12:07:32

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tuesday 14 March 2006 22:56, Mike Galbraith wrote:
> On Tue, 2006-03-14 at 21:10 +1100, Con Kolivas wrote:
> > On Tuesday 14 March 2006 21:05, Con Kolivas wrote:
> > > On Tuesday 14 March 2006 20:56, Ingo Molnar wrote:
> > > > * Mike Galbraith <[email protected]> wrote:
> > > > > Greetings,
> > > > >
> > > > > The patchlet below removes the sleep_avg multiplier. This
> > > > > multiplier was necessary back when we had 10 seconds of dynamic
> > > > > range in sleep_avg, but now that we only have one second, it causes
> > > > > that one second to be compressed down to 100ms in some cases. This
> > > > > is particularly noticeable when compiling a kernel in a slow NFS
> > > > > mount, and I believe it to be a very likely candidate for other
> > > > > recently reported network related interactivity problems.
> > > > >
> > > > > In testing, I can detect no negative impact of this removal. IMHO,
> > > > > this constitutes a bug-fix, and as such is suitable for 2.6.16.
> > > >
> > > > looks good to me. The biggest complaint against the current scheduler
> > > > is over-eager interactivity boosting - this patch moderates that in a
> > > > smooth way.
> > >
> > > I actually think Mike is right about the change, but has anyone else
> > > tested this patch to also confirm "it has no negative impact"
> > > warranting it's rapid inclusion in 2.6.16?
> >
> > /me smacks himself for misusing "it's"
> >
> > How about an interbench run before and after Mike?
>
> Nothing against interbench, but how about something more concrete...
> like a very modest parallel kernel compile in a slow NFS mount. No need
> to interpret results, it pokes you dead in the eye.

I have no nfs to test it, nor do I have a network where I could set it up. I
was simply suggesting if there is negligible difference on interbench on
common workloads it strengthens your statement of "no negative impact" with
some harder evidence. If others are willing to run with your change without
any further testing then so be it; I think they're likely to be a net
positive outcome. It's just unusual to be guided without anyone else testing
it.

> With my full change set, you _will_ see differences with interbench.
> Interbench will say you're better off without my changes in fact. Run
> any of the known scheduler exploits without my changes, and then with,
> and you'll likely consider revising interbench a little methinks ;-)

Not really; interbench is after interactivity, and exploit prone designs don't
necessarily have bad interactivity. If you can reproduce the nfs case as an
extra load for interbench I'd love to include it.

Cheers,
Con

2006-03-14 12:23:50

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tue, 2006-03-14 at 23:07 +1100, Con Kolivas wrote:
> On Tuesday 14 March 2006 22:56, Mike Galbraith wrote:
> > With my full change set, you _will_ see differences with interbench.
> > Interbench will say you're better off without my changes in fact. Run
> > any of the known scheduler exploits without my changes, and then with,
> > and you'll likely consider revising interbench a little methinks ;-)
>
> Not really; interbench is after interactivity, and exploit prone designs don't
> necessarily have bad interactivity. If you can reproduce the nfs case as an
> extra load for interbench I'd love to include it.

Yes, interbench tries to assess interactivity, but it gets it totally
wrong sometimes. It runs it's measurement at a high priority, and calls
the result good if it was able to get as much cpu as it wants. The very
code responsible for good interbench numbers is also responsible for
starvation problems. It's the long sleep logic. That logic makes my
box suck rocks under thud and irman2.

Don't forget, every one of the exploits I test with were posted by
people who were experiencing scheduler problems in real life. Try to
use your box while running those exploits, and then tell me that you
agree with interbench's assessment.

-Mike

2006-03-14 12:30:01

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tuesday 14 March 2006 23:24, Mike Galbraith wrote:
> On Tue, 2006-03-14 at 23:07 +1100, Con Kolivas wrote:
> > On Tuesday 14 March 2006 22:56, Mike Galbraith wrote:
> > > With my full change set, you _will_ see differences with interbench.
> > > Interbench will say you're better off without my changes in fact. Run
> > > any of the known scheduler exploits without my changes, and then with,
> > > and you'll likely consider revising interbench a little methinks ;-)
> >
> > Not really; interbench is after interactivity, and exploit prone designs
> > don't necessarily have bad interactivity. If you can reproduce the nfs
> > case as an extra load for interbench I'd love to include it.
>
> Yes, interbench tries to assess interactivity, but it gets it totally
> wrong sometimes. It runs it's measurement at a high priority, and calls
> the result good if it was able to get as much cpu as it wants.

You misunderstand the code and/or my intent. There is a thread at high
priority that does timing and signalling only, but the loads and benchmarked
simulations are run at normal priority (nice 0) or at a nice level set by
yourself in the options.

> The very
> code responsible for good interbench numbers is also responsible for
> starvation problems. It's the long sleep logic. That logic makes my
> box suck rocks under thud and irman2.
>
> Don't forget, every one of the exploits I test with were posted by
> people who were experiencing scheduler problems in real life. Try to
> use your box while running those exploits, and then tell me that you
> agree with interbench's assessment.

Ok you feel interbench is an irrelevant benchmark for your test case and I'm
not going to bother arguing since it doesn't claim to test every single
situation.

That doesn't change the fact that your patch has only been tested by yourself.
Don't forget I'm still agreeing with your change, I'm just suggesting the
usual patch precautions.

Cheers,
Con

2006-03-14 12:36:28

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

> On Tuesday 14 March 2006 23:24, Mike Galbraith wrote:
> > The very
> > code responsible for good interbench numbers is also responsible for
> > starvation problems. It's the long sleep logic. That logic makes my
> > box suck rocks under thud and irman2.

Oh and I do appreciate that an ultimately interactive design may well be also
ultimately exploitable. Interbench never claimed to test for exploits.

Cheers,
Con

2006-03-14 12:39:23

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tue, 2006-03-14 at 23:29 +1100, Con Kolivas wrote:
> On Tuesday 14 March 2006 23:24, Mike Galbraith wrote:
> >
> > Don't forget, every one of the exploits I test with were posted by
> > people who were experiencing scheduler problems in real life. Try to
> > use your box while running those exploits, and then tell me that you
> > agree with interbench's assessment.
>
> Ok you feel interbench is an irrelevant benchmark for your test case and I'm
> not going to bother arguing since it doesn't claim to test every single
> situation.

Yes. Interbench's opinion is irrelevant to me wrt this problem.

> That doesn't change the fact that your patch has only been tested by yourself.
> Don't forget I'm still agreeing with your change, I'm just suggesting the
> usual patch precautions.

Sure. Let's get people interested in testing this ASAP. OTOH, let's
not delay this simple and (IMHO) dead obvious fix getting into 2.6.16
simply because I'm the only one who _has_ done massive amounts of
testing ;-)

-Mike

2006-03-14 12:47:55

by Con Kolivas

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tuesday 14 March 2006 23:40, Mike Galbraith wrote:
> On Tue, 2006-03-14 at 23:29 +1100, Con Kolivas wrote:
> > On Tuesday 14 March 2006 23:24, Mike Galbraith wrote:
> > > Don't forget, every one of the exploits I test with were posted by
> > > people who were experiencing scheduler problems in real life. Try to
> > > use your box while running those exploits, and then tell me that you
> > > agree with interbench's assessment.
> >
> > Ok you feel interbench is an irrelevant benchmark for your test case and
> > I'm not going to bother arguing since it doesn't claim to test every
> > single situation.
>
> Yes. Interbench's opinion is irrelevant to me wrt this problem.

Ok one last try to explain where I'm coming from and then I'll give up ...

Interbench's opinion is not irrelevant to me on this because it may help your
nfs case but interbench does tell me what happens with X, video, audio etc.
It's precisely because it quantifies those other scenarios that I care.

Cheers,
Con

2006-03-14 12:58:43

by Mike Galbraith

[permalink] [raw]
Subject: Re: [2.6.16-rc6 patch] remove sleep_avg multiplier

On Tue, 2006-03-14 at 23:47 +1100, Con Kolivas wrote:
> On Tuesday 14 March 2006 23:40, Mike Galbraith wrote:
> > On Tue, 2006-03-14 at 23:29 +1100, Con Kolivas wrote:
> > > On Tuesday 14 March 2006 23:24, Mike Galbraith wrote:
> > > > Don't forget, every one of the exploits I test with were posted by
> > > > people who were experiencing scheduler problems in real life. Try to
> > > > use your box while running those exploits, and then tell me that you
> > > > agree with interbench's assessment.
> > >
> > > Ok you feel interbench is an irrelevant benchmark for your test case and
> > > I'm not going to bother arguing since it doesn't claim to test every
> > > single situation.
> >
> > Yes. Interbench's opinion is irrelevant to me wrt this problem.
>
> Ok one last try to explain where I'm coming from and then I'll give up ...
>
> Interbench's opinion is not irrelevant to me on this because it may help your
> nfs case but interbench does tell me what happens with X, video, audio etc.
> It's precisely because it quantifies those other scenarios that I care.

Sure, and I'm not trying to knock interbench. I used it as yet another
test to my changes as I made them. I just disagree with it's opinion.

(I didn't misunderstand the code either, I observed it in action,
interpreted the difference between reaction to stock, and reaction to my
changes, and then went straight to the long sleep logic and [tweak] made
the numbers identical to guarantee that I understood)

-Mike