2010-01-21 21:54:41

by Alex Chiang

[permalink] [raw]
Subject: scheduler vs hardware? (was Re: another i7 (linux) bug?)

Scheduler folks cc'ed.

* Luca Zini <[email protected]>:
> Dear all,
> I find your addresses in the thread
> (http://bugzilla.kernel.org/show_bug.cgi?id=15064) that I followed.

In general, it's better to send mail to the mailing lists in case
individual developers don't have time/expertise to help you.

Which kernel are you using?

This may be an issue for the scheduler folks since in your
experiment, the only thing you're changing are the nice levels,
not hardware.

> I had the same problem that now seems to be solved. There is
> also another peculiarity in my system:
>
> if I run a cpu intensive process with the lowest priority (19
> from man nice) I obtain much better performance that with the
> highest priority available (-20 from man nice).
>
> For example the same file is processed by lame in 8.7 seconds
> at the lowest priority, and in 12 seconds at the highest
> priority. Before posting a bug I wold like to understand if
> this is a problem related to the i7 mobile (my processor is a
> i7 Q720).
>
> As far as I tested on the same laptop series (dell studio 15),
> with the same kernel this problem does not exists.
>
> Can you test your i7 mobile system to help me to understand if
> it is related to the processor?
>
> Regards, Luca Zini
>


2010-01-22 07:19:24

by Mike Galbraith

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

> if I run a cpu intensive process with the lowest priority (19
> from man nice) I obtain much better performance that with the
> highest priority available (-20 from man nice).
>
> For example the same file is processed by lame in 8.7 seconds
> at the lowest priority, and in 12 seconds at the highest
> priority. Before posting a bug I wold like to understand if
> this is a problem related to the i7 mobile (my processor is a
> i7 Q720).
>
> As far as I tested on the same laptop series (dell studio 15),
> with the same kernel this problem does not exists.

So you only see this on the i7. That's odd. Can you try 33-rc5?

Posting a reliable reproducer would be nice. It'd also be nice to see
what all is running when you see this, and where.

-Mike

2010-01-22 08:51:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

On Fri, 2010-01-22 at 08:19 +0100, Mike Galbraith wrote:
> > if I run a cpu intensive process with the lowest priority (19
> > from man nice) I obtain much better performance that with the
> > highest priority available (-20 from man nice).
> >
> > For example the same file is processed by lame in 8.7 seconds
> > at the lowest priority, and in 12 seconds at the highest
> > priority. Before posting a bug I wold like to understand if
> > this is a problem related to the i7 mobile (my processor is a
> > i7 Q720).
> >
> > As far as I tested on the same laptop series (dell studio 15),
> > with the same kernel this problem does not exists.
>
> So you only see this on the i7. That's odd. Can you try 33-rc5?
>
> Posting a reliable reproducer would be nice. It'd also be nice to see
> what all is running when you see this, and where.

Using a sample from: http://lame.sourceforge.net/quality.php

My laptop does:


# time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:03/ 0:03| 0:03/ 0:03| 9.3341x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m3.273s
user 0m3.217s
sys 0m0.022s


# time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:01/ 0:01| 0:01/ 0:01| 27.239x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m1.121s
user 0m1.102s
sys 0m0.013s


# time nice -n -20 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:01/ 0:01| 0:01/ 0:01| 27.239x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m1.112s
user 0m1.093s
sys 0m0.018s



On a Nehalem class server machine it does:


# time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:00/ 0:00| 0:00/ 0:00| 32.926x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m0.932s
user 0m0.917s
sys 0m0.005s


# time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:00/ 0:00| 0:00/ 0:00| 32.568x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m0.927s
user 0m0.922s
sys 0m0.003s


# time nice -n -20 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
LAME 3.98.2 64bits (http://www.mp3dev.org/)
Using polyphase lowpass filter, transition band: 19383 Hz - 19916 Hz
Encoding youcantdothat.wav to <stdout>
Encoding as 44.1 kHz j-stereo MPEG-1 Layer III VBR(q=0)
Frame | CPU time/estim | REAL time/estim | play/CPU | ETA
1147/1147 (100%)| 0:00/ 0:00| 0:00/ 0:00| 32.926x| 0:00
32 [ 1] *
256 [1009] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320 [ 137] %%%%%%%%%
----------------------------------------------------------------------------
kbps LR MS % long switch short %
263.4 99.1 0.9 90.5 5.5 4.0

real 0m0.919s
user 0m0.914s
sys 0m0.005s

2010-01-22 09:11:08

by Mike Galbraith

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

On Fri, 2010-01-22 at 09:51 +0100, Peter Zijlstra wrote:
> On Fri, 2010-01-22 at 08:19 +0100, Mike Galbraith wrote:
> > > if I run a cpu intensive process with the lowest priority (19
> > > from man nice) I obtain much better performance that with the
> > > highest priority available (-20 from man nice).
> > >
> > > For example the same file is processed by lame in 8.7 seconds
> > > at the lowest priority, and in 12 seconds at the highest
> > > priority. Before posting a bug I wold like to understand if
> > > this is a problem related to the i7 mobile (my processor is a
> > > i7 Q720).
> > >
> > > As far as I tested on the same laptop series (dell studio 15),
> > > with the same kernel this problem does not exists.
> >
> > So you only see this on the i7. That's odd. Can you try 33-rc5?
> >
> > Posting a reliable reproducer would be nice. It'd also be nice to see
> > what all is running when you see this, and where.
>
> Using a sample from: http://lame.sourceforge.net/quality.php
>
> My laptop does:
>
>
> # time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m3.273s
> user 0m3.217s
> sys 0m0.022s
>
>
> # time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m1.121s
> user 0m1.102s
> sys 0m0.013s
>
>
> # time nice -n -20 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m1.112s
> user 0m1.093s
> sys 0m0.018s
>
>
>
> On a Nehalem class server machine it does:
>
>
> # time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m0.932s
> user 0m0.917s
> sys 0m0.005s
>
>
> # time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m0.927s
> user 0m0.922s
> sys 0m0.003s
>
>
> # time nice -n -20 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m0.919s
> user 0m0.914s
> sys 0m0.005s

Weird. Here there is zip squat difference, as expected with 1 thread.

time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

real 0m0.912s
user 0m0.908s
sys 0m0.000s

time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

real 0m0.912s
user 0m0.904s
sys 0m0.004s

time nice -n -20 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

real 0m0.912s
user 0m0.904s
sys 0m0.004s

(bah, who needs a nehalem;)

-Mike

2010-01-22 09:25:25

by Mike Galbraith

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

On Fri, 2010-01-22 at 09:51 +0100, Peter Zijlstra wrote:

> My laptop does:
>
>
> # time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m3.273s
> user 0m3.217s
> sys 0m0.022s
>
>
> # time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

> real 0m1.121s
> user 0m1.102s
> sys 0m0.013s

I don't see any way that can be scheduler (knocks wood). Could it be
cpufreq? IIRC there is or was an ignore nice gizmo in there.

-Mike

2010-01-22 09:35:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

On Fri, 2010-01-22 at 10:25 +0100, Mike Galbraith wrote:
> On Fri, 2010-01-22 at 09:51 +0100, Peter Zijlstra wrote:
>
> > My laptop does:
> >
> >
> > # time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
>
> > real 0m3.273s
> > user 0m3.217s
> > sys 0m0.022s
> >
> >
> > # time nice -n 0 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null
>
> > real 0m1.121s
> > user 0m1.102s
> > sys 0m0.013s
>
> I don't see any way that can be scheduler (knocks wood). Could it be
> cpufreq? IIRC there is or was an ignore nice gizmo in there.

That's a laptop with full desktop bloat, I'm sure either firefox or evo
farted, or both.

2010-01-22 11:22:36

by Luca Zini

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

First of all sorry for misunderstanding with Alex Chiang, I was trying to
collect some data from other i7 mobile users to try to isolate the problem
before posting.

I tried different things: I disabled speedstep from the bios and the results
where more sensible (higher priority lower execution time). The same thing
happens if I disable throttling by software selecting "aggressive powersave"
settings.

So I suppose that is something related directly or indirectly to frequency
scaling or turbo boost (unfortunately I have no bios option to disable only
turbo boost)
Here are the results of the same test of Peter Zijlstra

time sudo nice -n 19 lame -b 256 -V0 -h youcantdothat.wav 2&> /dev/null

real 0m1.105s
user 0m1.090s
sys 0m0.010s

time sudo nice -n 0 lame -b 256 -V0 -h youcantdothat.wav 2&> /dev/null

real 0m1.108s
user 0m1.100s
sys 0m0.010s

time sudo nice -n -20 lame -b 256 -V0 -h youcantdothat.wav 2&> /dev/null

real 0m1.354s
user 0m1.330s
sys 0m0.000s

They are almost the same results that I obtained before.
Looking at top there are no other process that is using the cpu (main are
Xorg and kopete).

I tested this on 2.6.31-17-server (ubuntu 9.10) 2.6.31-17-generic (ubuntu
9.10) and the default kernel of fedora 12 (live cd).
As soon as possible I'll try 33-rc5 as requested by Mike Galbraith (now I
can't shut down the computer).

Ok, a last minute update!
I slightly modified i7z to save a log to disk. If i7z output are reliable I
suppose that it can show is the problem:
with nice set to 19 a processor 3 reach the maximum speed, with nice set to
-20 its maximum value is 300-400 mhz under the maximum value. (I attach the
two logs named as the nice level used obtained running lame on a bigger file)
Please note that logs may have some value read before and/or after the
start/end of the running process (lame).

The next update will be the results with 33-rc5 kernel.

Luca


Attachments:
log-20 (1.42 kB)
log19 (1.21 kB)
Download all attachments

2010-01-22 16:05:09

by Chris Friesen

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)


On 01/22/2010 05:22 AM, Luca Zini wrote:

> with nice set to 19 a processor 3 reach the maximum speed, with nice
set to
> -20 its maximum value is 300-400 mhz under the maximum value.

I wonder if at lower nice levels the timing on something changes,
forcing multiple cores to be active just often enough to keep the i7
from doing the turbo boost thing, whereas with higher nice levels
everything is serialized allowing the boost to be enabled.

Chris

2010-01-22 20:15:07

by Luca Zini

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)

vanilla 33-rc5 runs ok:
with nice set to -20
real 0m8.649s
user 0m8.600s
sys 0m0.030s


with nice set to 19:
real 0m8.658s
user 0m8.600s
sys 0m0.040s

thank you!
Luca


On Friday 22 January 2010 08:19:18 am Mike Galbraith wrote:
> > if I run a cpu intensive process with the lowest priority (19
> > from man nice) I obtain much better performance that with the
> > highest priority available (-20 from man nice).
> >
> > For example the same file is processed by lame in 8.7 seconds
> > at the lowest priority, and in 12 seconds at the highest
> > priority. Before posting a bug I wold like to understand if
> > this is a problem related to the i7 mobile (my processor is a
> > i7 Q720).
> >
> > As far as I tested on the same laptop series (dell studio 15),
> > with the same kernel this problem does not exists.
>
> So you only see this on the i7. That's odd. Can you try 33-rc5?
>
> Posting a reliable reproducer would be nice. It'd also be nice to see
> what all is running when you see this, and where.
>
> -Mike
>

2010-01-23 09:43:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: scheduler vs hardware? (was Re: another i7 (linux) bug?)


* Peter Zijlstra <[email protected]> wrote:

> On Fri, 2010-01-22 at 08:19 +0100, Mike Galbraith wrote:
> > > if I run a cpu intensive process with the lowest priority (19
> > > from man nice) I obtain much better performance that with the
> > > highest priority available (-20 from man nice).
> > >
> > > For example the same file is processed by lame in 8.7 seconds
> > > at the lowest priority, and in 12 seconds at the highest
> > > priority. Before posting a bug I wold like to understand if
> > > this is a problem related to the i7 mobile (my processor is a
> > > i7 Q720).
> > >
> > > As far as I tested on the same laptop series (dell studio 15),
> > > with the same kernel this problem does not exists.
> >
> > So you only see this on the i7. That's odd. Can you try 33-rc5?
> >
> > Posting a reliable reproducer would be nice. It'd also be nice to see
> > what all is running when you see this, and where.
>
> Using a sample from: http://lame.sourceforge.net/quality.php
>
> My laptop does:
>
>
> # time nice -n 19 lame -b 256 -V0 -h youcantdothat.wav - > /dev/null

A 'perf stat --repeat 3' run would tell us more about the underlying reasons i
think.

Ingo