2007-09-17 09:21:55

by Rob Hussey

[permalink] [raw]
Subject: Scheduler benchmarks - a follow-up

Hi all,

After posting some benchmarks involving cfs
(http://lkml.org/lkml/2007/9/13/385), I got some feedback, so I
decided to do a follow-up that'll hopefully fill in the gaps many
people wanted to see filled.

This time around I've done the benchmarks against 2.6.21, 2.6.22-ck1,
and 2.6.23-rc6-cfs-devel (latest git as of 12 hours ago). All three
.configs are attached. The benchmarks consist of lat_ctx and
hackbench, both with a growing number of processes, as well as
pipe-test. All benchmarks were also run bound to a single core.

Since this time there are hundreds of lines of data, I'll post a
reasonable amount here and attach the data files. There are graphs
again this time, which I'll post links to as well as attach.

I'll start with some selected numbers, which are preceded by the
command used for the benchmark.

for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:
(the left most column is the number of processes ($i))

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

15 5.88 4.85 5.14
16 5.80 4.77 4.76
17 5.91 4.84 4.92
18 5.79 4.86 4.83
19 5.89 4.94 4.93
20 5.78 4.81 5.13
21 5.88 5.02 4.94
22 5.79 4.79 4.84
23 5.93 4.86 5.05
24 5.73 4.76 4.90
25 6.00 4.94 5.19

for((i=1; i < 100; i++)); do hackbench $i; done:

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

80 9.75 8.95 9.52
81 11.54 8.87 9.57
82 11.29 8.92 9.67
83 10.76 8.96 9.82
84 12.04 9.20 9.91
85 11.74 9.39 10.09
86 12.01 9.37 10.18
87 11.39 9.43 10.13
88 12.48 9.60 10.38
89 11.85 9.77 10.52
90 13.78 9.76 10.65

pipe-test:
(the left most column is the run #)

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

1 13.84 12.59 13.01
2 13.90 12.57 13.00
3 13.84 12.62 13.06
4 13.87 12.61 13.04
5 13.82 12.62 13.03
6 13.86 12.60 13.02
7 13.85 12.61 13.02
8 13.88 12.45 13.04
9 13.83 12.46 13.03
10 13.88 12.46 13.03

Bound to Single core:

for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

15 2.90 2.76 2.21
16 2.88 2.79 2.36
17 2.87 2.77 2.52
18 2.86 2.78 2.66
19 2.89 2.72 2.81
20 2.87 2.72 2.95
21 2.86 2.69 3.10
22 2.88 2.72 3.26
23 2.86 2.71 3.39
24 2.84 2.72 3.56
25 2.82 2.73 3.72


for((i=1; i < 100; i++)); do hackbench $i; done:

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

80 14.29 10.86 12.03
81 14.40 11.25 12.17
82 15.00 11.42 12.33
83 14.87 11.12 12.51
84 15.37 11.42 12.66
85 15.75 11.68 12.79
86 15.64 11.95 12.95
87 15.80 11.64 13.12
88 15.70 11.91 13.25
89 15.10 12.19 13.42
90 16.24 12.53 13.54

pipe-test:

2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel

1 9.27 8.50 8.55
2 9.27 8.47 8.55
3 9.28 8.47 8.54
4 9.28 8.48 8.54
5 9.28 8.48 8.54
6 9.29 8.46 8.54
7 9.28 8.47 8.55
8 9.29 8.47 8.55
9 9.29 8.45 8.54
10 9.28 8.46 8.54

Links to the graphs (the .dat files are in the same directory):
http://www.healthcarelinen.com/misc/benchmarks/lat_ctx_benchmark2.png
http://www.healthcarelinen.com/misc/benchmarks/hackbench_benchmark2.png
http://www.healthcarelinen.com/misc/benchmarks/pipe-test_benchmark2.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_lat_ctx_benchmark2.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_pipe-test_benchmark2.png

The only analysis I'll offer is that both sd and cfs are improvements,
and I'm glad that there is a lot of work being done in this area of
linux development. Much respect to Con Kolivas, Ingo Molnar, and Roman
Zippel, as well all the others who have contributed.

Any feedback is welcome.

Regards,
Rob


Attachments:
(No filename) (3.56 kB)
BOUND_hackbench_benchmark2.png (6.12 kB)
BOUND_lat_ctx_benchmark2.png (8.21 kB)
BOUND_pipe-test_benchmark2.png (3.31 kB)
hackbench_benchmark2.png (5.71 kB)
lat_ctx_benchmark2.png (8.67 kB)
pipe-test_benchmark2.png (3.65 kB)
data_files.tar.bz2 (5.85 kB)
config-2.6.21.bz2 (8.20 kB)
config-2.6.22-ck1.bz2 (8.16 kB)
config-2.6.23-rc6-cfs-devel.bz2 (7.81 kB)
Download all attachments

2007-09-17 11:18:59

by Ed Tomlinson

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

Rob,

I gather this was with the complete -ck patchset? It would be interesting to see if just SD
performed as well. If it does, CFS needs more work. if not there are other things in -ck
that really do improve performance and should be looked into.

Thanks
Ed Tomlinson

On September 17, 2007, Rob Hussey wrote:
> Hi all,
>
> After posting some benchmarks involving cfs
> (http://lkml.org/lkml/2007/9/13/385), I got some feedback, so I
> decided to do a follow-up that'll hopefully fill in the gaps many
> people wanted to see filled.
>
> This time around I've done the benchmarks against 2.6.21, 2.6.22-ck1,
> and 2.6.23-rc6-cfs-devel (latest git as of 12 hours ago). All three
> .configs are attached. The benchmarks consist of lat_ctx and
> hackbench, both with a growing number of processes, as well as
> pipe-test. All benchmarks were also run bound to a single core.
>
> Since this time there are hundreds of lines of data, I'll post a
> reasonable amount here and attach the data files. There are graphs
> again this time, which I'll post links to as well as attach.
>
> I'll start with some selected numbers, which are preceded by the
> command used for the benchmark.
>
> for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:
> (the left most column is the number of processes ($i))
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 15 5.88 4.85 5.14
> 16 5.80 4.77 4.76
> 17 5.91 4.84 4.92
> 18 5.79 4.86 4.83
> 19 5.89 4.94 4.93
> 20 5.78 4.81 5.13
> 21 5.88 5.02 4.94
> 22 5.79 4.79 4.84
> 23 5.93 4.86 5.05
> 24 5.73 4.76 4.90
> 25 6.00 4.94 5.19
>
> for((i=1; i < 100; i++)); do hackbench $i; done:
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 80 9.75 8.95 9.52
> 81 11.54 8.87 9.57
> 82 11.29 8.92 9.67
> 83 10.76 8.96 9.82
> 84 12.04 9.20 9.91
> 85 11.74 9.39 10.09
> 86 12.01 9.37 10.18
> 87 11.39 9.43 10.13
> 88 12.48 9.60 10.38
> 89 11.85 9.77 10.52
> 90 13.78 9.76 10.65
>
> pipe-test:
> (the left most column is the run #)
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 1 13.84 12.59 13.01
> 2 13.90 12.57 13.00
> 3 13.84 12.62 13.06
> 4 13.87 12.61 13.04
> 5 13.82 12.62 13.03
> 6 13.86 12.60 13.02
> 7 13.85 12.61 13.02
> 8 13.88 12.45 13.04
> 9 13.83 12.46 13.03
> 10 13.88 12.46 13.03
>
> Bound to Single core:
>
> for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 15 2.90 2.76 2.21
> 16 2.88 2.79 2.36
> 17 2.87 2.77 2.52
> 18 2.86 2.78 2.66
> 19 2.89 2.72 2.81
> 20 2.87 2.72 2.95
> 21 2.86 2.69 3.10
> 22 2.88 2.72 3.26
> 23 2.86 2.71 3.39
> 24 2.84 2.72 3.56
> 25 2.82 2.73 3.72
>
>
> for((i=1; i < 100; i++)); do hackbench $i; done:
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 80 14.29 10.86 12.03
> 81 14.40 11.25 12.17
> 82 15.00 11.42 12.33
> 83 14.87 11.12 12.51
> 84 15.37 11.42 12.66
> 85 15.75 11.68 12.79
> 86 15.64 11.95 12.95
> 87 15.80 11.64 13.12
> 88 15.70 11.91 13.25
> 89 15.10 12.19 13.42
> 90 16.24 12.53 13.54
>
> pipe-test:
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 1 9.27 8.50 8.55
> 2 9.27 8.47 8.55
> 3 9.28 8.47 8.54
> 4 9.28 8.48 8.54
> 5 9.28 8.48 8.54
> 6 9.29 8.46 8.54
> 7 9.28 8.47 8.55
> 8 9.29 8.47 8.55
> 9 9.29 8.45 8.54
> 10 9.28 8.46 8.54
>
> Links to the graphs (the .dat files are in the same directory):
> http://www.healthcarelinen.com/misc/benchmarks/lat_ctx_benchmark2.png
> http://www.healthcarelinen.com/misc/benchmarks/hackbench_benchmark2.png
> http://www.healthcarelinen.com/misc/benchmarks/pipe-test_benchmark2.png
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_lat_ctx_benchmark2.png
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_pipe-test_benchmark2.png
>
> The only analysis I'll offer is that both sd and cfs are improvements,
> and I'm glad that there is a lot of work being done in this area of
> linux development. Much respect to Con Kolivas, Ingo Molnar, and Roman
> Zippel, as well all the others who have contributed.
>
> Any feedback is welcome.
>
> Regards,
> Rob
>


2007-09-17 11:28:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Rob Hussey <[email protected]> wrote:

> Hi all,
>
> After posting some benchmarks involving cfs
> (http://lkml.org/lkml/2007/9/13/385), I got some feedback, so I
> decided to do a follow-up that'll hopefully fill in the gaps many
> people wanted to see filled.

thanks for the update!

> I'll start with some selected numbers, which are preceded by the
> command used for the benchmark.
>
> for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:
> (the left most column is the number of processes ($i))
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 15 5.88 4.85 5.14
> 16 5.80 4.77 4.76

the unbound results are harder to compare because CFS changed SMP
balancing to saturate multiple cores better - but this can result in a
micro-benchmark slowdown if the other core is idle (and one of the
benchmark tasks runs on one core and the other runs on the first core).
This affects lat_ctx and pipe-test. (I'll have a look at the hackbench
behavior.)

> Bound to Single core:

these are the more comparable (apples to apples) tests. Usually the most
stable of them is pipe-test:

> pipe-test:
>
> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
>
> 1 9.27 8.50 8.55
> 2 9.27 8.47 8.55
> 3 9.28 8.47 8.54
> 4 9.28 8.48 8.54
> 5 9.28 8.48 8.54

so -ck1 is 0.8% faster in this particular test. (but still, there can be
caching effects in either direction - so i usually run the test on both
cores/CPUs to see whether there's any systematic spread in the results.
The cache-layout related random spread can be as high as 10% on some
systems!)

many things happened between 2.6.22-ck1 and 2.6.23-cfs-devel that could
affect performance of this test. My initial guess would be sched_clock()
overhead. Could you send me your system's 'dmesg' output when running a
2.6.22 (or -ck1) kernel? Chances are that your TSC got marked unstable,
this turns on a much less precise but also faster sched_clock()
implementation. CFS uses the TSC even if the time-of-day code marked it
as unstable - going for the more precise but slightly slower variant.

To test this theory, could you apply the patch below to cfs-devel (if
you are interested in further testing this) - this changes the cfs-devel
version of sched_clock() to have a low-resolution fallback like v2.6.22
does. Does this result in any measurable increase in performance?

(there's also a new sched-devel.git tree out there - if you update to it
you'll need to re-pull it against a pristine Linus git head.)

Ingo

---
arch/i386/kernel/tsc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/arch/i386/kernel/tsc.c
===================================================================
--- linux.orig/arch/i386/kernel/tsc.c
+++ linux/arch/i386/kernel/tsc.c
@@ -110,9 +110,9 @@ unsigned long long native_sched_clock(vo
* very important for it to be as fast as the platform
* can achive it. )
*/
- if (unlikely(!tsc_enabled && !tsc_unstable))
+ if (1 || unlikely(!tsc_enabled && !tsc_unstable))
/* No locking but a rare wrong value is not a big deal: */
- return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
+ return jiffies_64 * (1000000000 / HZ);

/* read the Time Stamp Counter: */
rdtscll(this_offset);

2007-09-17 11:47:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Ed Tomlinson <[email protected]> wrote:

> I gather this was with the complete -ck patchset? It would be
> interesting to see if just SD performed as well. If it does, CFS
> needs more work. if not there are other things in -ck that really do
> improve performance and should be looked into.

also see:

http://lkml.org/lkml/2007/9/17/172

i think at least part of the differences is due to the different
sched_clock() accuracy and performance in v2.6.22-ck versus v2.6.23-cfs.

Ingo

2007-09-17 13:05:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Rob Hussey <[email protected]> wrote:

> http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png

heh - am i the only one impressed by the consistency of the blue line in
this graph? :-) [ and the green line looks a bit like a .. staircase? ]

i've meanwhile tested hackbench 90 and the performance difference
between -ck and -cfs-devel seems to be mostly down to the more precise
(but slower) sched_clock() introduced in v2.6.23 and to the startup
penalty of freshly created tasks.

Putting back the 2.6.22 version and tweaking the startup penalty gives
this:

[hackbench 90, smaller is better]

sched-devel.git sched-devel.git+lowres-sched-clock+dsp
--------------- --------------------------------------
5.555 5.149
5.641 5.149
5.572 5.171
5.583 5.155
5.532 5.111
5.540 5.138
5.617 5.176
5.542 5.119
5.587 5.159
5.553 5.177
--------------------------------------
avg: 5.572 avg: 5.150 (-8.1%)

('lowres-sched-clock' is the patch i sent in the previous mail. 'dsp' is
a disable-startup-penalty patch that is in the latest sched-devel.git)

i have used your .config to conduct this test.

can you reproduce this with the (very-) latest sched-devel git tree:

git-pull git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

plus with the low-res-sched-clock patch (re-) attached below?

Ingo
---
arch/i386/kernel/tsc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/arch/i386/kernel/tsc.c
===================================================================
--- linux.orig/arch/i386/kernel/tsc.c
+++ linux/arch/i386/kernel/tsc.c
@@ -110,9 +110,9 @@ unsigned long long native_sched_clock(vo
* very important for it to be as fast as the platform
* can achive it. )
*/
- if (unlikely(!tsc_enabled && !tsc_unstable))
+ if (1 || unlikely(!tsc_enabled && !tsc_unstable))
/* No locking but a rare wrong value is not a big deal: */
- return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
+ return jiffies_64 * (1000000000 / HZ);

/* read the Time Stamp Counter: */
rdtscll(this_offset);

2007-09-17 14:01:44

by Jos Poortvliet

[permalink] [raw]
Subject: Re: [ck] Re: Scheduler benchmarks - a follow-up

On 9/17/07, Ingo Molnar <[email protected]> wrote:
>
> * Rob Hussey <[email protected]> wrote:
>
> > http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png
>
> heh - am i the only one impressed by the consistency of the blue line in
> this graph? :-) [ and the green line looks a bit like a .. staircase? ]

Looks lovely, though as long as lower is better, that staircase does a
nice job ;-)

> i've meanwhile tested hackbench 90 and the performance difference
> between -ck and -cfs-devel seems to be mostly down to the more precise
> (but slower) sched_clock() introduced in v2.6.23 and to the startup
> penalty of freshly created tasks.
>
> Putting back the 2.6.22 version and tweaking the startup penalty gives
> this:
>
> [hackbench 90, smaller is better]
>
> sched-devel.git sched-devel.git+lowres-sched-clock+dsp
> --------------- --------------------------------------
> 5.555 5.149
> 5.641 5.149
> 5.572 5.171
> 5.583 5.155
> 5.532 5.111
> 5.540 5.138
> 5.617 5.176
> 5.542 5.119
> 5.587 5.159
> 5.553 5.177
> --------------------------------------
> avg: 5.572 avg: 5.150 (-8.1%)

Hmmm. So cfs was 0.8% slower compared to ck in the test by Rob, it
became 8% faster so... it should be faster than CK - provided these
results are valid over different tests.

But this is all microbenchmarks, which won't have much effect in real
life, right? Besides, will the lowres sched clock patch get in?

> ('lowres-sched-clock' is the patch i sent in the previous mail. 'dsp' is
> a disable-startup-penalty patch that is in the latest sched-devel.git)
>
> i have used your .config to conduct this test.
>
> can you reproduce this with the (very-) latest sched-devel git tree:
>
> git-pull git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
>
> plus with the low-res-sched-clock patch (re-) attached below?
>
> Ingo
> ---
> arch/i386/kernel/tsc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: linux/arch/i386/kernel/tsc.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/tsc.c
> +++ linux/arch/i386/kernel/tsc.c
> @@ -110,9 +110,9 @@ unsigned long long native_sched_clock(vo
> * very important for it to be as fast as the platform
> * can achive it. )
> */
> - if (unlikely(!tsc_enabled && !tsc_unstable))
> + if (1 || unlikely(!tsc_enabled && !tsc_unstable))
> /* No locking but a rare wrong value is not a big deal: */
> - return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
> + return jiffies_64 * (1000000000 / HZ);
>
> /* read the Time Stamp Counter: */
> rdtscll(this_offset);
> _______________________________________________
> http://ck.kolivas.org/faqs/replying-to-mailing-list.txt
> ck mailing list - mailto: [email protected]
> http://vds.kolivas.org/mailman/listinfo/ck
>

2007-09-17 14:12:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ck] Re: Scheduler benchmarks - a follow-up


* Jos Poortvliet <[email protected]> wrote:

> On 9/17/07, Ingo Molnar <[email protected]> wrote:
> >
> > * Rob Hussey <[email protected]> wrote:
> >
> > > http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png
> >
> > heh - am i the only one impressed by the consistency of the blue line in
> > this graph? :-) [ and the green line looks a bit like a .. staircase? ]
>
> Looks lovely, though as long as lower is better, that staircase does a
> nice job ;-)

lower is better, but you have to take the thing below into account:

> > i've meanwhile tested hackbench 90 and the performance difference
> > between -ck and -cfs-devel seems to be mostly down to the more precise
> > (but slower) sched_clock() introduced in v2.6.23 and to the startup
> > penalty of freshly created tasks.
> >
> > Putting back the 2.6.22 version and tweaking the startup penalty gives
> > this:
> >
> > [hackbench 90, smaller is better]
> >
> > sched-devel.git sched-devel.git+lowres-sched-clock+dsp
> > --------------- --------------------------------------
> > 5.555 5.149
> > 5.641 5.149
> > 5.572 5.171
> > 5.583 5.155
> > 5.532 5.111
> > 5.540 5.138
> > 5.617 5.176
> > 5.542 5.119
> > 5.587 5.159
> > 5.553 5.177
> > --------------------------------------
> > avg: 5.572 avg: 5.150 (-8.1%)
>
> Hmmm. So cfs was 0.8% slower compared to ck in the test by Rob, it
> became 8% faster so... it should be faster than CK - provided these
> results are valid over different tests.

on my box the TSC overhead has hit CFS quite hard, i'm not sure that's
true on Rob's box. So i'd expect them to be in roughly the same area.

> But this is all microbenchmarks, which won't have much effect in real
> life, right? [...]

yeah, it's much less pronounced in real life - a context-switch rate
above 10,000/sec is already excessive - while for example the lat_ctx
test generates close to a million context switches a second.

> [...] Besides, will the lowres sched clock patch get in?

i dont think so - we want precise/accurate scheduling before
performance. (otherwise tasks working off the timer tick could steal
away cycles without being accounted for them fairly, and could starve
out all other tasks.) Unless the difference was really huge in real life
- but it isnt.

Ingo

2007-09-17 19:44:38

by Willy Tarreau

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


On Mon, Sep 17, 2007 at 09:45:59PM +0200, Oleg Verych wrote:
> The copy list, removed by Ingo is restored. Playing fair game, Willy!

Sorry Oleg, I don't understand why you added me to this thread. And I
don't understand at all what your intent was :-/

I've left others Cc too since they wonder like me.

Regards,
Willy

> Roman, please, find whole thread here:
> http://thread.gmane.org/gmane.linux.kernel/580665
>
> > From: Ingo Molnar <[email protected]>
> > Newsgroups: gmane.linux.kernel,gmane.linux.kernel.ck
> > Subject: Re: Scheduler benchmarks - a follow-up
> > Date: Mon, 17 Sep 2007 13:27:07 +0200
> []
> > In-Reply-To: <[email protected]>
> > User-Agent: Mutt/1.5.14 (2007-02-12)
> > Received-SPF: softfail (mx2: transitioning domain of elte.hu does not designate 157.181.1.14 as permitted sender) client-ip=157.181.1.14; [email protected]; helo=elvis.elte.hu;
> []
> > Archived-At: <http://permalink.gmane.org/gmane.linux.kernel/580689>
>
> >
> > * Rob Hussey <[email protected]> wrote:
> >
> >> Hi all,
> >>
> >> After posting some benchmarks involving cfs
> >> (http://lkml.org/lkml/2007/9/13/385), I got some feedback, so I
> >> decided to do a follow-up that'll hopefully fill in the gaps many
> >> people wanted to see filled.
> >
> > thanks for the update!
> >
> >> I'll start with some selected numbers, which are preceded by the
> >> command used for the benchmark.
> >>
> >> for((i=2; i < 201; i++)); do lat_ctx -s 0 $i; done:
> >> (the left most column is the number of processes ($i))
> >>
> >> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
> >>
> >> 15 5.88 4.85 5.14
> >> 16 5.80 4.77 4.76
> >
> > the unbound results are harder to compare because CFS changed SMP
> > balancing to saturate multiple cores better - but this can result in a
> > micro-benchmark slowdown if the other core is idle (and one of the
> > benchmark tasks runs on one core and the other runs on the first core).
> > This affects lat_ctx and pipe-test. (I'll have a look at the hackbench
> > behavior.)
> >
> >> Bound to Single core:
> >
> > these are the more comparable (apples to apples) tests. Usually the most
> > stable of them is pipe-test:
> >
> >> pipe-test:
> >>
> >> 2.6.21 2.6.22-ck1 2.6.23-rc6-cfs-devel
> >>
> >> 1 9.27 8.50 8.55
> >> 2 9.27 8.47 8.55
> >> 3 9.28 8.47 8.54
> >> 4 9.28 8.48 8.54
> >> 5 9.28 8.48 8.54
> >
> > so -ck1 is 0.8% faster in this particular test. (but still, there can be
> > caching effects in either direction - so i usually run the test on both
> > cores/CPUs to see whether there's any systematic spread in the results.
> > The cache-layout related random spread can be as high as 10% on some
> > systems!)
> >
> > many things happened between 2.6.22-ck1 and 2.6.23-cfs-devel that could
> > affect performance of this test. My initial guess would be sched_clock()
> > overhead. Could you send me your system's 'dmesg' output when running a
> > 2.6.22 (or -ck1) kernel? Chances are that your TSC got marked unstable,
> > this turns on a much less precise but also faster sched_clock()
> > implementation. CFS uses the TSC even if the time-of-day code marked it
> > as unstable - going for the more precise but slightly slower variant.
> >
> > To test this theory, could you apply the patch below to cfs-devel (if
> > you are interested in further testing this) - this changes the cfs-devel
> > version of sched_clock() to have a low-resolution fallback like v2.6.22
> > does. Does this result in any measurable increase in performance?
> >
> > (there's also a new sched-devel.git tree out there - if you update to it
> > you'll need to re-pull it against a pristine Linus git head.)
> >
> > Ingo
> >
> > ---
> > arch/i386/kernel/tsc.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > Index: linux/arch/i386/kernel/tsc.c
> >===================================================================
> > --- linux.orig/arch/i386/kernel/tsc.c
> > +++ linux/arch/i386/kernel/tsc.c
> > @@ -110,9 +110,9 @@ unsigned long long native_sched_clock(vo
> > * very important for it to be as fast as the platform
> > * can achive it. )
> > */
> > - if (unlikely(!tsc_enabled && !tsc_unstable))
> > + if (1 || unlikely(!tsc_enabled && !tsc_unstable))
> > /* No locking but a rare wrong value is not a big deal: */
> > - return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
> > + return jiffies_64 * (1000000000 / HZ);
> >
> > /* read the Time Stamp Counter: */
> > rdtscll(this_offset);

2007-09-17 19:51:21

by Oleg Verych

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On Mon, Sep 17, 2007 at 09:43:42PM +0200, Willy Tarreau wrote:
>
> On Mon, Sep 17, 2007 at 09:45:59PM +0200, Oleg Verych wrote:
> > The copy list, removed by Ingo is restored. Playing fair game, Willy!
>
> Sorry Oleg, I don't understand why you added me to this thread. And I
> don't understand at all what your intent was :-/
>
> I've left others Cc too since they wonder like me.

Why? Answer: <http://mid.gmane.org/[email protected]>

IMHO you shouldn't leave public lists in the reply like this. Whatever.
____

2007-09-17 20:03:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Willy Tarreau <[email protected]> wrote:

> On Mon, Sep 17, 2007 at 09:45:59PM +0200, Oleg Verych wrote:
> > The copy list, removed by Ingo is restored. Playing fair game, Willy!
>
> Sorry Oleg, I don't understand why you added me to this thread. And I
> don't understand at all what your intent was :-/
>
> I've left others Cc too since they wonder like me.

i'm wondering too.

in my posted debugging answer, instead of replying privately to Rob, i
have also Cc:-ed lkml and ck-list. (perhaps i shouldnt even have Cc:-ed
the lists, as my posted debugging questions and suggestions were mostly
relevant to Rob and to me - but it's better to keep such things archived
on the lists, if someone wants to read them. In the first mail i also
Cc:-ed Mike and Peter who changed the scheduler recently - but i dropped
them after my first reply when it became apparent that it's not their
changes that are affected. With all lists still Cc:-ed, so that people
can follow it if they really want to.)

I'd expect any new results to be posted by Rob to everyone he wishes to
send them to - like he did it in the past.

Or is Oleg perhaps complaining about the fact that i Cc:-ed this back to
both affected email lists, for everyone to read? Would be a weird
argument.

Ingo

2007-09-17 20:05:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Oleg Verych <[email protected]> wrote:

> On Mon, Sep 17, 2007 at 09:43:42PM +0200, Willy Tarreau wrote:
> >
> > On Mon, Sep 17, 2007 at 09:45:59PM +0200, Oleg Verych wrote:
> > > The copy list, removed by Ingo is restored. Playing fair game, Willy!
> >
> > Sorry Oleg, I don't understand why you added me to this thread. And I
> > don't understand at all what your intent was :-/
> >
> > I've left others Cc too since they wonder like me.
>
> Why? Answer: <http://mid.gmane.org/[email protected]>
>
> IMHO you shouldn't leave public lists in the reply like this. Whatever.

i'm even more confused. Could you please explain your point to me? I
really cannot follow your argument. Thanks,

Ingo

2007-09-17 20:23:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Ed Tomlinson <[email protected]> wrote:

> Rob,
>
> I gather this was with the complete -ck patchset? It would be
> interesting to see if just SD performed as well. If it does, CFS
> needs more work. if not there are other things in -ck that really do
> improve performance and should be looked into.

yeah. The biggest item in -ck besides SD is swap-prefetch, but that
shouldnt have an effect in this case. I _think_ that most of the
measured difference is due to scheduler details though. Right now my
estimation is that with the patch i sent to Rob, and with latest
sched-devel.git, CFS should perform as good or better than SD, even in
these micro-benchmarks. (but i cannot tell what will happen on Rob's
machine - so i'm keeping an open mind towards any other fixables :-) I'm
curious about the next round of numbers (if Rob has time to do them).

Ingo

2007-09-17 20:36:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Ingo Molnar <[email protected]> wrote:

> i've meanwhile tested hackbench 90 and the performance difference
> between -ck and -cfs-devel seems to be mostly down to the more precise
> (but slower) sched_clock() introduced in v2.6.23 and to the startup
> penalty of freshly created tasks.

Rob, another thing i just noticed in your .configs: you have
CONFIG_PREEMPT=y enabled. Would it be possible to get a testrun with
that disabled? That gives the best throughput and context-switch latency
numbers. (CONFIG_PREEMPT might also have preemption artifacts - there's
one report of it having _worse_ desktop latencies on certain hardware
than !CONFIG_PREEMPT.)

Ingo

2007-09-17 20:42:53

by Willy Tarreau

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On Mon, Sep 17, 2007 at 10:06:40PM +0200, Oleg Verych wrote:
> On Mon, Sep 17, 2007 at 09:43:42PM +0200, Willy Tarreau wrote:
> >
> > On Mon, Sep 17, 2007 at 09:45:59PM +0200, Oleg Verych wrote:
> > > The copy list, removed by Ingo is restored. Playing fair game, Willy!
> >
> > Sorry Oleg, I don't understand why you added me to this thread. And I
> > don't understand at all what your intent was :-/
> >
> > I've left others Cc too since they wonder like me.
>
> Why? Answer: <http://mid.gmane.org/[email protected]>
>
> IMHO you shouldn't leave public lists in the reply like this. Whatever.

Nothing prevents Roman and I to exchange our respective points of views
on LKML when those are about other activities on the list, I don't get
your point.

I find particularly weird the way you hijacked an existing thread
adding references to a 4-days old mail and adding people in Cc just
as one would add fuel to try to start any sort of flamewar.

Sorry for you, but this has no chance to succeed. First, don't consider
that people are ennemies just because they use harsh words to tell each
other what they feel about a specific subject (otherwise, you'd find
hundreds of ennemies of everyone here). Second, if there are some aspects
of the mail above that you want explained, you can mail us privately,
you're not necessarily forced to hijack an unrelated discussion.

Now I'm not interested in following up into this thread, it's already
polluted. Please let people here discuss numbers and experiments.

Thanks,
Willy

2007-09-18 01:44:51

by Rob Hussey

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On 9/17/07, Ingo Molnar <[email protected]> wrote:
>
> * Rob Hussey <[email protected]> wrote:
>
> > http://www.healthcarelinen.com/misc/benchmarks/BOUND_hackbench_benchmark2.png
>
> heh - am i the only one impressed by the consistency of the blue line in
> this graph? :-) [ and the green line looks a bit like a .. staircase? ]
>
> i've meanwhile tested hackbench 90 and the performance difference
> between -ck and -cfs-devel seems to be mostly down to the more precise
> (but slower) sched_clock() introduced in v2.6.23 and to the startup
> penalty of freshly created tasks.
>
> Putting back the 2.6.22 version and tweaking the startup penalty gives
> this:
>
> [hackbench 90, smaller is better]
>
> sched-devel.git sched-devel.git+lowres-sched-clock+dsp
> --------------- --------------------------------------
> 5.555 5.149
> 5.641 5.149
> 5.572 5.171
> 5.583 5.155
> 5.532 5.111
> 5.540 5.138
> 5.617 5.176
> 5.542 5.119
> 5.587 5.159
> 5.553 5.177
> --------------------------------------
> avg: 5.572 avg: 5.150 (-8.1%)
>
> ('lowres-sched-clock' is the patch i sent in the previous mail. 'dsp' is
> a disable-startup-penalty patch that is in the latest sched-devel.git)
>
> i have used your .config to conduct this test.
>
> can you reproduce this with the (very-) latest sched-devel git tree:
>
> git-pull git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git
>
> plus with the low-res-sched-clock patch (re-) attached below?
>
> Ingo
> ---
> arch/i386/kernel/tsc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: linux/arch/i386/kernel/tsc.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/tsc.c
> +++ linux/arch/i386/kernel/tsc.c
> @@ -110,9 +110,9 @@ unsigned long long native_sched_clock(vo
> * very important for it to be as fast as the platform
> * can achive it. )
> */
> - if (unlikely(!tsc_enabled && !tsc_unstable))
> + if (1 || unlikely(!tsc_enabled && !tsc_unstable))
> /* No locking but a rare wrong value is not a big deal: */
> - return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
> + return jiffies_64 * (1000000000 / HZ);
>
> /* read the Time Stamp Counter: */
> rdtscll(this_offset);
> -

Sorry it took so long for me to get back.

Ok, to start the dmesg output for 2.6.22-ck1 is attached. The relevant
lines seem to be:
[ 27.691348] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
[ 27.995427] Time: tsc clocksource has been installed.

I've updated to the latest sched-devel git, and applied the patch
above. I ran it through the same tests, but this time only while bound
to a single core. Some selected numbers:

lat_ctx -s 0 $i (the left most number is $i):

15 3.09
16 3.09
17 3.11
18 3.07
19 2.99
20 3.09
21 3.05
22 3.11
23 3.05
24 3.08
25 3.06

hackbench $i:

80 11.720
81 11.698
82 11.888
83 12.094
84 12.232
85 12.351
86 12.512
87 12.680
88 12.736
89 12.861
90 13.103

pipe-test (the left most number is the run #):

1 8.85
2 8.80
3 8.84
4 8.82
5 8.82
6 8.80
7 8.82
8 8.82
9 8.85
10 8.83

Once again, graphs:
http://www.healthcarelinen.com/misc/benchmarks/BOUND_PATCHED_lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_PATCHED_hackbench_benchmark.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_PATCHED_pipe-test_benchmark.png

I saw in your other email that you'd like for me to try with
CONFIG_PREEMPT disabled. I should have a chance to try that very soon.

Regards,
Rob


Attachments:
(No filename) (4.00 kB)
dmesg-2.6.22-ck1.bz2 (10.99 kB)
BOUND_PATCHED_hackbench_benchmark.png (6.35 kB)
BOUND_PATCHED_lat_ctx_benchmark.png (9.21 kB)
BOUND_PATCHED_pipe-test_benchmark.png (3.84 kB)
data_files2.tar.bz2 (1.37 kB)
Download all attachments

2007-09-18 04:30:33

by Rob Hussey

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On 9/17/07, Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > i've meanwhile tested hackbench 90 and the performance difference
> > between -ck and -cfs-devel seems to be mostly down to the more precise
> > (but slower) sched_clock() introduced in v2.6.23 and to the startup
> > penalty of freshly created tasks.
>
> Rob, another thing i just noticed in your .configs: you have
> CONFIG_PREEMPT=y enabled. Would it be possible to get a testrun with
> that disabled? That gives the best throughput and context-switch latency
> numbers. (CONFIG_PREEMPT might also have preemption artifacts - there's
> one report of it having _worse_ desktop latencies on certain hardware
> than !CONFIG_PREEMPT.)

I reverted the patch from before since it didn't seem to help. Do you
think it may have to do with my system having Hyper-Threading enabled?
I should have pointed out before that I don't really have a dual-core
system, just a P4 with Hyper-Threading (I loosely used core to refer
to processor).

Some new numbers for 2.6.23-rc6-cfs-devel (!CONFIG_PREEMPT and bound
to single processor)

lat_ctx:

15 2.73
16 2.74
17 2.81
18 2.74
19 2.74
20 2.73
21 2.60
22 2.74
23 2.72
24 2.74
25 2.74

hackbench:

80 11.578
81 11.991
82 11.914
83 12.026
84 12.226
85 12.347
86 12.552
87 12.655
88 13.011
89 12.941
90 13.237

pipe-test:

1 9.58
2 9.58
3 9.58
4 9.58
5 9.58
6 9.58
7 9.58
8 9.58
9 9.58
10 9.58

The obligatory graphs:
http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_lat_ctx_benchmark.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_hackbench_benchmark.png
http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_pipe-test_benchmark.png

A cursory glance suggests that performance wrt lat_ctx and hackbench
has increased (lower numbers), but degraded quite a lot for pipe-test.
The numbers for pipe-test are extremely stable though, while the
numbers for hackbench are more erratic (which isn't saying much since
the original numbers gave nearly a straight line). I'm still willing
to try out any more ideas.

Regards,
Rob

2007-09-18 04:53:44

by Willy Tarreau

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

Hi Rob,

On Tue, Sep 18, 2007 at 12:30:05AM -0400, Rob Hussey wrote:
> I should have pointed out before that I don't really have a dual-core
> system, just a P4 with Hyper-Threading (I loosely used core to refer
> to processor).

Just for reference, we call them "siblings", not "cores" on HT. I believe
that a line "Sibling:" appears in /proc/cpuinfo BTW.

Regards,
Willy

2007-09-18 04:58:18

by Rob Hussey

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On 9/18/07, Willy Tarreau <[email protected]> wrote:
> Hi Rob,
>
> On Tue, Sep 18, 2007 at 12:30:05AM -0400, Rob Hussey wrote:
> > I should have pointed out before that I don't really have a dual-core
> > system, just a P4 with Hyper-Threading (I loosely used core to refer
> > to processor).
>
> Just for reference, we call them "siblings", not "cores" on HT. I believe
> that a line "Sibling:" appears in /proc/cpuinfo BTW.

Thanks, I was searching for the right word but couldn't come up with it.

2007-09-18 06:41:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Rob Hussey <[email protected]> wrote:

> A cursory glance suggests that performance wrt lat_ctx and hackbench
> has increased (lower numbers), but degraded quite a lot for pipe-test.
> The numbers for pipe-test are extremely stable though, while the
> numbers for hackbench are more erratic (which isn't saying much since
> the original numbers gave nearly a straight line). I'm still willing
> to try out any more ideas.

pipe-test is a very stable workload, and is thus quite sensitive to the
associativity of the CPU cache. Even killing the task and repeating the
same test isnt enough to get rid of the systematic skew that this can
cause. I've seen divergence of up to 10% in pipe-test. One way to test
it is to run pipe-test, then to stop it, then to "ssh localhost" (this
in itself uses up a couple of pipe objects and file objects and changes
the cache layout picture), then run pipe-test again, then again "ssh
localhost", etc. Via this trick one can often see cache-layout
artifacts. How much 'skew' does pipe-test have on your system if you try
this manually?

Ingo

2007-09-18 08:23:17

by Rob Hussey

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On 9/18/07, Ingo Molnar <[email protected]> wrote:
>
> * Rob Hussey <[email protected]> wrote:
>
> > A cursory glance suggests that performance wrt lat_ctx and hackbench
> > has increased (lower numbers), but degraded quite a lot for pipe-test.
> > The numbers for pipe-test are extremely stable though, while the
> > numbers for hackbench are more erratic (which isn't saying much since
> > the original numbers gave nearly a straight line). I'm still willing
> > to try out any more ideas.
>
> pipe-test is a very stable workload, and is thus quite sensitive to the
> associativity of the CPU cache. Even killing the task and repeating the
> same test isnt enough to get rid of the systematic skew that this can
> cause. I've seen divergence of up to 10% in pipe-test. One way to test
> it is to run pipe-test, then to stop it, then to "ssh localhost" (this
> in itself uses up a couple of pipe objects and file objects and changes
> the cache layout picture), then run pipe-test again, then again "ssh
> localhost", etc. Via this trick one can often see cache-layout
> artifacts. How much 'skew' does pipe-test have on your system if you try
> this manually?
>

I did 7 data sets of 5 runs each using this method. With pipe-test
bound to one sibling, there were 10 unique values in these 7 sets. The
lowest value was 9.22, the highest value was 9.62, and the median of
the unique values was 9.47. So the deviation from mean for the lowest
and highest values was {-0.25, 0.15} The numbers were even tighter for
pipe-test not bound to a single sibling: {-0.07, 0.12}

2007-09-18 08:48:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Rob Hussey <[email protected]> wrote:

> The obligatory graphs:
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_lat_ctx_benchmark.png
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_hackbench_benchmark.png
> http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_pipe-test_benchmark.png

btw., it's likely that if you turn off CONFIG_PREEMPT for .21 and for
.22-ck1 they'll improve a bit too - so it's not fair to put the .23
!PREEMPT numbers on the graph as the PREEMPT numbers of the other
kernels. (it shows the .23 scheduler being faster than it really is)

> A cursory glance suggests that performance wrt lat_ctx and hackbench
> has increased (lower numbers), but degraded quite a lot for pipe-test.
> The numbers for pipe-test are extremely stable though, while the
> numbers for hackbench are more erratic (which isn't saying much since
> the original numbers gave nearly a straight line). I'm still willing
> to try out any more ideas.

the pipe-test behavior looks like an outlier. !PREEMPT only removes code
(which makes the code faster), so this could be a cache layout artifact.
(or perhaps we preempt at a different point which is disadvantageous to
caching?) Pipe-test is equivalent to "lat_ctx -s 0 2" so if there was a
genuine slowdown it would show up in the lat_ctx graph - but the graph
shows a speedup.

Ingo

2007-09-18 09:45:27

by Rob Hussey

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up

On 9/18/07, Ingo Molnar <[email protected]> wrote:
>
> * Rob Hussey <[email protected]> wrote:
>
> > The obligatory graphs:
> > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_lat_ctx_benchmark.png
> > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_hackbench_benchmark.png
> > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_pipe-test_benchmark.png
>
> btw., it's likely that if you turn off CONFIG_PREEMPT for .21 and for
> .22-ck1 they'll improve a bit too - so it's not fair to put the .23
> !PREEMPT numbers on the graph as the PREEMPT numbers of the other
> kernels. (it shows the .23 scheduler being faster than it really is)
>

The graphs are really just to show where the new numbers fit in. Plus,
I was too lazy to run all the numbers again.

> > A cursory glance suggests that performance wrt lat_ctx and hackbench
> > has increased (lower numbers), but degraded quite a lot for pipe-test.
> > The numbers for pipe-test are extremely stable though, while the
> > numbers for hackbench are more erratic (which isn't saying much since
> > the original numbers gave nearly a straight line). I'm still willing
> > to try out any more ideas.
>
> the pipe-test behavior looks like an outlier. !PREEMPT only removes code
> (which makes the code faster), so this could be a cache layout artifact.
> (or perhaps we preempt at a different point which is disadvantageous to
> caching?) Pipe-test is equivalent to "lat_ctx -s 0 2" so if there was a
> genuine slowdown it would show up in the lat_ctx graph - but the graph
> shows a speedup.
>

Interestingly, every set of lat_ctx -s 0 2 numbers I run on the
!PREEMPT kernel are on average higher than with PREEMPT (around 2.84
for !PREEMPT and 2.4 for PREEMPT). Anything higher than around 2 or 3
(such as lat_ctx -s 0 8) gives lower average numbers for !PREEMPT.

Regards,
Rob

2007-09-18 09:49:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: Scheduler benchmarks - a follow-up


* Rob Hussey <[email protected]> wrote:

> On 9/18/07, Ingo Molnar <[email protected]> wrote:
> >
> > * Rob Hussey <[email protected]> wrote:
> >
> > > The obligatory graphs:
> > > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_lat_ctx_benchmark.png
> > > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_hackbench_benchmark.png
> > > http://www.healthcarelinen.com/misc/benchmarks/BOUND_NOPREEMPT_pipe-test_benchmark.png
> >
> > btw., it's likely that if you turn off CONFIG_PREEMPT for .21 and for
> > .22-ck1 they'll improve a bit too - so it's not fair to put the .23
> > !PREEMPT numbers on the graph as the PREEMPT numbers of the other
> > kernels. (it shows the .23 scheduler being faster than it really is)
> >
>
> The graphs are really just to show where the new numbers fit in. Plus,
> I was too lazy to run all the numbers again.

yeah - the graphs are completely OK (and they are really nice and
useful), i just wanted to point this out for completeness.

> > the pipe-test behavior looks like an outlier. !PREEMPT only removes
> > code (which makes the code faster), so this could be a cache layout
> > artifact. (or perhaps we preempt at a different point which is
> > disadvantageous to caching?) Pipe-test is equivalent to "lat_ctx -s
> > 0 2" so if there was a genuine slowdown it would show up in the
> > lat_ctx graph - but the graph shows a speedup.
>
> Interestingly, every set of lat_ctx -s 0 2 numbers I run on the
> !PREEMPT kernel are on average higher than with PREEMPT (around 2.84
> for !PREEMPT and 2.4 for PREEMPT). Anything higher than around 2 or 3
> (such as lat_ctx -s 0 8) gives lower average numbers for !PREEMPT.

perhaps this 2 task ping-pong is somehow special in that it manages to
fit into L1 cache much better under PREEMPT than under !PREEMPT.
(usually the opposite is true) At 3 tasks or more things dont fit
anymore (or the special alignment is gone) so the faster !PREEMPT code
wins.

Ingo