2008-06-27 12:03:23

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 00/30] SMP-group balancer - take 3

Hi,

Another go at SMP fairness for group scheduling.

This code needs some serious testing,..

However on my system performance doesn't tank as much as it used to.
I've ran sysbench and volanomark benchmarks.

The machine is a Quad core (Intel Q9450) with 4GB of RAM.
Fedora9 - x86_64

sysbench-0.4.8 + postgresql-8.3.3
volanomark-2.5.0.9 + openjdk-1.6.0

I've used cgroup group scheduling.

cgroup:/ - means all tasks are in the root group
cgroup:/foo - means all tasks are in a subgroup

mkdir /cgroup/foo
for i in `cat /cgroup/tasks`; do
echo $i > /cgroup/foo/tasks
done

The patches are against: tip/auto-sched-next of a few days ago.

---

.25

[root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
1: transactions: 50514 (841.90 per sec.)
2: transactions: 98745 (1645.73 per sec.)
4: transactions: 192682 (3211.31 per sec.)
8: transactions: 192082 (3201.26 per sec.)
16: transactions: 188891 (3147.95 per sec.)
32: transactions: 182364 (3039.12 per sec.)
64: transactions: 169412 (2822.94 per sec.)
128: transactions: 139505 (2323.95 per sec.)
256: transactions: 131516 (2188.98 per sec.)

[root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
Average throughput = 113350 messages per second
Average throughput = 112230 messages per second
Average throughput = 113125 messages per second


.26-rc

cgroup:/

[root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
1: transactions: 50553 (842.54 per sec.)
2: transactions: 98625 (1643.74 per sec.)
4: transactions: 191351 (3189.12 per sec.)
8: transactions: 193525 (3225.32 per sec.)
16: transactions: 190516 (3175.10 per sec.)
32: transactions: 186914 (3114.96 per sec.)
64: transactions: 178940 (2981.78 per sec.)
128: transactions: 156430 (2606.00 per sec.)
256: transactions: 134929 (2246.63 per sec.)

[root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
Average throughput = 124089 messages per second
Average throughput = 121962 messages per second
Average throughput = 121223 messages per second


cgroup:/foo

[root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
1: transactions: 50246 (837.43 per sec.)
2: transactions: 97466 (1624.41 per sec.)
4: transactions: 179609 (2993.43 per sec.)
8: transactions: 190931 (3182.07 per sec.)
16: transactions: 189882 (3164.50 per sec.)
32: transactions: 184649 (3077.14 per sec.)
64: transactions: 178200 (2969.46 per sec.)
128: transactions: 158835 (2646.14 per sec.)
256: transactions: 142100 (2366.51 per sec.)

[root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
Average throughput = 117789 messages per second
Average throughput = 118154 messages per second
Average throughput = 118945 messages per second


.26-rc-smp-group

cgroup:/

[root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
1: transactions: 50137 (835.61 per sec.)
2: transactions: 97406 (1623.41 per sec.)
4: transactions: 170755 (2845.88 per sec.)
8: transactions: 187406 (3123.35 per sec.)
16: transactions: 186865 (3114.18 per sec.)
32: transactions: 183559 (3059.03 per sec.)
64: transactions: 176834 (2946.70 per sec.)
128: transactions: 158882 (2647.04 per sec.)
256: transactions: 145081 (2415.81 per sec.)

[root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
Average throughput = 121499 messages per second
Average throughput = 120181 messages per second
Average throughput = 119775 messages per second


cgroup:/foo

[root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
1: transactions: 49564 (826.06 per sec.)
2: transactions: 96642 (1610.67 per sec.)
4: transactions: 183081 (3051.29 per sec.)
8: transactions: 187553 (3125.79 per sec.)
16: transactions: 185435 (3090.45 per sec.)
32: transactions: 182314 (3038.25 per sec.)
64: transactions: 174527 (2908.22 per sec.)
128: transactions: 159321 (2654.24 per sec.)
256: transactions: 140167 (2333.82 per sec.)

[root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
Average throughput = 130208 messages per second
Average throughput = 129086 messages per second
Average throughput = 129362 messages per second


--


2008-06-27 12:46:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3


* Peter Zijlstra <[email protected]> wrote:

> Hi,
>
> Another go at SMP fairness for group scheduling.
>
> This code needs some serious testing,..
>
> However on my system performance doesn't tank as much as it used to.
> I've ran sysbench and volanomark benchmarks.
>
> The machine is a Quad core (Intel Q9450) with 4GB of RAM.
> Fedora9 - x86_64
>
> sysbench-0.4.8 + postgresql-8.3.3
> volanomark-2.5.0.9 + openjdk-1.6.0
>
> I've used cgroup group scheduling.

cool. I have applied your patches to a new temporary topic,
tip/sched/devel.smp-group-balance. If that works out fine in testing
then we can merge it back into sched/devel.

Thanks Peter,

Ingo

2008-06-27 17:34:44

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3

On Fri, Jun 27, 2008 at 01:41:09PM +0200, Peter Zijlstra wrote:
> Hi,
>
> Another go at SMP fairness for group scheduling.
>
> This code needs some serious testing,..
>
> However on my system performance doesn't tank as much as it used to.
> I've ran sysbench and volanomark benchmarks.
>
> The machine is a Quad core (Intel Q9450) with 4GB of RAM.
> Fedora9 - x86_64
>
> sysbench-0.4.8 + postgresql-8.3.3
> volanomark-2.5.0.9 + openjdk-1.6.0
>
> I've used cgroup group scheduling.
>
> cgroup:/ - means all tasks are in the root group
> cgroup:/foo - means all tasks are in a subgroup
>
> mkdir /cgroup/foo
> for i in `cat /cgroup/tasks`; do
> echo $i > /cgroup/foo/tasks
> done
>
> The patches are against: tip/auto-sched-next of a few days ago.
>
> ---
>
> .25
>
> [root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
> 1: transactions: 50514 (841.90 per sec.)
> 2: transactions: 98745 (1645.73 per sec.)
> 4: transactions: 192682 (3211.31 per sec.)
> 8: transactions: 192082 (3201.26 per sec.)
> 16: transactions: 188891 (3147.95 per sec.)
> 32: transactions: 182364 (3039.12 per sec.)
> 64: transactions: 169412 (2822.94 per sec.)
> 128: transactions: 139505 (2323.95 per sec.)
> 256: transactions: 131516 (2188.98 per sec.)
>
> [root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
> Average throughput = 113350 messages per second
> Average throughput = 112230 messages per second
> Average throughput = 113125 messages per second
>
>
> .26-rc
>
> cgroup:/
>
> [root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
> 1: transactions: 50553 (842.54 per sec.)
> 2: transactions: 98625 (1643.74 per sec.)
> 4: transactions: 191351 (3189.12 per sec.)
> 8: transactions: 193525 (3225.32 per sec.)
> 16: transactions: 190516 (3175.10 per sec.)
> 32: transactions: 186914 (3114.96 per sec.)
> 64: transactions: 178940 (2981.78 per sec.)
> 128: transactions: 156430 (2606.00 per sec.)
> 256: transactions: 134929 (2246.63 per sec.)
>
> [root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
> Average throughput = 124089 messages per second
> Average throughput = 121962 messages per second
> Average throughput = 121223 messages per second
>
>
> cgroup:/foo
>
> [root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
> 1: transactions: 50246 (837.43 per sec.)
> 2: transactions: 97466 (1624.41 per sec.)
> 4: transactions: 179609 (2993.43 per sec.)
> 8: transactions: 190931 (3182.07 per sec.)
> 16: transactions: 189882 (3164.50 per sec.)
> 32: transactions: 184649 (3077.14 per sec.)
> 64: transactions: 178200 (2969.46 per sec.)
> 128: transactions: 158835 (2646.14 per sec.)
> 256: transactions: 142100 (2366.51 per sec.)
>
> [root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
> Average throughput = 117789 messages per second
> Average throughput = 118154 messages per second
> Average throughput = 118945 messages per second
>
>
> .26-rc-smp-group
>
> cgroup:/
>
> [root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
> 1: transactions: 50137 (835.61 per sec.)
> 2: transactions: 97406 (1623.41 per sec.)
> 4: transactions: 170755 (2845.88 per sec.)
> 8: transactions: 187406 (3123.35 per sec.)
> 16: transactions: 186865 (3114.18 per sec.)
> 32: transactions: 183559 (3059.03 per sec.)
> 64: transactions: 176834 (2946.70 per sec.)
> 128: transactions: 158882 (2647.04 per sec.)
> 256: transactions: 145081 (2415.81 per sec.)
>
> [root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
> Average throughput = 121499 messages per second
> Average throughput = 120181 messages per second
> Average throughput = 119775 messages per second
>
>
> cgroup:/foo
>
> [root@twins sysbench-0.4.8]# ./doit-psql-256-60sec
> 1: transactions: 49564 (826.06 per sec.)
> 2: transactions: 96642 (1610.67 per sec.)
> 4: transactions: 183081 (3051.29 per sec.)
> 8: transactions: 187553 (3125.79 per sec.)
> 16: transactions: 185435 (3090.45 per sec.)
> 32: transactions: 182314 (3038.25 per sec.)
> 64: transactions: 174527 (2908.22 per sec.)
> 128: transactions: 159321 (2654.24 per sec.)
> 256: transactions: 140167 (2333.82 per sec.)
>
> [root@twins vmark]# LOOP_CLIENT_COUNT=1000 ./loopclient.sh 2>&1 | grep Average
> Average throughput = 130208 messages per second
> Average throughput = 129086 messages per second
> Average throughput = 129362 messages per second

Some fairness numbers from tip/master

kernel compiles with even number of threads
/cgroup/a
[dhaval@mordor a]$ time make -j8
real 1m53.033s
user 1m28.785s
sys 0m22.224s

/cgroup/b
[dhaval@mordor b]$ time make -j16
real 1m51.826s
user 1m29.022s
sys 0m21.911s

kernel compile with odd number of threads
/cgroup/a
[dhaval@mordor a]$ time make -j7
real 1m49.441s
user 1m26.962s
sys 0m21.698s

/cgroup/b
[dhaval@mordor b]$ time make -j13
real 1m50.418s
user 1m26.888s
sys 0m21.508s

Running infinite loops in parallel (5 in one group, 2 in another)

8789 - 8793 belong to /cgroup/a
8794, 8795 belong /cgroup/b

When we start.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8795 dhaval 20 0 1720 264 212 R 54.6 0.0 0:06.31 test
8794 dhaval 20 0 1720 264 212 R 45.6 0.0 0:06.91 test
8790 dhaval 20 0 1720 264 212 R 23.0 0.0 0:07.29 test
8789 dhaval 20 0 1720 260 212 R 22.6 0.0 0:07.80 test
8791 dhaval 20 0 1720 264 212 R 18.3 0.0 0:07.28 test
8792 dhaval 20 0 1720 260 212 R 18.3 0.0 0:07.01 test
8793 dhaval 20 0 1720 260 212 R 18.0 0.0 0:06.93 test

After sometime

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8794 dhaval 20 0 1720 264 212 R 49.9 0.0 0:46.98 test
8795 dhaval 20 0 1720 264 212 R 49.9 0.0 0:52.61 test
8793 dhaval 20 0 1720 260 212 R 20.3 0.0 0:24.96 test
8789 dhaval 20 0 1720 260 212 R 20.0 0.0 0:24.83 test
8790 dhaval 20 0 1720 264 212 R 20.0 0.0 0:24.32 test
8791 dhaval 20 0 1720 264 212 R 20.0 0.0 0:23.29 test
8792 dhaval 20 0 1720 260 212 R 20.0 0.0 0:25.04 test

But these numbers are not very stable. Also it takes a long time (~1min)
to converge here.

The results look really good though.

--
regards,
Dhaval

2008-06-28 17:08:56

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3

Hi,

I get this at bootup

------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2738 check_flags+0x8a/0x12d()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-rc8-tip #5
[<c0226971>] warn_on_slowpath+0x41/0x7b
[<c024207e>] ? trace_hardirqs_off+0xb/0xd
[<c0207fef>] ? native_sched_clock+0x8b/0x9d
[<c022d490>] ? __sysctl_head_next+0x98/0x9f
[<c057b286>] ? _spin_unlock+0x1d/0x20
[<c022d490>] ? __sysctl_head_next+0x98/0x9f
[<c0244a54>] ? __lock_acquire+0xd96/0xda5
[<c024189b>] check_flags+0x8a/0x12d
[<c0244a9e>] lock_acquire+0x3b/0x89
[<c021cb50>] ? tg_shares_up+0x0/0x170
[<c021b074>] walk_tg_tree+0x2c/0x9f
[<c021b048>] ? walk_tg_tree+0x0/0x9f
[<c02190f7>] ? tg_nop+0x0/0x5
[<c0220d24>] update_shares+0x54/0x5d
[<c0220d86>] try_to_wake_up+0x59/0x22b
[<c0220f80>] wake_up_process+0xf/0x11
[<c0237c94>] kthread_create+0x68/0x98
[<c0234d75>] ? worker_thread+0x0/0xc2
[<c0235207>] __create_workqueue_key+0x19e/0x1ee
[<c0234d75>] ? worker_thread+0x0/0xc2
[<c076dd9b>] init_workqueues+0x4c/0x5d
[<c075b36e>] kernel_init+0xcf/0x255
[<c0329858>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c024379e>] ? trace_hardirqs_on_caller+0x10b/0x136
[<c0329858>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c0203a92>] ? restore_nocheck_notrace+0x0/0xe
[<c075b29f>] ? kernel_init+0x0/0x255
[<c075b29f>] ? kernel_init+0x0/0x255
[<c0204623>] kernel_thread_helper+0x7/0x10
=======================
---[ end trace 4eaa2a86a8e2da22 ]---
possible reason: unannotated irqs-on.
irq event stamp: 1892
hardirqs last enabled at (1891): [<c02437d4>] trace_hardirqs_on+0xb/0xd
hardirqs last disabled at (1892): [<c024207e>]
trace_hardirqs_off+0xb/0xd
softirqs last enabled at (1548): [<c022b6eb>] __do_softirq+0x13e/0x146
softirqs last disabled at (1541): [<c022b72d>] do_softirq+0x3a/0x52

--
regards,
Dhaval

2008-06-30 13:00:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3


* Dhaval Giani <[email protected]> wrote:

> Hi,
>
> I get this at bootup
>
> ------------[ cut here ]------------
> WARNING: at kernel/lockdep.c:2738 check_flags+0x8a/0x12d()
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.26-rc8-tip #5

please check latest tip/master. This is the commit that should fix it:

----------------
| commit 2d452c9b10caeec455eb5e56a0ef4ed485178213
| Author: Ingo Molnar <[email protected]>
| Date: Sun Jun 29 15:01:59 2008 +0200
|
| sched: sched_clock_cpu() based cpu_clock(), lockdep fix
|
| Vegard Nossum reported:
|
| > WARNING: at kernel/lockdep.c:2738 check_flags+0x142/0x160()
----------------

Ingo

2008-06-30 14:54:52

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3

On Mon, Jun 30, 2008 at 02:59:56PM +0200, Ingo Molnar wrote:
>
> * Dhaval Giani <[email protected]> wrote:
>
> > Hi,
> >
> > I get this at bootup
> >
> > ------------[ cut here ]------------
> > WARNING: at kernel/lockdep.c:2738 check_flags+0x8a/0x12d()
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.26-rc8-tip #5
>
> please check latest tip/master. This is the commit that should fix it:
>

Nope, does not :(. Still get,

------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2662 check_flags+0x7c/0x10b()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-rc8 #2
[<c0122a6d>] warn_on_slowpath+0x41/0x5d
[<c013ba0e>] ? find_usage_backwards+0xb4/0xd5
[<c013ba0e>] ? find_usage_backwards+0xb4/0xd5
[<c013ba0e>] ? find_usage_backwards+0xb4/0xd5
[<c013bc2b>] ? check_usage+0x23/0x58
[<c013bcd1>] ? check_prev_add_irq+0x71/0x85
[<c013be48>] ? check_prev_add+0x3b/0x17f
[<c013bfe6>] ? check_prevs_add+0x5a/0xb2
[<c013c0e8>] ? validate_chain+0xaa/0x29c
[<c013def5>] check_flags+0x7c/0x10b
[<c013dfb4>] lock_acquire+0x30/0x7e
[<c01187b2>] ? tg_shares_up+0x0/0x100
[<c01186b6>] walk_tg_tree+0x2c/0x96
[<c011868a>] ? walk_tg_tree+0x0/0x96
[<c0118907>] ? tg_nop+0x0/0x5
[<c011894e>] update_shares+0x42/0x4a
[<c011b87a>] try_to_wake_up+0x4c/0x11f
[<c011b95c>] wake_up_process+0xf/0x11
[<c01331b5>] kthread_create+0x6c/0x9c
[<c0130739>] ? worker_thread+0x0/0xd2
[<c024218c>] ? __spin_lock_init+0x24/0x47
[<c0130ceb>] create_workqueue_thread+0x2b/0x45
[<c0130739>] ? worker_thread+0x0/0xd2
[<c0130e3a>] __create_workqueue_key+0x115/0x14d
[<c05c8854>] ? kernel_init+0x0/0x93
[<c05d7594>] init_workqueues+0x4c/0x5d
[<c05c880d>] do_basic_setup+0x8/0x1e
[<c05c88ac>] kernel_init+0x58/0x93
[<c0104557>] kernel_thread_helper+0x7/0x10
=======================
---[ end trace 4eaa2a86a8e2da22 ]---
possible reason: unannotated irqs-on.
irq event stamp: 10216
hardirqs last enabled at (10215): [<c013e52b>]
debug_check_no_locks_freed+0x9d/0xa7
hardirqs last disabled at (10216): [<c0107f91>]
native_sched_clock+0x50/0xb8
softirqs last enabled at (9922): [<c0127171>] __do_softirq+0xdf/0xe6
softirqs last disabled at (9915): [<c01271b1>] do_softirq+0x39/0x51

--
regards,
Dhaval

2008-07-01 10:57:55

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 00/30] SMP-group balancer - take 3

On Mon, Jun 30, 2008 at 08:23:57PM +0530, Dhaval Giani wrote:
> On Mon, Jun 30, 2008 at 02:59:56PM +0200, Ingo Molnar wrote:
> >
> > * Dhaval Giani <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > I get this at bootup
> > >
> > > ------------[ cut here ]------------
> > > WARNING: at kernel/lockdep.c:2738 check_flags+0x8a/0x12d()
> > > Modules linked in:
> > > Pid: 1, comm: swapper Not tainted 2.6.26-rc8-tip #5
> >
> > please check latest tip/master. This is the commit that should fix it:
> >
>
> Nope, does not :(. Still get,
>

Ah, turns out my git-fetch did not work so well. I just pulled the
latest tip, and it seems to have been fixed. Sorry for the noise.

Thanks,
--
regards,
Dhaval