2005-05-07 13:43:40

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] implement nice support across physical cpus on SMP

SMP balancing is currently designed purely with throughput in mind. This
working patch implements a mechanism for supporting 'nice' across physical
cpus without impacting throughput.

This is a version for stable kernel 2.6.11.*

Carlos, if you could test this with your test case it would be appreciated.

Ingo, comments?

Cheers,
Con


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-07 17:59:41

by Carlos Carvalho

[permalink] [raw]
Subject: Re: [PATCH] implement nice support across physical cpus on SMP

Con Kolivas ([email protected]) wrote on 7 May 2005 23:42:
>SMP balancing is currently designed purely with throughput in mind. This
>working patch implements a mechanism for supporting 'nice' across physical
>cpus without impacting throughput.
>
>This is a version for stable kernel 2.6.11.*
>
>Carlos, if you could test this with your test case it would be appreciated.

Unfortunately it doesn't seem to have any effect:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
184 user1 39 19 7220 5924 520 R 99.9 1.1 209:40.68 mi41
266 user2 25 0 1760 480 420 R 50.5 0.1 86:36.31 xdipole1
227 user3 25 0 155m 62m 640 R 49.5 12.3 95:07.89 b170-se.x

Note that the nice 19 job monopolizes one processor while the other
two nice 0 ones share a single processor.

This is really a showstopper for this kind of application :-(

2005-05-07 21:45:46

by Con Kolivas

[permalink] [raw]
Subject: Re: [PATCH] implement nice support across physical cpus on SMP

On Sun, 8 May 2005 03:59, Carlos Carvalho wrote:
> Con Kolivas ([email protected]) wrote on 7 May 2005 23:42:
> >SMP balancing is currently designed purely with throughput in mind. This
> >working patch implements a mechanism for supporting 'nice' across
> > physical cpus without impacting throughput.
> >
> >This is a version for stable kernel 2.6.11.*
> >
> >Carlos, if you could test this with your test case it would be
> > appreciated.
>
> Unfortunately it doesn't seem to have any effect:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 184 user1 39 19 7220 5924 520 R 99.9 1.1 209:40.68 mi41
> 266 user2 25 0 1760 480 420 R 50.5 0.1 86:36.31 xdipole1
> 227 user3 25 0 155m 62m 640 R 49.5 12.3 95:07.89 b170-se.x
>
> Note that the nice 19 job monopolizes one processor while the other
> two nice 0 ones share a single processor.
>
> This is really a showstopper for this kind of application :-(

Ok back to the drawing board. I have to try and figure out why it doesn't work
for your case. I tried it on 4x with lots of cpu bound tasks so I'm not sure
why it doesn't help with tyours.

Cheers,
Con


Attachments:
(No filename) (1.15 kB)
(No filename) (189.00 B)
Download all attachments

2005-05-09 11:25:18

by Markus Tornqvist

[permalink] [raw]
Subject: Re: [PATCH] implement nice support across physical cpus on SMP

I beg to differ with Mr. Carvalho's assesment with this patch;
it works like a charm, and then some.

The rest of the message is just my analysis of the situation
ran on a Dell PowerEdge 2850, dual hyperthread Xeon EM64Ts, with
Debian Pure64 Sarge installed.

Mr. Carvalho, is the program you tested a failure with open, or
is it possible for me to have the code and try to reproduce this
nevertheless?

My two cents say this is going in :)

And on replying, anyone, please keep me in the Cc, as I'm not
subscribed.

The rest of this message is just the "raw" data on my experiment.

$ cat load.sh
#!/bin/sh

if [ $1 ] && [ -n $1 ]; then
count=$1
else
count=1
fi

cur=0
while [ $cur -lt $count ]; do
cur=$[ $cur + 1 ]
if [ $cur -eq $[ $count-1 ] ] || [ $cur -eq $count ]; then
nice -n 19 load_base.sh &
else
load_base.sh &
fi
done

$ cat load_base.sh
#!/bin/sh

while true; do a=1; done


$ ./load.sh 5
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3918 mjt 34 0 5660 1136 936 R 99.9 0.0 1:34.30 load_base.sh
3917 mjt 35 0 5660 1136 936 R 99.5 0.0 1:34.30 load_base.sh
3916 mjt 34 0 5660 1136 936 R 59.7 0.0 0:53.16 load_base.sh
3919 mjt 39 19 5660 1136 936 R 6.0 0.0 0:05.97 load_base.sh
3920 mjt 39 19 5660 1136 936 R 3.0 0.0 0:02.62 load_base.sh

PID USER PR NI VIRT SHR S %CPU %MEM TIME+ #C COMMAND
3917 mjt 26 0 5660 936 R 99.9 0.0 3:37.61 0 load_base.sh
3918 mjt 25 0 5660 936 R 99.9 0.0 3:37.60 3 load_base.sh
3916 mjt 26 0 5660 936 R 52.7 0.0 2:02.37 2 load_base.sh
3919 mjt 39 19 5660 936 R 7.0 0.0 0:13.80 1 load_base.sh
3920 mjt 39 19 5660 936 R 3.0 0.0 0:06.05 2 load_base.sh

top - 11:09:24 up 15:30, 2 users, load average: 4.99, 3.55, 1.63
PID USER PR NI VIRT SHR S %CPU %MEM TIME+ #C COMMAND
3917 mjt 25 0 5660 936 R 99.6 0.0 6:11.35 0 load_base.sh
3918 mjt 24 0 5660 936 R 99.6 0.0 6:11.34 3 load_base.sh
3916 mjt 39 0 5660 936 R 65.7 0.0 3:28.95 2 load_base.sh
3919 mjt 39 19 5660 936 R 7.0 0.0 0:23.54 1 load_base.sh
3920 mjt 39 19 5660 936 R 3.0 0.0 0:10.33 2 load_base.sh

top - 11:10:57 up 15:32, 2 users, load average: 4.99, 3.94, 1.95
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3917 mjt 22 0 5660 1136 936 R 99.5 0.0 7:51.62 load_base.sh
3918 mjt 21 0 5660 1136 936 R 99.5 0.0 7:51.61 load_base.sh
3916 mjt 39 0 5660 1136 936 R 53.7 0.0 4:25.26 load_base.sh
3919 mjt 39 19 5660 1136 936 R 7.0 0.0 0:29.92 load_base.sh
3920 mjt 39 19 5660 1136 936 R 3.0 0.0 0:13.13 load_base.sh

top - 11:12:32 up 15:33, 2 users, load average: 4.99, 4.22, 2.24
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
3917 mjt 35 0 5660 1136 936 R 99.9 0.0 9:28.56 0 load_base.sh
3918 mjt 34 0 5660 1136 936 R 99.5 0.0 9:28.54 3 load_base.sh
3916 mjt 35 0 5660 1136 936 R 61.7 0.0 5:19.77 2 load_base.sh
3919 mjt 39 19 5660 1136 936 R 6.0 0.0 0:36.07 1 load_base.sh
3920 mjt 39 19 5660 1136 936 R 3.0 0.0 0:15.82 2 load_base.sh

$ ./load.sh 7
top - 11:13:49 up 15:35, 2 users, load average: 5.17, 4.40, 2.45
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
3952 mjt 29 0 5660 1140 936 R 99.9 0.0 0:33.53 2 load_base.sh
3950 mjt 31 0 5660 1140 936 R 99.5 0.0 0:33.33 1 load_base.sh
3953 mjt 30 0 5660 1140 936 R 55.7 0.0 0:16.82 3 load_base.sh
3951 mjt 39 0 5660 1140 936 R 43.8 0.0 0:16.70 3 load_base.sh
3949 mjt 39 0 5660 1140 936 R 23.9 0.0 0:13.18 0 load_base.sh
3954 mjt 39 19 5660 1140 936 R 2.0 0.0 0:00.64 0 load_base.sh
3955 mjt 39 19 5660 1140 936 R 2.0 0.0 0:00.64 0 load_base.sh

top - 11:14:53 up 15:36, 2 users, load average: 6.38, 4.91, 2.76
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
3950 mjt 23 0 5660 1140 936 R 99.9 0.0 1:39.67 1 load_base.sh
3952 mjt 21 0 5660 1140 936 R 99.9 0.0 1:39.87 2 load_base.sh
3951 mjt 39 0 5660 1140 936 R 52.7 0.0 0:49.91 3 load_base.sh
3953 mjt 22 0 5660 1140 936 R 47.8 0.0 0:49.95 3 load_base.sh
3949 mjt 39 0 5660 1140 936 R 43.8 0.0 0:38.70 0 load_base.sh
3954 mjt 39 19 5660 1140 936 R 2.0 0.0 0:01.90 0 load_base.sh
3955 mjt 39 19 5660 1140 936 R 2.0 0.0 0:01.90 0 load_base.sh

--
mjt


Attachments:
(No filename) (4.86 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-05-09 11:28:39

by Markus Tornqvist

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] implement nice support across physical cpus on SMP

On Mon, May 09, 2005 at 02:24:46PM +0300, Markus T?rnqvist wrote:
>The rest of the message is just my analysis of the situation

Typing faster than thinking syndrome, running late for an exam ;)

--
mjt


Attachments:
(No filename) (205.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-05-09 11:47:35

by Con Kolivas

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] implement nice support across physical cpus on SMP

On Mon, 9 May 2005 21:24, Markus T?rnqvist wrote:
> I beg to differ with Mr. Carvalho's assesment with this patch;
> it works like a charm, and then some.
>
> The rest of the message is just my analysis of the situation
> ran on a Dell PowerEdge 2850, dual hyperthread Xeon EM64Ts, with
> Debian Pure64 Sarge installed.

Thanks for feedback.

> PID USER PR NI VIRT SHR S %CPU %MEM TIME+ #C COMMAND
> 3917 mjt 26 0 5660 936 R 99.9 0.0 3:37.61 0 load_base.sh
> 3918 mjt 25 0 5660 936 R 99.9 0.0 3:37.60 3 load_base.sh
> 3916 mjt 26 0 5660 936 R 52.7 0.0 2:02.37 2 load_base.sh
> 3919 mjt 39 19 5660 936 R 7.0 0.0 0:13.80 1 load_base.sh
> 3920 mjt 39 19 5660 936 R 3.0 0.0 0:06.05 2 load_base.sh
>
> top - 11:09:24 up 15:30, 2 users, load average: 4.99, 3.55, 1.63
> PID USER PR NI VIRT SHR S %CPU %MEM TIME+ #C COMMAND
> 3917 mjt 25 0 5660 936 R 99.6 0.0 6:11.35 0 load_base.sh
> 3918 mjt 24 0 5660 936 R 99.6 0.0 6:11.34 3 load_base.sh
> 3916 mjt 39 0 5660 936 R 65.7 0.0 3:28.95 2 load_base.sh
> 3919 mjt 39 19 5660 936 R 7.0 0.0 0:23.54 1 load_base.sh
> 3920 mjt 39 19 5660 936 R 3.0 0.0 0:10.33 2 load_base.sh

These runs don't look absolutely "ideal" as one nice 19 task is bound to cpu1
however since you're running hyperthreading it would seem the SMT nice code
is keeping that under check anyway (0:23 vs 6:11)

> top - 11:12:32 up 15:33, 2 users, load average: 4.99, 4.22, 2.24
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
> 3917 mjt 35 0 5660 1136 936 R 99.9 0.0 9:28.56 0
> load_base.sh 3918 mjt 34 0 5660 1136 936 R 99.5 0.0 9:28.54 3
> load_base.sh 3916 mjt 35 0 5660 1136 936 R 61.7 0.0 5:19.77 2
> load_base.sh 3919 mjt 39 19 5660 1136 936 R 6.0 0.0 0:36.07 1
> load_base.sh 3920 mjt 39 19 5660 1136 936 R 3.0 0.0 0:15.82 2
> load_base.sh
>
> $ ./load.sh 7
> top - 11:13:49 up 15:35, 2 users, load average: 5.17, 4.40, 2.45
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
> 3952 mjt 29 0 5660 1140 936 R 99.9 0.0 0:33.53 2
> load_base.sh 3950 mjt 31 0 5660 1140 936 R 99.5 0.0 0:33.33 1
> load_base.sh 3953 mjt 30 0 5660 1140 936 R 55.7 0.0 0:16.82 3
> load_base.sh 3951 mjt 39 0 5660 1140 936 R 43.8 0.0 0:16.70 3
> load_base.sh 3949 mjt 39 0 5660 1140 936 R 23.9 0.0 0:13.18 0
> load_base.sh 3954 mjt 39 19 5660 1140 936 R 2.0 0.0 0:00.64 0
> load_base.sh 3955 mjt 39 19 5660 1140 936 R 2.0 0.0 0:00.64 0
> load_base.sh
>
> top - 11:14:53 up 15:36, 2 users, load average: 6.38, 4.91, 2.76
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
> 3950 mjt 23 0 5660 1140 936 R 99.9 0.0 1:39.67 1
> load_base.sh 3952 mjt 21 0 5660 1140 936 R 99.9 0.0 1:39.87 2
> load_base.sh 3951 mjt 39 0 5660 1140 936 R 52.7 0.0 0:49.91 3
> load_base.sh 3953 mjt 22 0 5660 1140 936 R 47.8 0.0 0:49.95 3
> load_base.sh 3949 mjt 39 0 5660 1140 936 R 43.8 0.0 0:38.70 0
> load_base.sh 3954 mjt 39 19 5660 1140 936 R 2.0 0.0 0:01.90 0
> load_base.sh 3955 mjt 39 19 5660 1140 936 R 2.0 0.0 0:01.90 0

These runs pretty much confirm what I found to happen. My test machine for
this was also 4x. I can't see how the code would behave differently on 2x.
Perhaps if I make the prio_bias multiplied instead of added to the cpu load
it will be less affected by SCHED_LOAD_SCALE. The attached patch was
confirmed during testing to also provide smp distribution according to nice
on 4x. Carlos I know your machine is in production so you testing may not be
easy for you. Please try this on top if you have time.

Cheers,
Con

---
This patch alters the effect priority bias has on busy rebalancing by
multiplying the cpu load by the total priority instead of adding it.

Signed-off-by: Con Kolivas <[email protected]>


Attachments:
(No filename) (4.04 kB)
alter_prio_bias.diff (912.00 B)
Download all attachments

2005-05-09 18:56:13

by Markus Tornqvist

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] implement nice support across physical cpus on SMP

On Mon, May 09, 2005 at 09:47:05PM +1000, Con Kolivas wrote:
>
>Thanks for feedback.

For once I can give something back, it seems; thus it's my pleasure.

>> top - 11:09:24 up 15:30, 2 users, load average: 4.99, 3.55, 1.63
>> PID USER PR NI VIRT SHR S %CPU %MEM TIME+ #C COMMAND
>> 3917 mjt 25 0 5660 936 R 99.6 0.0 6:11.35 0 load_base.sh
>> 3918 mjt 24 0 5660 936 R 99.6 0.0 6:11.34 3 load_base.sh
>> 3916 mjt 39 0 5660 936 R 65.7 0.0 3:28.95 2 load_base.sh
>> 3919 mjt 39 19 5660 936 R 7.0 0.0 0:23.54 1 load_base.sh
>> 3920 mjt 39 19 5660 936 R 3.0 0.0 0:10.33 2 load_base.sh
>
>These runs don't look absolutely "ideal" as one nice 19 task is bound to cpu1
>however since you're running hyperthreading it would seem the SMT nice code
>is keeping that under check anyway (0:23 vs 6:11)

So let no one touch the SMT code as long as it works...

>These runs pretty much confirm what I found to happen. My test machine for
>this was also 4x. I can't see how the code would behave differently on 2x.

Who on whichever list one is subscribed to or not would care to
replicate these results on 2x and report?

Thank you.

>Perhaps if I make the prio_bias multiplied instead of added to the cpu load
>it will be less affected by SCHED_LOAD_SCALE. The attached patch was
>confirmed during testing to also provide smp distribution according to nice
>on 4x. Carlos I know your machine is in production so you testing may not be
>easy for you. Please try this on top if you have time.

I have no idea about SCHED_LOAD_SCALE, I'm afraid, but I will give
this latest patch a run while I'm at it.

The load.sh is the same I posted before

$ ./load.sh 5
top - 19:41:33 up 9 min, 2 users, load average: 0.40, 0.10, 0.03
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2798 mjt 25 0 5660 1140 936 R 99.9 0.0 0:02.04 2 load_base.sh
2799 mjt 25 0 5660 1140 936 R 99.9 0.0 0:02.04 3 load_base.sh
2797 mjt 25 0 5660 1140 936 R 51.8 0.0 0:01.16 0 load_base.sh
2801 mjt 39 19 5660 1140 936 R 7.0 0.0 0:00.12 1 load_base.sh
2800 mjt 39 19 5660 1140 936 R 3.0 0.0 0:00.05 0 load_base.sh

top - 19:42:20 up 10 min, 2 users, load average: 2.83, 0.78, 0.26
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2798 mjt 35 0 5660 1140 936 R 99.5 0.0 0:48.55 2 load_base.sh
2799 mjt 35 0 5660 1140 936 R 99.5 0.0 0:48.55 3 load_base.sh
2797 mjt 34 0 5660 1140 936 R 61.7 0.0 0:27.43 0 load_base.sh
2801 mjt 39 19 5660 1140 936 R 6.0 0.0 0:03.11 1 load_base.sh
2800 mjt 39 19 5660 1140 936 R 3.0 0.0 0:01.35 0 load_base.sh

top - 19:43:00 up 10 min, 2 users, load average: 3.88, 1.31, 0.46
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2798 mjt 24 0 5660 1140 936 R 99.9 0.0 1:29.13 2 load_base.sh
2799 mjt 24 0 5660 1140 936 R 99.5 0.0 1:29.12 3 load_base.sh
2797 mjt 24 0 5660 1140 936 R 49.8 0.0 0:50.19 0 load_base.sh
2801 mjt 39 19 5660 1140 936 R 7.0 0.0 0:05.76 1 load_base.sh
2800 mjt 39 19 5660 1140 936 R 3.0 0.0 0:02.48 0 load_base.sh

$ ./load.sh 7
top - 19:43:49 up 11 min, 2 users, load average: 4.98, 1.97, 0.73
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2807 mjt 21 0 5660 1140 936 R 99.9 0.0 0:27.73 0 load_base.sh
2804 mjt 34 0 5660 1140 936 R 99.6 0.0 0:23.46 1 load_base.sh
2808 mjt 20 0 5660 1140 936 R 99.6 0.0 0:25.51 2 load_base.sh
2805 mjt 39 0 5660 1140 936 R 39.8 0.0 0:08.12 3 load_base.sh
2806 mjt 33 0 5660 1140 936 R 37.9 0.0 0:12.46 3 load_base.sh
2788 mjt 20 0 5168 1092 832 R 1.0 0.0 0:00.56 3 top
2809 mjt 39 19 5660 1140 936 R 1.0 0.0 0:00.41 3 load_base.sh
2810 mjt 39 19 5660 1144 936 R 1.0 0.0 0:00.40 3 load_base.sh

top - 19:44:20 up 12 min, 2 users, load average: 5.78, 2.45, 0.92
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2807 mjt 22 0 5660 1140 936 R 99.9 0.0 0:56.56 0 load_base.sh
2804 mjt 35 0 5660 1140 936 R 99.5 0.0 0:52.29 1 load_base.sh
2808 mjt 21 0 5660 1140 936 R 99.5 0.0 0:54.34 2 load_base.sh
2805 mjt 35 0 5660 1140 936 R 33.8 0.0 0:15.99 3 load_base.sh
2806 mjt 39 0 5660 1140 936 R 21.9 0.0 0:20.22 3 load_base.sh
2788 mjt 20 0 5168 1092 832 R 1.0 0.0 0:00.65 3 top
2809 mjt 39 19 5660 1140 936 R 1.0 0.0 0:00.80 3 load_base.sh
2810 mjt 39 19 5660 1144 936 R 1.0 0.0 0:00.79 3 load_base.sh

top - 19:45:00 up 12 min, 2 users, load average: 6.37, 3.02, 1.18
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2804 mjt 28 0 5660 1140 936 R 99.9 0.0 1:32.02 1 load_base.sh
2807 mjt 35 0 5660 1140 936 R 99.9 0.0 1:36.29 0 load_base.sh
2808 mjt 33 0 5660 1140 936 R 99.5 0.0 1:34.07 2 load_base.sh
2806 mjt 27 0 5660 1140 936 R 30.9 0.0 0:31.09 3 load_base.sh
2805 mjt 39 0 5660 1140 936 R 24.9 0.0 0:26.81 3 load_base.sh
2809 mjt 39 19 5660 1140 936 R 1.0 0.0 0:01.34 3 load_base.sh
2810 mjt 39 19 5660 1144 936 R 1.0 0.0 0:01.33 3 load_base.sh

Then I decided to do something crazier and renice some pids, to see
what happens...

top - 19:45:45 up 13 min, 2 users, load average: 6.70, 3.58, 1.45
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2807 mjt 37 0 5660 1140 936 R 99.9 0.0 2:21.19 0 load_base.sh
2804 mjt 30 0 5660 1140 936 R 99.5 0.0 2:16.92 1 load_base.sh
2806 mjt 36 0 5660 1140 936 R 41.8 0.0 0:43.73 2 load_base.sh
2805 mjt 39 0 5660 1140 936 R 21.9 0.0 0:39.13 2 load_base.sh
2809 mjt 39 19 5660 1140 936 R 6.0 0.0 0:02.10 3 load_base.sh
2808 mjt 39 10 5660 1140 936 R 2.0 0.0 2:16.01 2 load_base.sh
2810 mjt 39 19 5660 1144 936 R 2.0 0.0 0:01.96 2 load_base.sh

top - 19:46:20 up 14 min, 2 users, load average: 6.83, 3.95, 1.66
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2804 mjt 35 0 5660 1140 936 R 99.6 0.0 2:52.38 1 load_base.sh
2805 mjt 37 0 5660 1140 936 R 99.6 0.0 0:54.29 3 load_base.sh
2807 mjt 21 0 5660 1140 936 R 99.6 0.0 2:56.65 0 load_base.sh
2806 mjt 21 0 5660 1140 936 R 23.9 0.0 0:53.67 2 load_base.sh
2808 mjt 39 10 5660 1140 936 R 21.9 0.0 2:34.49 2 load_base.sh
2809 mjt 39 19 5660 1140 936 R 2.0 0.0 0:02.66 2 load_base.sh
2810 mjt 39 19 5660 1144 936 R 2.0 0.0 0:02.57 2 load_base.sh

top - 19:47:00 up 14 min, 2 users, load average: 6.91, 4.33, 1.88
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2805 mjt 32 0 5660 1140 936 R 99.9 0.0 1:34.05 3 load_base.sh
2807 mjt 37 0 5660 1140 936 R 99.9 0.0 3:36.42 0 load_base.sh
2804 mjt 30 0 5660 1140 936 R 99.6 0.0 3:32.15 1 load_base.sh
2806 mjt 36 0 5660 1140 936 R 40.8 0.0 1:07.04 2 load_base.sh
2808 mjt 39 10 5660 1140 936 R 21.9 0.0 2:41.09 2 load_base.sh
2809 mjt 39 19 5660 1140 936 R 2.0 0.0 0:03.32 2 load_base.sh
2810 mjt 39 19 5660 1144 936 R 1.0 0.0 0:03.23 2 load_base.sh

And sudo renice before I call it a day

top - 19:48:27 up 16 min, 2 users, load average: 7.21, 5.05, 2.34
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2806 mjt 28 0 5660 1140 936 R 99.9 0.0 1:57.55 2 load_base.sh
2807 mjt 27 0 5660 1140 936 R 99.9 0.0 5:00.99 0 load_base.sh
2810 mjt 26 -10 5660 1144 936 R 96.3 0.0 0:05.56 3 load_base.sh
2804 mjt 39 0 5660 1140 936 R 24.8 0.0 4:57.88 1 load_base.sh
2808 mjt 36 10 5660 1140 936 R 9.5 0.0 2:56.90 1 load_base.sh
2805 mjt 39 0 5660 1140 936 R 1.0 0.0 2:29.56 1 load_base.sh

top - 19:49:10 up 16 min, 2 users, load average: 7.65, 5.46, 2.61
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2806 mjt 24 0 5660 1140 936 R 99.9 0.0 2:39.96 2 load_base.sh
2807 mjt 23 0 5660 1140 936 R 99.9 0.0 5:43.40 0 load_base.sh
2810 mjt 26 -10 5660 1144 936 R 99.9 0.0 0:45.87 1 load_base.sh
2805 mjt 39 0 5660 1140 936 R 17.5 0.0 2:36.59 3 load_base.sh
2808 mjt 39 10 5660 1140 936 R 8.7 0.0 2:59.79 3 load_base.sh
2804 mjt 27 0 5660 1140 936 R 7.1 0.0 5:03.47 3 load_base.sh

top - 19:49:45 up 17 min, 2 users, load average: 7.36, 5.63, 2.77
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ #C COMMAND
2806 mjt 31 0 5660 1140 936 R 99.9 0.0 3:15.66 2 load_base.sh
2807 mjt 29 0 5660 1140 936 R 99.9 0.0 6:19.10 0 load_base.sh
2810 mjt 26 -10 5660 1144 936 R 99.9 0.0 1:16.74 3 load_base.sh
2804 mjt 39 0 5660 1140 936 R 17.5 0.0 5:09.32 1 load_base.sh
2805 mjt 39 0 5660 1140 936 R 17.5 0.0 2:45.17 1 load_base.sh
2808 mjt 39 10 5660 1140 936 R 8.7 0.0 3:02.87 1 load_base.sh

Seems good enough under this very fabricated stress, hopefully someone
can tell me a good practical application with processes of different
nices going all over the place, so I can try something else.

But these processes are still clinging a bit to cpu 1 here, that's probably
another SMT feature.
Who tests this on SMP without SMT? Anyone? You! Over there!

This box will go into production soon and then I can maybe get some
glimpses of what happens in practice, and that's about it.
And it'll probably run everything with default values, except a light
mysql -5.

Thanks!

--
mjt


Attachments:
(No filename) (10.17 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-05-09 23:54:07

by Carlos Carvalho

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] implement nice support across physical cpus on SMP

Con Kolivas ([email protected]) wrote on 9 May 2005 21:47:
>Perhaps if I make the prio_bias multiplied instead of added to the cpu load
>it will be less affected by SCHED_LOAD_SCALE. The attached patch was
>confirmed during testing to also provide smp distribution according to nice
>on 4x.

It seems to work. I've tested it for a few hours on the same machine
and the 2 nice 0 processes take the bulk of the cpu time, while that
cpu bound program running at nice 19 takes only about 7%.

Maybe it's a bit early to say it's fine, but it does semm much better
than before, so I think it should go into the tree.

Thanks a lot!

2005-05-11 02:55:19

by Con Kolivas

[permalink] [raw]
Subject: Re: [ck] Re: [PATCH] implement nice support across physical cpus on SMP

On Tue, 10 May 2005 09:54 am, Carlos Carvalho wrote:
> Con Kolivas ([email protected]) wrote on 9 May 2005 21:47:
> >Perhaps if I make the prio_bias multiplied instead of added to the cpu
> > load it will be less affected by SCHED_LOAD_SCALE. The attached patch
> > was confirmed during testing to also provide smp distribution according
> > to nice on 4x.
>
> It seems to work. I've tested it for a few hours on the same machine
> and the 2 nice 0 processes take the bulk of the cpu time, while that
> cpu bound program running at nice 19 takes only about 7%.
>
> Maybe it's a bit early to say it's fine, but it does semm much better
> than before, so I think it should go into the tree.
>
> Thanks a lot!

My pleasure. Thanks for testing.

I'll roll up these patches for rc4 and make smp nice balancing a config option
for ultimate flexibility.

Cheers,
Con

2005-05-11 07:20:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH 2/2] SCHED: Make SMP nice a config option


ack on the first patch - but please dont make it a .config option!
Either it's good enough so that everyone can use it, or it isnt.

Ingo

2005-05-12 10:49:51

by Con Kolivas

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH 2/2] SCHED: Make SMP nice a config option

On Wed, 11 May 2005 17:20, Ingo Molnar wrote:
> ack on the first patch - but please dont make it a .config option!
> Either it's good enough so that everyone can use it, or it isnt.

Makes a heck of a lot of sense to me. I guess I was just being paranoid /
defensive for no good reason. The first patch alone should suffice.

Cheers,
Con


Attachments:
(No filename) (339.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-16 11:34:09

by Con Kolivas

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH] SCHED: Implement nice support across physical cpus on SMP

On Wed, 11 May 2005 13:04, Con Kolivas wrote:
> Andrew please consider for inclusion in -mm

It looks like I missed my window of opportunity and the SMP balancing design
has been restructured in latest -mm again so this patch will have to wait
another generation. Carlos, Markus you'll have to wait till that code settles
down (if ever) before I (or someone else) rewrites it for it to get included
in -mm followed by mainline. The patch you currently have will work fine for
2.6.11* and 2.6.12*

Cheers,
Con


Attachments:
(No filename) (514.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-16 18:33:15

by Markus Tornqvist

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH] SCHED: Implement nice support across physical cpus on SMP

On Mon, May 16, 2005 at 09:33:09PM +1000, Con Kolivas wrote:
>
>It looks like I missed my window of opportunity and the SMP balancing design
>has been restructured in latest -mm again so this patch will have to wait

...incredible...

--
mjt


Attachments:
(No filename) (246.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-05-17 13:39:52

by Carlos Carvalho

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH] SCHED: Implement nice support across physical cpus on SMP

Con Kolivas ([email protected]) wrote on 16 May 2005 21:33:
>On Wed, 11 May 2005 13:04, Con Kolivas wrote:
>> Andrew please consider for inclusion in -mm
>
>It looks like I missed my window of opportunity and the SMP balancing design
>has been restructured in latest -mm again so this patch will have to wait
>another generation. Carlos, Markus you'll have to wait till that code settles
>down (if ever) before I (or someone else) rewrites it for it to get included
>in -mm followed by mainline. The patch you currently have will work fine for
>2.6.11* and 2.6.12*

That's a pity. What's more important however is that this misfeature
of the scheduler should be corrected ASAP. The nice control is a
traditional UNIX characteristic and it should have higher priority in
the patch inclusion queue than other scheduler improvements.

2005-05-18 11:31:44

by Markus Tornqvist

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH] SCHED: Implement nice support across physical cpus on SMP

On Tue, May 17, 2005 at 10:39:28AM -0300, Carlos Carvalho wrote:
>That's a pity. What's more important however is that this misfeature
>of the scheduler should be corrected ASAP. The nice control is a
>traditional UNIX characteristic and it should have higher priority in
>the patch inclusion queue than other scheduler improvements.

Linux is not a traditional unix, but it doesn't mean the support
shouldn't exist.

My suggestion is that whoever broke the interface, rendering
con's patch which mingo accepted useless, merge the patch.

Thanks!

--
mjt


Attachments:
(No filename) (557.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-05-18 13:45:59

by Con Kolivas

[permalink] [raw]
Subject: Re: [SMP NICE] [PATCH] SCHED: Implement nice support across physical cpus on SMP

On Wed, 18 May 2005 21:30, Markus T?rnqvist wrote:
> On Tue, May 17, 2005 at 10:39:28AM -0300, Carlos Carvalho wrote:
> >That's a pity. What's more important however is that this misfeature
> >of the scheduler should be corrected ASAP. The nice control is a
> >traditional UNIX characteristic and it should have higher priority in
> >the patch inclusion queue than other scheduler improvements.
>
> Linux is not a traditional unix, but it doesn't mean the support
> shouldn't exist.
>
> My suggestion is that whoever broke the interface, rendering
> con's patch which mingo accepted useless, merge the patch.

Unrealistic. We are in a constant state of development, the direction of which
is determined by who is hacking on what, when - as opposed to "we need this
feature or fix now so lets direct all our efforts to that". Unfortunately the
SMP balancing changes need more than one iteration of a mainline kernel
before being incorporated due to the potential for regression so the
likelihood of "SMP nice" becoming part of mainline if it is based on this new
code is going to be (at a guess) 6 months. Of course my patch could go into
mainline in its current form and the SMP balancing code in -mm could be
modified with that in place rather than the other way around but I just
didn't get in early enough for that to happen ;)

Cheers,
Con


Attachments:
(No filename) (1.32 kB)
(No filename) (189.00 B)
Download all attachments

2005-05-21 05:01:27

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] SCHED: Implement nice support across physical cpus on SMP

Ok I've respun the smp nice support for the cpu scheduler modifications that
are in current -mm. Tested on 2.6.12-rc4-mm2 on 4x and seems to work fine.

Con
---


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-23 09:28:37

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] SCHED: change_prio_bias_only_if_queued

On Sat, 21 May 2005 15:00, Con Kolivas wrote:
> Ok I've respun the smp nice support for the cpu scheduler modifications
> that are in current -mm. Tested on 2.6.12-rc4-mm2 on 4x and seems to work
> fine.

Thanks to Peter Williams for noting I should only change the prio_bias if the
task is queued in set_user_nice.

Con
---


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2005-05-23 10:07:51

by Con Kolivas

[permalink] [raw]
Subject: [PATCH] SCHED: account_rt_tasks_in_prio_bias

On Mon, 23 May 2005 19:28, Con Kolivas wrote:
> On Sat, 21 May 2005 15:00, Con Kolivas wrote:
> > Ok I've respun the smp nice support for the cpu scheduler modifications
> > that are in current -mm. Tested on 2.6.12-rc4-mm2 on 4x and seems to work
> > fine.
>
> Thanks to Peter Williams for noting I should only change the prio_bias if
> the task is queued in set_user_nice.

And for completeness the effect of real time tasks' real time priority level
should be considered in prio_bias instead of their nice level.

Con
---


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments