2007-05-02 23:08:11

by Al Boldi

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

Con Kolivas wrote:
> On Monday 30 April 2007 18:05, Michael Gerdau wrote:
> > meanwhile I've redone my numbercrunching tests with the following
> > kernels: 2.6.21.1 (mainline)
> > 2.6.21-sd046
> > 2.6.21-cfs-v6
> > running on a dualcore x86_64.
> > [I will run the same test with 2.6.21.1-cfs-v7 over the next days,
> > likely tonight]
:
:
> > However from these figures it seems as if sd does provide for the
> > fairest (as in equal share for all) scheduling among the 3 schedulers
> > tested.
>
> Looks good, thanks. Ingo's been hard at work since then and has v8 out by
> now. SD has not changed so you wouldn't need to do the whole lot of tests
> on SD again unless you don't trust some of the results.

Well, I tried cfs-v8 and it still shows some nice regressions wrt
mainline/sd. SD's nice-levels look rather solid, implying fairness.


Thanks!

--
Al


2007-05-03 00:48:53

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

Con Kolivas wrote:
>> Looks good, thanks. Ingo's been hard at work since then and has v8 out by
>> now. SD has not changed so you wouldn't need to do the whole lot of tests
>> on SD again unless you don't trust some of the results.

On Thu, May 03, 2007 at 02:11:39AM +0300, Al Boldi wrote:
> Well, I tried cfs-v8 and it still shows some nice regressions wrt
> mainline/sd. SD's nice-levels look rather solid, implying fairness.

That's odd. The ->load_weight changes should've improved that quite
a bit. There may be something slightly off in how lag is computed,
or maybe the O(n) lag issue Ying Tang spotted is biting you.

Also, I should say that the nice number affairs don't imply fairness
per se. The way that works is that when tasks have "weights" (like
nice levels in UNIX) the definition of fairness changes so that each
task gets shares of CPU bandwidth proportional to its weight instead
of one share for one task.

It takes a bit closer inspection than feel tests to see if weighted
fairness is properly implemented. One thing to try is running a number
of identical CPU hogs at the same time at different nice levels for a
fixed period of time (e.g. 1 or 2 minutes) so they're in competition
with each other and seeing what percent of the CPU each gets. From
there you can figure out how many shares each is getting for its nice
level. Trying different mixtures of nice levels and different numbers
of tasks should give consistent results for the shares of CPU bandwidth
the CPU hogs get for being at a particular nice level. A scheduler gets
"bonus points" (i.e. is considered better at prioritizing) for the user
being able to specify how the weightings come out. The finer-grained
the control, the more bonus points.

Maybe con might want to take a stab at having users be able to specify
the weights for each nice level individually.

CFS actually has a second set of weights for tasks, namely the
timeslice for a given task. At the moment, they're all equal. It should
be the case that the shorter the timeslice a given task has, the less
latency it gets. So there is a fair amount of room for it to manuever
with respect to feel tests. It really needs to be done numerically to
get results we can be sure mean something.

The way this goes is task t_i gets a percent of the CPU p_i when the
tasks t_1, t_2, ..., t_n are all competing, and task t_i has nice level
n_i. The share corresponding to nice level n_i is then

p_i
w_i = -------
sum p_j

One thing to check for is that if two tasks have the same nice level
that their weights come out about equal. So for t_i and t_j, if n_i
= n_j then you check that at least approximately, w_i = w_j, or even
p_i = p_j, since we're not starting and stopping tasks in the midst of
the test. Also, you can't simplify sum p_j to 1, since the set of tasks
may not be the only things running.

The other thing to do is try a different number of tasks with a
different mix of nice levels. The weight w_i for a given nice
level n_i should be the same even in a different mix of tasks
and nice levels if the nice levels are the same.

If this sounds too far out, there's nothing to worry about. You can
just run the different numbers of tasks with different mixes of nice
levels and post the %cpu numbers. Or if that's still a bit far out
for you, a test that does all this is eventually going to get written.


-- wli

2007-05-03 03:47:52

by Al Boldi

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

William Lee Irwin III wrote:
> Con Kolivas wrote:
> >> Looks good, thanks. Ingo's been hard at work since then and has v8 out
> >> by now. SD has not changed so you wouldn't need to do the whole lot of
> >> tests on SD again unless you don't trust some of the results.
>
> On Thu, May 03, 2007 at 02:11:39AM +0300, Al Boldi wrote:
> > Well, I tried cfs-v8 and it still shows some nice regressions wrt
> > mainline/sd. SD's nice-levels look rather solid, implying fairness.
>
> That's odd. The ->load_weight changes should've improved that quite
> a bit. There may be something slightly off in how lag is computed,
> or maybe the O(n) lag issue Ying Tang spotted is biting you.

Is it not biting you too?

> Also, I should say that the nice number affairs don't imply fairness
> per se. The way that works is that when tasks have "weights" (like
> nice levels in UNIX) the definition of fairness changes so that each
> task gets shares of CPU bandwidth proportional to its weight instead
> of one share for one task.

Ok, but you can easily expose scheduler unfairness by using nice levels as
relative magnifiers; provided nice levels are implemented correctly.

> It takes a bit closer inspection than feel tests to see if weighted
> fairness is properly implemented. One thing to try is running a number
> of identical CPU hogs at the same time at different nice levels for a
> fixed period of time (e.g. 1 or 2 minutes) so they're in competition
> with each other and seeing what percent of the CPU each gets. From
> there you can figure out how many shares each is getting for its nice
> level. Trying different mixtures of nice levels and different numbers
> of tasks should give consistent results for the shares of CPU bandwidth
> the CPU hogs get for being at a particular nice level. A scheduler gets
> "bonus points" (i.e. is considered better at prioritizing) for the user
> being able to specify how the weightings come out. The finer-grained
> the control, the more bonus points.
>
> Maybe con might want to take a stab at having users be able to specify
> the weights for each nice level individually.
>
> CFS actually has a second set of weights for tasks, namely the
> timeslice for a given task. At the moment, they're all equal. It should
> be the case that the shorter the timeslice a given task has, the less
> latency it gets. So there is a fair amount of room for it to manuever
> with respect to feel tests. It really needs to be done numerically to
> get results we can be sure mean something.
>
> The way this goes is task t_i gets a percent of the CPU p_i when the
> tasks t_1, t_2, ..., t_n are all competing, and task t_i has nice level
> n_i. The share corresponding to nice level n_i is then
>
> p_i
> w_i = -------
> sum p_j
>
> One thing to check for is that if two tasks have the same nice level
> that their weights come out about equal. So for t_i and t_j, if n_i
> = n_j then you check that at least approximately, w_i = w_j, or even
> p_i = p_j, since we're not starting and stopping tasks in the midst of
> the test. Also, you can't simplify sum p_j to 1, since the set of tasks
> may not be the only things running.
>
> The other thing to do is try a different number of tasks with a
> different mix of nice levels. The weight w_i for a given nice
> level n_i should be the same even in a different mix of tasks
> and nice levels if the nice levels are the same.
>
> If this sounds too far out, there's nothing to worry about. You can
> just run the different numbers of tasks with different mixes of nice
> levels and post the %cpu numbers. Or if that's still a bit far out
> for you, a test that does all this is eventually going to get written.

chew.c does exactly that, just make sure sched_granularity_ms >= 5,000,000.


Thanks!

--
Al


2007-05-03 04:35:05

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

William Lee Irwin III wrote:
>> That's odd. The ->load_weight changes should've improved that quite
>> a bit. There may be something slightly off in how lag is computed,
>> or maybe the O(n) lag issue Ying Tang spotted is biting you.

On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> Is it not biting you too?

I'm a kernel programmer. I'm not an objective tester.

It also happens to be the case that I personally have never encountered
a performance problem with any of the schedulers, mainline included, on
any system I use interactively. So my "user experience" is not valuable.


William Lee Irwin III wrote:
>> Also, I should say that the nice number affairs don't imply fairness
>> per se. The way that works is that when tasks have "weights" (like
>> nice levels in UNIX) the definition of fairness changes so that each
>> task gets shares of CPU bandwidth proportional to its weight instead
>> of one share for one task.

On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> Ok, but you can easily expose scheduler unfairness by using nice levels as
> relative magnifiers; provided nice levels are implemented correctly.

This doesn't really fit in with anything I'm aware of.


William Lee Irwin III wrote:
>> The other thing to do is try a different number of tasks with a
>> different mix of nice levels. The weight w_i for a given nice
>> level n_i should be the same even in a different mix of tasks
>> and nice levels if the nice levels are the same.
>> If this sounds too far out, there's nothing to worry about. You can
>> just run the different numbers of tasks with different mixes of nice
>> levels and post the %cpu numbers. Or if that's still a bit far out
>> for you, a test that does all this is eventually going to get written.

On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> chew.c does exactly that, just make sure sched_granularity_ms >= 5,000,000.

Please post the source of chew.c


-- wli

2007-05-03 06:38:58

by Al Boldi

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

William Lee Irwin III wrote:
> William Lee Irwin III wrote:
> >> That's odd. The ->load_weight changes should've improved that quite
> >> a bit. There may be something slightly off in how lag is computed,
> >> or maybe the O(n) lag issue Ying Tang spotted is biting you.
>
> On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> > Is it not biting you too?
>
> I'm a kernel programmer. I'm not an objective tester.
>
> It also happens to be the case that I personally have never encountered
> a performance problem with any of the schedulers, mainline included, on
> any system I use interactively. So my "user experience" is not valuable.
>
> William Lee Irwin III wrote:
> >> Also, I should say that the nice number affairs don't imply fairness
> >> per se. The way that works is that when tasks have "weights" (like
> >> nice levels in UNIX) the definition of fairness changes so that each
> >> task gets shares of CPU bandwidth proportional to its weight instead
> >> of one share for one task.
>
> On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> > Ok, but you can easily expose scheduler unfairness by using nice levels
> > as relative magnifiers; provided nice levels are implemented correctly.
>
> This doesn't really fit in with anything I'm aware of.

You are not the first person that doesn't understand what I'm talking about.
Don't worry about it.

> William Lee Irwin III wrote:
> >> The other thing to do is try a different number of tasks with a
> >> different mix of nice levels. The weight w_i for a given nice
> >> level n_i should be the same even in a different mix of tasks
> >> and nice levels if the nice levels are the same.
> >> If this sounds too far out, there's nothing to worry about. You can
> >> just run the different numbers of tasks with different mixes of nice
> >> levels and post the %cpu numbers. Or if that's still a bit far out
> >> for you, a test that does all this is eventually going to get written.
>
> On Thu, May 03, 2007 at 06:51:43AM +0300, Al Boldi wrote:
> > chew.c does exactly that, just make sure sched_granularity_ms >=
> > 5,000,000.
>
> Please post the source of chew.c

Attached.


Thanks!

--
Al



Attachments:
(No filename) (2.13 kB)
chew.c (1.11 kB)
Download all attachments

2007-05-03 07:22:46

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

> William Lee Irwin III wrote:
On Thu, May 03, 2007 at 09:42:51AM +0300, Al Boldi wrote:
> sched_rr_get_interval(0, &ts);
> printf("pid %d, prio %3d, interval of %d nsec\n", getpid(), getpriority(PRIO_PROCESS, 0), ts.tv_nsec);

Oh dear. What are you trying to figure out from the task's timeslice?
That's not even meaningful in cfs.


On Thu, May 03, 2007 at 09:42:51AM +0300, Al Boldi wrote:
> start = last = stamp();
> while(1) {
> cur = stamp();
> delta = cur-last;
> if (delta > thresh_ticks) {
> act = last - start;
> printf("pid %d, prio %3d, out for %4llu ms, ran for %4llu ms, load %3llu%\n"
> , getpid(), getpriority(PRIO_PROCESS, 0), delta/1000, act/1000,(act*100)/(cur-start));
> start = cur = stamp();
> }
> last = cur;
> }
>
> return 0;
> }

This is looking for scheduling latencies, which are necessarily O(tasks).
This is not the way to do it.


-- wli

2007-05-03 07:57:21

by Al Boldi

[permalink] [raw]
Subject: Re: [ck] [REPORT] 2.6.21.1 vs 2.6.21-sd046 vs 2.6.21-cfs-v6

William Lee Irwin III wrote:
> On Thu, May 03, 2007 at 09:42:51AM +0300, Al Boldi wrote:
> > sched_rr_get_interval(0, &ts);
> > printf("pid %d, prio %3d, interval of %d nsec\n", getpid(),
> > getpriority(PRIO_PROCESS, 0), ts.tv_nsec);
>
> Oh dear. What are you trying to figure out from the task's timeslice?

chew.c is general purpose. It's not specific to cfs.

> That's not even meaningful in cfs.
>
> On Thu, May 03, 2007 at 09:42:51AM +0300, Al Boldi wrote:
> > start = last = stamp();
> > while(1) {
> > cur = stamp();
> > delta = cur-last;
> > if (delta > thresh_ticks) {
> > act = last - start;
> > printf("pid %d, prio %3d, out for %4llu ms, ran
> > for %4llu ms, load %3llu%\n" , getpid(), getpriority(PRIO_PROCESS, 0),
> > delta/1000, act/1000,(act*100)/(cur-start)); start = cur = stamp();
> > }
> > last = cur;
> > }
> >
> > return 0;
> > }
>
> This is looking for scheduling latencies, which are necessarily O(tasks).

This exposes the actual proc latency/waittime and runtime and cpu load
irrespective of being O(tasks) or not.

> This is not the way to do it.

You'll need to boot into /bin/sh and run two or more chew.c's to see any
reasonable results.

Then tell me whether this is the way to do it or not.


Thanks!

--
Al